Commit Graph

649779 Commits

Author SHA1 Message Date
Olof Johansson
9aa29e9e5e Non-urgent fixes for omaps for v4.10 merge window:
- Fix mismatched interrupt numbers for tps65217, these are not yet
   used
 
 - Remove unused omapdss_early_init_of()
 
 - Use seq_putc() for pm-debug.c
 -----BEGIN PGP SIGNATURE-----
 
 iQIuBAABCAAYBQJYK1bEERx0b255QGF0b21pZGUuY29tAAoJEBvUPslcq6VzX9EQ
 AL7rLfFsVY39z2pu/QEdkiBfypIcJx1TE9jMR/9j4rJUxewhTkVvyK0fqHnUv7ZP
 znXM/v0ZGE7q4AoyaV2mBsrk+Gty8pkzW50swUxmKK6XwJSjmKAeX1RkzmmNmPYW
 xXC1h77GmbmGnxFJIqQrvbuJse2ItKUIq3ks7+ZELlAwe51SSM3CHHuEVKcPNcJS
 2gmwmaWEi8WaSLBEmlI4OverVlzZrAWYxGyG+QArY9RKRuzCeIKjyk4y+L6RVk6F
 J0Y6Wqbmd9LAwKGwkxBjhpOMTJnM9cHbje7w4wvKWOZ4PVNfneCQXGSwhI9CZIU9
 WqDCeTDkx42ElV7KQSrxSBOuXRv6uIoLYLQIVTQeSHf+EeN4v74CSPMNdgpMWoVu
 AvLgq0pUth4c2hEn8nl2M77jxdqtG7IbtrrOHhBI5TG6s4wPXOco7JKcZOQ7cKkN
 GG4RnvXPH4zFuX0pn9ONn3IAAQXXRP1fXolmMBewYOpaOhrNURasq/y75dPFohHx
 P7Frv/f6B5MUKciHe1nENJhL/BCubSHhqNGy2fyNXi3yHH3Z9LK0sfad6D938Kl8
 xSqLE1LUrp2rj3jvalj7masywssxpQMsyO26JtMqXUQxHXme5DR+m8pHOkz4/MKe
 rGaMSvE5lp84taz8/T8LHEybWog96ky2efsuTIcmMgWy
 =tMf4
 -----END PGP SIGNATURE-----

Merge tag 'omap-for-v4.10/fixes-not-urgent-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into next/fixes-non-critical

Non-urgent fixes for omaps for v4.10 merge window:

- Fix mismatched interrupt numbers for tps65217, these are not yet
  used

- Remove unused omapdss_early_init_of()

- Use seq_putc() for pm-debug.c

* tag 'omap-for-v4.10/fixes-not-urgent-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
  ARM: OMAP2+: pm-debug: Use seq_putc() in two functions
  ARM: OMAP2+: Remove the omapdss_early_init_of() function
  mfd: tps65217: Fix mismatched interrupt number

Signed-off-by: Olof Johansson <olof@lixom.net>
2016-11-18 16:41:41 -08:00
Olof Johansson
4f23683ced Allwinner arm64 DT changes for 4.10
Support for the Allwinner A64, their first armv8 SoC.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCAAGBQJYK3ohAAoJEBx+YmzsjxAgQ0cP/AtC7KSWnQ2afcEeIRxii9X+
 QbiyQqPvmRF/mBtZN/49wKwRdh1qtsGk29UxMf06hrk2oA/Kc9KoPed+5Q0qG2ac
 2m1KWN6G+qNk8bbd26A4sXRBsWUH7Gp7kCUjIBoRRUup6YQFySxIEjSU7dfmPHzi
 Rxbq5fo9JMpar8bjVr1mVeIL/0p4Uz2LYt4miqARlztGJaUEZtBTrWDncLM0TqTV
 hzeP1lKwelwkVDcrrSgZ5nBCcdmXFvIK6S3pNHcy11NCt7Qnf4pEn2heFs0Xh0/3
 3nTcr7TEEeEc1oVc6UyM6GCgV3SrBlLTXIhvULQoU0sdhjTQxUo1zpDCO81p0UfQ
 Qu7KxjhvXOjRYZLBQfXWN4VwOURrP9Bw2VwxN4vACUpQpQX/9eovIT8XVH9Jd8kO
 bnbmzWLJq+arnE81CVrL7oqTK0r6O1/D+6kOLbvP9o6+Pt/2oJ8kHrJQCLXzfOnE
 nhuZE489fTLeKtgloR5wez/Vu1JwaIhhHsfWkskljmi7Xx4XUJdu6XtBoS67qJE5
 xDg/ujD3S9TT0AhZ5XWn1f6fCPI4N+E8HvHUJNLZSi1iPnW4c2wZ9tBBEOu7hh2X
 7ZiZ6k+FOg+ZyY/jdChJ0SWvIzcutIbkP/0pKgLLPozEImv6ARqfyJbWauxbt/Hl
 JKw9pE7Uo7ztdAz/52DL
 =JkJ+
 -----END PGP SIGNATURE-----

Merge tag 'sunxi-dt64-for-4.10' of https://git.kernel.org/pub/scm/linux/kernel/git/mripard/linux into next/dt64

Allwinner arm64 DT changes for 4.10

Support for the Allwinner A64, their first armv8 SoC.

* tag 'sunxi-dt64-for-4.10' of https://git.kernel.org/pub/scm/linux/kernel/git/mripard/linux:
  arm64: dts: add Pine64 support
  Documentation: devicetree: add vendor prefix for Pine64
  arm64: dts: add Allwinner A64 SoC .dtsi

Signed-off-by: Olof Johansson <olof@lixom.net>
2016-11-18 16:41:14 -08:00
Linus Torvalds
aad931a30f One fix for an NFS/RDMA crash.
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJYLznTAAoJECebzXlCjuG+ThgP/0tQByIqOKGqUNcEE0MLT4/+
 E2/V/Basi3tnCIatAjYZiAN/rKnO5f4iuyh7PKG/bzlYluLv5MLq68HUia8EorN8
 LExNvZAwYhjEQwDTzhBSityLHWmCwy3G0yYsJQ1DnbUdh8wAKr7lj4R0sr8RGJc1
 GxgWnlhi2lAJSGYRa8tYHzh0tTXGOCoR8POKXFJ91PTx8gEO6VzULvbIQm0RLSow
 +LGW36ov/ChQtJzVJsfcW6Hf4wHFevrtVTPtLWckMEtRq/DJ7hS2btgc02hpqFZm
 MK7wywHT35LV+DvU6QPmwUUaf5IXJjWx0W7thOsjWbYMbAHC/0D3De8bgGaAI3B1
 nB+B96BpGrALyhTX2pXQiQxsavXBl37BOGl3Ft03WrAVI4aJsfkaWDRS2X1jxfXI
 zhGBN2vseoiJblie95hLIgvMtkRmOq4E44oNDiP9zKTwrIkISoz5jmvLHY/8Mj7E
 NCof2P+K6ays8ywD2DqHlJKmiGA7PdNT87ZeeS4ZFvEjWSd4S1pfa0R+jg5FVxZl
 Vl7QQX5D/Ep+sXszJin4dYQnl844+sVMVaj6CdQOK0udml81UZTRO5fvjNexs3e4
 4Zd/ymC/XGs6Hz3pbPeIkAd/MzXCK0zNojNAdZnicOMzQpG2sZ76SJRZQg0sTCFH
 EP6QTWxOog4lnDfML13E
 =F2DQ
 -----END PGP SIGNATURE-----

Merge tag 'nfsd-4.9-2' of git://linux-nfs.org/~bfields/linux

Pull nfsd bugfix from Bruce Fields:
 "Just one fix for an NFS/RDMA crash"

* tag 'nfsd-4.9-2' of git://linux-nfs.org/~bfields/linux:
  sunrpc: svc_age_temp_xprts_now should not call setsockopt non-tcp transports
2016-11-18 16:32:21 -08:00
Olof Johansson
7f9b2b1506 Allwinner defconfig changes for 4.10
Two patches to enable the dumb VGA bridges in the multi_v7 and sunxi
 defconfig.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCAAGBQJYK3fVAAoJEBx+YmzsjxAgxQsQAJ5nxCMARLnLKaSuIu8OuusQ
 eZLogAq5TDTirkgHupOM1eEAsTwwsH/I2EtkR86Cl0cE4nhAEJfaCh5DnlG5rcaV
 wi1jeyDy5mEaxyD1eom1YZYnXdBbmiSK5f6snJV4Uhu915sgDA28JYtSNhVvTtMA
 igoPhGMFzmyOZY4LD2BgYk/BjTxpJEMnI4uqKv7mlp1noqwY67bl5/h204vUsmqt
 n3vZPg4h7IWZWMVlG+xMjfZ8o+xSJtQ7G2o84GDtdQ7f3dbpcxHfUrd7MK07ogCP
 vLLHgAvXDwKx3RaY/E8sJ9Qa4awASxFMEHGfPoFhSwEmm72LVr1JStvmstvLlmmC
 pdzhzQNRguiSX4VTJWwcl6ne1WI9Y/e+s3+OPXQQuNoAwqoS3waRqw9BUZQydNTu
 lcL0bMZZSO8a3MJdlIenISjt8LCWZLF/K9w+/nxf6HQigSF8GTfLTTbLu6/ew37H
 XfneT/zJN+cw7Fw5J/Q9Xn8tawF8CVqPxKzhCz5YFr/R2EznVgfdVfALQ2mkJqhS
 m0c+rFSCIKmAd+Txx3eabyoAoJlCwDFmanoohR5QFvNLxSRfeYP28Vkyrx5MKVMa
 Uw5l3rQ/3613GYTiMSEDqhuMbHklZ1ZVEWCDlih6+pO08nGIzv8F9CSWJV5yNYGM
 XQPblI7HJ1ExwoOiIqlS
 =P4mc
 -----END PGP SIGNATURE-----

Merge tag 'sunxi-defconfig-for-4.10' of https://git.kernel.org/pub/scm/linux/kernel/git/mripard/linux into next/defconfig

Allwinner defconfig changes for 4.10

Two patches to enable the dumb VGA bridges in the multi_v7 and sunxi
defconfig.

* tag 'sunxi-defconfig-for-4.10' of https://git.kernel.org/pub/scm/linux/kernel/git/mripard/linux:
  ARM: multi_v7: enable VGA bridge
  ARM: sunxi: Enable VGA bridge

Signed-off-by: Olof Johansson <olof@lixom.net>
2016-11-18 16:28:02 -08:00
Olof Johansson
78d375b97e Reset controller changes for v4.10
- remove obsolete STiH41[56] platform support
 - add Oxford Semiconductor OX820 support
 - add reset index include files for OX810SE and OX820
 - make drivers with boolean Kconfig options explicitly
   non-modular
 - allow shared pulsed resets via reset_control_reset, which
   in this case means that the reset must have been triggered
   once, but possibly earlier, after the function returns, and
   is never triggered again for the lifetime of the reset
   control
 -----BEGIN PGP SIGNATURE-----
 
 iQJMBAABCAA2FiEEBsBxhV1FaKwXuCOBUMKIHHCeYOsFAlgrXr8YHHBoaWxpcHAu
 emFiZWxAZ21haWwuY29tAAoJEFDCiBxwnmDrP7UQAIVZfUCBUfc5PF6u/pWrUOtn
 Eg7SWvo1TYcJPJKoICwlRwKXmzLVfmivCsxW8my5Ddb66JQ5hMkoqsXALQl0sAZK
 U2xXiy76z+EsL9tvvJ2b9Yqx3BIbzPkZyv419RWMVldqUtdbtHe70xLjF7DVfaTs
 f9BTh0OAN55yWhYS7HQDQKm2bvUe3jy0XDzXlbPXgdy/roAReBFSFO4r80cbbAfv
 DZwhrzmGSAGF0fE/TQA65wkkuZMM/rE3cjOjieVC4ZaeX7OQ+eqIMwph25uHCEnS
 61g1J8GcwKvtP1iXAg5VdLAW2TTGTHztBR8UchrxZfySnKWR0rGee3LkQX8LVfih
 mVI2jtzpq6fGYSask9B/fGnObfjWF0kHcgy525kdcNbo8HVZ0vdgZI/yoCh3ZViT
 RHpuXk5cHYM6t7aNGOBvVvdKOL3wtuFUrJKYEpOL3BB//SLZXxCDCawMt7H2XHX5
 3EbLfKikQQbMNs8XzeSfTOQi9ITcdwnLdTqjBunAbf0aMiF+SOtDS/yb6BsG7zjv
 muz4LUtcMq5YfXnZbB11ngz+DSN/g7MHIjl9MXG8kMO9JsVb1R9VtmbxNQ87EU92
 oKGy4ix0vSIRl7/gmmFZt4iV8RQOvffkB4iIkOXlWXdnsUrqIj3V/YwEr+8leoWk
 MHr75XggJOUH9VaMUP/2
 =e7kp
 -----END PGP SIGNATURE-----

Merge tag 'reset-for-4.10' of git://git.pengutronix.de/git/pza/linux into next/drivers

Reset controller changes for v4.10

- remove obsolete STiH41[56] platform support
- add Oxford Semiconductor OX820 support
- add reset index include files for OX810SE and OX820
- make drivers with boolean Kconfig options explicitly
  non-modular
- allow shared pulsed resets via reset_control_reset, which
  in this case means that the reset must have been triggered
  once, but possibly earlier, after the function returns, and
  is never triggered again for the lifetime of the reset
  control

* tag 'reset-for-4.10' of git://git.pengutronix.de/git/pza/linux:
  reset: allow using reset_control_reset with shared reset
  reset: lpc18xx: make it explicitly non-modular
  reset: zynq: make it explicitly non-modular
  reset: sunxi: make it explicitly non-modular
  reset: socfpga: make it explicitly non-modular
  reset: berlin: make it explicitly non-modular
  dt-bindings: reset: oxnas: Update for OX820
  dt-bindings: reset: oxnas: Add include file with reset indexes
  reset: oxnas: Add OX820 support
  reset: sti: softreset: Remove obsolete platforms from dt binding doc.
  reset: sti: Remove STiH415/6 reset support

Signed-off-by: Olof Johansson <olof@lixom.net>
2016-11-18 16:17:45 -08:00
Jonathan Corbet
726d661fea Merge remote-tracking branch 'sound/topic/restize-docs' into sound
Bring in the sphinxification of the sound documentation.
2016-11-18 16:19:28 -07:00
Ulf Hansson
1d9174fbc5 PM / Runtime: Defer resuming of the device in pm_runtime_force_resume()
When the pm_runtime_force_suspend|resume() helpers were invented, we still
had CONFIG_PM_RUNTIME and CONFIG_PM_SLEEP as separate Kconfig options.

To make sure these helpers worked for all combinations and without
introducing too much of complexity, the device was always resumed in
pm_runtime_force_resume().

More precisely, when CONFIG_PM_SLEEP was set and CONFIG_PM_RUNTIME was
unset, we needed to resume the device as the subsystem/driver couldn't
rely on using runtime PM to do it.

As the CONFIG_PM_RUNTIME option was merged into CONFIG_PM a while ago, it
removed this combination, of using CONFIG_PM_SLEEP without the earlier
CONFIG_PM_RUNTIME.

For this reason we can now rely on the subsystem/driver to use runtime PM
to resume the device, instead of forcing that to be done in all cases. In
other words, let's defer the runtime resume to a later point when it's
actually needed.

Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-11-19 00:15:39 +01:00
Jonathan Corbet
917fef6f7e Linux 4.9-rc4
-----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJYHmoCAAoJEHm+PkMAQRiG7RMIAI2i7Y5hpL5yCxK5AFaL4u/G
 KxXfp1B1UanUTgjOmd7zGqtDYcFX9t7GTTUFixQ7/9Opr4PD9qbnatoDGSc3xjbT
 msDgA1B78F1/Q3kHWfeGq32MihQ4mj5NwUCo+igUcUvvWG7mHgzErj/Nh5RoobQX
 p/izdpTbrw3GX6xXB8olbG7XWHaVye/+TT3q6+gmgm8I/QEujcLeGoycE0zlhPN8
 FG/JX76At/+ZM2Py7Oxo3k+oKL9CHrtOQYDp/wN0uslV5eYvvkZz0/M1HMOGZt+c
 gZU5jzM17K7C4Nzo06WAuBU9wUBGc25m+cPicLlOmljnzfU+f50SKaDjZq3p7QI=
 =2KUF
 -----END PGP SIGNATURE-----

Merge tag 'v4.9-rc4' into sound

Bring in -rc4 patches so I can successfully merge the sound doc changes.
2016-11-18 16:13:41 -07:00
Pavel Machek
dbfa048db9 MAINTAINERS: Add LED subsystem co-maintainer
Mark me as a co-maintainer of LED subsystem.

Signed-off-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Jacek Anaszewski <j.anaszewski@samsung.com>
2016-11-18 23:56:10 +01:00
Mauro Carvalho Chehab
f2709c206d Revert "[media] dvb_frontend: merge duplicate dvb_tuner_ops.release implementations"
While this patch sounded a good idea, unfortunately, it causes
bad dependencies, as drivers that would otherwise work without
the DVB core will now break:

ERROR: "dvb_tuner_simple_release" [drivers/media/tuners/tea5767.ko] undefined!
ERROR: "dvb_tuner_simple_release" [drivers/media/tuners/tea5761.ko] undefined!
ERROR: "dvb_tuner_simple_release" [drivers/media/tuners/tda827x.ko] undefined!
ERROR: "dvb_tuner_simple_release" [drivers/media/tuners/tda18218.ko] undefined!
ERROR: "dvb_tuner_simple_release" [drivers/media/tuners/qt1010.ko] undefined!
ERROR: "dvb_tuner_simple_release" [drivers/media/tuners/mt2266.ko] undefined!
ERROR: "dvb_tuner_simple_release" [drivers/media/tuners/mt20xx.ko] undefined!
ERROR: "dvb_tuner_simple_release" [drivers/media/tuners/mt2060.ko] undefined!
ERROR: "dvb_tuner_simple_release" [drivers/media/tuners/mc44s803.ko] undefined!
ERROR: "dvb_tuner_simple_release" [drivers/media/tuners/fc0013.ko] undefined!
ERROR: "dvb_tuner_simple_release" [drivers/media/tuners/fc0012.ko] undefined!
ERROR: "dvb_tuner_simple_release" [drivers/media/tuners/fc0011.ko] undefined!

So, we have to revert it.

Note: as the argument for the release ops changed from "int"
to "void", we needed to change it at the revert patch, to
avoid compilation issues like:
	drivers/media/tuners/tea5767.c:437:23: error: initialization from incompatible pointer type [-Werror=incompatible-pointer-types]
	  .release           = tea5767_release,
	                       ^~~~~~~~~~~~~~~

This reverts commit 22a613e898.

Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
2016-11-18 20:44:33 -02:00
Chris Wilson
05c348377d drm/i915: Skip final clflush if LLC is coherent
If the LLC is coherent with the object, we do not need to worry about
whether main memory and cache mismatch when we hand the object back to
the system.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161118211747.25197-2-chris@chris-wilson.co.uk
2016-11-18 22:33:49 +00:00
Chris Wilson
a6a7cc4b7d drm/i915: Always flush the dirty CPU cache when pinning the scanout
Currently we only clflush the scanout if it is in the CPU domain. Also
flush if we have a pending CPU clflush. We also want to treat the
dirtyfb path similar, and flush any pending writes there as well.

v2: Only send the fb flush message if flushing the dirt on flip
v3: Make flush-for-flip and dirtyfb look more alike since they serve
similar roles as end-of-frame marker.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> #v2
Link: http://patchwork.freedesktop.org/patch/msgid/20161118211747.25197-1-chris@chris-wilson.co.uk
2016-11-18 22:33:22 +00:00
Krzysztof Kozlowski
8e35c48e70 ARM: dts: exynos: Fix invalid GIC interrupt flags in audio block of Exynos5410
Recently added audio block of Exynos5410 missed global fixup of GIC
interrupt flags.  Interrupt of type IRQ_TYPE_NONE is not allowed for GIC
interrupts so use level high.

Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Tested-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
2016-11-18 23:29:19 +02:00
Song Liu
5aabf7c49d md/r5cache: r5cache recovery: part 2
1. In previous patch, we:
      - add new data to r5l_recovery_ctx
      - add new functions to recovery write-back cache
   The new functions are not used in this patch, so this patch does not
   change the behavior of recovery.

2. In this patchpatch, we:
      - modify main recovery procedure r5l_recovery_log() to call new
        functions
      - remove old functions

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
2016-11-18 13:28:28 -08:00
Song Liu
b4c625c673 md/r5cache: r5cache recovery: part 1
Recovery of write-back cache has different logic to write-through only
cache. Specifically, for write-back cache, the recovery need to scan
through all active journal entries before flushing data out. Therefore,
large portion of the recovery logic is rewritten here.

To make the diffs cleaner, we split the rewrite as follows:

1. In this patch, we:
      - add new data to r5l_recovery_ctx
      - add new functions to recovery write-back cache
   The new functions are not used in this patch, so this patch does not
   change the behavior of recovery.

2. In next patch, we:
      - modify main recovery procedure r5l_recovery_log() to call new
        functions
      - remove old functions

With cache feature, there are 2 different scenarios of recovery:
1. Data-Parity stripe: a stripe with complete parity in journal.
2. Data-Only stripe: a stripe with only data in journal (or partial
   parity).

The code differentiate Data-Parity stripe from Data-Only stripe with
flag STRIPE_R5C_CACHING.

For Data-Parity stripes, we use the same procedure as raid5 journal,
where all the data and parity are replayed to the RAID devices.

For Data-Only strips, we need to finish complete calculate parity and
finish the full reconstruct write or RMW write. For simplicity, in
the recovery, we load the stripe to stripe cache. Once the array is
started, the stripe cache state machine will handle these stripes
through normal write path.

r5c_recovery_flush_log contains the main procedure of recovery. The
recovery code first scans through the journal and loads data to
stripe cache. The code keeps tracks of all these stripes in a list
(use sh->lru and ctx->cached_list), stripes in the list are
organized in the order of its first appearance on the journal.
During the scan, the recovery code assesses each stripe as
Data-Parity or Data-Only.

During scan, the array may run out of stripe cache. In these cases,
the recovery code will also call raid5_set_cache_size to increase
stripe cache size. If the array still runs out of stripe cache
because there isn't enough memory, the array will not assemble.

At the end of scan, the recovery code replays all Data-Parity
stripes, and sets proper states for Data-Only stripes. The recovery
code also increases seq number by 10 and rewrites all Data-Only
stripes to journal. This is to avoid confusion after repeated
crashes. More details is explained in raid5-cache.c before
r5c_recovery_rewrite_data_only_stripes().

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
2016-11-18 13:28:14 -08:00
Song Liu
9ed988f5dc md/r5cache: refactoring journal recovery code
1. rename r5l_read_meta_block() as r5l_recovery_read_meta_block();
2. pull the code that initialize r5l_meta_block from
   r5l_log_write_empty_meta_block() to a separate function
   r5l_recovery_create_empty_meta_block(), so that we can reuse this
   piece of code.

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
2016-11-18 13:27:45 -08:00
Song Liu
2c7da14b90 md/r5cache: sysfs entry journal_mode
With write cache, journal_mode is the knob to switch between
write-back and write-through.

Below is an example:

root@virt-test:~/# cat /sys/block/md0/md/journal_mode
[write-through] write-back
root@virt-test:~/# echo write-back > /sys/block/md0/md/journal_mode
root@virt-test:~/# cat /sys/block/md0/md/journal_mode
write-through [write-back]

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
2016-11-18 13:27:24 -08:00
Song Liu
a39f7afde3 md/r5cache: write-out phase and reclaim support
There are two limited resources, stripe cache and journal disk space.
For better performance, we priotize reclaim of full stripe writes.
To free up more journal space, we free earliest data on the journal.

In current implementation, reclaim happens when:
1. Periodically (every R5C_RECLAIM_WAKEUP_INTERVAL, 30 seconds) reclaim
   if there is no reclaim in the past 5 seconds.
2. when there are R5C_FULL_STRIPE_FLUSH_BATCH (256) cached full stripes,
   or cached stripes is enough for a full stripe (chunk size / 4k)
   (r5c_check_cached_full_stripe)
3. when there is pressure on stripe cache (r5c_check_stripe_cache_usage)
4. when there is pressure on journal space (r5l_write_stripe, r5c_cache_data)

r5c_do_reclaim() contains new logic of reclaim.

For stripe cache:

When stripe cache pressure is high (more than 3/4 stripes are cached,
or there is empty inactive lists), flush all full stripe. If fewer
than R5C_RECLAIM_STRIPE_GROUP (NR_STRIPE_HASH_LOCKS * 2) full stripes
are flushed, flush some paritial stripes. When stripe cache pressure
is moderate (1/2 to 3/4 of stripes are cached), flush all full stripes.

For log space:

To avoid deadlock due to log space, we need to reserve enough space
to flush cached data. The size of required log space depends on total
number of cached stripes (stripe_in_journal_count). In current
implementation, the writing-out phase automatically include pending
data writes with parity writes (similar to write through case).
Therefore, we need up to (conf->raid_disks + 1) pages for each cached
stripe (1 page for meta data, raid_disks pages for all data and
parity). r5c_log_required_to_flush_cache() calculates log space
required to flush cache. In the following, we refer to the space
calculated by r5c_log_required_to_flush_cache() as
reclaim_required_space.

Two flags are added to r5conf->cache_state: R5C_LOG_TIGHT and
R5C_LOG_CRITICAL. R5C_LOG_TIGHT is set when free space on the log
device is less than 3x of reclaim_required_space. R5C_LOG_CRITICAL
is set when free space on the log device is less than 2x of
reclaim_required_space.

r5c_cache keeps all data in cache (not fully committed to RAID) in
a list (stripe_in_journal_list). These stripes are in the order of their
first appearance on the journal. So the log tail (last_checkpoint)
should point to the journal_start of the first item in the list.

When R5C_LOG_TIGHT is set, r5l_reclaim_thread starts flushing out
stripes at the head of stripe_in_journal. When R5C_LOG_CRITICAL is
set, the state machine only writes data that are already in the
log device (in stripe_in_journal_list).

This patch includes a fix to improve performance by
Shaohua Li <shli@fb.com>.

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
2016-11-18 13:26:48 -08:00
Song Liu
1e6d690b93 md/r5cache: caching phase of r5cache
As described in previous patch, write back cache operates in two
phases: caching and writing-out. The caching phase works as:
1. write data to journal
   (r5c_handle_stripe_dirtying, r5c_cache_data)
2. call bio_endio
   (r5c_handle_data_cached, r5c_return_dev_pending_writes).

Then the writing-out phase is as:
1. Mark the stripe as write-out (r5c_make_stripe_write_out)
2. Calcualte parity (reconstruct or RMW)
3. Write parity (and maybe some other data) to journal device
4. Write data and parity to RAID disks

This patch implements caching phase. The cache is integrated with
stripe cache of raid456. It leverages code of r5l_log to write
data to journal device.

Writing-out phase of the cache is implemented in the next patch.

With r5cache, write operation does not wait for parity calculation
and write out, so the write latency is lower (1 write to journal
device vs. read and then write to raid disks). Also, r5cache will
reduce RAID overhead (multipile IO due to read-modify-write of
parity) and provide more opportunities of full stripe writes.

This patch adds 2 flags to stripe_head.state:
 - STRIPE_R5C_PARTIAL_STRIPE,
 - STRIPE_R5C_FULL_STRIPE,

Instead of inactive_list, stripes with cached data are tracked in
r5conf->r5c_full_stripe_list and r5conf->r5c_partial_stripe_list.
STRIPE_R5C_FULL_STRIPE and STRIPE_R5C_PARTIAL_STRIPE are flags for
stripes in these lists. Note: stripes in r5c_full/partial_stripe_list
are not considered as "active".

For RMW, the code allocates an extra page for each data block
being updated.  This is stored in r5dev->orig_page and the old data
is read into it.  Then the prexor calculation subtracts ->orig_page
from the parity block, and the reconstruct calculation adds the
->page data back into the parity block.

r5cache naturally excludes SkipCopy. When the array has write back
cache, async_copy_data() will not skip copy.

There are some known limitations of the cache implementation:

1. Write cache only covers full page writes (R5_OVERWRITE). Writes
   of smaller granularity are write through.
2. Only one log io (sh->log_io) for each stripe at anytime. Later
   writes for the same stripe have to wait. This can be improved by
   moving log_io to r5dev.
3. With writeback cache, read path must enter state machine, which
   is a significant bottleneck for some workloads.
4. There is no per stripe checkpoint (with r5l_payload_flush) in
   the log, so recovery code has to replay more than necessary data
   (sometimes all the log from last_checkpoint). This reduces
   availability of the array.

This patch includes a fix proposed by ZhengYuan Liu
<liuzhengyuan@kylinos.cn>

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
2016-11-18 13:26:30 -08:00
Song Liu
2ded370373 md/r5cache: State machine for raid5-cache write back mode
This patch adds state machine for raid5-cache. With log device, the
raid456 array could operate in two different modes (r5c_journal_mode):
  - write-back (R5C_MODE_WRITE_BACK)
  - write-through (R5C_MODE_WRITE_THROUGH)

Existing code of raid5-cache only has write-through mode. For write-back
cache, it is necessary to extend the state machine.

With write-back cache, every stripe could operate in two different
phases:
  - caching
  - writing-out

In caching phase, the stripe handles writes as:
  - write to journal
  - return IO

In writing-out phase, the stripe behaviors as a stripe in write through
mode R5C_MODE_WRITE_THROUGH.

STRIPE_R5C_CACHING is added to sh->state to differentiate caching and
writing-out phase.

Please note: this is a "no-op" patch for raid5-cache write-through
mode.

The following detailed explanation is copied from the raid5-cache.c:

/*
 * raid5 cache state machine
 *
 * With rhe RAID cache, each stripe works in two phases:
 *      - caching phase
 *      - writing-out phase
 *
 * These two phases are controlled by bit STRIPE_R5C_CACHING:
 *   if STRIPE_R5C_CACHING == 0, the stripe is in writing-out phase
 *   if STRIPE_R5C_CACHING == 1, the stripe is in caching phase
 *
 * When there is no journal, or the journal is in write-through mode,
 * the stripe is always in writing-out phase.
 *
 * For write-back journal, the stripe is sent to caching phase on write
 * (r5c_handle_stripe_dirtying). r5c_make_stripe_write_out() kicks off
 * the write-out phase by clearing STRIPE_R5C_CACHING.
 *
 * Stripes in caching phase do not write the raid disks. Instead, all
 * writes are committed from the log device. Therefore, a stripe in
 * caching phase handles writes as:
 *      - write to log device
 *      - return IO
 *
 * Stripes in writing-out phase handle writes as:
 *      - calculate parity
 *      - write pending data and parity to journal
 *      - write data and parity to raid disks
 *      - return IO for pending writes
 */

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
2016-11-18 13:26:07 -08:00
Song Liu
937621c36e md/r5cache: move some code to raid5.h
Move some define and inline functions to raid5.h, so they can be
used in raid5-cache.c

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
2016-11-18 13:25:40 -08:00
Song Liu
c757ec95c2 md/r5cache: Check array size in r5l_init_log
Currently, r5l_write_stripe checks meta size for each stripe write,
which is not necessary.

With this patch, r5l_init_log checks maximal meta size of the array,
which is (r5l_meta_block + raid_disks x r5l_payload_data_parity).
If this is too big to fit in one page, r5l_init_log aborts.

With current meta data, r5l_log support raid_disks up to 203.

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
2016-11-18 13:24:46 -08:00
Chris Wilson
b17993b7b2 drm/i915: Don't touch NULL sg on i915_gem_object_get_pages_gtt() error
On the DMA mapping error path, sg may be NULL (it has already been
marked as the last scatterlist entry), and we should avoid dereferencing
it again.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: e227330223 ("drm/i915: avoid leaking DMA mappings")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Imre Deak <imre.deak@intel.com>
Cc: stable@vger.kernel.org
Link: http://patchwork.freedesktop.org/patch/msgid/20161114112930.2033-1-chris@chris-wilson.co.uk
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
2016-11-18 20:51:53 +00:00
Chris Wilson
786d290cae drm/i915: Check that each request phase is completed before retiring
Trying to chase an impossible bug (ivb):

[  207.765411] [drm:i915_reset_and_wakeup [i915]] resetting chip
[  207.765734] [drm:i915_gem_reset [i915]] resetting render ring to restart from tail of request 0x4ee834
[  207.765791] [drm:intel_print_rc6_info [i915]] Enabling RC6 states: RC6 on RC6p on RC6pp off
[  207.767213] [drm:intel_guc_setup [i915]] GuC fw status: path (null), fetch NONE, load NONE
[  207.767515] kernel BUG at drivers/gpu/drm/i915/i915_gem_request.c:203!
[  207.767551] invalid opcode: 0000 [#1] PREEMPT SMP
[  207.767576] Modules linked in: snd_hda_intel i915 cdc_ncm usbnet mii x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel lpc_ich snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec snd_hwdep snd_hda_core mei_me mei snd_pcm sdhci_pci sdhci mmc_core e1000e ptp pps_core [last unloaded: i915]
[  207.767808] CPU: 3 PID: 8855 Comm: gem_ringfill Tainted: G     U          4.9.0-rc5-CI-Patchwork_3052+ #1
[  207.767854] Hardware name: LENOVO 2356GCG/2356GCG, BIOS G7ET31WW (1.13 ) 07/02/2012
[  207.767894] task: ffff88012c82a740 task.stack: ffffc9000383c000
[  207.767927] RIP: 0010:[<ffffffffa00a0a3a>]  [<ffffffffa00a0a3a>] i915_gem_request_retire+0x2a/0x4b0 [i915]
[  207.767999] RSP: 0018:ffffc9000383fb20  EFLAGS: 00010293
[  207.768027] RAX: 00000000004ee83c RBX: ffff880135dcb480 RCX: 00000000004ee83a
[  207.768062] RDX: ffff88012fea42a8 RSI: 0000000000000001 RDI: ffff88012c82af68
[  207.768095] RBP: ffffc9000383fb48 R08: 0000000000000000 R09: 0000000000000000
[  207.768129] R10: 0000000000000000 R11: 0000000000000000 R12: ffff880135dcb480
[  207.768163] R13: ffff88012fea42a8 R14: 0000000000000000 R15: 00000000000001d8
[  207.768200] FS:  00007f955f658740(0000) GS:ffff88013e2c0000(0000) knlGS:0000000000000000
[  207.768239] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  207.768258] CR2: 0000555899725930 CR3: 00000001316f6000 CR4: 00000000001406e0
[  207.768286] Stack:
[  207.768299]  ffff880135dcb480 ffff880135dcbe00 ffff88012fea42a8 0000000000000000
[  207.768350]  00000000000001d8 ffffc9000383fb70 ffffffffa00a1339 0000000000000000
[  207.768402]  ffff88012f296c88 00000000000003f0 ffffc9000383fbb0 ffffffffa00b582d
[  207.768453] Call Trace:
[  207.768493]  [<ffffffffa00a1339>] i915_gem_request_retire_upto+0x49/0x90 [i915]
[  207.768553]  [<ffffffffa00b582d>] intel_ring_begin+0x15d/0x2d0 [i915]
[  207.768608]  [<ffffffffa00b59cb>] intel_ring_alloc_request_extras+0x2b/0x40 [i915]
[  207.768667]  [<ffffffffa00a2fd9>] i915_gem_request_alloc+0x359/0x440 [i915]
[  207.768723]  [<ffffffffa008bd03>] i915_gem_do_execbuffer.isra.15+0x783/0x1a10 [i915]
[  207.768766]  [<ffffffff811a6a2e>] ? __might_fault+0x3e/0x90
[  207.768816]  [<ffffffffa008d380>] i915_gem_execbuffer2+0xc0/0x250 [i915]
[  207.768854]  [<ffffffff815532a6>] drm_ioctl+0x1f6/0x480
[  207.768900]  [<ffffffffa008d2c0>] ? i915_gem_execbuffer+0x330/0x330 [i915]
[  207.768939]  [<ffffffff81202f6e>] do_vfs_ioctl+0x8e/0x690
[  207.768972]  [<ffffffff818193ac>] ? retint_kernel+0x2d/0x2d
[  207.769004]  [<ffffffff810d6ef2>] ? trace_hardirqs_on_caller+0x122/0x1b0
[  207.769039]  [<ffffffff812035ac>] SyS_ioctl+0x3c/0x70
[  207.769068]  [<ffffffff818189ae>] entry_SYSCALL_64_fastpath+0x1c/0xb1
[  207.769103] Code: 90 55 48 89 e5 41 57 41 56 41 55 41 54 49 89 fc 53 8b 35 fa 7b e1 e1 85 f6 0f 85 55 03 00 00 41 8b 84 24 80 02 00 00 85 c0 75 02 <0f> 0b 49 8b 94 24 a8 00 00 00 48 8b 8a e0 01 00 00 8b 89 c0 00
[  207.769400] RIP  [<ffffffffa00a0a3a>] i915_gem_request_retire+0x2a/0x4b0 [i915]
[  207.769463]  RSP <ffffc9000383fb20>

Let's add a couple more BUG_ONs before this to ascertain that the request
did make it to hardware. The impossible part of this stacktrace is that
request must have been considered completed by the i915_request_wait()
before we tried to retire it.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161118143412.26508-1-chris@chris-wilson.co.uk
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
2016-11-18 20:49:46 +00:00
Matthew Auld
43e157fa1d drm/i915: i915_pages_create_for_stolen should return err ptr
When gathering the pages from our backing storage we expect get_pages()
to either give us our sg_table or an err ptr. However when gathering our
fake pages for stolen memory we may return NULL in the event of a
failure. To prevent any funny business we should therefore return the
proper err ptr value.

Fixes: 03ac84f183 ("drm/i915: Pass around sg_table to get_pages/put_pages backend")
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1479488536-6168-1-git-send-email-matthew.auld@intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-11-18 20:48:58 +00:00
Rafael J. Wysocki
aab0b243b9 Merge branches 'acpica-fixes', 'acpi-cppc-fixes' and 'acpi-tools-fixes'
* acpica-fixes:
  Revert "ACPICA: FADT support cleanup"

* acpi-cppc-fixes:
  mailbox: PCC: Fix lockdep warning when request PCC channel

* acpi-tools-fixes:
  tools/power/acpi: Remove direct kernel source include reference
2016-11-18 21:34:42 +01:00
Mauro Carvalho Chehab
a4afb3ed43 [media] Kconfig: fix breakages when DVB_CORE is not selected
On some weird randconfigs, it is possible to select DVB
drivers, without having the DVB_CORE:

CONFIG_DVB_AU8522=m
CONFIG_DVB_AU8522_V4L=m
CONFIG_DVB_TUNER_DIB0090=m

This was never supposed to work, but changeset 22a613e898
("[media] dvb_frontend: merge duplicate dvb_tuner_ops.release implementations")
caused it to be exposed:

   drivers/built-in.o: In function `fc0011_attach':
   (.text+0x1598fb): undefined reference to `dvb_tuner_simple_release'
   drivers/built-in.o:(.rodata+0x55e58): undefined reference to `dvb_tuner_simple_release'
   drivers/built-in.o:(.rodata+0x57398): undefined reference to `dvb_tuner_simple_release'

Fixes: 22a613e898 ("[media] dvb_frontend: merge duplicate dvb_tuner_ops.release implementations")
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
2016-11-18 17:59:17 -02:00
David S. Miller
79c3dcbabb Merge branch 'sparc-lockdep-small'
Babu Moger says:

====================
Adjust lockdep static allocations for sparc

These patches limit the static allocations for lockdep data structures
used for debugging locking correctness. For sparc, all the kernel's code,
data, and bss, must have locked translations in the TLB so that we don't
get TLB misses on kernel code and data. Current sparc chips have 8 TLB
entries available that may be locked down, and with a 4mb page size,
this gives a maximum of 32MB. With PROVE_LOCKING we could go over this
limit and cause system boot-up problems. These patches limit the static
allocations so that everything fits in current required size limit.

patch 1 : Adds new config parameter CONFIG_PROVE_LOCKING_SMALL
Patch 2 : Adjusts the sizes based on the new config parameter

v2-> v3:
   Some more comments from Sam Ravnborg and Peter Zijlstra.
   Defined PROVE_LOCKING_SMALL as invisible and moved the selection to
   arch/sparc/Kconfig.

v1-> v2:
   As suggested by Peter Zijlstra, keeping the default as is.
   Introduced new config variable CONFIG_PROVE_LOCKING_SMALL
   to handle sparc specific case.

v0:
   Initial revision.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-18 11:33:26 -08:00
Babu Moger
e245d99e6c lockdep: Limit static allocations if PROVE_LOCKING_SMALL is defined
Reduce the size of data structure for lockdep entries by half if
PROVE_LOCKING_SMALL if defined. This is used only for sparc.

Signed-off-by: Babu Moger <babu.moger@oracle.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-18 11:33:19 -08:00
Babu Moger
e6b5f1be7a config: Adding the new config parameter CONFIG_PROVE_LOCKING_SMALL for sparc
This new config parameter limits the space used for "Lock debugging:
prove locking correctness" by about 4MB. The current sparc systems have
the limitation of 32MB size for kernel size including .text, .data and
.bss sections. With PROVE_LOCKING feature, the kernel size could grow
beyond this limit and causing system boot-up issues. With this option,
kernel limits the size of the entries of lock_chains, stack_trace etc.,
so that kernel fits in required size limit. This is not visible to user
and only used for sparc.

Signed-off-by: Babu Moger <babu.moger@oracle.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-18 11:33:19 -08:00
Benjamin Coddington
d41cbfc9a6 NFSv4.1: Handle NFS4ERR_OLD_STATEID in nfs4_reclaim_open_state
Now that we're doing TEST_STATEID in nfs4_reclaim_open_state(), we can have
a NFS4ERR_OLD_STATEID returned from nfs41_open_expired() .  Instead of
marking state recovery as failed, mark the state for recovery again.

Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-11-18 14:27:27 -05:00
Tushar Dave
1a9bbccaf8 sunbmac: Fix compiler warning
sunbmac uses '__u32' for dma handle while invoking kernel DMA APIs,
instead of using dma_addr_t. This hasn't caused any 'incompatible
pointer type' warning on SPARC because until now dma_addr_t is of
type u32. However, recent changes in SPARC ATU (iommu) enables 64bit
DMA and therefore dma_addr_t becomes of type u64. This makes
'incompatible pointer type' warnings inevitable.

e.g.
drivers/net/ethernet/sun/sunbmac.c: In function ‘bigmac_ether_init’:
drivers/net/ethernet/sun/sunbmac.c:1166: warning: passing argument 3 of ‘dma_alloc_coherent’ from incompatible pointer type
./include/linux/dma-mapping.h:445: note: expected ‘dma_addr_t *’ but argument is of type ‘__u32 *’

This patch resolves above compiler warning.

Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com>
Reviewed-by: chris hyser <chris.hyser@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-18 11:18:27 -08:00
Tushar Dave
266439c94d sunqe: Fix compiler warnings
sunqe uses '__u32' for dma handle while invoking kernel DMA APIs,
instead of using dma_addr_t. This hasn't caused any 'incompatible
pointer type' warning on SPARC because until now dma_addr_t is of
type u32. However, recent changes in SPARC ATU (iommu) enables 64bit
DMA and therefore dma_addr_t becomes of type u64. This makes
'incompatible pointer type' warnings inevitable.

e.g.
drivers/net/ethernet/sun/sunqe.c: In function ‘qec_ether_init’:
drivers/net/ethernet/sun/sunqe.c:883: warning: passing argument 3 of ‘dma_alloc_coherent’ from incompatible pointer type
./include/linux/dma-mapping.h:445: note: expected ‘dma_addr_t *’ but argument is of type ‘__u32 *’
drivers/net/ethernet/sun/sunqe.c:885: warning: passing argument 3 of ‘dma_alloc_coherent’ from incompatible pointer type
./include/linux/dma-mapping.h:445: note: expected ‘dma_addr_t *’ but argument is of type ‘__u32 *’

This patch resolves above compiler warnings.

Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com>
Reviewed-by: chris hyser <chris.hyser@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-18 11:18:26 -08:00
Trond Myklebust
5cc7861eb5 NFSv4: Don't call close if the open stateid has already been cleared
Ensure we test to see if the open stateid is actually set, before we
send a CLOSE.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-11-18 14:18:02 -05:00
David S. Miller
49cc0c43d0 Merge branch 'sun4v-64bit-DMA'
Tushar Dave says:

====================
sparc: Enable sun4v hypervisor PCI IOMMU v2 APIs and ATU

ATU (Address Translation Unit) is a new IOMMU in SPARC supported with
sun4v hypervisor PCI IOMMU v2 APIs.

Current SPARC IOMMU supports only 32bit address ranges and one TSB
per PCIe root complex that has a 2GB per root complex DVMA space
limit. The limit has become a scalability bottleneck nowadays that
a typical 10G/40G NIC can consume 500MB DVMA space per instance.
When DVMA resource is exhausted, devices will not be usable
since the driver can't allocate DVMA.

For example, we recently experienced legacy IOMMU limitation while
using i40e driver in system with large number of CPUs (e.g. 128).
Four ports of i40e, each request 128 QP (Queue Pairs). Each queue has
512 (default) descriptors. So considering only RX queues (because RX
premap DMA buffers), i40e takes 4*128*512 number of DMA entries in
IOMMU table. Legacy IOMMU can have at max (2G/8K)- 1 entries available
in table. So bringing up four instance of i40e alone saturate existing
IOMMU resource.

ATU removes bottleneck by allowing guest os to create IOTSB of size
32G (or more) with 64bit address ranges available in ATU HW. 32G is
more than enough DVMA space to be shared by all PCIe devices under
root complex contrast to 2G space provided by legacy IOMMU.

ATU allows PCIe devices to use 64bit DMA addressing. Devices
which choose to use 32bit DMA mask will continue to work with the
existing legacy IOMMU.

The patch set is tested on sun4v (T1000, T2000, T3, T4, T5, T7, S7)
and sun4u SPARC.

Thanks.
-Tushar

v2->v3:
- Patch #5 addresses comment by Joe Perches.
 -- use %s, __func__ instead of embedding the function name.

v1->v2:
- Patch #2 addresses comments by Dave M.
 -- use page allocator to allocate IOTSB.
 -- use true/false with boolean variables.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-18 11:17:10 -08:00
Tushar Dave
d30a6b84df sparc64: Enable 64-bit DMA
ATU 64bit addressing allows PCIe devices with 64bit DMA capabilities
to use ATU for 64bit DMA.

Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com>
Reviewed-by: chris hyser <chris.hyser@oracle.com>
Acked-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-18 11:17:00 -08:00
Tushar Dave
f08978b0fd sparc64: Enable sun4v dma ops to use IOMMU v2 APIs
Add Hypervisor IOMMU v2 APIs pci_iotsb_map(), pci_iotsb_demap() and
enable sun4v dma ops to use IOMMU v2 API for all PCIe devices with
64bit DMA mask.

Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com>
Reviewed-by: chris hyser <chris.hyser@oracle.com>
Acked-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-18 11:17:00 -08:00
Tushar Dave
5116ab4eab sparc64: Bind PCIe devices to use IOMMU v2 service
In order to use Hypervisor (HV) IOMMU v2 API for map/demap, each PCIe
device has to be bound to IOTSB using HV API pci_iotsb_bind().

Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com>
Reviewed-by: chris hyser <chris.hyser@oracle.com>
Acked-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-18 11:16:59 -08:00
Tushar Dave
31f077dc7d sparc64: Initialize iommu_map_table and iommu_pool
Like legacy IOMMU, use common iommu_map_table and iommu_pool for ATU.
This change initializes iommu_map_table and iommu_pool for ATU.

Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com>
Reviewed-by: chris hyser <chris.hyser@oracle.com>
Reviewed-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-18 11:16:59 -08:00
Tushar Dave
f0248c1524 sparc64: Add ATU (new IOMMU) support
ATU (Address Translation Unit) is a new IOMMU in SPARC supported with
Hypervisor IOMMU v2 APIs.

Current SPARC IOMMU supports only 32bit address ranges and one TSB
per PCIe root complex that has a 2GB per root complex DVMA space
limit. The limit has become a scalability bottleneck nowadays that
a typical 10G/40G NIC can consume 300MB-500MB DVMA space per
instance. When DVMA resource is exhausted, devices will not be usable
since the driver can't allocate DVMA.

ATU removes bottleneck by allowing guest os to create IOTSB of size
32G (or more) with 64bit address ranges available in ATU HW. 32G is
more than enough DVMA space to be shared by all PCIe devices under
root complex contrast to 2G space provided by legacy IOMMU.

ATU allows PCIe devices to use 64bit DMA addressing. Devices
which choose to use 32bit DMA mask will continue to work with the
existing legacy IOMMU.

Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com>
Reviewed-by: chris hyser <chris.hyser@oracle.com>
Acked-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-18 11:16:59 -08:00
Dave Kleikamp
c88c545bf3 sparc64: Add FORCE_MAX_ZONEORDER and default to 13
This change allows ATU (new IOMMU) in SPARC systems to request
large (32M) contiguous memory during boot for creating IOTSB backing
store.

Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com>
Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-18 11:16:58 -08:00
Sabrina Dubroca
f82ef3e10a rtnetlink: fix FDB size computation
Add missing NDA_VLAN attribute's size.

Fixes: 1e53d5bb88 ("net: Pass VLAN ID to rtnl_fdb_notify.")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-18 14:09:42 -05:00
Heiner Kallweit
c044170fcf [media] media: rc: nuvoton: replace usage of spin_lock_irqsave in ISR
Kernel takes care that interrupts from one source are serialized.
So there's no need to use spinlock_irq_save.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
2016-11-18 17:08:15 -02:00
Heiner Kallweit
73d4576d8f [media] media: rc: nuvoton: rename spinlock nvt_lock
Spinlock nvt_lock is a member of struct nvt_dev and there's no need
to prefix it with nvt_. So remove this prefix.

[mchehab@s-opensource.org: change the prefix also at the open function,
 as the patch removing it were not applied (yet?)]
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>

Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
2016-11-18 17:07:07 -02:00
Heiner Kallweit
f7ceec4fa0 [media] media: rc: nuvoton: eliminate nvt->tx.lock
Using a separate spinlock to protect access to substruct tx of struct
nvt_dev doesn't provide any actual benefit. We can use spinlock
nvt_lock to protect all access to struct nvt_dev and get rid of
nvt->tx.lock.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
2016-11-18 17:04:41 -02:00
Hariprasad Shenai
ab677ff4ad cxgb4: Allocate Tx queues dynamically
Allocate resources dynamically for Upper layer driver's (ULD) like
cxgbit, iw_cxgb4, cxgb4i and chcr. The resources allocated include Tx
queues which are allocated when ULD register with cxgb4 driver and freed
while un-registering. The Tx queues which are shared by ULD shall be
allocated by first registering driver and un-allocated by last
unregistering driver.

Signed-off-by: Atul Gupta <atul.gupta@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-18 14:04:29 -05:00
Heiner Kallweit
b24ccccaee [media] media: rc: nuvoton: eliminate member pdev from struct nvt_dev
Member pdev of struct nvt_dev is needed only to access &pdev->dev.
We can get rid of this it by using rdev->dev.parent instead
(both point to the same struct device).

Setting rdev->dev.parent can be removed from the probe function
as this is done by devm_rc_allocate_device now.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
2016-11-18 17:03:32 -02:00
Dan Carpenter
c816061d27 liquidio CN23XX: bitwise vs logical AND typo
We obviously intended a bitwise AND here, not a logical one.

Fixes: 8c978d0592 ("liquidio CN23XX: Mailbox support")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-18 14:03:32 -05:00
Stefan Hajnoczi
0f5258cd91 netns: fix get_net_ns_by_fd(int pid) typo
The argument to get_net_ns_by_fd() is a /proc/$PID/ns/net file
descriptor not a pid.  Fix the typo.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Rami Rosen <roszenrami@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-18 14:01:58 -05:00
David S. Miller
87305c4cd2 A few more bugfixes:
* limit # of scan results stored in memory - this is a long-standing bug
    Jouni and I only noticed while discussing other things in Santa Fe
  * revert AP_LINK_PS patch that was causing issues (Felix)
  * various A-MSDU/A-MPDU fixes for TXQ code (Felix)
  * interoperability workaround for peers with broken VHT capabilities
    (Filip Matusiak)
  * add bitrate definition for a VHT MCS that's supposed to be invalid
    but gets used by some hardware anyway (Thomas Pedersen)
  * beacon timer fix in hwsim (Benjamin Beichler)
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCgAGBQJYLrKJAAoJEGt7eEactAAdshYP/Rwi/k9WZ0RKQIA1QA0fnmwv
 TbUUvPHQ4LA39Czs0JwOPDIbter7TBzFXV5EK0y39jGm8St4a/oHuN072kzBYvob
 YbBzM0YKV2uQCAjH3AjWjS3RPOlTzfUqSRWgEqtgUTC5VZrpNfhd2r0q+jfwpQ7s
 +qUsdwqvVJMZKfOEngIXzNXdI95N1d8tpGwzkUqDbOJ+7amdriiJKyTsKEOMqREA
 0Mdhu6roEcMDVO+xt5ZkilmpLZOUEHAzaWkwm7d6VToWi7k3pwMiEZ4fthvHZHEZ
 6K2bn1UMLdKAExRcQBE3wrmn5jFfUbts1mhmoN1drHocI12epcgw5I4ZjEbfpqMB
 IBkOM+2hbSdUVfE4KVH6iubqiJwtHB8YSTdKFmMO4jPx7UshPF7YCny6XWC+EqQm
 ktjyG/lhlXJ0HaOKC/MAwd/KSzfH0eJWGNBDV5s/FIWiqu2oWn+ZIX7vGTpp4K1Z
 Rery/ZUiJGzVITNG1ka+GXq2tUvDXfkMApr+jFH4G6zioWovxB6+4uyMEeclvC5q
 zjgYEpZhbn2wQUNEJ8btDDgXkyDGh3UPjy93fO8fMEvv8d9CCOKTQkIUpjPp6jbZ
 EoLCTW0NNjnAmQRiUFF3vO2O7e3ZTJ4hyNEVkc7mJ2X3DllXRSelTIFSfZVHpb2k
 P+gY7uC0cAPZ4s6q8GSb
 =q2mI
 -----END PGP SIGNATURE-----

Merge tag 'mac80211-for-davem-2016-11-18' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211

Johannes Berg says:

====================
A few more bugfixes:
 * limit # of scan results stored in memory - this is a long-standing bug
   Jouni and I only noticed while discussing other things in Santa Fe
 * revert AP_LINK_PS patch that was causing issues (Felix)
 * various A-MSDU/A-MPDU fixes for TXQ code (Felix)
 * interoperability workaround for peers with broken VHT capabilities
   (Filip Matusiak)
 * add bitrate definition for a VHT MCS that's supposed to be invalid
   but gets used by some hardware anyway (Thomas Pedersen)
 * beacon timer fix in hwsim (Benjamin Beichler)
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-18 14:00:27 -05:00