- Fix mismatched interrupt numbers for tps65217, these are not yet
used
- Remove unused omapdss_early_init_of()
- Use seq_putc() for pm-debug.c
-----BEGIN PGP SIGNATURE-----
iQIuBAABCAAYBQJYK1bEERx0b255QGF0b21pZGUuY29tAAoJEBvUPslcq6VzX9EQ
AL7rLfFsVY39z2pu/QEdkiBfypIcJx1TE9jMR/9j4rJUxewhTkVvyK0fqHnUv7ZP
znXM/v0ZGE7q4AoyaV2mBsrk+Gty8pkzW50swUxmKK6XwJSjmKAeX1RkzmmNmPYW
xXC1h77GmbmGnxFJIqQrvbuJse2ItKUIq3ks7+ZELlAwe51SSM3CHHuEVKcPNcJS
2gmwmaWEi8WaSLBEmlI4OverVlzZrAWYxGyG+QArY9RKRuzCeIKjyk4y+L6RVk6F
J0Y6Wqbmd9LAwKGwkxBjhpOMTJnM9cHbje7w4wvKWOZ4PVNfneCQXGSwhI9CZIU9
WqDCeTDkx42ElV7KQSrxSBOuXRv6uIoLYLQIVTQeSHf+EeN4v74CSPMNdgpMWoVu
AvLgq0pUth4c2hEn8nl2M77jxdqtG7IbtrrOHhBI5TG6s4wPXOco7JKcZOQ7cKkN
GG4RnvXPH4zFuX0pn9ONn3IAAQXXRP1fXolmMBewYOpaOhrNURasq/y75dPFohHx
P7Frv/f6B5MUKciHe1nENJhL/BCubSHhqNGy2fyNXi3yHH3Z9LK0sfad6D938Kl8
xSqLE1LUrp2rj3jvalj7masywssxpQMsyO26JtMqXUQxHXme5DR+m8pHOkz4/MKe
rGaMSvE5lp84taz8/T8LHEybWog96ky2efsuTIcmMgWy
=tMf4
-----END PGP SIGNATURE-----
Merge tag 'omap-for-v4.10/fixes-not-urgent-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into next/fixes-non-critical
Non-urgent fixes for omaps for v4.10 merge window:
- Fix mismatched interrupt numbers for tps65217, these are not yet
used
- Remove unused omapdss_early_init_of()
- Use seq_putc() for pm-debug.c
* tag 'omap-for-v4.10/fixes-not-urgent-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
ARM: OMAP2+: pm-debug: Use seq_putc() in two functions
ARM: OMAP2+: Remove the omapdss_early_init_of() function
mfd: tps65217: Fix mismatched interrupt number
Signed-off-by: Olof Johansson <olof@lixom.net>
Support for the Allwinner A64, their first armv8 SoC.
-----BEGIN PGP SIGNATURE-----
iQIcBAABCAAGBQJYK3ohAAoJEBx+YmzsjxAgQ0cP/AtC7KSWnQ2afcEeIRxii9X+
QbiyQqPvmRF/mBtZN/49wKwRdh1qtsGk29UxMf06hrk2oA/Kc9KoPed+5Q0qG2ac
2m1KWN6G+qNk8bbd26A4sXRBsWUH7Gp7kCUjIBoRRUup6YQFySxIEjSU7dfmPHzi
Rxbq5fo9JMpar8bjVr1mVeIL/0p4Uz2LYt4miqARlztGJaUEZtBTrWDncLM0TqTV
hzeP1lKwelwkVDcrrSgZ5nBCcdmXFvIK6S3pNHcy11NCt7Qnf4pEn2heFs0Xh0/3
3nTcr7TEEeEc1oVc6UyM6GCgV3SrBlLTXIhvULQoU0sdhjTQxUo1zpDCO81p0UfQ
Qu7KxjhvXOjRYZLBQfXWN4VwOURrP9Bw2VwxN4vACUpQpQX/9eovIT8XVH9Jd8kO
bnbmzWLJq+arnE81CVrL7oqTK0r6O1/D+6kOLbvP9o6+Pt/2oJ8kHrJQCLXzfOnE
nhuZE489fTLeKtgloR5wez/Vu1JwaIhhHsfWkskljmi7Xx4XUJdu6XtBoS67qJE5
xDg/ujD3S9TT0AhZ5XWn1f6fCPI4N+E8HvHUJNLZSi1iPnW4c2wZ9tBBEOu7hh2X
7ZiZ6k+FOg+ZyY/jdChJ0SWvIzcutIbkP/0pKgLLPozEImv6ARqfyJbWauxbt/Hl
JKw9pE7Uo7ztdAz/52DL
=JkJ+
-----END PGP SIGNATURE-----
Merge tag 'sunxi-dt64-for-4.10' of https://git.kernel.org/pub/scm/linux/kernel/git/mripard/linux into next/dt64
Allwinner arm64 DT changes for 4.10
Support for the Allwinner A64, their first armv8 SoC.
* tag 'sunxi-dt64-for-4.10' of https://git.kernel.org/pub/scm/linux/kernel/git/mripard/linux:
arm64: dts: add Pine64 support
Documentation: devicetree: add vendor prefix for Pine64
arm64: dts: add Allwinner A64 SoC .dtsi
Signed-off-by: Olof Johansson <olof@lixom.net>
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJYLznTAAoJECebzXlCjuG+ThgP/0tQByIqOKGqUNcEE0MLT4/+
E2/V/Basi3tnCIatAjYZiAN/rKnO5f4iuyh7PKG/bzlYluLv5MLq68HUia8EorN8
LExNvZAwYhjEQwDTzhBSityLHWmCwy3G0yYsJQ1DnbUdh8wAKr7lj4R0sr8RGJc1
GxgWnlhi2lAJSGYRa8tYHzh0tTXGOCoR8POKXFJ91PTx8gEO6VzULvbIQm0RLSow
+LGW36ov/ChQtJzVJsfcW6Hf4wHFevrtVTPtLWckMEtRq/DJ7hS2btgc02hpqFZm
MK7wywHT35LV+DvU6QPmwUUaf5IXJjWx0W7thOsjWbYMbAHC/0D3De8bgGaAI3B1
nB+B96BpGrALyhTX2pXQiQxsavXBl37BOGl3Ft03WrAVI4aJsfkaWDRS2X1jxfXI
zhGBN2vseoiJblie95hLIgvMtkRmOq4E44oNDiP9zKTwrIkISoz5jmvLHY/8Mj7E
NCof2P+K6ays8ywD2DqHlJKmiGA7PdNT87ZeeS4ZFvEjWSd4S1pfa0R+jg5FVxZl
Vl7QQX5D/Ep+sXszJin4dYQnl844+sVMVaj6CdQOK0udml81UZTRO5fvjNexs3e4
4Zd/ymC/XGs6Hz3pbPeIkAd/MzXCK0zNojNAdZnicOMzQpG2sZ76SJRZQg0sTCFH
EP6QTWxOog4lnDfML13E
=F2DQ
-----END PGP SIGNATURE-----
Merge tag 'nfsd-4.9-2' of git://linux-nfs.org/~bfields/linux
Pull nfsd bugfix from Bruce Fields:
"Just one fix for an NFS/RDMA crash"
* tag 'nfsd-4.9-2' of git://linux-nfs.org/~bfields/linux:
sunrpc: svc_age_temp_xprts_now should not call setsockopt non-tcp transports
Two patches to enable the dumb VGA bridges in the multi_v7 and sunxi
defconfig.
-----BEGIN PGP SIGNATURE-----
iQIcBAABCAAGBQJYK3fVAAoJEBx+YmzsjxAgxQsQAJ5nxCMARLnLKaSuIu8OuusQ
eZLogAq5TDTirkgHupOM1eEAsTwwsH/I2EtkR86Cl0cE4nhAEJfaCh5DnlG5rcaV
wi1jeyDy5mEaxyD1eom1YZYnXdBbmiSK5f6snJV4Uhu915sgDA28JYtSNhVvTtMA
igoPhGMFzmyOZY4LD2BgYk/BjTxpJEMnI4uqKv7mlp1noqwY67bl5/h204vUsmqt
n3vZPg4h7IWZWMVlG+xMjfZ8o+xSJtQ7G2o84GDtdQ7f3dbpcxHfUrd7MK07ogCP
vLLHgAvXDwKx3RaY/E8sJ9Qa4awASxFMEHGfPoFhSwEmm72LVr1JStvmstvLlmmC
pdzhzQNRguiSX4VTJWwcl6ne1WI9Y/e+s3+OPXQQuNoAwqoS3waRqw9BUZQydNTu
lcL0bMZZSO8a3MJdlIenISjt8LCWZLF/K9w+/nxf6HQigSF8GTfLTTbLu6/ew37H
XfneT/zJN+cw7Fw5J/Q9Xn8tawF8CVqPxKzhCz5YFr/R2EznVgfdVfALQ2mkJqhS
m0c+rFSCIKmAd+Txx3eabyoAoJlCwDFmanoohR5QFvNLxSRfeYP28Vkyrx5MKVMa
Uw5l3rQ/3613GYTiMSEDqhuMbHklZ1ZVEWCDlih6+pO08nGIzv8F9CSWJV5yNYGM
XQPblI7HJ1ExwoOiIqlS
=P4mc
-----END PGP SIGNATURE-----
Merge tag 'sunxi-defconfig-for-4.10' of https://git.kernel.org/pub/scm/linux/kernel/git/mripard/linux into next/defconfig
Allwinner defconfig changes for 4.10
Two patches to enable the dumb VGA bridges in the multi_v7 and sunxi
defconfig.
* tag 'sunxi-defconfig-for-4.10' of https://git.kernel.org/pub/scm/linux/kernel/git/mripard/linux:
ARM: multi_v7: enable VGA bridge
ARM: sunxi: Enable VGA bridge
Signed-off-by: Olof Johansson <olof@lixom.net>
- remove obsolete STiH41[56] platform support
- add Oxford Semiconductor OX820 support
- add reset index include files for OX810SE and OX820
- make drivers with boolean Kconfig options explicitly
non-modular
- allow shared pulsed resets via reset_control_reset, which
in this case means that the reset must have been triggered
once, but possibly earlier, after the function returns, and
is never triggered again for the lifetime of the reset
control
-----BEGIN PGP SIGNATURE-----
iQJMBAABCAA2FiEEBsBxhV1FaKwXuCOBUMKIHHCeYOsFAlgrXr8YHHBoaWxpcHAu
emFiZWxAZ21haWwuY29tAAoJEFDCiBxwnmDrP7UQAIVZfUCBUfc5PF6u/pWrUOtn
Eg7SWvo1TYcJPJKoICwlRwKXmzLVfmivCsxW8my5Ddb66JQ5hMkoqsXALQl0sAZK
U2xXiy76z+EsL9tvvJ2b9Yqx3BIbzPkZyv419RWMVldqUtdbtHe70xLjF7DVfaTs
f9BTh0OAN55yWhYS7HQDQKm2bvUe3jy0XDzXlbPXgdy/roAReBFSFO4r80cbbAfv
DZwhrzmGSAGF0fE/TQA65wkkuZMM/rE3cjOjieVC4ZaeX7OQ+eqIMwph25uHCEnS
61g1J8GcwKvtP1iXAg5VdLAW2TTGTHztBR8UchrxZfySnKWR0rGee3LkQX8LVfih
mVI2jtzpq6fGYSask9B/fGnObfjWF0kHcgy525kdcNbo8HVZ0vdgZI/yoCh3ZViT
RHpuXk5cHYM6t7aNGOBvVvdKOL3wtuFUrJKYEpOL3BB//SLZXxCDCawMt7H2XHX5
3EbLfKikQQbMNs8XzeSfTOQi9ITcdwnLdTqjBunAbf0aMiF+SOtDS/yb6BsG7zjv
muz4LUtcMq5YfXnZbB11ngz+DSN/g7MHIjl9MXG8kMO9JsVb1R9VtmbxNQ87EU92
oKGy4ix0vSIRl7/gmmFZt4iV8RQOvffkB4iIkOXlWXdnsUrqIj3V/YwEr+8leoWk
MHr75XggJOUH9VaMUP/2
=e7kp
-----END PGP SIGNATURE-----
Merge tag 'reset-for-4.10' of git://git.pengutronix.de/git/pza/linux into next/drivers
Reset controller changes for v4.10
- remove obsolete STiH41[56] platform support
- add Oxford Semiconductor OX820 support
- add reset index include files for OX810SE and OX820
- make drivers with boolean Kconfig options explicitly
non-modular
- allow shared pulsed resets via reset_control_reset, which
in this case means that the reset must have been triggered
once, but possibly earlier, after the function returns, and
is never triggered again for the lifetime of the reset
control
* tag 'reset-for-4.10' of git://git.pengutronix.de/git/pza/linux:
reset: allow using reset_control_reset with shared reset
reset: lpc18xx: make it explicitly non-modular
reset: zynq: make it explicitly non-modular
reset: sunxi: make it explicitly non-modular
reset: socfpga: make it explicitly non-modular
reset: berlin: make it explicitly non-modular
dt-bindings: reset: oxnas: Update for OX820
dt-bindings: reset: oxnas: Add include file with reset indexes
reset: oxnas: Add OX820 support
reset: sti: softreset: Remove obsolete platforms from dt binding doc.
reset: sti: Remove STiH415/6 reset support
Signed-off-by: Olof Johansson <olof@lixom.net>
When the pm_runtime_force_suspend|resume() helpers were invented, we still
had CONFIG_PM_RUNTIME and CONFIG_PM_SLEEP as separate Kconfig options.
To make sure these helpers worked for all combinations and without
introducing too much of complexity, the device was always resumed in
pm_runtime_force_resume().
More precisely, when CONFIG_PM_SLEEP was set and CONFIG_PM_RUNTIME was
unset, we needed to resume the device as the subsystem/driver couldn't
rely on using runtime PM to do it.
As the CONFIG_PM_RUNTIME option was merged into CONFIG_PM a while ago, it
removed this combination, of using CONFIG_PM_SLEEP without the earlier
CONFIG_PM_RUNTIME.
For this reason we can now rely on the subsystem/driver to use runtime PM
to resume the device, instead of forcing that to be done in all cases. In
other words, let's defer the runtime resume to a later point when it's
actually needed.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJYHmoCAAoJEHm+PkMAQRiG7RMIAI2i7Y5hpL5yCxK5AFaL4u/G
KxXfp1B1UanUTgjOmd7zGqtDYcFX9t7GTTUFixQ7/9Opr4PD9qbnatoDGSc3xjbT
msDgA1B78F1/Q3kHWfeGq32MihQ4mj5NwUCo+igUcUvvWG7mHgzErj/Nh5RoobQX
p/izdpTbrw3GX6xXB8olbG7XWHaVye/+TT3q6+gmgm8I/QEujcLeGoycE0zlhPN8
FG/JX76At/+ZM2Py7Oxo3k+oKL9CHrtOQYDp/wN0uslV5eYvvkZz0/M1HMOGZt+c
gZU5jzM17K7C4Nzo06WAuBU9wUBGc25m+cPicLlOmljnzfU+f50SKaDjZq3p7QI=
=2KUF
-----END PGP SIGNATURE-----
Merge tag 'v4.9-rc4' into sound
Bring in -rc4 patches so I can successfully merge the sound doc changes.
While this patch sounded a good idea, unfortunately, it causes
bad dependencies, as drivers that would otherwise work without
the DVB core will now break:
ERROR: "dvb_tuner_simple_release" [drivers/media/tuners/tea5767.ko] undefined!
ERROR: "dvb_tuner_simple_release" [drivers/media/tuners/tea5761.ko] undefined!
ERROR: "dvb_tuner_simple_release" [drivers/media/tuners/tda827x.ko] undefined!
ERROR: "dvb_tuner_simple_release" [drivers/media/tuners/tda18218.ko] undefined!
ERROR: "dvb_tuner_simple_release" [drivers/media/tuners/qt1010.ko] undefined!
ERROR: "dvb_tuner_simple_release" [drivers/media/tuners/mt2266.ko] undefined!
ERROR: "dvb_tuner_simple_release" [drivers/media/tuners/mt20xx.ko] undefined!
ERROR: "dvb_tuner_simple_release" [drivers/media/tuners/mt2060.ko] undefined!
ERROR: "dvb_tuner_simple_release" [drivers/media/tuners/mc44s803.ko] undefined!
ERROR: "dvb_tuner_simple_release" [drivers/media/tuners/fc0013.ko] undefined!
ERROR: "dvb_tuner_simple_release" [drivers/media/tuners/fc0012.ko] undefined!
ERROR: "dvb_tuner_simple_release" [drivers/media/tuners/fc0011.ko] undefined!
So, we have to revert it.
Note: as the argument for the release ops changed from "int"
to "void", we needed to change it at the revert patch, to
avoid compilation issues like:
drivers/media/tuners/tea5767.c:437:23: error: initialization from incompatible pointer type [-Werror=incompatible-pointer-types]
.release = tea5767_release,
^~~~~~~~~~~~~~~
This reverts commit 22a613e898.
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Currently we only clflush the scanout if it is in the CPU domain. Also
flush if we have a pending CPU clflush. We also want to treat the
dirtyfb path similar, and flush any pending writes there as well.
v2: Only send the fb flush message if flushing the dirt on flip
v3: Make flush-for-flip and dirtyfb look more alike since they serve
similar roles as end-of-frame marker.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> #v2
Link: http://patchwork.freedesktop.org/patch/msgid/20161118211747.25197-1-chris@chris-wilson.co.uk
Recently added audio block of Exynos5410 missed global fixup of GIC
interrupt flags. Interrupt of type IRQ_TYPE_NONE is not allowed for GIC
interrupts so use level high.
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Tested-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
1. In previous patch, we:
- add new data to r5l_recovery_ctx
- add new functions to recovery write-back cache
The new functions are not used in this patch, so this patch does not
change the behavior of recovery.
2. In this patchpatch, we:
- modify main recovery procedure r5l_recovery_log() to call new
functions
- remove old functions
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
Recovery of write-back cache has different logic to write-through only
cache. Specifically, for write-back cache, the recovery need to scan
through all active journal entries before flushing data out. Therefore,
large portion of the recovery logic is rewritten here.
To make the diffs cleaner, we split the rewrite as follows:
1. In this patch, we:
- add new data to r5l_recovery_ctx
- add new functions to recovery write-back cache
The new functions are not used in this patch, so this patch does not
change the behavior of recovery.
2. In next patch, we:
- modify main recovery procedure r5l_recovery_log() to call new
functions
- remove old functions
With cache feature, there are 2 different scenarios of recovery:
1. Data-Parity stripe: a stripe with complete parity in journal.
2. Data-Only stripe: a stripe with only data in journal (or partial
parity).
The code differentiate Data-Parity stripe from Data-Only stripe with
flag STRIPE_R5C_CACHING.
For Data-Parity stripes, we use the same procedure as raid5 journal,
where all the data and parity are replayed to the RAID devices.
For Data-Only strips, we need to finish complete calculate parity and
finish the full reconstruct write or RMW write. For simplicity, in
the recovery, we load the stripe to stripe cache. Once the array is
started, the stripe cache state machine will handle these stripes
through normal write path.
r5c_recovery_flush_log contains the main procedure of recovery. The
recovery code first scans through the journal and loads data to
stripe cache. The code keeps tracks of all these stripes in a list
(use sh->lru and ctx->cached_list), stripes in the list are
organized in the order of its first appearance on the journal.
During the scan, the recovery code assesses each stripe as
Data-Parity or Data-Only.
During scan, the array may run out of stripe cache. In these cases,
the recovery code will also call raid5_set_cache_size to increase
stripe cache size. If the array still runs out of stripe cache
because there isn't enough memory, the array will not assemble.
At the end of scan, the recovery code replays all Data-Parity
stripes, and sets proper states for Data-Only stripes. The recovery
code also increases seq number by 10 and rewrites all Data-Only
stripes to journal. This is to avoid confusion after repeated
crashes. More details is explained in raid5-cache.c before
r5c_recovery_rewrite_data_only_stripes().
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
1. rename r5l_read_meta_block() as r5l_recovery_read_meta_block();
2. pull the code that initialize r5l_meta_block from
r5l_log_write_empty_meta_block() to a separate function
r5l_recovery_create_empty_meta_block(), so that we can reuse this
piece of code.
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
With write cache, journal_mode is the knob to switch between
write-back and write-through.
Below is an example:
root@virt-test:~/# cat /sys/block/md0/md/journal_mode
[write-through] write-back
root@virt-test:~/# echo write-back > /sys/block/md0/md/journal_mode
root@virt-test:~/# cat /sys/block/md0/md/journal_mode
write-through [write-back]
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
There are two limited resources, stripe cache and journal disk space.
For better performance, we priotize reclaim of full stripe writes.
To free up more journal space, we free earliest data on the journal.
In current implementation, reclaim happens when:
1. Periodically (every R5C_RECLAIM_WAKEUP_INTERVAL, 30 seconds) reclaim
if there is no reclaim in the past 5 seconds.
2. when there are R5C_FULL_STRIPE_FLUSH_BATCH (256) cached full stripes,
or cached stripes is enough for a full stripe (chunk size / 4k)
(r5c_check_cached_full_stripe)
3. when there is pressure on stripe cache (r5c_check_stripe_cache_usage)
4. when there is pressure on journal space (r5l_write_stripe, r5c_cache_data)
r5c_do_reclaim() contains new logic of reclaim.
For stripe cache:
When stripe cache pressure is high (more than 3/4 stripes are cached,
or there is empty inactive lists), flush all full stripe. If fewer
than R5C_RECLAIM_STRIPE_GROUP (NR_STRIPE_HASH_LOCKS * 2) full stripes
are flushed, flush some paritial stripes. When stripe cache pressure
is moderate (1/2 to 3/4 of stripes are cached), flush all full stripes.
For log space:
To avoid deadlock due to log space, we need to reserve enough space
to flush cached data. The size of required log space depends on total
number of cached stripes (stripe_in_journal_count). In current
implementation, the writing-out phase automatically include pending
data writes with parity writes (similar to write through case).
Therefore, we need up to (conf->raid_disks + 1) pages for each cached
stripe (1 page for meta data, raid_disks pages for all data and
parity). r5c_log_required_to_flush_cache() calculates log space
required to flush cache. In the following, we refer to the space
calculated by r5c_log_required_to_flush_cache() as
reclaim_required_space.
Two flags are added to r5conf->cache_state: R5C_LOG_TIGHT and
R5C_LOG_CRITICAL. R5C_LOG_TIGHT is set when free space on the log
device is less than 3x of reclaim_required_space. R5C_LOG_CRITICAL
is set when free space on the log device is less than 2x of
reclaim_required_space.
r5c_cache keeps all data in cache (not fully committed to RAID) in
a list (stripe_in_journal_list). These stripes are in the order of their
first appearance on the journal. So the log tail (last_checkpoint)
should point to the journal_start of the first item in the list.
When R5C_LOG_TIGHT is set, r5l_reclaim_thread starts flushing out
stripes at the head of stripe_in_journal. When R5C_LOG_CRITICAL is
set, the state machine only writes data that are already in the
log device (in stripe_in_journal_list).
This patch includes a fix to improve performance by
Shaohua Li <shli@fb.com>.
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
As described in previous patch, write back cache operates in two
phases: caching and writing-out. The caching phase works as:
1. write data to journal
(r5c_handle_stripe_dirtying, r5c_cache_data)
2. call bio_endio
(r5c_handle_data_cached, r5c_return_dev_pending_writes).
Then the writing-out phase is as:
1. Mark the stripe as write-out (r5c_make_stripe_write_out)
2. Calcualte parity (reconstruct or RMW)
3. Write parity (and maybe some other data) to journal device
4. Write data and parity to RAID disks
This patch implements caching phase. The cache is integrated with
stripe cache of raid456. It leverages code of r5l_log to write
data to journal device.
Writing-out phase of the cache is implemented in the next patch.
With r5cache, write operation does not wait for parity calculation
and write out, so the write latency is lower (1 write to journal
device vs. read and then write to raid disks). Also, r5cache will
reduce RAID overhead (multipile IO due to read-modify-write of
parity) and provide more opportunities of full stripe writes.
This patch adds 2 flags to stripe_head.state:
- STRIPE_R5C_PARTIAL_STRIPE,
- STRIPE_R5C_FULL_STRIPE,
Instead of inactive_list, stripes with cached data are tracked in
r5conf->r5c_full_stripe_list and r5conf->r5c_partial_stripe_list.
STRIPE_R5C_FULL_STRIPE and STRIPE_R5C_PARTIAL_STRIPE are flags for
stripes in these lists. Note: stripes in r5c_full/partial_stripe_list
are not considered as "active".
For RMW, the code allocates an extra page for each data block
being updated. This is stored in r5dev->orig_page and the old data
is read into it. Then the prexor calculation subtracts ->orig_page
from the parity block, and the reconstruct calculation adds the
->page data back into the parity block.
r5cache naturally excludes SkipCopy. When the array has write back
cache, async_copy_data() will not skip copy.
There are some known limitations of the cache implementation:
1. Write cache only covers full page writes (R5_OVERWRITE). Writes
of smaller granularity are write through.
2. Only one log io (sh->log_io) for each stripe at anytime. Later
writes for the same stripe have to wait. This can be improved by
moving log_io to r5dev.
3. With writeback cache, read path must enter state machine, which
is a significant bottleneck for some workloads.
4. There is no per stripe checkpoint (with r5l_payload_flush) in
the log, so recovery code has to replay more than necessary data
(sometimes all the log from last_checkpoint). This reduces
availability of the array.
This patch includes a fix proposed by ZhengYuan Liu
<liuzhengyuan@kylinos.cn>
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
This patch adds state machine for raid5-cache. With log device, the
raid456 array could operate in two different modes (r5c_journal_mode):
- write-back (R5C_MODE_WRITE_BACK)
- write-through (R5C_MODE_WRITE_THROUGH)
Existing code of raid5-cache only has write-through mode. For write-back
cache, it is necessary to extend the state machine.
With write-back cache, every stripe could operate in two different
phases:
- caching
- writing-out
In caching phase, the stripe handles writes as:
- write to journal
- return IO
In writing-out phase, the stripe behaviors as a stripe in write through
mode R5C_MODE_WRITE_THROUGH.
STRIPE_R5C_CACHING is added to sh->state to differentiate caching and
writing-out phase.
Please note: this is a "no-op" patch for raid5-cache write-through
mode.
The following detailed explanation is copied from the raid5-cache.c:
/*
* raid5 cache state machine
*
* With rhe RAID cache, each stripe works in two phases:
* - caching phase
* - writing-out phase
*
* These two phases are controlled by bit STRIPE_R5C_CACHING:
* if STRIPE_R5C_CACHING == 0, the stripe is in writing-out phase
* if STRIPE_R5C_CACHING == 1, the stripe is in caching phase
*
* When there is no journal, or the journal is in write-through mode,
* the stripe is always in writing-out phase.
*
* For write-back journal, the stripe is sent to caching phase on write
* (r5c_handle_stripe_dirtying). r5c_make_stripe_write_out() kicks off
* the write-out phase by clearing STRIPE_R5C_CACHING.
*
* Stripes in caching phase do not write the raid disks. Instead, all
* writes are committed from the log device. Therefore, a stripe in
* caching phase handles writes as:
* - write to log device
* - return IO
*
* Stripes in writing-out phase handle writes as:
* - calculate parity
* - write pending data and parity to journal
* - write data and parity to raid disks
* - return IO for pending writes
*/
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
Move some define and inline functions to raid5.h, so they can be
used in raid5-cache.c
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
Currently, r5l_write_stripe checks meta size for each stripe write,
which is not necessary.
With this patch, r5l_init_log checks maximal meta size of the array,
which is (r5l_meta_block + raid_disks x r5l_payload_data_parity).
If this is too big to fit in one page, r5l_init_log aborts.
With current meta data, r5l_log support raid_disks up to 203.
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
On the DMA mapping error path, sg may be NULL (it has already been
marked as the last scatterlist entry), and we should avoid dereferencing
it again.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: e227330223 ("drm/i915: avoid leaking DMA mappings")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Imre Deak <imre.deak@intel.com>
Cc: stable@vger.kernel.org
Link: http://patchwork.freedesktop.org/patch/msgid/20161114112930.2033-1-chris@chris-wilson.co.uk
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
When gathering the pages from our backing storage we expect get_pages()
to either give us our sg_table or an err ptr. However when gathering our
fake pages for stolen memory we may return NULL in the event of a
failure. To prevent any funny business we should therefore return the
proper err ptr value.
Fixes: 03ac84f183 ("drm/i915: Pass around sg_table to get_pages/put_pages backend")
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1479488536-6168-1-git-send-email-matthew.auld@intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
On some weird randconfigs, it is possible to select DVB
drivers, without having the DVB_CORE:
CONFIG_DVB_AU8522=m
CONFIG_DVB_AU8522_V4L=m
CONFIG_DVB_TUNER_DIB0090=m
This was never supposed to work, but changeset 22a613e898
("[media] dvb_frontend: merge duplicate dvb_tuner_ops.release implementations")
caused it to be exposed:
drivers/built-in.o: In function `fc0011_attach':
(.text+0x1598fb): undefined reference to `dvb_tuner_simple_release'
drivers/built-in.o:(.rodata+0x55e58): undefined reference to `dvb_tuner_simple_release'
drivers/built-in.o:(.rodata+0x57398): undefined reference to `dvb_tuner_simple_release'
Fixes: 22a613e898 ("[media] dvb_frontend: merge duplicate dvb_tuner_ops.release implementations")
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Babu Moger says:
====================
Adjust lockdep static allocations for sparc
These patches limit the static allocations for lockdep data structures
used for debugging locking correctness. For sparc, all the kernel's code,
data, and bss, must have locked translations in the TLB so that we don't
get TLB misses on kernel code and data. Current sparc chips have 8 TLB
entries available that may be locked down, and with a 4mb page size,
this gives a maximum of 32MB. With PROVE_LOCKING we could go over this
limit and cause system boot-up problems. These patches limit the static
allocations so that everything fits in current required size limit.
patch 1 : Adds new config parameter CONFIG_PROVE_LOCKING_SMALL
Patch 2 : Adjusts the sizes based on the new config parameter
v2-> v3:
Some more comments from Sam Ravnborg and Peter Zijlstra.
Defined PROVE_LOCKING_SMALL as invisible and moved the selection to
arch/sparc/Kconfig.
v1-> v2:
As suggested by Peter Zijlstra, keeping the default as is.
Introduced new config variable CONFIG_PROVE_LOCKING_SMALL
to handle sparc specific case.
v0:
Initial revision.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Reduce the size of data structure for lockdep entries by half if
PROVE_LOCKING_SMALL if defined. This is used only for sparc.
Signed-off-by: Babu Moger <babu.moger@oracle.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This new config parameter limits the space used for "Lock debugging:
prove locking correctness" by about 4MB. The current sparc systems have
the limitation of 32MB size for kernel size including .text, .data and
.bss sections. With PROVE_LOCKING feature, the kernel size could grow
beyond this limit and causing system boot-up issues. With this option,
kernel limits the size of the entries of lock_chains, stack_trace etc.,
so that kernel fits in required size limit. This is not visible to user
and only used for sparc.
Signed-off-by: Babu Moger <babu.moger@oracle.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now that we're doing TEST_STATEID in nfs4_reclaim_open_state(), we can have
a NFS4ERR_OLD_STATEID returned from nfs41_open_expired() . Instead of
marking state recovery as failed, mark the state for recovery again.
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
sunbmac uses '__u32' for dma handle while invoking kernel DMA APIs,
instead of using dma_addr_t. This hasn't caused any 'incompatible
pointer type' warning on SPARC because until now dma_addr_t is of
type u32. However, recent changes in SPARC ATU (iommu) enables 64bit
DMA and therefore dma_addr_t becomes of type u64. This makes
'incompatible pointer type' warnings inevitable.
e.g.
drivers/net/ethernet/sun/sunbmac.c: In function ‘bigmac_ether_init’:
drivers/net/ethernet/sun/sunbmac.c:1166: warning: passing argument 3 of ‘dma_alloc_coherent’ from incompatible pointer type
./include/linux/dma-mapping.h:445: note: expected ‘dma_addr_t *’ but argument is of type ‘__u32 *’
This patch resolves above compiler warning.
Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com>
Reviewed-by: chris hyser <chris.hyser@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
sunqe uses '__u32' for dma handle while invoking kernel DMA APIs,
instead of using dma_addr_t. This hasn't caused any 'incompatible
pointer type' warning on SPARC because until now dma_addr_t is of
type u32. However, recent changes in SPARC ATU (iommu) enables 64bit
DMA and therefore dma_addr_t becomes of type u64. This makes
'incompatible pointer type' warnings inevitable.
e.g.
drivers/net/ethernet/sun/sunqe.c: In function ‘qec_ether_init’:
drivers/net/ethernet/sun/sunqe.c:883: warning: passing argument 3 of ‘dma_alloc_coherent’ from incompatible pointer type
./include/linux/dma-mapping.h:445: note: expected ‘dma_addr_t *’ but argument is of type ‘__u32 *’
drivers/net/ethernet/sun/sunqe.c:885: warning: passing argument 3 of ‘dma_alloc_coherent’ from incompatible pointer type
./include/linux/dma-mapping.h:445: note: expected ‘dma_addr_t *’ but argument is of type ‘__u32 *’
This patch resolves above compiler warnings.
Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com>
Reviewed-by: chris hyser <chris.hyser@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ensure we test to see if the open stateid is actually set, before we
send a CLOSE.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Tushar Dave says:
====================
sparc: Enable sun4v hypervisor PCI IOMMU v2 APIs and ATU
ATU (Address Translation Unit) is a new IOMMU in SPARC supported with
sun4v hypervisor PCI IOMMU v2 APIs.
Current SPARC IOMMU supports only 32bit address ranges and one TSB
per PCIe root complex that has a 2GB per root complex DVMA space
limit. The limit has become a scalability bottleneck nowadays that
a typical 10G/40G NIC can consume 500MB DVMA space per instance.
When DVMA resource is exhausted, devices will not be usable
since the driver can't allocate DVMA.
For example, we recently experienced legacy IOMMU limitation while
using i40e driver in system with large number of CPUs (e.g. 128).
Four ports of i40e, each request 128 QP (Queue Pairs). Each queue has
512 (default) descriptors. So considering only RX queues (because RX
premap DMA buffers), i40e takes 4*128*512 number of DMA entries in
IOMMU table. Legacy IOMMU can have at max (2G/8K)- 1 entries available
in table. So bringing up four instance of i40e alone saturate existing
IOMMU resource.
ATU removes bottleneck by allowing guest os to create IOTSB of size
32G (or more) with 64bit address ranges available in ATU HW. 32G is
more than enough DVMA space to be shared by all PCIe devices under
root complex contrast to 2G space provided by legacy IOMMU.
ATU allows PCIe devices to use 64bit DMA addressing. Devices
which choose to use 32bit DMA mask will continue to work with the
existing legacy IOMMU.
The patch set is tested on sun4v (T1000, T2000, T3, T4, T5, T7, S7)
and sun4u SPARC.
Thanks.
-Tushar
v2->v3:
- Patch #5 addresses comment by Joe Perches.
-- use %s, __func__ instead of embedding the function name.
v1->v2:
- Patch #2 addresses comments by Dave M.
-- use page allocator to allocate IOTSB.
-- use true/false with boolean variables.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
ATU 64bit addressing allows PCIe devices with 64bit DMA capabilities
to use ATU for 64bit DMA.
Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com>
Reviewed-by: chris hyser <chris.hyser@oracle.com>
Acked-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add Hypervisor IOMMU v2 APIs pci_iotsb_map(), pci_iotsb_demap() and
enable sun4v dma ops to use IOMMU v2 API for all PCIe devices with
64bit DMA mask.
Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com>
Reviewed-by: chris hyser <chris.hyser@oracle.com>
Acked-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In order to use Hypervisor (HV) IOMMU v2 API for map/demap, each PCIe
device has to be bound to IOTSB using HV API pci_iotsb_bind().
Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com>
Reviewed-by: chris hyser <chris.hyser@oracle.com>
Acked-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Like legacy IOMMU, use common iommu_map_table and iommu_pool for ATU.
This change initializes iommu_map_table and iommu_pool for ATU.
Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com>
Reviewed-by: chris hyser <chris.hyser@oracle.com>
Reviewed-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ATU (Address Translation Unit) is a new IOMMU in SPARC supported with
Hypervisor IOMMU v2 APIs.
Current SPARC IOMMU supports only 32bit address ranges and one TSB
per PCIe root complex that has a 2GB per root complex DVMA space
limit. The limit has become a scalability bottleneck nowadays that
a typical 10G/40G NIC can consume 300MB-500MB DVMA space per
instance. When DVMA resource is exhausted, devices will not be usable
since the driver can't allocate DVMA.
ATU removes bottleneck by allowing guest os to create IOTSB of size
32G (or more) with 64bit address ranges available in ATU HW. 32G is
more than enough DVMA space to be shared by all PCIe devices under
root complex contrast to 2G space provided by legacy IOMMU.
ATU allows PCIe devices to use 64bit DMA addressing. Devices
which choose to use 32bit DMA mask will continue to work with the
existing legacy IOMMU.
Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com>
Reviewed-by: chris hyser <chris.hyser@oracle.com>
Acked-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This change allows ATU (new IOMMU) in SPARC systems to request
large (32M) contiguous memory during boot for creating IOTSB backing
store.
Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com>
Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add missing NDA_VLAN attribute's size.
Fixes: 1e53d5bb88 ("net: Pass VLAN ID to rtnl_fdb_notify.")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kernel takes care that interrupts from one source are serialized.
So there's no need to use spinlock_irq_save.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Spinlock nvt_lock is a member of struct nvt_dev and there's no need
to prefix it with nvt_. So remove this prefix.
[mchehab@s-opensource.org: change the prefix also at the open function,
as the patch removing it were not applied (yet?)]
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Using a separate spinlock to protect access to substruct tx of struct
nvt_dev doesn't provide any actual benefit. We can use spinlock
nvt_lock to protect all access to struct nvt_dev and get rid of
nvt->tx.lock.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Allocate resources dynamically for Upper layer driver's (ULD) like
cxgbit, iw_cxgb4, cxgb4i and chcr. The resources allocated include Tx
queues which are allocated when ULD register with cxgb4 driver and freed
while un-registering. The Tx queues which are shared by ULD shall be
allocated by first registering driver and un-allocated by last
unregistering driver.
Signed-off-by: Atul Gupta <atul.gupta@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Member pdev of struct nvt_dev is needed only to access &pdev->dev.
We can get rid of this it by using rdev->dev.parent instead
(both point to the same struct device).
Setting rdev->dev.parent can be removed from the probe function
as this is done by devm_rc_allocate_device now.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
We obviously intended a bitwise AND here, not a logical one.
Fixes: 8c978d0592 ("liquidio CN23XX: Mailbox support")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The argument to get_net_ns_by_fd() is a /proc/$PID/ns/net file
descriptor not a pid. Fix the typo.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Rami Rosen <roszenrami@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
* limit # of scan results stored in memory - this is a long-standing bug
Jouni and I only noticed while discussing other things in Santa Fe
* revert AP_LINK_PS patch that was causing issues (Felix)
* various A-MSDU/A-MPDU fixes for TXQ code (Felix)
* interoperability workaround for peers with broken VHT capabilities
(Filip Matusiak)
* add bitrate definition for a VHT MCS that's supposed to be invalid
but gets used by some hardware anyway (Thomas Pedersen)
* beacon timer fix in hwsim (Benjamin Beichler)
-----BEGIN PGP SIGNATURE-----
iQIcBAABCgAGBQJYLrKJAAoJEGt7eEactAAdshYP/Rwi/k9WZ0RKQIA1QA0fnmwv
TbUUvPHQ4LA39Czs0JwOPDIbter7TBzFXV5EK0y39jGm8St4a/oHuN072kzBYvob
YbBzM0YKV2uQCAjH3AjWjS3RPOlTzfUqSRWgEqtgUTC5VZrpNfhd2r0q+jfwpQ7s
+qUsdwqvVJMZKfOEngIXzNXdI95N1d8tpGwzkUqDbOJ+7amdriiJKyTsKEOMqREA
0Mdhu6roEcMDVO+xt5ZkilmpLZOUEHAzaWkwm7d6VToWi7k3pwMiEZ4fthvHZHEZ
6K2bn1UMLdKAExRcQBE3wrmn5jFfUbts1mhmoN1drHocI12epcgw5I4ZjEbfpqMB
IBkOM+2hbSdUVfE4KVH6iubqiJwtHB8YSTdKFmMO4jPx7UshPF7YCny6XWC+EqQm
ktjyG/lhlXJ0HaOKC/MAwd/KSzfH0eJWGNBDV5s/FIWiqu2oWn+ZIX7vGTpp4K1Z
Rery/ZUiJGzVITNG1ka+GXq2tUvDXfkMApr+jFH4G6zioWovxB6+4uyMEeclvC5q
zjgYEpZhbn2wQUNEJ8btDDgXkyDGh3UPjy93fO8fMEvv8d9CCOKTQkIUpjPp6jbZ
EoLCTW0NNjnAmQRiUFF3vO2O7e3ZTJ4hyNEVkc7mJ2X3DllXRSelTIFSfZVHpb2k
P+gY7uC0cAPZ4s6q8GSb
=q2mI
-----END PGP SIGNATURE-----
Merge tag 'mac80211-for-davem-2016-11-18' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
Johannes Berg says:
====================
A few more bugfixes:
* limit # of scan results stored in memory - this is a long-standing bug
Jouni and I only noticed while discussing other things in Santa Fe
* revert AP_LINK_PS patch that was causing issues (Felix)
* various A-MSDU/A-MPDU fixes for TXQ code (Felix)
* interoperability workaround for peers with broken VHT capabilities
(Filip Matusiak)
* add bitrate definition for a VHT MCS that's supposed to be invalid
but gets used by some hardware anyway (Thomas Pedersen)
* beacon timer fix in hwsim (Benjamin Beichler)
====================
Signed-off-by: David S. Miller <davem@davemloft.net>