Commit Graph

480368 Commits

Author SHA1 Message Date
Krzysztof Kozlowski
0f5ebabdd0 dmaengine: pl330: Fix NULL pointer dereference on probe failure
If dma_async_device_register() returns error and probe should clean up
and return error, a NULL pointer exception happens because of
dereference of not allocated channel thread:

Dmesg log (from early printk):
dma-pl330 12680000.pdma: unable to register DMAC
DMA pl330_control: removing pch: eeac4000, chan: eeac4014, thread:   (null)
Unable to handle kernel NULL pointer dereference at virtual address 0000000c
pgd = c0004000
[0000000c] *pgd=00000000
Internal error: Oops: 5 [#1] PREEMPT SMP ARM
Modules linked in:
CPU: 2 PID: 1 Comm: swapper/0 Not tainted 3.17.0-rc3-next-20140904-00005-g6cc4c1937d90-dirty #427
task: ee80a800 ti: ee888000 task.ti: ee888000
PC is at _stop+0x8/0x2c8
LR is at pl330_control+0x70/0x2e8
pc : [<c0205dc8>]    lr : [<c020623c>]    psr: 60000193
sp : ee889df8  ip : 00000002  fp : 00000000
r10: eeac4014  r9 : ee0e62bc  r8 : 00000000
r7 : eeac405c  r6 : 60000113  r5 : ee0e6210  r4 : eeac4000
r3 : 00000002  r2 : 00000002  r1 : 00010000  r0 : 00000000
Flags: nZCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment kernel
Control: 10c5387d  Table: 4000404a  DAC: 00000015
Process swapper/0 (pid: 1, stack limit = 0xee888240)
Stack: (0xee889df8 to 0xee88a000)
9de0:                                                       00000002 eeac4000
9e00: ee0e6210 eeac4000 ee0e6210 60000113 eeac405c c020623c 00000000 c020725c
9e20: ee889e20 ee889e20 ee0e6210 eeac4080 00200200 00100100 eeac4014 00000020
9e40: ee0e6218 c0208374 00000000 ee9bb340 ee0e6210 00000000 00000000 c0605cd8
9e60: ee970000 c0605c84 ee9700f8 00000000 c05c4270 00000000 00000000 c0203b3c
9e80: ee970000 c06624a8 00000000 c0605c84 00000000 c023f890 ee970000 c0605c84
9ea0: ee970034 00000000 c05b23d0 c023fa3c 00000000 c0605c84 c023f9b0 c023e0d4
9ec0: ee947e78 ee9b9440 c0605c84 eea1e780 c0605acc c023f094 c0513b50 c0605c84
9ee0: c05ecbd8 c0605c84 c05ecbd8 ee11ba40 c0626500 c0240064 00000000 c05ecbd8
9f00: c05ecbd8 c0008964 c040f13c 0000009f c0626500 c057465c ee80a800 60000113
9f20: 00000000 c05efdb0 60000113 00000000 ef7fc89d c0421168 0000008f c003787c
9f40: c0573d6c 00000006 ef7fc8bb 00000006 c05efd50 ef7fc800 c05dfbc4 00000006
9f60: c05c4264 c0626500 0000008f c05c4270 c059b518 c059bcb4 00000006 00000006
9f80: c059b518 c003c08c 00000000 c040091c 00000000 00000000 00000000 00000000
9fa0: 00000000 c0400924 00000000 c000e7b8 00000000 00000000 00000000 00000000
9fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
9fe0: 00000000 00000000 00000000 00000000 00000013 00000000 c0c0c0c0 c0c0c0c0
[<c0205dc8>] (_stop) from [<c020623c>] (pl330_control+0x70/0x2e8)
[<c020623c>] (pl330_control) from [<c0208374>] (pl330_probe+0x594/0x75c)
[<c0208374>] (pl330_probe) from [<c0203b3c>] (amba_probe+0xb8/0x120)
[<c0203b3c>] (amba_probe) from [<c023f890>] (driver_probe_device+0x10c/0x22c)
[<c023f890>] (driver_probe_device) from [<c023fa3c>] (__driver_attach+0x8c/0x90)
[<c023fa3c>] (__driver_attach) from [<c023e0d4>] (bus_for_each_dev+0x54/0x88)
[<c023e0d4>] (bus_for_each_dev) from [<c023f094>] (bus_add_driver+0xd4/0x1d0)
[<c023f094>] (bus_add_driver) from [<c0240064>] (driver_register+0x78/0xf4)
[<c0240064>] (driver_register) from [<c0008964>] (do_one_initcall+0x80/0x1d0)
[<c0008964>] (do_one_initcall) from [<c059bcb4>] (kernel_init_freeable+0x108/0x1d4)
[<c059bcb4>] (kernel_init_freeable) from [<c0400924>] (kernel_init+0x8/0xec)
[<c0400924>] (kernel_init) from [<c000e7b8>] (ret_from_fork+0x14/0x3c)
Code: e5813010 e12fff1e e92d40f0 e24dd00c (e590200c)
---[ end trace c94b2f4f38dff3bf ]---

This happens because the necessary resources were not yet allocated - no
call to pl330_alloc_chan_resources().

Terminate the thread and free channel resource only if channel thread is not NULL.

Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Cc: <stable@vger.kernel.org>
Fixes: 0b94c57717 ("DMA: PL330: Add check if device tree compatible")
Reviewed-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2014-10-15 13:30:09 +05:30
Krzysztof Kozlowski
c3cb38f43c dmaengine: pl330: Remove unused 'regs' variable in pl330_submit_req()
The 'void __iomem *regs' is not used in pl330_submit_req() function.
Remove it.

Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2014-10-15 13:30:09 +05:30
Krzysztof Kozlowski
937cb2f249 dmaengine: pl330: Remove non-NULL check for pl330_submit_req parameters
The pl330_submit_req() checked supplied 'struct pl330_thread thrd' and
'struct dma_pl330_desc desc' parameters for non-NULL. However these
checks are useless because supplied arguments won't be NULL.

The pl330_submit_req() is called in only one place and:
1. 'desc' is already dereferenced in fill_queue() before calling
   pl330_submit_req().
2. 'thrd' is always dereferenced after calling
   fill_queue()->pl330_submit_req().

Removing the checks for non-NULL values fixes following warning:
drivers/dma/pl330.c:1376 pl330_submit_req() warn: variable dereferenced before check 'thrd' (see line 1367)

Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2014-10-15 13:30:09 +05:30
Santosh Shilimkar
97215800e4 MAINTAINERS: Update Santosh Shilimkar's email id
Signed-off-by: Santosh Shilimkar <ssantosh@kernel.org>
Signed-off-by: Olof Johansson <olof@lixom.net>
2014-10-14 23:35:19 -07:00
Olof Johansson
22414f776d Samsung defconfig, actually exynos_defconig updates for v3.18
- enable USB gadget support
 - enable Maxim77802 support
 - enable Maxim77693 and I2C GPIO drivers
 - enable Atmel maXTouch support
 - enable SBS battery support
 - enable Control Groups support
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJUPbBbAAoJEA0Cl+kVi2xqatsP/32CiTFSh7lG/GOC2riKhHFK
 qpPoplPKXjJSM3rnFG7X9j1+uYoPAf+jfxhOmfniM9uBiYPLcfV0p4D4hAUOfBmg
 y1D0vIaBJrJEer3ZgLFyXQZWkWyNU9O0c/VlieUyxAncCCLVSVx+aJg3L4Qz9DR5
 26LjBoTGmpk3KExgDlnqKJ2kHdxhNC+weCjrQC6koWK+3eUjlYnjaYGuSacO7onv
 QWJkXaJV69jzQNBWkDlf3JeDBIl/YIlq+rj52vJCqgeuZAkOYelA+YTPm/F74cPO
 1m1byBpmNVcaVeMtdhCZRx+EuHDfWYPUb821W6R98soA+zJ45Vvol5HLD8EDrXZ9
 /F1AeU4xZsAhNav9qaqLqGJmcsOHYEWURyzhbzpXz4Xs0C1pjTzB06L0Ca62P8oP
 Ki2S1JaSSqMxXjwceebmg9N3nJnuGgF3gtHuUYGC3uoAWXLIqg/noQZh3g5i2RWZ
 WpvKd9DSrLfL2rgR5bjaPbdFeV5dAy0Buuq2cqTb0cWtriz3oEn2WS5muTx7e/II
 jtZ4gNUWt+wCerMtTQRdqRjsqk5/5VvNLuT2y8MTJjBaQuGmZv60TmBbSLIAtZqp
 7VMtTevcJqXlSdrgwMNNyakxsHndWJ4ExrqFSywX5O6mdX7Vhy4Q2svZQ8stlK9K
 aiW+wFhE3MSa+n5ekM4P
 =GSyQ
 -----END PGP SIGNATURE-----

Merge tag 'samsung-defconfig' of git://git.kernel.org/pub/scm/linux/kernel/git/kgene/linux-samsung into fixes

Merge "Samsung defconfig, actually exynos_defconig updates for v3.18" from
Kukjin Kim:

- enable USB gadget support
- enable Maxim77802 support
- enable Maxim77693 and I2C GPIO drivers
- enable Atmel maXTouch support
- enable SBS battery support
- enable Control Groups support

* tag 'samsung-defconfig' of git://git.kernel.org/pub/scm/linux/kernel/git/kgene/linux-samsung:
  ARM: exynos_defconfig: enable USB gadget support
  ARM: exynos_defconfig: Enable Maxim 77693 and I2C GPIO drivers
  ARM: exynos_defconfig: Enable SBS battery support
  ARM: exynos_defconfig: Enable Control Groups support
  ARM: exynos_defconfig: Enable Atmel maXTouch support
  ARM: exynos_defconfig: Enable MAX77802

Signed-off-by: Olof Johansson <olof@lixom.net>
2014-10-14 23:32:31 -07:00
Olof Johansson
e17fd8e58a Samsung fixes for v3.18
- fix ifdef around cpu_*_do_[suspend, resume] ops to check
   CONFIG_ARM_CPU_SUSPEND and not CONFIG_PM_SLEEP
 - fix exynos_defconfig build with PM_SLEEP=n and ARM_EXYNOS_CPUIDLE=n
 - fix enabling Samsung PM debug functionality due to recently merged
   patches and previous merge conflicts
 - fix pull-up setting in sd4_width8 pin group for exynos4x12
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJUPa89AAoJEA0Cl+kVi2xqDagP+gJMTHKBbGi6VUAGnV+6ojCa
 LgXjaCKfPlHFJ9C4d8OvW2g3Fc+3Ns/Tm2s93AWNz+NEJo4Ft5OZW+jHzIMF1Viz
 lAewxzoAgq5CeseyvmzUQ5aCVglzCGvgQ6WN9Lq1sRqD8BywHYDRpmrxmDGjGN0D
 51pIxgEJLRYsOsJVhIx1wBBWTxaQH2ViqACsak+HJkw0WzoDQUVzSv85NlzaMwyR
 55rikCPDC/RbSLI3McyNWx+BJ0bBy6HqXkfhFsTmROa0ewjPAR+miEKvF6Jqk6ij
 IZKpfD6hiHNuyhyQEpY3Mw5sOoleVLCuAM5x1f883b/ufISARGTZ0Zhb5+UQ2Rdh
 bf5bZw+ZuhMCZioN2tUgBCOUwRC/G+PGzy82Z46FQhp6e21VTwY345HhZfvIKHHy
 10veTu8RGFm0Y2VL6rENHhWN6tZzOrmM0oeN4nEnZQfXy7tsRy6qFLYv30wuY2y4
 6oPK6Vs1iB0BAz8HioAuupIFn8ZkfO4AoCW+Gu94Gt3n0dTOFP7JIBgjycvSxLb2
 FV2Zt/DXxjAj+ld/jf6lOIIYw/0JyaB58QeLKdASQxmiUbUuoQt9STe/c6MS1miy
 j8BPM4ILuhak+0aFUgSRZiewoGyqXEShdinmdJ1B/k1JVmFiwgEOjT7tQTQeEiCn
 4OPzu6fIYau0CdUqMy7w
 =6hVm
 -----END PGP SIGNATURE-----

Merge tag 'samsung-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/kgene/linux-samsung into fixes

Merge "Samsung fixes for v3.18" from Kukjin Kim:

- fix ifdef around cpu_*_do_[suspend, resume] ops to check
  CONFIG_ARM_CPU_SUSPEND and not CONFIG_PM_SLEEP
- fix exynos_defconfig build with PM_SLEEP=n and ARM_EXYNOS_CPUIDLE=n
- fix enabling Samsung PM debug functionality due to recently merged
  patches and previous merge conflicts
- fix pull-up setting in sd4_width8 pin group for exynos4x12

* tag 'samsung-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/kgene/linux-samsung:
  ARM: mm: Fix ifdef around cpu_*_do_[suspend, resume] ops
  ARM: EXYNOS: Fix build with PM_SLEEP=n and ARM_EXYNOS_CPUIDLE=n
  ARM: SAMSUNG: Restore Samsung PM Debug functionality
  ARM: dts: Fix pull setting in sd4_width8 pin group for exynos4x12

Signed-off-by: Olof Johansson <olof@lixom.net>
2014-10-14 23:31:13 -07:00
Olof Johansson
6d81dc87c0 Two omap fixes for issues noticed during the merge window:
- We need to enable ARM errata 430973 for omap3
 
 - The smc91x on some early n900 boards need to be disabled
   for now until the dependencies to specific a bootloader
   version are fixed
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJUOEX4AAoJEBvUPslcq6VzYz0P/3kudc/RoBxHnkcR2ii3MAPy
 BuQHcN/QFe8A8WO+fnjX+Pj5hf/CtajIrZlzq+5z6U1MKBwbAQbeE/uxAENgEwd7
 01pVxYxWXs2caIaCkcMXN0qcwizgMRBTGQr8BMwtcXobkjfrOcllfj0GFE7xLsHC
 j9cmpwB2Dw9Xip//dQZnHPhAjabI5/5BMlRi1KWUVngQJeiB0jasrWtWpi8pzFI0
 K59MPoiNCzhXRvSdsh5eLDJTFSMaLyyvtKONlZDM7fZHfaqCHmz9B5agd35JnzDb
 uL8x74QDIIDsm5hJJihTb+HvkY8+Jnj+QV7WiJTcby+pF5xayF7IsrM2ZJmwDsBI
 3LsSRhOqDGQjEw8TOttGg34Kcz65j4fKJEw2K42D40T9LvNs0r0mOwKE9/xMJ3mC
 Xny/iUUCeNdCK+xQJpj+LWGCdeBnOlFjCuG7GWmrw3J2Ynegwf/KQ5WLqKb8giH9
 ZVdgrNeYEMoHQOQRzqwl/N4ie0cIu39E9IPbh8ybglXX49HYNoTtkJNYi4/6nJTt
 50VT9aydPgHtgcnb3Mb+yWnrERaZNPvocDil5OBgmO2+Rbx51f51TyxyYgWGf+xM
 vfOPVxCf9muKt2SaMTlQYMiWFrEHMyujvir8vfxlOdEyaErRhwmP2oeDXC/v9AXE
 /AK3bdGe3FJio86/wIoB
 =B9g8
 -----END PGP SIGNATURE-----

Merge tag 'fixes-for-v3.18-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes

Merge "Two omap fixes for v3.18 merge window" from Tony Lindgren:

Two omap fixes for issues noticed during the merge window:

- We need to enable ARM errata 430973 for omap3

- The smc91x on some early n900 boards need to be disabled
  for now until the dependencies to specific a bootloader
  version are fixed

* tag 'fixes-for-v3.18-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
  ARM: dts: Disable smc91x on n900 until bootloader dependency is removed
  ARM: omap2plus_defconfig: Enable ARM erratum 430973 for omap3

Signed-off-by: Olof Johansson <olof@lixom.net>
2014-10-14 23:30:36 -07:00
Olof Johansson
9a2ad529ed ARM: sunxi_defconfig: enable CONFIG_REGULATOR
Commit 97a13e5289 ('net: phy: mdio-sun4i: don't select REGULATOR') removed
the select of REGULATOR, which means that it now has to be explicitly
enabled in the defconfig or things won't work very well.

In particular, this fixes a problem with SD/MMC not probing on my A31-based
board.

Cc: Beniamino Galvani <b.galvani@gmail.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
2014-10-14 23:26:58 -07:00
Linus Torvalds
0429fbc0bd Merge branch 'for-3.18-consistent-ops' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu
Pull percpu consistent-ops changes from Tejun Heo:
 "Way back, before the current percpu allocator was implemented, static
  and dynamic percpu memory areas were allocated and handled separately
  and had their own accessors.  The distinction has been gone for many
  years now; however, the now duplicate two sets of accessors remained
  with the pointer based ones - this_cpu_*() - evolving various other
  operations over time.  During the process, we also accumulated other
  inconsistent operations.

  This pull request contains Christoph's patches to clean up the
  duplicate accessor situation.  __get_cpu_var() uses are replaced with
  with this_cpu_ptr() and __this_cpu_ptr() with raw_cpu_ptr().

  Unfortunately, the former sometimes is tricky thanks to C being a bit
  messy with the distinction between lvalues and pointers, which led to
  a rather ugly solution for cpumask_var_t involving the introduction of
  this_cpu_cpumask_var_ptr().

  This converts most of the uses but not all.  Christoph will follow up
  with the remaining conversions in this merge window and hopefully
  remove the obsolete accessors"

* 'for-3.18-consistent-ops' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (38 commits)
  irqchip: Properly fetch the per cpu offset
  percpu: Resolve ambiguities in __get_cpu_var/cpumask_var_t -fix
  ia64: sn_nodepda cannot be assigned to after this_cpu conversion. Use __this_cpu_write.
  percpu: Resolve ambiguities in __get_cpu_var/cpumask_var_t
  Revert "powerpc: Replace __get_cpu_var uses"
  percpu: Remove __this_cpu_ptr
  clocksource: Replace __this_cpu_ptr with raw_cpu_ptr
  sparc: Replace __get_cpu_var uses
  avr32: Replace __get_cpu_var with __this_cpu_write
  blackfin: Replace __get_cpu_var uses
  tile: Use this_cpu_ptr() for hardware counters
  tile: Replace __get_cpu_var uses
  powerpc: Replace __get_cpu_var uses
  alpha: Replace __get_cpu_var
  ia64: Replace __get_cpu_var uses
  s390: cio driver &__get_cpu_var replacements
  s390: Replace __get_cpu_var uses
  mips: Replace __get_cpu_var uses
  MIPS: Replace __get_cpu_var uses in FPU emulator.
  arm: Replace __this_cpu_ptr with raw_cpu_ptr
  ...
2014-10-15 07:48:18 +02:00
Linus Torvalds
6929c35897 LLVMLinux patches for v3.18
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJUPOQPAAoJEHKvRublQaWiZnQP/0bDfPvhaNKZtmDKYzbqQm9x
 Gh3fB53d7TtIei///5eD/LUGObg0Ze8g2k8aAC05hcq5Be8Haitma3iDDSvJq6ig
 umYU9BWkv47BHQy0gAVMXyaxHocICq7W9bnOvU4SSEW/sWeneBllyxgCfV7EKSTd
 B/OXP+Ovr4FtjAweOACCN8b0M9w4wdxNKfDNV+WDIHAddYBngxrwq7zASKNNCx2N
 0u9sXIdTp0Fvxmyx/lYLC5NXN0CUDjB3Ffdx3+eehrBp2lT6JdlkYU403c85cIMP
 oIPJVKFbOnM04MfCjhFNTrK9OtC2eD6PoWq+FLL0UJx3YW5HkbLsiHGm/2UTJ2Z1
 5QOwDebMxlvrb6f6Gv846ADl7YcByiXkieTDHRlOnplVRNV5Sj8UWAgq+zyq7sWq
 2uRuW2UvKx7vYoAwKRwCaIoqpIe3NIvZQzE7C9mGOprIawZ5e0YJzaR6OoBs4Y8i
 gmBeFx266URJun7isy1R7JJsMjYzxbEXju9zH/SkghbLHnf8yqafIG+pG1GD7n5R
 o2C/5TVXjmEhIoDn8j2ZozaElX4mD7REKILpIGaE4XltExNTexq8neRo9D3ajzif
 N5RrMCAkBzIxMz83evDppe3ObtkEaf0K43VCO3AVQ2g9jXg7ttKhR2hb8HRqCMHe
 Lp3d8qKyZyL/HZ5F62yX
 =zJyI
 -----END PGP SIGNATURE-----

Merge tag 'llvmlinux-for-v3.18' of git://git.linuxfoundation.org/llvmlinux/kernel

Pull LLVM updates from Behan Webster:
 "These patches remove the use of VLAIS using a new SHASH_DESC_ON_STACK
  macro.

  Some of the previously accepted VLAIS removal patches haven't used
  this macro.  I will push new patches to consistently use this macro in
  all those older cases for 3.19"

[ More LLVM patches coming in through subsystem trees, and LLVM itself
  needs some fixes that are already in many distributions but not in
  released versions of LLVM.  Some day this will all "just work"  - Linus ]

* tag 'llvmlinux-for-v3.18' of git://git.linuxfoundation.org/llvmlinux/kernel:
  crypto: LLVMLinux: Remove VLAIS usage from crypto/testmgr.c
  security, crypto: LLVMLinux: Remove VLAIS from ima_crypto.c
  crypto: LLVMLinux: Remove VLAIS usage from libcrc32c.c
  crypto: LLVMLinux: Remove VLAIS usage from crypto/hmac.c
  crypto, dm: LLVMLinux: Remove VLAIS usage from dm-crypt
  crypto: LLVMLinux: Remove VLAIS from crypto/.../qat_algs.c
  crypto: LLVMLinux: Remove VLAIS from crypto/omap_sham.c
  crypto: LLVMLinux: Remove VLAIS from crypto/n2_core.c
  crypto: LLVMLinux: Remove VLAIS from crypto/mv_cesa.c
  crypto: LLVMLinux: Remove VLAIS from crypto/ccp/ccp-crypto-sha.c
  btrfs: LLVMLinux: Remove VLAIS
  crypto: LLVMLinux: Add macro to remove use of VLAIS in crypto code
2014-10-15 07:30:52 +02:00
Linus Torvalds
23971bdfff IOMMU Updates for Linux v3.18
This pull-request includes:
 
 	* Change in the IOMMU-API to convert the former iommu_domain_capable
 	  function to just iommu_capable
 
 	* Various fixes in handling RMRR ranges for the VT-d driver (one fix
 	  requires a device driver core change which was acked
 	  by Greg KH)
 
 	* The AMD IOMMU driver now assigns and deassigns complete alias groups
 	  to fix issues with devices using the wrong PCI request-id
 
 	* MMU-401 support for the ARM SMMU driver
 
 	* Multi-master IOMMU group support for the ARM SMMU driver
 
 	* Various other small fixes all over the place
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJUPNxYAAoJECvwRC2XARrjMwMP/RLSr+oA31rGVjLXcmcCHl7Q
 Uj7xpcnG19qB0aqNR1JeJuZNkK/tw44pE353MQPbz4N9UVUiogklGIVD1iJvFV53
 0qm84bvpDJIof4aP35B3H3Umft2USTn/lmsQg/RklQcNTW8DzNj63b8BTNR7k/GL
 G7bLg7F1BUCl0shZCCsFspOIulQPAJYN2OvHlfYBav/bfDvfouQ3lrV+loGrK44r
 F2Hmp+imXlIhUCjfbiWz6wKFxvPrxZx482vm2pXBCSnXEdW4/fz6nf9VHUK/Cfsq
 JAimY1CfiDo1aqH9/yVHUOw5SD/NYOXq6E5bFPg/WENbipbbae5cK2u6PX5MMBAn
 CG4BM8l9xicfGPqgn5YFSRY/6qC6K7NlxMnt9U8l18QIkDVDqEtUgJQISJuce7wx
 FWx6eSWaxpIe5yhq19/h2ELalUUyR/fPq+UXXjYDL1kLV/vcvC/lC3mbNAQU93zU
 WK0bG2tDg88JHavc25Ewa2aOn4BVM2BpwuLbYlgQReaEmsQRnEPgtmRNyLJHqbFE
 wwpCj8pBWdufsJWRyvpnXQ+CfA7oSz4e7hz1G+0/5uiDmagfvg16Ql5JtPmmuLUm
 Kc3dVIiG0s1ewohZIIJETGCqprQbCSqs8CCQqB6p2zDBWFKpNT7F38lm/KlehkCz
 JpAiI7Y2K9Jejp0VIPrt
 =OMOt
 -----END PGP SIGNATURE-----

Merge tag 'iommu-updates-v3.18' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

Pull IOMMU updates from Joerg Roedel:
 "This pull-request includes:

   - change in the IOMMU-API to convert the former iommu_domain_capable
     function to just iommu_capable

   - various fixes in handling RMRR ranges for the VT-d driver (one fix
     requires a device driver core change which was acked by Greg KH)

   - the AMD IOMMU driver now assigns and deassigns complete alias
     groups to fix issues with devices using the wrong PCI request-id

   - MMU-401 support for the ARM SMMU driver

   - multi-master IOMMU group support for the ARM SMMU driver

   - various other small fixes all over the place"

* tag 'iommu-updates-v3.18' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (41 commits)
  iommu/vt-d: Work around broken RMRR firmware entries
  iommu/vt-d: Store bus information in RMRR PCI device path
  iommu/vt-d: Only remove domain when device is removed
  driver core: Add BUS_NOTIFY_REMOVED_DEVICE event
  iommu/amd: Fix devid mapping for ivrs_ioapic override
  iommu/irq_remapping: Fix the regression of hpet irq remapping
  iommu: Fix bus notifier breakage
  iommu/amd: Split init_iommu_group() from iommu_init_device()
  iommu: Rework iommu_group_get_for_pci_dev()
  iommu: Make of_device_id array const
  amd_iommu: do not dereference a NULL pointer address.
  iommu/omap: Remove omap_iommu unused owner field
  iommu: Remove iommu_domain_has_cap() API function
  IB/usnic: Convert to use new iommu_capable() API function
  vfio: Convert to use new iommu_capable() API function
  kvm: iommu: Convert to use new iommu_capable() API function
  iommu/tegra: Convert to iommu_capable() API function
  iommu/msm: Convert to iommu_capable() API function
  iommu/vt-d: Convert to iommu_capable() API function
  iommu/fsl: Convert to iommu_capable() API function
  ...
2014-10-15 07:23:49 +02:00
Linus Torvalds
c0fa2373f8 The clk tree changes for 3.18 are dominated by clock drivers. Mostly
fixes and enhancements to existing drivers as well as new drivers. This
 tag contains a bit more arch code than I usually take due to some OMAP2+
 changes. Additionally it contains the restart notifier handlers which
 are merged as a dependency into several trees.
 
 The PXA changes are the only messy part. Due to having a stable tree I
 had to revert one patch and follow up with one more fix near the tip of
 this tag. Some dead code is introduced but it will soon become live code
 after 3.18-rc1 is released as the rest of the PXA family is converted
 over to the common clock framework.
 
 Another trend in this tag is that multiple vendors have started to push
 the complexity of changing their CPU frequency into the clock driver,
 whereas this used to be done in CPUfreq drivers.
 
 Changes to the clk core include a generic gpio-clock type and a
 clk_set_phase() function added to the top-level clk.h api. Due to some
 confusion on the fbdev mailing list the kernel boot parameters
 documentation was updated to further explain the clk_ignore_unused
 parameter, which is often required by users of the simplefb driver.
 Finally some fixes to the locking around the clock debugfs stuff was
 done to prevent deadlocks when interacting with other subsystems.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.14 (GNU/Linux)
 
 iQIcBAABAgAGBQJUMu8gAAoJEDqPOy9afJhJ+GwP/3aU1PzhEPooZ3sZ5hkhmRYc
 RTzNZAODuOGbGnAiNQcr8XW3LJ6wKz5TSzzUC8IQkTcYM1Tsc7s5B6v+nMOkR2Jh
 sfrlnDEV/dsW9/3QADFuBowCaZdsaZnHn96RDhTmyDlPjh4HRR2k8ITT+TREbFrd
 cHDWy4QnI0u4NzhKtitvgW2770HyBpr31v5IdoRhVi5whoiBNL49BPwhwDWhwZVe
 w6qvc0jV8FK9Ra/Q7Vw6r3tiKkpO/upqVFDrsO831mp2qDcQvtOgNW9H2fjcobaX
 3/KCbs1TZs39e71RsEGwCvmCudXkTgO1wUJ86MuCLHeb2o78Vx8EYie02/RApTOJ
 0KGR+kFouggy2naeH8pXiTZk2HWMCbut6NQ1+AVbea5Em7hgHbYaQN71wVFKR4L7
 QL+TugrIg81fGWSvxoTo6fsbEiKOUdhXvHFWP5etKHL+Ll+7ku05ojHLOZgEEwTf
 zFWSSF4XSFQtuQD1gup0pSfoLs6qVR57l8FsrxfRPK9jGttg5z1wyNkY+585ptim
 eyTn4mkvkx9t9Sx47VRj9WPcPr2SW1w8lTMw1WqKfHG7AEUJHHkRQThQmiU82b47
 dTls4BBZ6sVZ8wj0V4zvnvbmtdYohOmBqNDEYx+a0dzPKstcAJyZgcjWBc13zds4
 rIKKxhiU7jGWH4qnJLrx
 =w2rN
 -----END PGP SIGNATURE-----

Merge tag 'clk-for-linus-3.18' of git://git.linaro.org/people/mike.turquette/linux

Pull clock tree updates from Mike Turquette:
 "The clk tree changes for 3.18 are dominated by clock drivers.  Mostly
  fixes and enhancements to existing drivers as well as new drivers.
  This tag contains a bit more arch code than I usually take due to some
  OMAP2+ changes.  Additionally it contains the restart notifier
  handlers which are merged as a dependency into several trees.

  The PXA changes are the only messy part.  Due to having a stable tree
  I had to revert one patch and follow up with one more fix near the tip
  of this tag.  Some dead code is introduced but it will soon become
  live code after 3.18-rc1 is released as the rest of the PXA family is
  converted over to the common clock framework.

  Another trend in this tag is that multiple vendors have started to
  push the complexity of changing their CPU frequency into the clock
  driver, whereas this used to be done in CPUfreq drivers.

  Changes to the clk core include a generic gpio-clock type and a
  clk_set_phase() function added to the top-level clk.h api.  Due to
  some confusion on the fbdev mailing list the kernel boot parameters
  documentation was updated to further explain the clk_ignore_unused
  parameter, which is often required by users of the simplefb driver.

  Finally some fixes to the locking around the clock debugfs stuff was
  done to prevent deadlocks when interacting with other subsystems."

* tag 'clk-for-linus-3.18' of git://git.linaro.org/people/mike.turquette/linux: (99 commits)
  clk: pxa clocks build system fix
  Revert "arm: pxa: Transition pxa27x to clk framework"
  clk: samsung: register restart handlers for s3c2412 and s3c2443
  clk: rockchip: add restart handler
  clk: rockchip: rk3288: i2s_frac adds flag to set parent's rate
  doc/kernel-parameters.txt: clarify clk_ignore_unused
  arm: pxa: Transition pxa27x to clk framework
  dts: add devicetree bindings for pxa27x clocks
  clk: add pxa27x clock drivers
  arm: pxa: add clock pll selection bits
  clk: dts: document pxa clock binding
  clk: add pxa clocks infrastructure
  clk: gpio-gate: Ensure gpiod_ APIs are prototyped
  clk: ti: dra7-atl-clock: Mark the device as pm_runtime_irq_safe
  clk: ti: LLVMLinux: Move __init outside of type definition
  clk: ti: consider the fact that of_clk_get() might return an error
  clk: ti: dra7-atl-clock: fix a memory leak
  clk: ti: change clock init to use generic of_clk_init
  clk: hix5hd2: add I2C clocks
  clk: hix5hd2: add watchdog0 clocks
  ...
2014-10-15 07:05:03 +02:00
Linus Torvalds
fcc3a5d277 Changes to existing drivers:
- DT clean-ups in da9055-core, max14577, rn5t618, arizona, hi6421, stmpe, twl4030
   - Export symbols for use in modules in max14577
   - Plenty of static code analysis/Coccinelle fixes throughout the SS
   - Regmap clean-ups in arizona, wm5102, wm5110, da9052, tps65217, rk808
   - Remove unused/duplicate code in da9052, 88pm860x, ti_ssp, lpc_sch, arizona
   - Bug fixes in ti_am335x_tscadc, da9052, ti_am335x_tscadc, rtsx_pcr
   - IRQ fixups in arizona, stmpe, max14577
   - Regulator related changes in axp20x
   - Pass DMA coherency information from parent => child in MFD core
   - Rename DT document files for consistency
   - Add ACPI support to the MFD core
   - Add Andreas Werner to MAINTAINERS for MEN F21BMC
 
 New drivers/supported devices:
   - New driver for MEN 14F021P00 Board Management Controller
   - New driver for Ricoh RN5T618 PMIC
   - New driver for Rockchip RK808
   - New driver for HiSilicon Hi6421 PMIC
   - New driver for Qualcomm SPMI PMICs
   - Add support for Intel Braswell in lpc_ich
   - Add support for Intel 9 Series PCH in lpc_ich
   - Add support for Intel Quark ILB in lpc_sch
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJUMvv5AAoJEFGvii+H/Hdhq90P/3a7ed9Gc4SatQNJ8u68e8M+
 lllGPWXVKnbCR8yc/kCALBpNYUcyPzTp5u1l/ozwEgRDgCzNAvYC2h/aflpPjWSu
 5q1rE7V8Cz/hUxXU/fcEMcnJYiqdaRowgdFtUM+ClLQReOkmwQhWID+hLvTlCUIN
 6MkXCsAl6vrzBEtbKtlR5+6VDQ3Q84gqN2SadpxS+yQwIfGrq1ZWYATaPhdSNGR9
 4bde6YwAqgttQDHyHw0dsd9VtJ53KVk13QkHIHW6S6uPOaZSIvtt4noDUtghDUA1
 tN7d5e5x1Rm8lPREQ4PxMKqHJoRxGfYyAosqXlt3XA1wbjgOgN35nev3gqrbfho5
 eHIWfFJgPDOOwTRVT1drTOVSoxecsbrQq1YB7ChdnfREQbpFiwKhBIxjQKEpQNrI
 OjxXp4ngXwiz31Hvq+44Z6MEVVRCTXgAuBf9/cd8GkF772H7nKmT+wH1QvF+6BRG
 52qEwugTiINo3O+5g1xuDFjFWZ5GWrwUQuRHss13A0cgo+EUJKM6caH+375T7jIT
 vH+2hg0XrqAlWPqcPd1Ma9TVKqI6RJdF0XOk7YP+PcPRvN+SoW/TAGFpzfDHCd+K
 dj3/10nJZUi4CKz6PRcTxKFFpgYjsEGwhYHRWLtH+MXg3UcCyoqTrvfpkGh+hq37
 H9rkW3cNzeyHSAaeKtnk
 =xxsZ
 -----END PGP SIGNATURE-----

Merge tag 'mfd-for-linus-3.18' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd

Pull MFD updates from Lee Jones:
 "Changes to existing drivers:
  - DT clean-ups in da9055-core, max14577, rn5t618, arizona, hi6421, stmpe, twl4030
  - Export symbols for use in modules in max14577
  - Plenty of static code analysis/Coccinelle fixes throughout the SS
  - Regmap clean-ups in arizona, wm5102, wm5110, da9052, tps65217, rk808
  - Remove unused/duplicate code in da9052, 88pm860x, ti_ssp, lpc_sch, arizona
  - Bug fixes in ti_am335x_tscadc, da9052, ti_am335x_tscadc, rtsx_pcr
  - IRQ fixups in arizona, stmpe, max14577
  - Regulator related changes in axp20x
  - Pass DMA coherency information from parent => child in MFD core
  - Rename DT document files for consistency
  - Add ACPI support to the MFD core
  - Add Andreas Werner to MAINTAINERS for MEN F21BMC

 New drivers/supported devices:
  - New driver for MEN 14F021P00 Board Management Controller
  - New driver for Ricoh RN5T618 PMIC
  - New driver for Rockchip RK808
  - New driver for HiSilicon Hi6421 PMIC
  - New driver for Qualcomm SPMI PMICs
  - Add support for Intel Braswell in lpc_ich
  - Add support for Intel 9 Series PCH in lpc_ich
  - Add support for Intel Quark ILB in lpc_sch"

[ Delayed to after the poweer/reset pull due to Kconfig problems with
  recursive Kconfig select/depends-on chains.   - Linus ]

* tag 'mfd-for-linus-3.18' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd: (79 commits)
  mfd: cros_ec: wait for completion of commands that return IN_PROGRESS
  i2c: i2c-cros-ec-tunnel: Set retries to 3
  mfd: cros_ec: move locking into cros_ec_cmd_xfer
  mfd: cros_ec: stop calling ->cmd_xfer() directly
  mfd: cros_ec: Delay for 50ms when we see EC_CMD_REBOOT_EC
  MAINTAINERS: Adds Andreas Werner to maintainers list for MEN F21BMC
  mfd: arizona: Correct mask to allow setting micbias external cap
  mfd: Add ACPI support
  Revert "mfd: wm5102: Manually apply register patch"
  mfd: ti_am335x_tscadc: Update logic in CTRL register for 5-wire TS
  mfd: dt-bindings: atmel-gpbr: Rename doc file to conform to naming convention
  mfd: dt-bindings: qcom-pm8xxx: Rename doc file to conform to naming convention
  mfd: Inherit coherent_dma_mask from parent device
  mfd: Document DT bindings for Qualcomm SPMI PMICs
  mfd: Add support for Qualcomm SPMI PMICs
  mfd: dt-bindings: pm8xxx: Add new compatible string
  mfd: axp209x: Drop the parent supplies field
  mfd: twl4030-power: Use 'ti,system-power-controller' as alternative way to support system power off
  mfd: dt-bindings: twl4030-power: Use the standard property to mark power control
  mfd: syscon: Add Atmel GPBR DT bindings documention
  ...
2014-10-15 06:58:16 +02:00
Linus Torvalds
50fa86172b power supply and reset changes for the v3.18 series
- Initial support for the following chips
   * max77836 (charger)
   * max14577 (charger)
   * bq27742 (battery gauge)
   * ltc2952 (poweroff)
   * stih416 (restart)
   * syscon-reboot (restart)
   * gpio-restart (restart)
  - cleanup of power supply core
  - misc. fixes in power supply and reset drivers
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABCgAGBQJUOWD7AAoJENju1/PIO/qaj5QP/AtbG96NRpX6Ou8W+qulhSbe
 npTKItRMERZhV+lJUAdmLxQq8Cy6I5Cobdj140oIM6HIKZ7Dh6jChxd2t+RCLsO0
 Y16VMq45L9E/mQuVCLqSSNCPKrRjbF7hW+8gpym/Yc846o3Fv8nhTQzkWxCsVrDR
 rNerRnrUyvKpqRpCN88u4ZuOHjm/146nZtTsAzvL8nGnQBZFJHv4eu7D5nwSN0QS
 FrLiI/mE7h1KFcdK5LrZf51IfEP9w4RVV3vs1wEuP0iPS5dguv2dNzXXVZa5eD/2
 AYSSouVKkgdwryLAvIIaDLD405UYl5Xgt5J6d+QKXxm2+eeZMEY36+PEK5WtRtBX
 CykzO2BgvxyOdz2HU9D8irR1Li71jVSWwQITMJ6bQsbKEQAhltmkmsTOxtqB+ekd
 254rKcKWTqllmEhg1LR8PCidf8OZNEcKzXi4XmQdQIcKNF9tvdTXP5c0/FzuIAlQ
 tqWZW1kQK6/TTfx+XxeeszJWY5VknIaka2Bi83pOZYtu94CtUFBdEVkpaYmC433+
 qLOto7VMy+AKYViTJJDDEnSEFiLNZz/zF+SUDf+YB1RelULSF3zC8CbDcZQa5fNm
 4qTU1fl2gGrIZ1jDPPihm6xP8r9WkeuQWEytG2UJZoa+l4XmzfLHBUUDVo0zqTd3
 txfmKca8Y3GhXGwINi1m
 =sj9S
 -----END PGP SIGNATURE-----

Merge tag 'for-v3.18' of git://git.infradead.org/battery-2.6

Pull power supply and reset updates from Sebastian Reichel:
 - Initial support for the following chips
   * max77836 (charger)
   * max14577 (charger)
   * bq27742 (battery gauge)
   * ltc2952 (poweroff)
   * stih416 (restart)
   * syscon-reboot (restart)
   * gpio-restart (restart)
 - cleanup of power supply core
 - misc fixes in power supply and reset drivers

* tag 'for-v3.18' of git://git.infradead.org/battery-2.6: (48 commits)
  power: ab8500_fg: Fix build warning
  Documentation: charger: max14577: Update the date of introducing ABI
  power: reset: corrections for simple syscon reboot driver
  Documentation: power: reset: Add documentation for generic SYSCON reboot driver
  power: reset: Add generic SYSCON register mapped reset
  bq27x00_battery: Fix flag reading for bq27742
  power: reset: use restart_notifier mechanism for msm-poweroff
  power: Add simple gpio-restart driver
  power: reset: st: Provide DT bindings for ST's Power Reset driver
  power: reset: Add restart functionality for STiH41x platforms
  power: charger-manager: Fix NULL pointer exception with missing cm-fuel-gauge
  power: max14577: Fix circular config SYSFS dependency
  power: gpio-charger: do not use gpio value directly
  power: max8925: Use of_get_child_by_name
  power: max8925: Fix NULL ptr dereference on memory allocation failure
  bq27x00_battery: Add support to bq27742
  Documentation: charger: max14577: Document exported sysfs entry
  devicetree: mfd: max14577: Add device tree bindings document
  power: max17040: Add ID for MAX77836 Fuel Gauge block
  charger: max14577: Configure battery-dependent settings from DTS and sysfs
  ...

Conflicts:
	drivers/power/reset/Kconfig
	drivers/power/reset/Makefile
2014-10-15 06:56:23 +02:00
Linus Torvalds
6b04908166 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client
Pull Ceph updates from Sage Weil:
 "There is the long-awaited discard support for RBD (Guangliang Zhao,
  Josh Durgin), a pile of RBD bug fixes that didn't belong in late -rc's
  (Ilya Dryomov, Li RongQing), a pile of fs/ceph bug fixes and
  performance and debugging improvements (Yan, Zheng, John Spray), and a
  smattering of cleanups (Chao Yu, Fabian Frederick, Joe Perches)"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (40 commits)
  ceph: fix divide-by-zero in __validate_layout()
  rbd: rbd workqueues need a resque worker
  libceph: ceph-msgr workqueue needs a resque worker
  ceph: fix bool assignments
  libceph: separate multiple ops with commas in debugfs output
  libceph: sync osd op definitions in rados.h
  libceph: remove redundant declaration
  ceph: additional debugfs output
  ceph: export ceph_session_state_name function
  ceph: include the initial ACL in create/mkdir/mknod MDS requests
  ceph: use pagelist to present MDS request data
  libceph: reference counting pagelist
  ceph: fix llistxattr on symlink
  ceph: send client metadata to MDS
  ceph: remove redundant code for max file size verification
  ceph: remove redundant io_iter_advance()
  ceph: move ceph_find_inode() outside the s_mutex
  ceph: request xattrs if xattr_version is zero
  rbd: set the remaining discard properties to enable support
  rbd: use helpers to handle discard for layered images correctly
  ...
2014-10-15 06:46:01 +02:00
Linus Torvalds
ce9d7f7b45 Merge branch 'CVE-2014-7970' of git://git.kernel.org/pub/scm/linux/kernel/git/luto/linux
Pull pivot_root() fix from Andy Lutomirski.

Prevent a leak of unreachable mounts.

* 'CVE-2014-7970' of git://git.kernel.org/pub/scm/linux/kernel/git/luto/linux:
  mnt: Prevent pivot_root from creating a loop in the mount tree
2014-10-15 06:43:27 +02:00
David S. Miller
2ef1e9efeb Merge branch 'cxgb4'
Anish Bhatt says:

====================
ipv6 and related cleanup for cxgb4/cxgb4i

This patch set removes some duplicated/extraneous code from cxgb4i, guards
cxgb4 against compilation failure based on ipv6 tristate, make ipv6 related
code no longer be enabled by default irrespective of ipv6 tristate and fixes
a refcnt issue.
-Anish

v2 : Provide more detailed commit messages, make subject more concise as
recommended by Dave Miller.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-15 00:29:08 -04:00
Anish Bhatt
c5bbcb5822 cxgb4i: Remove duplicate call to dst_neigh_lookup()
There is an extra call to dst_neigh_lookup() leftover in cxgb4i that can cause
an unreleased refcnt issue. Remove extraneous call.

Signed-off-by: Anish Bhatt <anish@chelsio.com>

Fixes : 759a0cc5a3 ('cxgb4i: Add ipv6 code to driver, call into libcxgbi ipv6 api')
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-15 00:28:59 -04:00
Anish Bhatt
f42bb57c61 cxgb4i : Fix -Wunused-function warning
A bunch of ipv6 related code is left on by default. While this causes no
compilation issues, there is no need to have this enabled by default. Guard
with an ipv6 check, which also takes care of a -Wunused-function warning.

Signed-off-by: Anish Bhatt <anish@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-15 00:28:59 -04:00
Anish Bhatt
1bb60376cd cxgb4 : Fix build failure in cxgb4 when ipv6 is disabled/not in-built
cxgb4 ipv6 does not guard against ipv6 being disabled, or the standard
ipv6 module vs inbuilt tri-state issue. This was fixed for cxgb4i & iw_cxgb4
but missed for cxgb4.

Signed-off-by: Anish Bhatt <anish@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-15 00:28:58 -04:00
Anish Bhatt
587ddfe2d2 cxgb4i : Remove duplicated CLIP handling code
cxgb4 already handles CLIP updates from a previous changeset for iw_cxgb4,
there is no need to have this functionality in cxgb4i. Remove duplicated code

Signed-off-by: Anish Bhatt <anish@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-15 00:28:58 -04:00
David S. Miller
f4da3628dc sparc64: Fix FPU register corruption with AES crypto offload.
The AES loops in arch/sparc/crypto/aes_glue.c use a scheme where the
key material is preloaded into the FPU registers, and then we loop
over and over doing the crypt operation, reusing those pre-cooked key
registers.

There are intervening blkcipher*() calls between the crypt operation
calls.  And those might perform memcpy() and thus also try to use the
FPU.

The sparc64 kernel FPU usage mechanism is designed to allow such
recursive uses, but with a catch.

There has to be a trap between the two FPU using threads of control.

The mechanism works by, when the FPU is already in use by the kernel,
allocating a slot for FPU saving at trap time.  Then if, within the
trap handler, we try to use the FPU registers, the pre-trap FPU
register state is saved into the slot.  Then at trap return time we
notice this and restore the pre-trap FPU state.

Over the long term there are various more involved ways we can make
this work, but for a quick fix let's take advantage of the fact that
the situation where this happens is very limited.

All sparc64 chips that support the crypto instructiosn also are using
the Niagara4 memcpy routine, and that routine only uses the FPU for
large copies where we can't get the source aligned properly to a
multiple of 8 bytes.

We look to see if the FPU is already in use in this context, and if so
we use the non-large copy path which only uses integer registers.

Furthermore, we also limit this special logic to when we are doing
kernel copy, rather than a user copy.

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-14 19:37:58 -07:00
Michael S. Tsirkin
1bbc260627 virtio-rng: refactor probe error handling
Code like
	vi->vq = NULL;
	kfree(vi)
does not make sense.

Clean it up, use goto error labels for cleanup.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2014-10-15 10:25:14 +10:30
Michael S. Tsirkin
5d8f16d08b virtio_scsi: drop scan callback
Enable VQs early like we do for restore.
This makes it possible to drop the scan callback,
moving scanning into the probe function, and making
code simpler.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2014-10-15 10:25:14 +10:30
Michael S. Tsirkin
486d2e632c virtio_balloon: enable VQs early on restore
virtio spec requires drivers to set DRIVER_OK before using VQs.
This is set automatically after resume returns, virtio balloon
violated this rule by adding bufs, which causes the VQ to be used
directly within restore.

To fix, call virtio_device_ready before using VQ.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2014-10-15 10:25:13 +10:30
Michael S. Tsirkin
e67423c7b4 virtio_scsi: fix race on device removal
We cancel event work on device removal, but an interrupt
could trigger immediately after this, and queue it
again.

To fix, set a flag.

Loosely based on patch by Paolo Bonzini

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2014-10-15 10:25:12 +10:30
Paolo Bonzini
1fa5b2a784 virito_scsi: use freezable WQ for events
Michael S. Tsirkin noticed a race condition:
we reset device on freeze, but system WQ is still
running so it might try adding bufs to a VQ meanwhile.

To fix, switch to handling events from the freezable WQ.

Reported-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2014-10-15 10:25:11 +10:30
Michael S. Tsirkin
e53fbd11e9 virtio_net: enable VQs early on restore
virtio spec requires drivers to set DRIVER_OK before using VQs.
This is set automatically after restore returns, virtio net violated this
rule by using receive VQs within restore.

To fix, call virtio_device_ready before using VQs.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2014-10-15 10:25:10 +10:30
Michael S. Tsirkin
401bbdc901 virtio_console: enable VQs early on restore
virtio spec requires drivers to set DRIVER_OK before using VQs.
This is set automatically after resume returns, virtio console violated this
rule by adding inbufs, which causes the VQ to be used directly within
restore.

To fix, call virtio_device_ready before using VQs.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2014-10-15 10:25:09 +10:30
Michael S. Tsirkin
52c9cf1ac3 virtio_scsi: enable VQs early on restore
virtio spec requires drivers to set DRIVER_OK before using VQs.
This is set automatically after restore returns, virtio scsi violated
this rule on restore by kicking event vq within restore.

To fix, call virtio_device_ready before using event queue.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2014-10-15 10:25:08 +10:30
Michael S. Tsirkin
6d62c37f19 virtio_blk: enable VQs early on restore
virtio spec requires drivers to set DRIVER_OK before using VQs.
This is set automatically after restore returns, virtio block violated
this rule on restore by restarting queues, which might in theory
cause the VQ to be used directly within restore.

To fix, call virtio_device_ready before using starting queues.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2014-10-15 10:25:07 +10:30
Michael S. Tsirkin
cd67904895 virtio_scsi: move kick event out from virtscsi_init
We currently kick event within virtscsi_init,
before host is fully initialized.

This can in theory confuse guest if device
consumes the buffers immediately.

To fix,  move virtscsi_kick_event_all out to scan/restore.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2014-10-15 10:25:06 +10:30
Michael S. Tsirkin
0246555550 virtio_net: fix use after free on allocation failure
In the extremely unlikely event that driver initialization fails after
RX buffers are added, virtio net frees RX buffers while VQs are
still active, potentially causing device to use a freed buffer.

To fix, reset device first - same as we do on device removal.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2014-10-15 10:25:05 +10:30
Michael S. Tsirkin
64b4cc3911 9p/trans_virtio: enable VQs early
virtio spec requires drivers to set DRIVER_OK before using VQs.
This is set automatically after probe returns, but virtio 9p device
adds self to channel list within probe, at which point VQ can be
used in violation of the spec.

To fix, call virtio_device_ready before using VQs.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2014-10-15 10:25:04 +10:30
Michael S. Tsirkin
f5866db64f virtio_console: enable VQs early
virtio spec requires drivers to set DRIVER_OK before using VQs.
This is set automatically after probe returns, virtio console violated this
rule by adding inbufs, which causes the VQ to be used directly within
probe.

To fix, call virtio_device_ready before using VQs.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2014-10-15 10:25:03 +10:30
Michael S. Tsirkin
7a11370e5e virtio_blk: enable VQs early
virtio spec requires drivers to set DRIVER_OK before using VQs.
This is set automatically after probe returns, virtio block violated this
rule by calling add_disk, which causes the VQ to be used directly within
probe.

To fix, call virtio_device_ready before using VQs.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2014-10-15 10:25:02 +10:30
Michael S. Tsirkin
4baf1e33d0 virtio_net: enable VQs early
virtio spec requires drivers to set DRIVER_OK before using VQs.
This is set automatically after probe returns, virtio net violated this
rule by using receive VQs within probe.

To fix, call virtio_device_ready before using VQs.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2014-10-15 10:25:02 +10:30
Michael S. Tsirkin
3569db5930 virtio: add API to enable VQs early
virtio spec 0.9.X requires DRIVER_OK to be set before
VQs are used, but some drivers use VQs before probe
function returns.
Since DRIVER_OK is set after probe, this violates the spec.

Even though under virtio 1.0 transitional devices support this
behaviour, we want to make it possible for those early callers to become
spec compliant and eventually support non-transitional devices.

Add API for drivers to call before using VQs.

Sets DRIVER_OK internally.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2014-10-15 10:25:01 +10:30
Michael S. Tsirkin
507613bf31 virtio_net: minor cleanup
goto done;
done:
	return;
is ugly, it was put there to make diff review easier.
replace by open-coded return.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2014-10-15 10:25:00 +10:30
Michael S. Tsirkin
080c637373 virtio-net: drop config_mutex
config_mutex served two purposes: prevent multiple concurrent config
change handlers, and synchronize access to config_enable flag.

Since commit dbf2576e37
    workqueue: make all workqueues non-reentrant
all workqueues are non-reentrant, and config_enable
is now gone.

Get rid of the unnecessary lock.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2014-10-15 10:24:59 +10:30
Michael S. Tsirkin
102a2786c9 virtio_net: drop config_enable
Now that virtio core ensures config changes don't arrive during probing,
drop config_enable flag in virtio net.
On removal, flush is now sufficient to guarantee that no change work is
queued.

This help simplify the driver, and will allow setting DRIVER_OK earlier
without losing config change notifications.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2014-10-15 10:24:58 +10:30
Michael S. Tsirkin
1f54b0c055 virtio-blk: drop config_mutex
config_mutex served two purposes: prevent multiple concurrent config
change handlers, and synchronize access to config_enable flag.

Since commit dbf2576e37
    workqueue: make all workqueues non-reentrant
all workqueues are non-reentrant, and config_enable
is now gone.

Get rid of the unnecessary lock.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2014-10-15 10:24:57 +10:30
Michael S. Tsirkin
cc74f71934 virtio_blk: drop config_enable
Now that virtio core ensures config changes don't
arrive during probing, drop config_enable flag
in virtio blk.
On removal, flush is now sufficient to guarantee that
no change work is queued.

This help simplify the driver, and will allow
setting DRIVER_OK earlier without losing config
change notifications.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2014-10-15 10:24:56 +10:30
Michael S. Tsirkin
22b7050a02 virtio: defer config changed notifications
Defer config changed notifications that arrive during
probe/scan/freeze/restore.

This will allow drivers to set DRIVER_OK earlier, without worrying about
racing with config change interrupts.

This change will also benefit old hypervisors (before 2009)
that send interrupts without checking DRIVER_OK: previously,
the callback could race with driver-specific initialization.

This will also help simplify drivers.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (cosmetic changes)
2014-10-15 10:24:56 +10:30
Michael S. Tsirkin
c6716bae52 virtio-pci: move freeze/restore to virtio core
This is in preparation to extending config changed event handling
in core.
Wrapping these in an API also seems to make for a cleaner code.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2014-10-15 10:24:55 +10:30
Michael S. Tsirkin
016c98c6fe virtio: unify config_changed handling
Replace duplicated code in all transports with a single wrapper in
virtio.c.

The only functional change is in virtio_mmio.c: if a buggy device sends
us an interrupt before driver is set, we previously returned IRQ_NONE,
now we return IRQ_HANDLED.

As this must not happen in practice, this does not look like a big deal.

See also commit 3fff0179e3
	virtio-pci: do not oops on config change if driver not loaded.
for the original motivation behind the driver check.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2014-10-15 10:24:54 +10:30
Michael S. Tsirkin
6fbc198cf6 virtio_pci: fix virtio spec compliance on restore
On restore, virtio pci does the following:
+ set features
+ init vqs etc - device can be used at this point!
+ set ACKNOWLEDGE,DRIVER and DRIVER_OK status bits

This is in violation of the virtio spec, which
requires the following order:
- ACKNOWLEDGE
- DRIVER
- init vqs
- DRIVER_OK

This behaviour will break with hypervisors that assume spec compliant
behaviour.  It seems like a good idea to have this patch applied to
stable branches to reduce the support butden for the hypervisors.

Cc: stable@vger.kernel.org
Cc: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2014-10-15 10:24:53 +10:30
Prarit Bhargava
d3051b489a modules, lock around setting of MODULE_STATE_UNFORMED
A panic was seen in the following sitation.

There are two threads running on the system. The first thread is a system
monitoring thread that is reading /proc/modules. The second thread is
loading and unloading a module (in this example I'm using my simple
dummy-module.ko).  Note, in the "real world" this occurred with the qlogic
driver module.

When doing this, the following panic occurred:

 ------------[ cut here ]------------
 kernel BUG at kernel/module.c:3739!
 invalid opcode: 0000 [#1] SMP
 Modules linked in: binfmt_misc sg nfsv3 rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel lrw igb gf128mul glue_helper iTCO_wdt iTCO_vendor_support ablk_helper ptp sb_edac cryptd pps_core edac_core shpchp i2c_i801 pcspkr wmi lpc_ich ioatdma mfd_core dca ipmi_si nfsd ipmi_msghandler auth_rpcgss nfs_acl lockd sunrpc xfs libcrc32c sr_mod cdrom sd_mod crc_t10dif crct10dif_common mgag200 syscopyarea sysfillrect sysimgblt i2c_algo_bit drm_kms_helper ttm isci drm libsas ahci libahci scsi_transport_sas libata i2c_core dm_mirror dm_region_hash dm_log dm_mod [last unloaded: dummy_module]
 CPU: 37 PID: 186343 Comm: cat Tainted: GF          O--------------   3.10.0+ #7
 Hardware name: Intel Corporation S2600CP/S2600CP, BIOS RMLSDP.86I.00.29.D696.1311111329 11/11/2013
 task: ffff8807fd2d8000 ti: ffff88080fa7c000 task.ti: ffff88080fa7c000
 RIP: 0010:[<ffffffff810d64c5>]  [<ffffffff810d64c5>] module_flags+0xb5/0xc0
 RSP: 0018:ffff88080fa7fe18  EFLAGS: 00010246
 RAX: 0000000000000003 RBX: ffffffffa03b5200 RCX: 0000000000000000
 RDX: 0000000000001000 RSI: ffff88080fa7fe38 RDI: ffffffffa03b5000
 RBP: ffff88080fa7fe28 R08: 0000000000000010 R09: 0000000000000000
 R10: 0000000000000000 R11: 000000000000000f R12: ffffffffa03b5000
 R13: ffffffffa03b5008 R14: ffffffffa03b5200 R15: ffffffffa03b5000
 FS:  00007f6ae57ef740(0000) GS:ffff88101e7a0000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000000000404f70 CR3: 0000000ffed48000 CR4: 00000000001407e0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
 Stack:
  ffffffffa03b5200 ffff8810101e4800 ffff88080fa7fe70 ffffffff810d666c
  ffff88081e807300 000000002e0f2fbf 0000000000000000 ffff88100f257b00
  ffffffffa03b5008 ffff88080fa7ff48 ffff8810101e4800 ffff88080fa7fee0
 Call Trace:
  [<ffffffff810d666c>] m_show+0x19c/0x1e0
  [<ffffffff811e4d7e>] seq_read+0x16e/0x3b0
  [<ffffffff812281ed>] proc_reg_read+0x3d/0x80
  [<ffffffff811c0f2c>] vfs_read+0x9c/0x170
  [<ffffffff811c1a58>] SyS_read+0x58/0xb0
  [<ffffffff81605829>] system_call_fastpath+0x16/0x1b
 Code: 48 63 c2 83 c2 01 c6 04 03 29 48 63 d2 eb d9 0f 1f 80 00 00 00 00 48 63 d2 c6 04 13 2d 41 8b 0c 24 8d 50 02 83 f9 01 75 b2 eb cb <0f> 0b 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41
 RIP  [<ffffffff810d64c5>] module_flags+0xb5/0xc0
  RSP <ffff88080fa7fe18>

    Consider the two processes running on the system.

    CPU 0 (/proc/modules reader)
    CPU 1 (loading/unloading module)

    CPU 0 opens /proc/modules, and starts displaying data for each module by
    traversing the modules list via fs/seq_file.c:seq_open() and
    fs/seq_file.c:seq_read().  For each module in the modules list, seq_read
    does

            op->start()  <-- this is a pointer to m_start()
            op->show()   <- this is a pointer to m_show()
            op->stop()   <-- this is a pointer to m_stop()

    The m_start(), m_show(), and m_stop() module functions are defined in
    kernel/module.c. The m_start() and m_stop() functions acquire and release
    the module_mutex respectively.

    ie) When reading /proc/modules, the module_mutex is acquired and released
    for each module.

    m_show() is called with the module_mutex held.  It accesses the module
    struct data and attempts to write out module data.  It is in this code
    path that the above BUG_ON() warning is encountered, specifically m_show()
    calls

    static char *module_flags(struct module *mod, char *buf)
    {
            int bx = 0;

            BUG_ON(mod->state == MODULE_STATE_UNFORMED);
    ...

    The other thread, CPU 1, in unloading the module calls the syscall
    delete_module() defined in kernel/module.c.  The module_mutex is acquired
    for a short time, and then released.  free_module() is called without the
    module_mutex.  free_module() then sets mod->state = MODULE_STATE_UNFORMED,
    also without the module_mutex.  Some additional code is called and then the
    module_mutex is reacquired to remove the module from the modules list:

        /* Now we can delete it from the lists */
        mutex_lock(&module_mutex);
        stop_machine(__unlink_module, mod, NULL);
        mutex_unlock(&module_mutex);

This is the sequence of events that leads to the panic.

CPU 1 is removing dummy_module via delete_module().  It acquires the
module_mutex, and then releases it.  CPU 1 has NOT set dummy_module->state to
MODULE_STATE_UNFORMED yet.

CPU 0, which is reading the /proc/modules, acquires the module_mutex and
acquires a pointer to the dummy_module which is still in the modules list.
CPU 0 calls m_show for dummy_module.  The check in m_show() for
MODULE_STATE_UNFORMED passed for dummy_module even though it is being
torn down.

Meanwhile CPU 1, which has been continuing to remove dummy_module without
holding the module_mutex, now calls free_module() and sets
dummy_module->state to MODULE_STATE_UNFORMED.

CPU 0 now calls module_flags() with dummy_module and ...

static char *module_flags(struct module *mod, char *buf)
{
        int bx = 0;

        BUG_ON(mod->state == MODULE_STATE_UNFORMED);

and BOOM.

Acquire and release the module_mutex lock around the setting of
MODULE_STATE_UNFORMED in the teardown path, which should resolve the
problem.

Testing: In the unpatched kernel I can panic the system within 1 minute by
doing

while (true) do insmod dummy_module.ko; rmmod dummy_module.ko; done

and

while (true) do cat /proc/modules; done

in separate terminals.

In the patched kernel I was able to run just over one hour without seeing
any issues.  I also verified the output of panic via sysrq-c and the output
of /proc/modules looks correct for all three states for the dummy_module.

        dummy_module 12661 0 - Unloading 0xffffffffa03a5000 (OE-)
        dummy_module 12661 0 - Live 0xffffffffa03bb000 (OE)
        dummy_module 14015 1 - Loading 0xffffffffa03a5000 (OE+)

Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: stable@kernel.org
2014-10-15 10:20:09 +10:30
Eric W. Biederman
0d0826019e mnt: Prevent pivot_root from creating a loop in the mount tree
Andy Lutomirski recently demonstrated that when chroot is used to set
the root path below the path for the new ``root'' passed to pivot_root
the pivot_root system call succeeds and leaks mounts.

In examining the code I see that starting with a new root that is
below the current root in the mount tree will result in a loop in the
mount tree after the mounts are detached and then reattached to one
another.  Resulting in all kinds of ugliness including a leak of that
mounts involved in the leak of the mount loop.

Prevent this problem by ensuring that the new mount is reachable from
the current root of the mount tree.

[Added stable cc.  Fixes CVE-2014-7970.  --Andy]

Cc: stable@vger.kernel.org
Reported-by: Andy Lutomirski <luto@amacapital.net>
Reviewed-by: Andy Lutomirski <luto@amacapital.net>
Link: http://lkml.kernel.org/r/87bnpmihks.fsf@x220.int.ebiederm.org
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
2014-10-14 14:27:19 -07:00
Eric Dumazet
9b462d02d6 tcp: TCP Small Queues and strange attractors
TCP Small queues tries to keep number of packets in qdisc
as small as possible, and depends on a tasklet to feed following
packets at TX completion time.
Choice of tasklet was driven by latencies requirements.

Then, TCP stack tries to avoid reorders, by locking flows with
outstanding packets in qdisc in a given TX queue.

What can happen is that many flows get attracted by a low performing
TX queue, and cpu servicing TX completion has to feed packets for all of
them, making this cpu 100% busy in softirq mode.

This became particularly visible with latest skb->xmit_more support

Strategy adopted in this patch is to detect when tcp_wfree() is called
from ksoftirqd and let the outstanding queue for this flow being drained
before feeding additional packets, so that skb->ooo_okay can be set
to allow select_queue() to select the optimal queue :

Incoming ACKS are normally handled by different cpus, so this patch
gives more chance for these cpus to take over the burden of feeding
qdisc with future packets.

Tested:

lpaa23:~# ./super_netperf 1400 --google-pacing-rate 3028000 -H lpaa24 -l 3600 &

lpaa23:~# sar -n DEV 1 10 | grep eth1
06:16:18 AM      eth1 595448.00 1190564.00  38381.09 1760253.12      0.00      0.00      1.00
06:16:19 AM      eth1 594858.00 1189686.00  38340.76 1758952.72      0.00      0.00      0.00
06:16:20 AM      eth1 597017.00 1194019.00  38480.79 1765370.29      0.00      0.00      1.00
06:16:21 AM      eth1 595450.00 1190936.00  38380.19 1760805.05      0.00      0.00      0.00
06:16:22 AM      eth1 596385.00 1193096.00  38442.56 1763976.29      0.00      0.00      1.00
06:16:23 AM      eth1 598155.00 1195978.00  38552.97 1768264.60      0.00      0.00      0.00
06:16:24 AM      eth1 594405.00 1188643.00  38312.57 1757414.89      0.00      0.00      1.00
06:16:25 AM      eth1 593366.00 1187154.00  38252.16 1755195.83      0.00      0.00      0.00
06:16:26 AM      eth1 593188.00 1186118.00  38232.88 1753682.57      0.00      0.00      1.00
06:16:27 AM      eth1 596301.00 1192241.00  38440.94 1762733.09      0.00      0.00      0.00
Average:         eth1 595457.30 1190843.50  38381.69 1760664.84      0.00      0.00      0.50
lpaa23:~# ./tc -s -d qd sh dev eth1 | grep backlog
 backlog 7606336b 2513p requeues 167982
 backlog 224072b 74p requeues 566
 backlog 581376b 192p requeues 5598
 backlog 181680b 60p requeues 1070
 backlog 5305056b 1753p requeues 110166    // Here, this TX queue is attracting flows
 backlog 157456b 52p requeues 1758
 backlog 672216b 222p requeues 3025
 backlog 60560b 20p requeues 24541
 backlog 448144b 148p requeues 21258

lpaa23:~# echo 1 >/proc/sys/net/ipv4/tcp_tsq_enable_tcp_wfree_ksoftirqd_detect

Immediate jump to full bandwidth, and traffic is properly
shard on all tx queues.

lpaa23:~# sar -n DEV 1 10 | grep eth1
06:16:46 AM      eth1 1397632.00 2795397.00  90081.87 4133031.26      0.00      0.00      1.00
06:16:47 AM      eth1 1396874.00 2793614.00  90032.99 4130385.46      0.00      0.00      0.00
06:16:48 AM      eth1 1395842.00 2791600.00  89966.46 4127409.67      0.00      0.00      1.00
06:16:49 AM      eth1 1395528.00 2791017.00  89946.17 4126551.24      0.00      0.00      0.00
06:16:50 AM      eth1 1397891.00 2795716.00  90098.74 4133497.39      0.00      0.00      1.00
06:16:51 AM      eth1 1394951.00 2789984.00  89908.96 4125022.51      0.00      0.00      0.00
06:16:52 AM      eth1 1394608.00 2789190.00  89886.90 4123851.36      0.00      0.00      1.00
06:16:53 AM      eth1 1395314.00 2790653.00  89934.33 4125983.09      0.00      0.00      0.00
06:16:54 AM      eth1 1396115.00 2792276.00  89984.25 4128411.21      0.00      0.00      1.00
06:16:55 AM      eth1 1396829.00 2793523.00  90030.19 4130250.28      0.00      0.00      0.00
Average:         eth1 1396158.40 2792297.00  89987.09 4128439.35      0.00      0.00      0.50

lpaa23:~# tc -s -d qd sh dev eth1 | grep backlog
 backlog 7900052b 2609p requeues 173287
 backlog 878120b 290p requeues 589
 backlog 1068884b 354p requeues 5621
 backlog 996212b 329p requeues 1088
 backlog 984100b 325p requeues 115316
 backlog 956848b 316p requeues 1781
 backlog 1080996b 357p requeues 3047
 backlog 975016b 322p requeues 24571
 backlog 990156b 327p requeues 21274

(All 8 TX queues get a fair share of the traffic)

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-14 17:16:26 -04:00