Commit Graph

361645 Commits

Author SHA1 Message Date
Dave Airlie
e9f211ad7d Merge branch 'drm-nouveau-next' of git://anongit.freedesktop.org/git/nouveau/linux-2.6 into drm-next
restore debugfs vbios, fix multiple actions with supervisor intrs

* 'drm-nouveau-next' of git://anongit.freedesktop.org/git/nouveau/linux-2.6:
  drm/nouveau: restore debugfs/vbios.rom support
  drm/nv50-/kms: remove UPDATE methods after each encoder disconnect
  drm/nvd0/disp: handle multiple actions from one set of supervisor intrs
  drm/nv50/disp: handle multiple actions from one set of supervisor intrs
2013-02-21 07:13:56 +10:00
Nicholas Bellinger
7b745c84a9 target/file: Add WRITE_SAME w/ UNMAP=0 emulation support
This patch adds support for emulation of WRITE_SAME w/ UNMAP=0 within
fd_execute_write_same() backend code.

The emulation uses vfs_writev() to submit a locally populated buffer
from the received WRITE_SAME scatterlist block for duplication, and by
default enforces a limit of max_write_same_len=0x1000 (8192) sectors up
to the limit of 1024 iovec entries for the single call to vfs_writev().

It also sets max_write_same_len to the operational default at setup ->
fd_configure_device() time.

Tested with 512, 1k, 2k, and 4k block_sizes.

(asias: convert to vzalloc)

Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Asias He <asias@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2013-02-20 12:29:04 -08:00
YOSHIFUJI Hideaki / 吉藤英明
ecd9883724 ipv6: fix race condition regarding dst->expires and dst->from.
Eric Dumazet wrote:
| Some strange crashes happen in rt6_check_expired(), with access
| to random addresses.
|
| At first glance, it looks like the RTF_EXPIRES and
| stuff added in commit 1716a96101
| (ipv6: fix problem with expired dst cache)
| are racy : same dst could be manipulated at the same time
| on different cpus.
|
| At some point, our stack believes rt->dst.from contains a dst pointer,
| while its really a jiffie value (as rt->dst.expires shares the same area
| of memory)
|
| rt6_update_expires() should be fixed, or am I missing something ?
|
| CC Neil because of https://bugzilla.redhat.com/show_bug.cgi?id=892060

Because we do not have any locks for dst_entry, we cannot change
essential structure in the entry; e.g., we cannot change reference
to other entity.

To fix this issue, split 'from' and 'expires' field in dst_entry
out of union.  Once it is 'from' is assigned in the constructor,
keep the reference until the very last stage of the life time of
the object.

Of course, it is unsafe to change 'from', so make rt6_set_from simple
just for fresh entries.

Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Reported-by: Neil Horman <nhorman@tuxdriver.com>
CC: Gao Feng <gaofeng@cn.fujitsu.com>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reported-by: Steinar H. Gunderson <sesse@google.com>
Reviewed-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-20 15:11:45 -05:00
Amerigo Wang
68534c682e net: fix a wrong assignment in skb_split()
commit c9af6db4c1 (net: Fix possible wrong checksum generation)
has a suspicous piece:

	-       skb_shinfo(skb1)->gso_type = skb_shinfo(skb)->gso_type;
	-
	+       skb_shinfo(skb)->tx_flags = skb_shinfo(skb1)->tx_flags & SKBTX_SHARED_FRAG;

skb1 is the new skb, therefore should be on the left side of the assignment.
This patch fixes it.

Cc: Pravin B Shelar <pshelar@nicira.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-20 15:11:44 -05:00
Linus Torvalds
55529fa576 EDAC updates for 3.9
Only one: AMD F16h MCE decoding enablement from Jacob Shin.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABAgAGBQJRI2t7AAoJEBLB8Bhh3lVKmLAP/RbYNVVvLZZxXqB6heCkX7D3
 Avhi4GtuHqjY+79nqBoJEH9HjgzBzL6ffsJGPF3gtXrWybi6nzZkqog5QGRQFfGS
 gIoldbfP9MMF/RrMHbUF5x9Ha9EcdudDgYE3eqCq6KUfOMlUrBRECaU5YRrB+FKR
 qsuQNmF6J3uMPYwO9oGhTpA/vPwwapFaKgm+KND8q5h5JcpJrRpooKNQc1DheW+x
 nfYFpmSBW8eamVwV5DTjqKVhKeG3gB/3i6uSTpLvh4i0ZmV2vQ+drwC3aU0Eh0Q4
 wnslEpQENjODsvbZH6b0j2WTDgBGKEpALRkMUDTnnqvYrR96cV02MWUfp6cbKYCv
 +d/iPt3hEBCyDTDzfP66N+1b+Wqls4Dx8lhhqLdPhsIpY2Qu8OQ8/mjyIIs12qK5
 g9w3lsfy3shWmdsK/Ehd0IhNt1bzNV+63CINA2isC2QAp0Eqecj3jKrXor+MTPu1
 uRY3wM+8CrdeFNNhhhsMQYIb4+DFGLgMwY5mV6z8ceFJAjxvyUxjObnSMopjBKog
 9+JcyEHCADbpHsRup/zRIoGtSs+44sQ6l8l7pq6d4K4UfkG7Biq9lcxThID7rImn
 Rl6MOJVmJuM5n9JVIU4/bmhjfuTcI+gdshexDEH2HnqaXxd7ceIbnrFWFaBZEO5t
 t3WasBuIb30g1XCQmNT6
 =LbcU
 -----END PGP SIGNATURE-----

Merge tag 'edac_3.9' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp

Pull EDAC updates from Borislav Petkov:
 "Mostly AMD's side of EDAC.  It is basically a new family enablement
  stuff: AMD F16h MCE decoding enablement from Jacob Shin.  The rest is
  trivial cleanups."

* tag 'edac_3.9' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp:
  mpc85xx_edac: Fix typo
  EDAC, MCE, AMD: Remove unneeded exports
  EDAC, MCE, AMD: Add MCE decoding support for Family 16h
  EDAC, MCE, AMD: Make MC2 decoding per-family
  amd64_edac: Remove dead code
2013-02-20 11:28:50 -08:00
Linus Torvalds
8793422fd9 ACPI and power management updates for 3.9-rc1
- Rework of the ACPI namespace scanning code from Rafael J. Wysocki
   with contributions from Bjorn Helgaas, Jiang Liu, Mika Westerberg,
   Toshi Kani, and Yinghai Lu.
 
 - ACPI power resources handling and ACPI device PM update from
   Rafael J. Wysocki.
 
 - ACPICA update to version 20130117 from Bob Moore and Lv Zheng
   with contributions from Aaron Lu, Chao Guan, Jesper Juhl, and
   Tim Gardner.
 
 - Support for Intel Lynxpoint LPSS from Mika Westerberg.
 
 - cpuidle update from Len Brown including Intel Haswell support, C1
   state for intel_idle, removal of global pm_idle.
 
 - cpuidle fixes and cleanups from Daniel Lezcano.
 
 - cpufreq fixes and cleanups from Viresh Kumar and Fabio Baltieri
   with contributions from Stratos Karafotis and Rickard Andersson.
 
 - Intel P-states driver for Sandy Bridge processors from
   Dirk Brandewie.
 
 - cpufreq driver for Marvell Kirkwood SoCs from Andrew Lunn.
 
 - cpufreq fixes related to ordering issues between acpi-cpufreq and
   powernow-k8 from Borislav Petkov and Matthew Garrett.
 
 - cpufreq support for Calxeda Highbank processors from Mark Langsdorf
   and Rob Herring.
 
 - cpufreq driver for the Freescale i.MX6Q SoC and cpufreq-cpu0 update
   from Shawn Guo.
 
 - cpufreq Exynos fixes and cleanups from Jonghwan Choi, Sachin Kamat,
   and Inderpal Singh.
 
 - Support for "lightweight suspend" from Zhang Rui.
 
 - Removal of the deprecated power trace API from Paul Gortmaker.
 
 - Assorted updates from Andreas Fleig, Colin Ian King,
   Davidlohr Bueso, Joseph Salisbury, Kees Cook, Li Fei,
   Nishanth Menon, ShuoX Liu, Srinivas Pandruvada, Tejun Heo,
   Thomas Renninger, and Yasuaki Ishimatsu.
 
 /
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.19 (GNU/Linux)
 
 iQIcBAABAgAGBQJRIsArAAoJEKhOf7ml8uNsD6MP/j7C4NA+GTq6RdwoJt+Yki0K
 9Ep8I4pEuRFoN/oskv24EyQhpGJIk6UxWcJ/DWFBc+1VhmKORta7k2Idv/wlJA77
 s7AcDveA9xcDh+TVfbh87TeuiMSXiSdDZbiaQO+wMizWJAF3F84AnjiAqqqyQcSK
 bA5/Siz/vWlt9PyYDaQtHTVE4lpvPuVcQdYewsdaH2PsmUjvIg/TUzg28CTrdyvv
 eHOdBK9R0/OLQLhzRbL0VOGJ//wEl+HJRO0QEhTKPgdQ1e/VH/4Zu5WSzF8P/x4C
 s2f8U4IKQqulDuDHXtpMpelFm7hRWgsOqZLkcyXLs+0dvSM9CTPO6P0ZaImxUctk
 5daHWEsXUnCErDQawt1mcZP8l6qnxofMQIfLXyPVzvlSnHyToTmrtXa1v2u4AuL/
 hOo4MYWsFNUmRdtGFFGlExGgEDZ4G5NwiYjRBl/6XJ3v4nhnnMbuzxP8scpoe5m1
 8tjroJHZFUUs/mFU/H+oRbHzSzXPmp1sddNaTg4OpVmTn3DDh6ljnFhiItd1Ndw0
 5ldVbSe6ETq5RoK0TbzvQOeVpa9F3JfqbrXLQPqfd2iz/No41LQYG1uShRYuXKuA
 wfEcc+c9VMd3FILu05pGwBnU8VS9VbxTYMz7xDxg6b29Ywnb7u+Q1ycCk2gFYtkS
 E2oZDuyewTJxaskzYsNr
 =wijn
 -----END PGP SIGNATURE-----

Merge tag 'pm+acpi-3.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI and power management updates from Rafael Wysocki:

 - Rework of the ACPI namespace scanning code from Rafael J.  Wysocki
   with contributions from Bjorn Helgaas, Jiang Liu, Mika Westerberg,
   Toshi Kani, and Yinghai Lu.

 - ACPI power resources handling and ACPI device PM update from Rafael
   J Wysocki.

 - ACPICA update to version 20130117 from Bob Moore and Lv Zheng with
   contributions from Aaron Lu, Chao Guan, Jesper Juhl, and Tim Gardner.

 - Support for Intel Lynxpoint LPSS from Mika Westerberg.

 - cpuidle update from Len Brown including Intel Haswell support, C1
   state for intel_idle, removal of global pm_idle.

 - cpuidle fixes and cleanups from Daniel Lezcano.

 - cpufreq fixes and cleanups from Viresh Kumar and Fabio Baltieri with
   contributions from Stratos Karafotis and Rickard Andersson.

 - Intel P-states driver for Sandy Bridge processors from Dirk
   Brandewie.

 - cpufreq driver for Marvell Kirkwood SoCs from Andrew Lunn.

 - cpufreq fixes related to ordering issues between acpi-cpufreq and
   powernow-k8 from Borislav Petkov and Matthew Garrett.

 - cpufreq support for Calxeda Highbank processors from Mark Langsdorf
   and Rob Herring.

 - cpufreq driver for the Freescale i.MX6Q SoC and cpufreq-cpu0 update
   from Shawn Guo.

 - cpufreq Exynos fixes and cleanups from Jonghwan Choi, Sachin Kamat,
   and Inderpal Singh.

 - Support for "lightweight suspend" from Zhang Rui.

 - Removal of the deprecated power trace API from Paul Gortmaker.

 - Assorted updates from Andreas Fleig, Colin Ian King, Davidlohr Bueso,
   Joseph Salisbury, Kees Cook, Li Fei, Nishanth Menon, ShuoX Liu,
   Srinivas Pandruvada, Tejun Heo, Thomas Renninger, and Yasuaki
   Ishimatsu.

* tag 'pm+acpi-3.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (267 commits)
  PM idle: remove global declaration of pm_idle
  unicore32 idle: delete stray pm_idle comment
  openrisc idle: delete pm_idle
  mn10300 idle: delete pm_idle
  microblaze idle: delete pm_idle
  m32r idle: delete pm_idle, and other dead idle code
  ia64 idle: delete pm_idle
  cris idle: delete idle and pm_idle
  ARM64 idle: delete pm_idle
  ARM idle: delete pm_idle
  blackfin idle: delete pm_idle
  sparc idle: rename pm_idle to sparc_idle
  sh idle: rename global pm_idle to static sh_idle
  x86 idle: rename global pm_idle to static x86_idle
  APM idle: register apm_cpu_idle via cpuidle
  cpufreq / intel_pstate: Add kernel command line option disable intel_pstate.
  cpufreq / intel_pstate: Change to disallow module build
  tools/power turbostat: display SMI count by default
  intel_idle: export both C1 and C1E
  ACPI / hotplug: Fix concurrency issues and memory leaks
  ...
2013-02-20 11:26:56 -08:00
Zach Brown
24542bf7ea btrfs: limit fallocate extent reservation to 256MB
Very large fallocate requests are cpu bound and result in extents with a
repeating pattern of ever decreasing size:

$ time fallocate -l 1T file
real	0m13.039s

( an excerpt of the extents from btrfs-debug-tree: )
  prealloc data disk byte 1536292564992 nr 397312
  prealloc data disk byte 1536292962304 nr 196608
  prealloc data disk byte 1536293158912 nr 98304
  prealloc data disk byte 1536293257216 nr 49152
  prealloc data disk byte 1536293306368 nr 24576
  prealloc data disk byte 1536293330944 nr 12288
  prealloc data disk byte 1536293343232 nr 8192
  prealloc data disk byte 1536293351424 nr 4096
  prealloc data disk byte 1536293355520 nr 4096
  prealloc data disk byte 1536293359616 nr 4096

The excessive cpu use comes from __btrfs_prealloc_file_range() trying to
allocate the entire remaining size after each extent is allocated.
btrfs_reserve_extent() repeatedly cuts this requested size in half until
it gets down to the size that the allocators can return.  We limit the
problem for now by capping each reservation at 256 meg.

The small extents come from a masking bug when decreasing the requested
reservation size.  The high 32bits are cleared and the remaining low
bits might happen to reserve a small size.   Fix this by using
round_down() which properly casts the mask.

After these fixes huge fallocate requests are fast and result in nice
large extents:

$ time fallocate -l 1T file
real	0m0.082s

  prealloc data disk byte 1112425889792 nr 268435456
  prealloc data disk byte 1112694325248 nr 268435456
  prealloc data disk byte 1112962760704 nr 268435456

Reported-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Zach Brown <zab@redhat.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-02-20 14:06:25 -05:00
Thomas Gleixner
1cba0cdf5e btrfs: Init io_lock after cloning btrfs device struct
__btrfs_close_devices() clones btrfs device structs with
memcpy(). Some of the fields in the clone are reinitialized, but it's
missing to init io_lock. In mainline this goes unnoticed, but on RT it
leaves the plist pointing to the original about to be freed lock
struct.

Initialize io_lock after cloning, so no references to the original
struct are left.

Reported-and-tested-by: Mike Galbraith <efault@gmx.de>
Cc: stable@vger.kernel.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-02-20 14:06:20 -05:00
Chris Mason
e942f883bc Merge branch 'raid56-experimental' into for-linus-3.9
Signed-off-by: Chris Mason <chris.mason@fusionio.com>

Conflicts:
	fs/btrfs/ctree.h
	fs/btrfs/extent-tree.c
	fs/btrfs/inode.c
	fs/btrfs/volumes.c
2013-02-20 14:06:05 -05:00
Chris Mason
b2c6b3e061 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/josef/btrfs-next into for-linus-3.9
Signed-off-by: Chris Mason <chris.mason@fusionio.com>

Conflicts:
	fs/btrfs/disk-io.c
2013-02-20 14:05:45 -05:00
Linus Torvalds
b3cdda2b4f Device tree changes for v3.9
All around device tree changes destined for v3.8. Aside from the
 documentation updates the highlights in this branch include:
 - Kbuild changes for using CPP with .dts files
 - locking fix from preempt_rt patchset
 - include DT alias names in device uevent
 - Selftest bugfixes and improvements
 - New function for counting phandles stanzas in a property
 - constify argument to of_node_full_name()
 - Various bug fixes
 
 This tree did also contain a commit to use platform_device_add instead
 of open-coding the device add code, but it caused problems with amba
 devices and needed to be reverted.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJRI3ZoAAoJEEFnBt12D9kBW0EP/2hTN9cS3b0CMyhh+PVUWZKu
 U+pTMbXBfomYC/9vWEBEpnYQSZuBXA+Sow3ubkRk6p6qjwYi0NUuAE4reQHLwvji
 u9nL7v9WNr4WXqUpMSgSzoxzPkvr2mfrHWRU2adaKpr+p4UvjbYNG1SxppqTJGji
 HThYNsgfdFzNvO7xtFTJGpMe3UhWfazdnVc/rg1csqex2UCZMqmSf1VjNqQIGt+t
 zH6jcCSZY96rX9f+HgdL9rvZyGSjDSIHRllpuG+8u5N3N1CSzbKPe4zSia3mlsC3
 g6g3bOihGJYeG2sc1RzHSdI6ANCn3RTuuA4xQBe/xCKvZIMRNNtzsf2Kbbah0ISG
 NW1WW3KRnq85sEdwv9gtFeMoalZ/sTV1O9m3vG9Xz2XgzWgf7c0V/7ukpFuTpQby
 NiFoTbc7K2E8J/fa8NhKfR4myzNKr3peJ6mJEMMn6PkdQwnOh1AC9l5iuDzMsdvk
 IGY8YvR1qY32IW68E42JQdteZP45EBzEgD9NjU7gRGI6nu2g5czv/VFztjiom4qd
 XahvdsfuVlCitRG8g2CHgBtEsjStYUmCa+gnIoycX7HhMShwYRX/cxA7Yife2UGV
 k+GUKCkGpHOLIoiAEHq+BdEv7amuJsqglJ5kvjL01m80k7JDGJqq8H9UUZ6yOCqe
 iXERO1R4HezNVFtMDrLO
 =Zlkx
 -----END PGP SIGNATURE-----

Merge tag 'dt-for-linus' of git://git.secretlab.ca/git/linux

Pull device tree changes from Grant Likely:
 "All around device tree changes destined for v3.8.  Aside from the
  documentation updates the highlights in this branch include:

   - Kbuild changes for using CPP with .dts files
   - locking fix from preempt_rt patchset
   - include DT alias names in device uevent
   - Selftest bugfixes and improvements
   - New function for counting phandles stanzas in a property
   - constify argument to of_node_full_name()
   - Various bug fixes

  This tree did also contain a commit to use platform_device_add instead
  of open-coding the device add code, but it caused problems with amba
  devices and needed to be reverted."

* tag 'dt-for-linus' of git://git.secretlab.ca/git/linux: (23 commits)
  Revert "of: use platform_device_add"
  kbuild: limit dtc+cpp include path
  gpio: Make of_count_named_gpios() use new of_count_phandle_with_args()
  of: Create function for counting number of phandles in a property
  of/base: Clean up exit paths for of_parse_phandle_with_args()
  of/selftest: Use selftest() macro throughout
  of/selftest: Fix GPIOs selftest to cover the 7th case
  of: fix recursive locking in of_get_next_available_child()
  documentation/devicetree: Fix a typo in exynos-dw-mshc.txt
  OF: convert devtree lock from rw_lock to raw spinlock
  of/exynos_g2d: Add Bindings for exynos G2D driver
  kbuild: create a rule to run the pre-processor on *.dts files
  input: Extend matrix-keypad device tree binding
  devicetree: Move NS2 LEDs binding into LEDs directory
  of: use platform_device_add
  powerpc/5200: Fix size to request_mem_region() call
  documentation/devicetree: Fix typos
  of: add 'const' to of_node_full_name parameter
  of: Output devicetree alias names in uevent
  DT: add vendor prefixes for Renesas and Toshiba
  ...
2013-02-20 11:04:46 -08:00
Linus Torvalds
3aad3f03b2 SPI changes for v3.9
Changes to both core spi code and spi device drivers. The driver
 changes are the usual set of bug fixes and platform enablement. Core
 code changes include:
 - More intelligent assignment of SPI bus numbers when using DT
 - Common mechanism for using gpios as CS lines
 - Pull checks for bits_per_word and transfer speed out of drivers and
   into core code
 - Ensure temporary DMA buffers are DMA safe
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJRI3g7AAoJEEFnBt12D9kB0acQAIaAOqN0Hvae/EtIl7Z40k3q
 DgLuFh6lSzhp690rbM/ULYQtJwxEyIvl0dMRPzMGcw+/tfbZIVEc8VmKJKu6/Qlr
 eMadyxlOtMn/KOcp0e8Yirhgg9o7aeeBo/tpF0jptu6VnVA4eJRMC/Z1EceJaF0+
 cWlwkd/53nzYyg+HLPHWRGiN0JjmC3qzFEBBosWRSBYH1qgjdkSXt7qCxQTHBM8x
 EfjSdIE2O5p4itJ+//M7qV1+MraPvjrFh3wdqIve+XCudqfjy/yUsxMu0TF2zkmd
 mim35J0q/2raxqtA4gc+kfhQsfmiSXXJ0gi054Z/PE9OnS0Ed4/+z5j2aIJDZ41W
 oOrac10Xs1Krdi1zMx0fMoRplKV7xsjXEJ/xDl4k/Gwa86q/3etyQbr+uAG0VCQb
 zN3a3iHx8TK+69sbMJwdImR8Wm5hWEqK0HGYFoV0F31oIeB1Z/pXoqUmpuI2J9Nb
 S2CZb2Ed1KEOokCQjF89m4eOjk8AwAqw6KHBGIY0oCqhvVbmKPXkG/pRZ8d11whV
 fm51MS4YWKgNw1+agExW+62bfJ0WQpu2CvSzgx7lXiEytxKcw7vCxW8Etp3+Ybs+
 1ZF95OnzhiCqwUfulfQ7br6Ax03ATMFJ/Xm63i8b4m2n1TOYLq3/Bn+POaL2K6o8
 chFiIdp7GCGqwT/Tgahy
 =DR18
 -----END PGP SIGNATURE-----

Merge tag 'spi-for-linus' of git://git.secretlab.ca/git/linux

Pull SPI changes from Grant Likely:
 "Changes to both core spi code and spi device drivers.  The driver
  changes are the usual set of bug fixes and platform enablement.

  Core code changes include:

   - More intelligent assignment of SPI bus numbers when using DT

   - Common mechanism for using gpios as CS lines

   - Pull checks for bits_per_word and transfer speed out of drivers and
     into core code

   - Ensure temporary DMA buffers are DMA safe"

* tag 'spi-for-linus' of git://git.secretlab.ca/git/linux: (50 commits)
  spi: Document cs_gpios and cs_gpio in kernel-doc
  spi/of: Fix initialization of cs_gpios array
  spi/pxa2xx: add support for Lynxpoint SPI controllers
  spi/pxa2xx: add support for Intel Low Power Subsystem SPI
  spi/pxa2xx: add support for SPI_LOOP
  spi/pxa2xx: add support for runtime PM
  spi/pxa2xx: add support for DMA engine
  spi/pxa2xx: break out the private DMA API usage into a separate file
  spi/ath79: add shutdown handler
  spi/mips-lantiq: set SPI_MASTER_HALF_DUPLEX flag
  spi/mips-lantiq: make use of spi_finalize_current_message
  spi/bcm63xx: work around inability to keep CS up
  spi/davinci: use request_threaded_irq() to fix deadlock
  spi/orion: Use module_platform_driver()
  spi/bcm63xx: reject transfers unable to transfer
  spi: Ensure memory used for spi_write_then_read() is DMA safe
  spi/spi-mpc512x-psc: init mode bits supported by the driver
  spi/mpc512x-psc: don't use obsolet cell-index property
  spi: Remove erroneous __init, __exit and __exit_p() references in drivers
  spi/s3c64xx: fix checkpatch warnings and error
  ...
2013-02-20 11:03:22 -08:00
Linus Torvalds
10b6339e93 The common clock framework changes for 3.9 are almost entirely fixes.
None are dire enough to be Cc'd to stable which may be interpreted to
 mean that users of the framework are reaching stability.  Lots of new
 adoption of this framework is via DeviceTree data and that comes through
 the respective architecture and platform trees instead of through the
 clk framework tree.  Two new features are improved debugfs output and an
 improvement to how DT clocks are initialized by reusing a common method.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJRI81VAAoJEDqPOy9afJhJzXIQAMZX0DFBio0k5yxmlUS4r+I+
 LV6aRyyvdafVZwx16oTXE4SkGYkTUKrRfD+EAhZA+UsWdlWrAUE623dnBpvDXmkF
 gR+sRrhhzfF2754mWDVDHqFINNgqIR2d1pMNbCXLP85CZLWvkUISV1hnREB5pwfj
 W0T2hohtBe7XtXtsCTCqoLke+D2jrSqg9obcMDyEAM3R2K42nujsTcp0STi/NJvS
 i1YbbYgq6E1/2SNGDOf0YYZscxUGNOI2MN7OckVk1c5781hzKdyeiLiMzY5v6UvT
 aTPXiqetO6UlnCjUPZ1edluHRELVTvrO3qwEqmWfqky6MbcE5uxBqpNabF614ozZ
 eywTXd4xSBKY5Z1WfYn4pgukIvqGsqgGxLaxazUQGR76B66Uk0I3Rca+vSy/O652
 ry0Ejcc67dGOPwYuKkvuwuDwrSSOoRF3q6H99SRLjN/1YopksICVkMU8YeiuUNGJ
 Rb5mD37dfeLhJYLG0mhUdO5olAOsDQJN/nNLQJTAPCpPt+nQURpOaOI+LC9xV6Rr
 pNeA9m1t1hWpfBeYdnUL7y+IUv7XYgRFhjFpRRVjfTOK5nym6MR5OJgdtCv/Kzt4
 tYEJiT5/OgJkUBnCWz5yA+jxYEVxkAeJqT+UMht2a98NYq4rfIiJmJsUj+op69bD
 P2S+GUp0NlpgeNXkjO2s
 =1pl+
 -----END PGP SIGNATURE-----

Merge tag 'clk-for-linus' of git://git.linaro.org/people/mturquette/linux

Pull clock framework update from Michael Turquette:
 "The common clock framework changes for 3.9 are almost entirely fixes.

  None are dire enough to be Cc'd to stable which may be interpreted to
  mean that users of the framework are reaching stability.  Lots of new
  adoption of this framework is via DeviceTree data and that comes
  through the respective architecture and platform trees instead of
  through the clk framework tree.

  Two new features are improved debugfs output and an improvement to how
  DT clocks are initialized by reusing a common method."

* tag 'clk-for-linus' of git://git.linaro.org/people/mturquette/linux: (25 commits)
  clk: sunxi: remove stale Makefile entry
  clk: vexpress: Use common of_clk_init() function
  clk: zynq: Use common of_clk_init() function
  clk: vt8500: Use common of_clk_init() function
  clk: highbank: Use common of_clk_init() function
  clk: sunxi: Use common of_clk_init() function
  clk: add common of_clk_init() function
  clk: Deduplicate exit code in clk_set_rate
  clk: beautify Makefile
  clk-divider: fix macros
  clk: prima2: enable dt-binding clkdev mapping
  clk: mxs: Index is always positive
  clk: max77686: Avoid double free at remove time
  clk: remove exported function from __init section
  clk: vt8500: Add support for WM8750/WM8850 PLL clocks
  clk: vt8500: Fix division-by-0 when requested rate=0
  clk: vt8500: Fix device clock divisor calculations
  clk: vt8500: Fix error in PLL calculations on non-exact match.
  clk: max77686: Remove unnecessary NULL checking for container_of()
  clk: JSON debugfs clock tree summary
  ...
2013-02-20 11:02:10 -08:00
Linus Torvalds
c6699b58f4 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input updates from Dmitry Torokhov:
 "Two new touchpad drivers - Cypress APA I2C Trackpad and Cypress PS/2
  touchpad and a big update to ALPS driver from Kevin Cernekee that adds
  support for "Rushmore" touchpads and paves way for adding support for
  "Dolphin" touchpads.

  There is also a new input driver for Goldfish emulator and also
  Android keyreset driver was folded into SysRq code.

  A few more drivers were updated with device tree bindings and others
  got some small cleanups and fixes."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (55 commits)
  Input: cyttsp-spi - remove duplicate MODULE_ALIAS()
  Input: tsc2005 - add MODULE_ALIAS
  Input: tegra-kbc - require CONFIG_OF, remove platform data
  Input: synaptics - initialize pointer emulation usage
  Input: MT - do not apply filtering on emulated events
  Input: bma150 - make some defines public and fix some comments
  Input: bma150 - fix checking pm_runtime_get_sync() return value
  Input: ALPS - enable trackstick on Rushmore touchpads
  Input: ALPS - add support for "Rushmore" touchpads
  Input: ALPS - make the V3 packet field decoder "pluggable"
  Input: ALPS - move pixel and bitmap info into alps_data struct
  Input: ALPS - fix command mode check
  Input: ALPS - rework detection of Pinnacle AGx touchpads
  Input: ALPS - move {addr,nibble}_command settings into alps_set_defaults()
  Input: ALPS - use function pointers for different protocol handlers
  Input: ALPS - rework detection sequence
  Input: ALPS - introduce helper function for repeated commands
  Input: ALPS - move alps_get_model() down below hw_init code
  Input: ALPS - copy "model" info into alps_data struct
  Input: ALPS - document the alps.h data structures
  ...
2013-02-20 11:00:43 -08:00
Mauro Carvalho Chehab
1339730e73 Linux 3.8-rc7
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.13 (GNU/Linux)
 
 iQEcBAABAgAGBQJRFWw4AAoJEHm+PkMAQRiGnTAH/jBHA2umNc3lc7VkUpusve4q
 GGIlNzYh6iuvIGwKQVj9YPsl37qtQnkDUmY8f8WxZjfxiIDw3TkRKDgNLJaM3Jy5
 E426/FGlRx/Iia5+4tuBeoVYMoIPnndgW5lEAMRK1SvhTByhIYAXsaM0UwPBetb+
 Z5NMdH1f1HVF7RCCmHAkzEk9z78UpdeCzI0t0XuasP2hp2ARAcE1okdO7fNaLiyo
 EmenGhRvy9bAsbRwV0rCdl0rQiZXEYM353DWS7n6q4fyVm8MXFwloUxnWCJTzOIL
 ZLJaz18adFj7Ip/X6ksnMQiQU2Q3B7aThs5ycv0QGxxL2rDFveYRRQ5ICrXOy3M=
 =jjBc
 -----END PGP SIGNATURE-----

Merge tag 'v3.8-rc7' into next

Linux 3.8-rc7

* tag 'v3.8-rc7': (12052 commits)
  Linux 3.8-rc7
  net: sctp: sctp_endpoint_free: zero out secret key data
  net: sctp: sctp_setsockopt_auth_key: use kzfree instead of kfree
  atm/iphase: rename fregt_t -> ffreg_t
  ARM: 7641/1: memory: fix broken mmap by ensuring TASK_UNMAPPED_BASE is aligned
  ARM: DMA mapping: fix bad atomic test
  ARM: realview: ensure that we have sufficient IRQs available
  ARM: GIC: fix GIC cpumask initialization
  net: usb: fix regression from FLAG_NOARP code
  l2tp: dont play with skb->truesize
  net: sctp: sctp_auth_key_put: use kzfree instead of kfree
  netback: correct netbk_tx_err to handle wrap around.
  xen/netback: free already allocated memory on failure in xen_netbk_get_requests
  xen/netback: don't leak pages on failure in xen_netbk_tx_check_gop.
  xen/netback: shutdown the ring if it contains garbage.
  drm/ttm: fix fence locking in ttm_buffer_object_transfer, 2nd try
  virtio_console: Don't access uninitialized data.
  net: qmi_wwan: add more Huawei devices, including E320
  net: cdc_ncm: add another Huawei vendor specific device
  ipv6/ip6_gre: fix error case handling in ip6gre_tunnel_xmit()
  ...
2013-02-20 15:45:52 -03:00
Markus F.X.J. Oberhumer
0ec7382036 crypto: testmgr - update LZO compression test vectors
Update the LZO compression test vectors according to the latest compressor
version.

Signed-off-by: Markus F.X.J. Oberhumer <markus@oberhumer.com>
2013-02-20 19:36:02 +01:00
Markus F.X.J. Oberhumer
8b975bd3f9 lib/lzo: Update LZO compression to current upstream version
This commit updates the kernel LZO code to the current upsteam version
which features a significant speed improvement - benchmarking the Calgary
and Silesia test corpora typically shows a doubled performance in
both compression and decompression on modern i386/x86_64/powerpc machines.

Signed-off-by: Markus F.X.J. Oberhumer <markus@oberhumer.com>
2013-02-20 19:36:01 +01:00
Markus F.X.J. Oberhumer
b6bec26cea lib/lzo: Rename lzo1x_decompress.c to lzo1x_decompress_safe.c
Rename the source file to match the function name and thereby
also make room for a possible future even slightly faster
"non-safe" decompressor version.

Signed-off-by: Markus F.X.J. Oberhumer <markus@oberhumer.com>
2013-02-20 19:36:00 +01:00
Greg Kroah-Hartman
6166805c3d Revert "USB: EHCI: make ehci-vt8500 a separate driver"
This reverts commit d57ada0c37.

All of these are wrong and need to be reverted for now.

Cc: Manjunath Goudar <manjunath.goudar@linaro.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Tony Prisk <linux@prisktech.co.nz>
Cc: Alexey Charkov <alchark@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-20 10:26:31 -08:00
Greg Kroah-Hartman
04867125e1 Revert "USB: EHCI: make ehci-orion a separate driver"
This reverts commit 6ed3c43d05.

All of these are wrong, and need to be reverted for now.

Cc: Manjunath Goudar <manjunath.goudar@linaro.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-20 10:25:44 -08:00
Greg Kroah-Hartman
e9a92b2b37 Revert "USB: update host controller Kconfig entries"
This reverts commit e2ced16661.

All of these are wrong, and need to be removed for now until they can
get reworked properly.

Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-20 10:25:05 -08:00
Linus Torvalds
5a1203914a - Four new drivers:
o goldfish_battery:
     This is Android Emulator battery driver. Originally from Google, but
     Intel folks reshaped it for mainline;
 
   o pm2301_charger:
     A new driver for ST-Ericsson 2301 Power Management chip, uses AB8500
     battery management core;
 
   o qnap-poweroff:
     The driver adds poweroff functionality for QNAP NAS boxes;
 
   o restart-poweroff:
     A generic driver that implements 'power off by restarting'. The actual
     poweroff functionality is implemented through a bootloader, so Linux'
     task is just to restart the box. The driver is useful on Buffalo
     Linkstation LS-XHL and LS-CHLv2 boards. Andrew Lunn worked on
     submitting the driver (as well as qnap-poweroff above).
 
 - A lot of fixes for ab8500 drivers. This is a part of efforts of syncing
   internal ST-Ericsson development tree with the mainline. Lee Jones @
   Linaro worked on compilation and reshaping these series;
 
 - New health properties for the power supplies: "Watchdog timer expire"
   and "Safety timer expire";
 
 - As usual, a bunch of fixes/cleanups here and there.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.19 (GNU/Linux)
 
 iQIcBAABAgAGBQJRIww1AAoJEGgI9fZJve1bJ7sP/1hR8SYQZ/k0ayxA/kIxNEDA
 Qa2sAcdP5pa4F/A2Wi5MTrckSHuLlRRigWRCPM09emVOvt6g2MdM3Eq6ShuYb2p0
 g2bX0F9KBEa9gdiJC1wNVe9DilXbrJsk86V5E+ZkuVnNzM3S1FA1sPWGhdH66YVq
 9IKxm2XntyXFpuvhSYE01KQoFY+Crv1NEGXoIR7tB6Q1KOf3XswnsFwdKJ95IO5r
 8f/lVYRAYUfeplhrYd3eJTtb/GHhK7CutFV/E6kWlumGCWNVH2xt+dTlAmTUdQvO
 jKe9R5iq5eXFTQmN8QZDH7QdXO1DdmJZ9F112hIAud/93bmNrnc0WqMKM7CFHkfH
 S6ptGgeIdpGEM1KMgzKTNrQeNIYsVNtqbEiIiodR8S4e0cX2pb2n5IfzmELIVc0n
 QMZylORpGaH1lBxTE2RoT4raGeWF5jH0t0b0YRqccRPvQ4VjCJslDOq9IOETUe+/
 YxWpELJigW7FGLMfx0SB2/3s1P4EUaufWvHANCVaPuBsOZga854R4dzms4hKyKsL
 JQg+OKp/Bt78yAj/8/n0FWdxgldp/NjF4485ITQ1eOZyg8VSeIp93Lk6lgSTukux
 R1xLruxfPgdynpmYFCGF99JG2G8RmRuJFSTf8JttvyztkAxQw6zBtdmdiPk38wst
 qpUxdPpNDIvt65Ix7x+D
 =955B
 -----END PGP SIGNATURE-----

Merge tag 'for-v3.9' of git://git.infradead.org/battery-2.6

Pull battery updates from Anton Vorontsov:
 "Four new drivers:

   - goldfish_battery:

     This is Android Emulator battery driver.  Originally from Google,
     but Intel folks reshaped it for mainline

   - pm2301_charger:

     A new driver for ST-Ericsson 2301 Power Management chip, uses
     AB8500 battery management core

   - qnap-poweroff:

     The driver adds poweroff functionality for QNAP NAS boxes

   - restart-poweroff:

     A generic driver that implements 'power off by restarting'.  The
     actual poweroff functionality is implemented through a bootloader,
     so Linux' task is just to restart the box.  The driver is useful on
     Buffalo Linkstation LS-XHL and LS-CHLv2 boards.  Andrew Lunn worked
     on submitting the driver (as well as qnap-poweroff above).

  Additionally:

   - A lot of fixes for ab8500 drivers.  This is a part of efforts of
     syncing internal ST-Ericsson development tree with the mainline.
     Lee Jones @ Linaro worked on compilation and reshaping these
     series.

   - New health properties for the power supplies: "Watchdog timer
     expire" and "Safety timer expire"

   - As usual, a bunch of fixes/cleanups here and there"

* tag 'for-v3.9' of git://git.infradead.org/battery-2.6: (81 commits)
  bq2415x_charger: Add support for offline and 100mA mode
  generic-adc-battery: Fix forever loop in gab_remove()
  goldfish_battery: Add missing GENERIC_HARDIRQS dependency
  da9030_battery: Include notifier.h
  bq27x00_battery: Fix reporting battery temperature
  power/reset: Remove newly introduced __dev* annotations
  lp8727_charger: Small cleanup in naming
  ab8500_btemp: Demote initcall sequence
  ds2782_battery: Add power_supply_changed() calls for proper uevent support
  power: Add battery driver for goldfish emulator
  u8500-charger: Delay for USB enumeration
  ab8500-bm: Remove individual [charger|btemp|fg|chargalg] pdata structures
  ab8500-charger: Do not touch VBUSOVV bits
  ab8500-fg: Use correct battery charge full design
  pm2301: LPN mode control support
  pm2301: Enable vbat low monitoring
  ab8500-bm: Flush all work queues before suspending
  ab8500-fg: Go to INIT_RECOVERY when charger removed
  ab8500-charger: Add support for autopower on AB8505 and AB9540
  abx500-chargalg: Add new sysfs interface to get current charge status
  ...

Fix up fairly straightforward conflicts in the ab8500 driver.  But since
it seems to be ARM-specific, I can't even compile-test the result..
2013-02-20 10:19:07 -08:00
Miao Xie
272d26d0ad Btrfs: fix missing release of qgroup reservation in commit_transaction()
We forget to free qgroup reservation in commit_transaction(),fix it.

Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Wang Shilong <wangsl-fnst@cn.fujitsu.com>
Cc: Arne Jansen <sensille@gmx.net>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-02-20 13:00:09 -05:00
Wang Shilong
683cebda90 Btrfs: fix missing check before disabling quota
The original code forget to check whether quota has been disabled firstly,
and it will return 'EINVAL' and return error to users if quota has been
disabled,it will be unfriendly and confusing for users to see that.
So just return directly if quota has been disabled.

Signed-off-by: Wang Shilong <wangsl-fnst@cn.fujitsu.com>
Cc: Arne Jansen <sensille@gmx.net>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-02-20 13:00:07 -05:00
Liu Bo
fa6ac8765c Btrfs: fix cleaner thread not working with inode cache option
Right now inode cache inode is treated as the same as space cache
inode, ie. keep inode in memory till putting super.

But this leads to an awkward situation.

If we're going to delete a snapshot/subvolume, btrfs will not
actually delete it and return free space, but will add it to dead
roots list until the last inode on this snap/subvol being destroyed.
Then we'll fetch deleted roots and cleanup them via cleaner thread.

So here is the problem, if we enable inode cache option, each
snap/subvol has a cached inode which is used to store inode allcation
information.  And this cache inode will be kept in memory, as the above
said.  So with inode cache, snap/subvol can only be added into
dead roots list during freeing roots stage in umount, so that we can
ONLY get space back after another remount(we cleanup dead roots on mount).

But the real thing is we'll no more use the snap/subvol if we mark it
deleted, so we can safely iput its cache inode when we delete snap/subvol.

Another thing is that we need to change the rules of droping inode, we
don't keep snap/subvol's cache inode in memory till end so that we can
add snap/subvol into dead roots list in time.

Reported-by: Mitch Harder <mitch.harder@sabayonlinux.org>
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-02-20 13:00:06 -05:00
Miao Xie
d4edf39bd5 Btrfs: fix uncompleted transaction
In some cases, we need commit the current transaction, but don't want
to start a new one if there is no running transaction, so we introduce
the function - btrfs_attach_transaction(), which can catch the current
transaction, and return -ENOENT if there is no running transaction.

But no running transaction doesn't mean the current transction completely,
because we removed the running transaction before it completes. In some
cases, it doesn't matter. But in some special cases, such as freeze fs, we
hope the transaction is fully on disk, it will introduce some bugs, for
example, we may feeze the fs and dump the data in the disk, if the transction
doesn't complete, we would dump inconsistent data. So we need fix the above
problem for those cases.

We fixes this problem by introducing a function:
	btrfs_attach_transaction_barrier()
if we hope all the transaction is fully on the disk, even they are not
running, we can use this function.

Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-02-20 13:00:05 -05:00
Miao Xie
178260b2c1 Btrfs: fix the deadlock between the transaction start/attach and commit
Now btrfs_commit_transaction() does this

ret = btrfs_run_ordered_operations(root, 0)

which async flushes all inodes on the ordered operations list, it introduced
a deadlock that transaction-start task, transaction-commit task and the flush
workers waited for each other.
(See the following URL to get the detail
 http://marc.info/?l=linux-btrfs&m=136070705732646&w=2)

As we know, if ->in_commit is set, it means someone is committing the
current transaction, we should not try to join it if we are not JOIN
or JOIN_NOLOCK, wait is the best choice for it. In this way, we can avoid
the above problem. In this way, there is another benefit: there is no new
transaction handle to block the transaction which is on the way of commit,
once we set ->in_commit.

Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-02-20 13:00:03 -05:00
Miao Xie
4b82490649 Btrfs: fix the qgroup reserved space is released prematurely
In start_transactio(), we will try to join the transaction again after
the current transaction is committed, so we should not release the
reserved space of the qgroup. Fix it.

Cc: Arne Jansen <sensille@gmx.net>
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-02-20 13:00:02 -05:00
Zach Brown
cdb4c5748c btrfs: define BTRFS_MAGIC as a u64 value
super.magic is an le64 but it's treated as an unterminated string when
compared against BTRFS_MAGIC which is defined as a string.  Instead
define BTRFS_MAGIC as a normal hex value and use endian helpers to
compare it to the super's magic.

I tested this by mounting an fs made before the change and made sure
that it didn't introduce sparse errors.  This matches a similar cleanup
that is pending in btrfs-progs.  David Sterba pointed out that we should
fix the kernel side as well :).

Signed-off-by: Zach Brown <zab@redhat.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-02-20 13:00:01 -05:00
jeff.liu
a8bfd4abea Btrfs: set/change the label of a mounted file system
With this new ioctl(2) BTRFS_IOC_SET_FSLABEL, we can set/change the label of a mounted file system.

Signed-off-by: Jie Liu <jeff.liu@oracle.com>
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: Miao Xie <miaox@cn.fujitsu.com>
Reviewed-by: Goffredo Baroncelli <kreijack@inwind.it>
Reviewed-by: David Sterba <dsterba@suse.cz>
Reviewed-by: Goffredo Baroncelli <kreijack@inwind.it>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-02-20 12:59:59 -05:00
jeff.liu
867ab667e7 Btrfs: Add a new ioctl to get the label of a mounted file system
Add a new ioctl(2) BTRFS_IOC_GET_FSLABLE, so that we can get the label upon a mounted filesystem.

Signed-off-by: Jie Liu <jeff.liu@oracle.com>
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Cc: Miao Xie <miaox@cn.fujitsu.com>
Cc: Goffredo Baroncelli <kreijack@inwind.it>
Cc: David Sterba <dsterba@suse.cz>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-02-20 12:59:58 -05:00
Josef Bacik
569e0f358c Btrfs: place ordered operations on a per transaction list
Miao made the ordered operations stuff run async, which introduced a
deadlock where we could get somebody (sync) racing in and committing the
transaction while a commit was already happening.  The new committer would
try and flush ordered operations which would hang waiting for the commit to
finish because it is done asynchronously and no longer inherits the callers
trans handle.  To fix this we need to make the ordered operations list a per
transaction list.  We can get new inodes added to the ordered operation list
by truncating them and then having another process writing to them, so this
makes it so that anybody trying to add an ordered operation _must_ start a
transaction in order to add itself to the list, which will keep new inodes
from getting added to the ordered operations list after we start committing.
This should fix the deadlock and also keeps us from doing a lot more work
than we need to during commit.  Thanks,

Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-02-20 12:59:57 -05:00
Josef Bacik
dde5740fdd Btrfs: relax the block group size limit for bitmaps
Dave pointed out that xfstests 273 will tell you that it failed to load the
space cache for a block group when it remounts.  This is because we run out
of space writing out the block group cache.  This is ok and is working as it
should, but let's try to be a bit nicer.  This happens because the block
group was 100mb, but bitmap entries cover 128mb, so we were only getting
extent entries for this block group, which ended up being too many to fit in
the free space cache.  So relax the bitmap size requirements to block groups
that are at least half the size a bitmap will cover or larger, that way we
can still keep the amount of space used in the free space cache low enough
to be able to write it out.  With this patch I no longer fail to write out
the free space cache.  Thanks,

Reported-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-02-20 12:59:55 -05:00
Ilya Dryomov
3e39cea61c Btrfs: allow for selecting only completely empty chunks
Enhance balance usage filter by making it possible to balance out only
completely empty chunks.  Today, usage filter properly acts on values
from 1 to 99 inclusive, usage=100 selects all chunks, and usage=0
selects no chunks.  This commit changes the usage=0 case: the new
meaning is to restripe only completely empty chunks and nothing else.

Suggested-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-02-20 12:59:54 -05:00
Ilya Dryomov
bf023ecfca Btrfs: eliminate a use-after-free in btrfs_balance()
Commit 5af3e8cc introduced a use-after-free at volumes.c:3139: bctl is freed
above in __cancel_balance() in all cases except for balance pause.  Fix this
by moving the offending check a couple statements above, the meaning of the
check is preserved.

Reported-by: Chris Mason <chris.mason@fusionio.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-02-20 12:59:53 -05:00
Josef Bacik
c8f2f24bd5 Btrfs: remove unused extent io tree ops V2
Nobody uses these io tree ops anymore so just remove them and clean up the code
a bit.  Thanks,

Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-02-20 12:59:52 -05:00
David Sterba
210549ebe9 btrfs: add cancellation points to defrag
The defrag operation can take very long, we want to have a way how to
cancel it. The code checks for a pending signal at safe points in the
defrag loops and returns EAGAIN. This means a user can press ^C after
running 'btrfs fi defrag', woks for both defrag modes, files and root.

Returning from the command was instant in my light tests, but may take
longer depending on the aging factor of the filesystem.

Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-02-20 12:59:51 -05:00
David Sterba
b069e0c345 btrfs: put some enospc messages under enospc_debug
The warning in use_block_rsv is not useful for users and may fill
the logs unnecessarily.

Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-02-20 12:59:49 -05:00
Miao Xie
38851cc19a Btrfs: implement unlocked dio write
This idea is from ext4. By this patch, we can make the dio write parallel,
and improve the performance. But because we can not update isize without
i_mutex, the unlocked dio write just can be done in front of the EOF.

We needn't worry about the race between dio write and truncate, because the
truncate need wait untill all the dio write end.

And we also needn't worry about the race between dio write and punch hole,
because we have extent lock to protect our operation.

I ran fio to test the performance of this feature.

== Hardware ==
CPU: Intel(R) Core(TM)2 Duo CPU     E7500  @ 2.93GHz
Mem: 2GB
SSD: Intel X25-M 120GB (Test Partition: 60GB)

== config file ==
[global]
ioengine=psync
direct=1
bs=4k
size=32G
runtime=60
directory=/mnt/btrfs/
filename=testfile
group_reporting
thread

[file1]
numjobs=1 # 2 4
rw=randwrite

== result (KBps) ==
write	1	2	4
lock	24936	24738	24726
nolock	24962	30866	32101

== result (iops) ==
write	1	2	4
lock	6234	6184	6181
nolock	6240	7716	8025

Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-02-20 12:59:48 -05:00
Miao Xie
2e60a51e62 Btrfs: serialize unlocked dio reads with truncate
Currently, we can do unlocked dio reads, but the following race
is possible:

dio_read_task			truncate_task
				->btrfs_setattr()
->btrfs_direct_IO
    ->__blockdev_direct_IO
      ->btrfs_get_block
				  ->btrfs_truncate()
				 #alloc truncated blocks
				 #to other inode
      ->submit_io()
     #INFORMATION LEAK

In order to avoid this problem, we must serialize unlocked dio reads with
truncate. There are two approaches:
- use extent lock to protect the extent that we truncate
- use inode_dio_wait() to make sure the truncating task will wait for
  the read DIO.

If we use the 1st one, we will meet the endless truncation problem due to
the nonlocked read DIO after we implement the nonlocked write DIO. It is
because we still need invoke inode_dio_wait() avoid the race between write
DIO and truncation. By that time, we have to introduce

  btrfs_inode_{block, resume}_nolock_dio()

again. That is we have to implement this patch again, so I choose the 2nd
way to fix the problem.

Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-02-20 12:59:47 -05:00
Miao Xie
0934856d46 Btrfs: fix deadlock due to unsubmitted
The deadlock problem happened when running fsstress(a test program in LTP).

Steps to reproduce:
 # mkfs.btrfs -b 100M <partition>
 # mount <partition> <mnt>
 # <Path>/fsstress -p 3 -n 10000000 -d <mnt>

The reason is:
btrfs_direct_IO()
 |->do_direct_IO()
     |->get_page()
     |->get_blocks()
     |	 |->btrfs_delalloc_resereve_space()
     |	 |->btrfs_add_ordered_extent() -------	Add a new ordered extent
     |->dio_send_cur_page(page0) --------------	We didn't submit bio here
     |->get_page()
     |->get_blocks()
	 |->btrfs_delalloc_resereve_space()
	     |->flush_space()
		 |->btrfs_start_ordered_extent()
		     |->wait_event() ----------	Wait the completion of
						the ordered extent that is
						mentioned above

But because we didn't submit the bio that is mentioned above, the ordered
extent can not complete, we would wait for its completion forever.

There are two methods which can fix this deadlock problem:
1. submit the bio before we invoke get_blocks()
2. reserve the space before we do dio

Though the 1st is the simplest way, we need modify the code of VFS, and it
is likely to break contiguous requests, and introduce performance regression
for the other filesystems.

So we have to choose the 2nd way.

Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Cc: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-02-20 12:59:45 -05:00
Josef Bacik
4a7d0f6854 Btrfs: cleanup orphan reservation if truncate fails
I noticed we were getting lots of warnings with xfstest 83 because we have
reservations outstanding.  This is because we moved the orphan add outside
of the truncate, but we don't actually cleanup our reservation if something
fails.  This fixes the problem and I no longer see warnings.  Thanks,

Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-02-20 12:59:44 -05:00
Josef Bacik
5d80366e9b Btrfs: steal from global reserve if we are cleaning up orphans
Sometimes xfstest 83 will fail to remount the scratch device because we've
gotten ourselves so full that we cannot cleanup the orphan items.  In this
case check to see if we're doing the orphan cleanup and if we are allow us
to steal our reservation from the global block rsv.  With this patch I've
not been able to reproduce the failed mount problem.  Thanks,

Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-02-20 12:59:42 -05:00
Miao Xie
8696c53304 Btrfs: fix memory leak of pending_snapshot->inherit
The argument "inherit" of btrfs_ioctl_snap_create_transid() was assigned
to NULL during we created the snapshots, so we didn't free it though we
called kfree() in the caller.

But since we are sure the snapshot creation is done after the function -
btrfs_ioctl_snap_create_transid() - completes, it is safe that we don't
assign the pointer "inherit" to NULL, and just free it in the caller of
btrfs_ioctl_snap_create_transid(). In this way, the code can become more
readable.

Reported-by: Alex Lyakas <alex.btrfs@zadarastorage.com>
Cc: Arne Jansen <sensille@gmx.net>
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-02-20 12:59:41 -05:00
Miao Xie
2b8195bb57 Btrfs: fix the race between bio and btrfs_stop_workers
open_ctree() need read the metadata to initialize the global information
of btrfs. But it may fail after it submit some bio, and then it will jump
to the error path. Unfortunately, it doesn't check if there are some bios
in flight, and just stop all the worker threads. As a result, when the
submitted bios end, they can not find any worker thread which can deal with
subsequent work, then oops happen.

kernel BUG at fs/btrfs/async-thread.c:605!

Fix this problem by invoking invalidate_inode_pages2() before we stop the
worker threads. This function will wait until the bio end because it need
lock the pages which are going to be invalidated, and if a page is under
disk read IO, it must be locked. invalidate_inode_pages2() need wait until
end bio handler to unlocked it.

Reported-and-Tested-by: Tsutomu Itoh <t-itoh@jp.fujitsu.com>
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-02-20 12:59:40 -05:00
Mark Fasheh
cb95e7bf7b btrfs: add "no file data" flag to btrfs send ioctl
This patch adds the flag, BTRFS_SEND_FLAG_NO_FILE_DATA to the btrfs send
ioctl code. When this flag is set, the btrfs send code will never write file
data into the stream (thus also avoiding expensive reads of that data in the
first place). BTRFS_SEND_C_UPDATE_EXTENT commands will be sent (instead of
BTRFS_SEND_C_WRITE) with an offset, length pair indicating the extent in
question.

This patch does not affect the operation of BTRFS_SEND_C_CLONE commands -
they will continue to be sent when a search finds an appropriate extent to
clone from.

Signed-off-by: Mark Fasheh <mfasheh@suse.de>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-02-20 12:59:39 -05:00
Liu Bo
2f697dc6a6 Btrfs: extend the checksum item as much as possible
For write, we also reserve some space for COW blocks during updating
the checksum tree, and we calculate the number of blocks by checking
if the number of bytes outstanding that are going to need csums needs
one more block for csum.

When we add these checksum into the checksum tree, we use ordered sums
list.
Every ordered sum contains csums for each sector, and we'll first try
to look up an existing csum item,
a) if we don't yet have a proper csum item, then we need to insert one,
b) or if we find one but the csum item is not big enough, then we need
to extend it.

The point is we'll unlock the whole path and then insert or extend.
So others can hack in and update the tree.

Each insert or extend needs update the tree with COW on, and we may need
to insert/extend for many times.

That means what we've reserved for updating checksum tree is NOT enough
indeed.

The case is even more serious with having several write threads at the
same time, it can end up eating our reserved space quickly and starting
eating globle reserve pool instead.

I don't yet come up with a way to calculate the worse case for updating
csum, but extending the checksum item as much as possible can be helpful
in my test.

The idea behind is that it can reduce the times we insert/extend so that
it saves us precious reserved space.

Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-02-20 12:59:37 -05:00
Eric Sandeen
de78b51a28 btrfs: remove cache only arguments from defrag path
The entry point at the defrag ioctl always sets "cache only" to 0;
the codepaths haven't run for a long time as far as I can
tell.  Chris says they're dead code, so remove them.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-02-20 12:59:36 -05:00
Josef Bacik
e4a2bcaca9 Btrfs: if we aren't committing just end the transaction if we error out
I hit a deadlock where transaction commit was waiting on num_writers to be
0.  This happened because somebody came into btrfs_commit_transaction and
noticed we had aborted and it went to cleanup_transaction.  This shouldn't
happen because cleanup_transaction is really to fixup a bad commit, it
doesn't do the normal trans handle cleanup things.  So if we have an error
just do the normal btrfs_end_transaction dance and return.  Once we are in
the actual commit path we can use cleanup_transaction and be good to go.
Thanks,

Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-02-20 12:59:35 -05:00
Josef Bacik
3e04e7f10b Btrfs: handle errors in compression submission path
I noticed we would deadlock if we aborted a transaction while doing
compressed io.  This is because we don't unlock our pages if something goes
horribly wrong.  To fix this we need to make sure that we call
extent_clear_unlock_delalloc in order to unlock all the pages.  If we have
to cow in the async submission thread we need to make sure to unlock our
locked_page as the cow error path will not unlock the locked page as it
depends on the caller to unlock that page.  With this patch we no longer
deadlock on the page lock when we have an aborted transaction.  Thanks,

Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-02-20 12:59:33 -05:00