Commit Graph

28305 Commits

Author SHA1 Message Date
Josef Bacik
eb838e73dc Btrfs: lock extents as we map them in DIO
A deadlock in xfstests 113 was uncovered by commit

d187663ef2

This is because we would not return EIOCBQUEUED for short AIO reads, instead
we'd wait for the DIO to complete and then return the amount of data we
transferred, which would allow our stuff to unlock the remaning amount.  But
with this change this no longer happens, so if we have a short AIO read (for
example if we try to read past EOF), we could leave the section from EOF to
the end of where we tried to read locked.  Fixing this is tricky since there
is no clear way to know exactly how much data DIO truly submitted for IO, so
to make this less hard on ourselves and less combersome we need to lock the
extents as we try to map them, and then we unlock any areas we didn't
actually map.  This makes us completely safe from deadlocks and reliance on
a particular behavior of the DIO code.  This also lays the groundwork for
allowing us to use the normal csum storage method for reads which means we
can remove an allocation.  Thanks,

Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2012-08-28 16:53:27 -04:00
Dan Carpenter
dadd1105ca Btrfs: fix some endian bugs handling the root times
"trans->transid" is cpu endian but we want to store the data as little
endian.  "item->ctime.nsec" is only 32 bits, not 64.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
2012-08-28 16:53:26 -04:00
Dan Carpenter
55e591ffde Btrfs: unlock on error in btrfs_delalloc_reserve_metadata()
We should release this mutex before returning the error code.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
2012-08-28 16:53:25 -04:00
Dan Carpenter
57a5a88203 Btrfs: checking for NULL instead of IS_ERR
add_qgroup_rb() never returns NULL, only error pointers.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
2012-08-28 16:53:25 -04:00
Dan Carpenter
5986802c2f Btrfs: fix some error codes in btrfs_qgroup_inherit()
These are returning zero when it should be returning a negative error
code.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
2012-08-28 16:53:24 -04:00
Stefan Behrens
aa2ffd0616 Btrfs: fix a misplaced address operator in a condition
This should obviously not be "if (&flag)" but "if (flag)".

Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de>
2012-08-28 16:53:23 -04:00
Linus Torvalds
89a897fbd8 Pull request from git://github.com/prasad-joshi/logfs_upstream.git
There are few important bug fixes for LogFS
 
 9f0bbd8 logfs: query block device for number of pages to send with bio
 	This BUG was found when LogFS was used on KVM. The patch fixes
 	the problem by asking for underlaying block device the number
 	of pages to send with each BIO.
 
 41b93bc logfs: maintain the ordering of meta-inode destruction
 	LogFS maintains file system meta-data in special inodes. These
 	inodes are releated to each other, therefore they must be
 	destroyed in a proper order.
 
 ddb24bb logfs: create a pagecache page if it is not present
 
 cd8bfa9 logfs: initialize the number of iovecs in bio
 	LogFS used to panic when it was created on an encrypted LVM
 	volume. The patch fixes the problem by properly initializing
 	the BIO.
 
 d2dcd90 logfs: destroy the reserved inodes while unmounting
 
 Diffstat:
  fs/logfs/dev_bdev.c  |   15 ++++++++-------
  fs/logfs/inode.c     |   18 +-----------------
  fs/logfs/journal.c   |    2 +-
  fs/logfs/readwrite.c |    1 +
  fs/logfs/segment.c   |    2 +-
  5 files changed, 12 insertions(+), 26 deletions(-)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJQNmnnAAoJEDFA/f+3K+ZNhc0QAJWWtcvqCIU2UHKw0DaQieH8
 X+YwoqdH5UBR0vylyoHCgCRSoRuttYEy47DYDhkpUdqad60atDhwIwhFaknlNZbt
 +hv2ve6hZTzcs42HZhu+0WUzMkUs0s6Xjuu9JlZND59zk1wWBfttDlKI2/SdjMlm
 GnW0nx5gLFZ1rmpaL8bdRpyLXxUvb9FmUc+YWsuinj8A2Lnqg5bNOmJZ3CiMYNVk
 UvbHDYJmNaMZndbYeXZtxXtUo9Uk4HsU+7whpmgD+OPz7h+VMOgfXnwkpsgmtbY2
 qTXAjyFVpN23zhBTbpCMXvpfbrffdBJBfkEW+2sXqpr9IHbMX0Y9cb/XJ3Ub1Qz+
 HRLdBh0iAZewWVRsGOKG2OU1WUrTNzKUAJ795QFL0c0bZsODzQ+5OlcD7rBzEMpq
 ioN49UtJWlmckWaott6PinSl/OKWlvz771Zayh9+ttuL3Dvt6coV7K3ns5MI6glN
 M9DTMd8GAW2Kdz80EyAjZYWw2M3lbs0/GghJB4ozYg3mrDpBq1w5ouEvSNw8PuJw
 yoUEe2xLGpLZdzQDouKMlrai8kohCyMoHKH1Kimx+iG4LZkYLy8VJepY1mfzQbEY
 1zSB1nCV/c67qI3fxRdPNmJ7YtjTC9ero0qOouXwSJWwZ+OEikGXOKWnjRutcY1C
 Jfbfg6gNqOrZZbYWp1ke
 =eS/c
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://github.com/prasad-joshi/logfs_upstream

Pull LogFS bugfixes from Prasad Joshi:

 - "logfs: query block device for number of pages to send with bio"

	This BUG was found when LogFS was used on KVM. The patch fixes
	the problem by asking for underlaying block device the number
	of pages to send with each BIO.

 - "logfs: maintain the ordering of meta-inode destruction"

	LogFS maintains file system meta-data in special inodes. These
	inodes are releated to each other, therefore they must be
	destroyed in a proper order.

 - "logfs: initialize the number of iovecs in bio"

	LogFS used to panic when it was created on an encrypted LVM
	volume. The patch fixes the problem by properly initializing
	the BIO.

Plus a couple more:
 - logfs: create a pagecache page if it is not present
 - logfs: destroy the reserved inodes while unmounting

* tag 'for-linus' of git://github.com/prasad-joshi/logfs_upstream:
  logfs: query block device for number of pages to send with bio
  logfs: maintain the ordering of meta-inode destruction
  logfs: create a pagecache page if it is not present
  logfs: initialize the number of iovecs in bio
  logfs: destroy the reserved inodes while unmounting
2012-08-26 10:14:11 -07:00
Linus Torvalds
e1d33a5c63 xfs: bugfixes for 3.6-rc4
- fix uninitialised variable in xfs_rtbuf_get()
 - unlock the AGI buffer when looping in xfs_dialloc
 - check for possible overflow in xfs_ioc_trim
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.10 (GNU/Linux)
 
 iQIcBAABAgAGBQJQN7FMAAoJENaLyazVq6ZOrwsQAMB2Cg/t/XeiPFD5QtvFX7Ua
 9cS3klSo9p29Xr5oeo54b3mF+5dycS61PQu2zdg4XhU5ZqoWoS6xUIFxzo3yWiVi
 nC/BbgMQTgkfEQs+2ulmXRgP8A2DdakSDmvAZDQNURZQwH4zykU0tPKD9V009TMW
 SaIprIWQqhRrsd9prrv364FgKv3/vyuFL1Agho304o3ynk+2SLDzm3eqdTaXTD9Z
 9OGP5Hs5Mh9CZmr46MIpRnoaHppqNRZh7EbDyMFL8kMN8F2yOdJk5MgA1m7gPJ5Z
 ADfnv3laxEp04hhD5lazePYJwZlg/KB77Y4YXD5hmTSznIxcQNlfRcMV9ju4FGgr
 Nb5SgA/6iBk5dkZyuyIdqCzj1Mt6SimpIWb3+/vcS8+VaqQUsyoufccO0zrgeoIA
 /O1PtCe//haFENlD+hMc9QrQSISsaR/YYUv0yz4YcsxKsdmyYpbHdTyFhVjnkKoS
 BtA0ZOgWYebgEzr7J1ijW085CrDvAc9jsPujOWmZlBJWcfo1rtVfqhXWll0ms5cr
 ZRoWqLDqoOhjlUDqwMN4dk6ENxG4QbSl6AWXXVXrUvtvaNzoi0MBNX3HPt/RVmF5
 rGKFvLuIxBHAfD1xYCqtCfP9yzeQzDy1SPdASForEZOyM+RVUnOf9i+1nhbsu/97
 AhiW4lozQetWzx+RQwU+
 =+FuN
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-v3.6-rc4' of git://oss.sgi.com/xfs/xfs

Pull xfs bugfixes from Ben Myers:
 - fix uninitialised variable in xfs_rtbuf_get()
 - unlock the AGI buffer when looping in xfs_dialloc
 - check for possible overflow in xfs_ioc_trim

* tag 'for-linus-v3.6-rc4' of git://oss.sgi.com/xfs/xfs:
  xfs: check for possible overflow in xfs_ioc_trim
  xfs: unlock the AGI buffer when looping in xfs_dialloc
  xfs: fix uninitialised variable in xfs_rtbuf_get()
2012-08-25 11:47:06 -07:00
Linus Torvalds
8497ae61d0 Merge branch 'for-3.6' of git://linux-nfs.org/~bfields/linux
Pull nfsd bugfixes from J. Bruce Fields:
 "Particular thanks to Michael Tokarev, Malahal Naineni, and Jamie
  Heilman for their testing and debugging help."

* 'for-3.6' of git://linux-nfs.org/~bfields/linux:
  svcrpc: fix svc_xprt_enqueue/svc_recv busy-looping
  svcrpc: sends on closed socket should stop immediately
  svcrpc: fix BUG() in svc_tcp_clear_pages
  nfsd4: fix security flavor of NFSv4.0 callback
2012-08-25 11:43:41 -07:00
Linus Torvalds
a7e546f175 Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull block-related fixes from Jens Axboe:

 - Improvements to the buffered and direct write IO plugging from
   Fengguang.

 - Abstract out the mapping of a bio in a request, and use that to
   provide a blk_bio_map_sg() helper.  Useful for mapping just a bio
   instead of a full request.

 - Regression fix from Hugh, fixing up a patch that went into the
   previous release cycle (and marked stable, too) attempting to prevent
   a loop in __getblk_slow().

 - Updates to discard requests, fixing up the sizing and how we align
   them.  Also a change to disallow merging of discard requests, since
   that doesn't really work properly yet.

 - A few drbd fixes.

 - Documentation updates.

* 'for-linus' of git://git.kernel.dk/linux-block:
  block: replace __getblk_slow misfix by grow_dev_page fix
  drbd: Write all pages of the bitmap after an online resize
  drbd: Finish requests that completed while IO was frozen
  drbd: fix drbd wire compatibility for empty flushes
  Documentation: update tunable options in block/cfq-iosched.txt
  Documentation: update tunable options in block/cfq-iosched.txt
  Documentation: update missing index files in block/00-INDEX
  block: move down direct IO plugging
  block: remove plugging at buffered write time
  block: disable discard request merge temporarily
  bio: Fix potential memory leak in bio_find_or_create_slab()
  block: Don't use static to define "void *p" in show_partition_start()
  block: Add blk_bio_map_sg() helper
  block: Introduce __blk_segment_map_sg() helper
  fs/block-dev.c:fix performance regression in O_DIRECT writes to md block devices
  block: split discard into aligned requests
  block: reorganize rounding of max_discard_sectors
2012-08-25 11:36:43 -07:00
Linus Torvalds
62688e5b64 Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull UDF, ext3 & reiserfs fixes from Jan Kara:
 "A couple of fixes (udf, reiserfs, ext3) that accumulated over my
  vacation."

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  udf: fix retun value on error path in udf_load_logicalvol
  jbd: don't write superblock when unmounting an ro filesystem
  reiserfs: fix deadlocks with quotas
  quota: Move down dqptr_sem read after initializing default warn[] type at __dquot_alloc_space().
  UDF: During mount free lvid_bh before rescanning with different blocksize
  udf: fix udf_setsize() for file data in ICB
2012-08-23 21:56:22 -07:00
Linus Torvalds
f4673d6f18 UBIFS fixes for 3.6:
1. Fix crash on error which prevents emulated power-cut testing.
 2. Fix log reply regression introduced in 3.6-rc1.
 3. Fix UBIFS complaints about too small debug buffer size which.
 4. Fix error message spelling, and remove incorrect commentary.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABAgAGBQJQNdHYAAoJECmIfjd9wqK0t+cQANEkF1Xgs8cd/5U++tQWCGyu
 EInLm+416ytTy4Vg2bLwlJ4OnUhgtTSTeKovtp3yoSKytxB4vf50qI5P/gInXSxo
 mI1KkvHZLraFzFPGtnbXGCg0YS8xZLIhw2p8VDuNFcxZIEiuAGZ2qMC9tPzzzle6
 em/yDIumR//oioFHOqVNu6wKE3do9QY5fo0meqE6QTliKppSDhWcO9MCfodkFOCp
 WtlD2dfIQv3SdaZW+X8skbqMFQ9KJRAzk59hl1Cm2jlXEHEHYmvGwqTCSOk8W/0c
 3ZSgUVW1YKBYq25yA3Xdz6aDCgRT6LYsi/EH0VNNMlt/H9RfcCNTGe96ioTcgsOT
 FroLZZUK9PiQYmqUqnk8eqmUyq888vKZY3Sbxt5/bbZhSoK6aYzNgfLlkZNfnHDL
 unzpyPqYq9VahuTgY5ravDclPKCc2p7xeWJYeOE9JtUY6/fVKJh9qERrtVoNP+nD
 LjS89G/tNBn9ZPk8KyXgikJ4JZ7sw30vySmBdGtx+ckzXqKTsDtLznvdSU3/XiJM
 zdqYV3gulgoJtIPXFVuI3i3GeKyXuJlPpjNimuFn+9mNElZfrvs6RcGbQ3Df/3Mc
 omfTrRxS+vRadGC9MZoIMCV27p7+cEynbLLVXtvPebwqiO1tV1fCspdJGjJmkbvL
 14hi7hTYvYCtMR+m8i2+
 =VvYb
 -----END PGP SIGNATURE-----

Merge tag 'upstream-3.6-rc3' of git://git.infradead.org/linux-ubifs

Pull UBIFS fixes from Artem Bityutskiy:
 - Fix crash on error which prevents emulated power-cut testing.
 - Fix log reply regression introduced in 3.6-rc1.
 - Fix UBIFS complaints about too small debug buffer size which.
 - Fix error message spelling, and remove incorrect commentary.

* tag 'upstream-3.6-rc3' of git://git.infradead.org/linux-ubifs:
  UBIFS: fix error messages spelling
  UBIFS: fix complaints about too small debug buffer size
  UBIFS: fix replay regression
  UBIFS: fix crash on error path
  UBIFS: remove stale commentary
2012-08-23 21:50:40 -07:00
Tomas Racek
a672e1be30 xfs: check for possible overflow in xfs_ioc_trim
If range.start or range.minlen is bigger than filesystem size, return
invalid value error. This fixes possible overflow in BTOBB macro when
passed value was nearly ULLONG_MAX.

Signed-off-by: Tomas Racek <tracek@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2012-08-23 14:48:44 -05:00
Christoph Hellwig
7612903099 xfs: unlock the AGI buffer when looping in xfs_dialloc
Also update some commens in the area to make the code easier to read.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2012-08-23 14:48:32 -05:00
Dave Chinner
0b9e3f6d84 xfs: fix uninitialised variable in xfs_rtbuf_get()
Results in this assert failure in generic/090:

XFS: Assertion failed: *nmap >= 1, file: fs/xfs/xfs_bmap.c, line: 4363
.....
Call Trace:
 [<ffffffff814680db>] xfs_bmapi_read+0x6b/0x370
 [<ffffffff814b64b2>] xfs_rtbuf_get+0x42/0x130
 [<ffffffff814b6f09>] xfs_rtget_summary+0x89/0x120
 [<ffffffff814b7bfe>] xfs_rtallocate_extent_size+0xce/0x340
 [<ffffffff814b89f0>] xfs_rtallocate_extent+0x240/0x290
 [<ffffffff81462c1a>] xfs_bmap_rtalloc+0x1ba/0x340
 [<ffffffff81463a65>] xfs_bmap_alloc+0x35/0x40
 [<ffffffff8146f111>] xfs_bmapi_allocate+0xf1/0x350
 [<ffffffff8146f9de>] xfs_bmapi_write+0x66e/0xa60
 [<ffffffff8144538a>] xfs_iomap_write_direct+0x22a/0x3f0
 [<ffffffff8143707b>] __xfs_get_blocks+0x38b/0x5d0
 [<ffffffff814372d4>] xfs_get_blocks_direct+0x14/0x20
 [<ffffffff811b0081>] do_blockdev_direct_IO+0xf71/0x1eb0
 [<ffffffff811b1015>] __blockdev_direct_IO+0x55/0x60
 [<ffffffff814355ca>] xfs_vm_direct_IO+0x11a/0x1e0
 [<ffffffff8112d617>] generic_file_direct_write+0xd7/0x1b0
 [<ffffffff8143e16c>] xfs_file_dio_aio_write+0x13c/0x320
 [<ffffffff8143e6f2>] xfs_file_aio_write+0x1c2/0x1d0
 [<ffffffff81174a07>] do_sync_write+0xa7/0xe0
 [<ffffffff81175288>] vfs_write+0xa8/0x160
 [<ffffffff81175702>] sys_pwrite64+0x92/0xb0
 [<ffffffff81b68f69>] system_call_fastpath+0x16/0x1b

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2012-08-23 14:48:16 -05:00
Hugh Dickins
676ce6d5ca block: replace __getblk_slow misfix by grow_dev_page fix
Commit 91f68c89d8 ("block: fix infinite loop in __getblk_slow")
is not good: a successful call to grow_buffers() cannot guarantee
that the page won't be reclaimed before the immediate next call to
__find_get_block(), which is why there was always a loop there.

Yesterday I got "EXT4-fs error (device loop0): __ext4_get_inode_loc:3595:
inode #19278: block 664: comm cc1: unable to read itable block" on console,
which pointed to this commit.

I've been trying to bisect for weeks, why kbuild-on-ext4-on-loop-on-tmpfs
sometimes fails from a missing header file, under memory pressure on
ppc G5.  I've never seen this on x86, and I've never seen it on 3.5-rc7
itself, despite that commit being in there: bisection pointed to an
irrelevant pinctrl merge, but hard to tell when failure takes between
18 minutes and 38 hours (but so far it's happened quicker on 3.6-rc2).

(I've since found such __ext4_get_inode_loc errors in /var/log/messages
from previous weeks: why the message never appeared on console until
yesterday morning is a mystery for another day.)

Revert 91f68c89d8, restoring __getblk_slow() to how it was (plus
a checkpatch nitfix).  Simplify the interface between grow_buffers()
and grow_dev_page(), and avoid the infinite loop beyond end of device
by instead checking init_page_buffers()'s end_block there (I presume
that's more efficient than a repeated call to blkdev_max_block()),
returning -ENXIO to __getblk_slow() in that case.

And remove akpm's ten-year-old "__getblk() cannot fail ... weird"
comment, but that is worrying: are all users of __getblk() really
now prepared for a NULL bh beyond end of device, or will some oops??

Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: stable@vger.kernel.org # 3.0 3.2 3.4 3.5
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2012-08-23 12:17:36 +02:00
Linus Torvalds
f753c4ec15 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client
Pull ceph fixes from Sage Weil:
 "Jim's fix closes a narrow race introduced with the msgr changes.  One
  fix resolves problems with debugfs initialization that Yan found when
  multiple client instances are created (e.g., two clusters mounted, or
  rbd + cephfs), another one fixes problems with mounting a nonexistent
  server subdirectory, and the last one fixes a divide by zero error
  from unsanitized ioctl input that Dan Carpenter found."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
  ceph: avoid divide by zero in __validate_layout()
  libceph: avoid truncation due to racing banners
  ceph: tolerate (and warn on) extraneous dentry from mds
  libceph: delay debugfs initialization until we learn global_id
2012-08-22 09:58:05 -07:00
Linus Torvalds
ad746be969 NFS client bugfixes for Linux 3.6
- NFSv3 mounts need to fail if the FSINFO rpc call fails
 - Ensure that the NFS commit cache gets torn down when we unload the
   NFS module.
 - Fix memory scribble issues when interrupting a LAYOUTGET rpc call
 - Fix NFSv4 legacy idmapper regressions
 - Fix issues with the NFSv4 getacl command
 - Fix a regression when using the legacy "mount -t nfs4"
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABAgAGBQJQNPyKAAoJEGcL54qWCgDyQlAP/AmnLaJmKMPj5P4GCiA/uxMV
 sr0XSSVAv+OgUtRxLNO5PDyunVIIEjwA5nGa2RZEG14+eYwNEeQehaGoaGqoKLhh
 ATOCMvHHEs5qSKLckWsZWflf6U/MLimYS3D/hHVUtP0QWbBMtol3ImABs2ht/VUc
 HW0wQoEG+5mhbhC87d4ku5E+OHQzYeNNmdX2VkKOmT0CwRd/bfsHxlGwMmcHldP2
 TSge+MRhoKNycpy0j7nugHj2pZ12UiX+O8371OlAketnPk0OA+6RNGMyNo49IQAn
 VEN9MFWc4ypH1+hJ0OvhzmUA6Zn2/lyQXUtwM1DcXeBmLI4aClb3bT3G3xKNh09O
 yLEvkh2aUUlmoTSgP7vDK/Ui3QwVnsaJULvg+gywQJG6qTSFgUQfErNKgJmabHQb
 d0EuumlSGnJ7qdC5nmrUp+M/CpbfGh15ax/JnTRGVK7C3jeCm18vc2np+z4EUDp/
 USzrvuBu1B8eI0nQNoht0BeNf7sEJGgFIn8BJzxDakP8buXPMqEvFwA8c1JmMeBX
 E6pbG2jHqPdrTJtxQTpMRqhA0MxbFlVNi5cCs6U7ifiBqbRWEXyNVgAHTwoIxIrn
 ASbL0s8gZH+EMEXNOmPmFiO2NUjBCeSzAjrsTmSpWa13YmGfiroZN8N/iPzLnFf+
 FR3SFUnoJHq7x4uLZoAX
 =lqfh
 -----END PGP SIGNATURE-----

Merge tag 'nfs-for-3.6-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client bugfixes from Trond Myklebust:
 - NFSv3 mounts need to fail if the FSINFO rpc call fails
 - Ensure that the NFS commit cache gets torn down when we unload the
   NFS module.
 - Fix memory scribble issues when interrupting a LAYOUTGET rpc call
 - Fix NFSv4 legacy idmapper regressions
 - Fix issues with the NFSv4 getacl command
 - Fix a regression when using the legacy "mount -t nfs4"

* tag 'nfs-for-3.6-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  NFSv3: Ensure that do_proc_get_root() reports errors correctly
  NFSv4: Ensure that nfs4_alloc_client cleans up on error.
  NFS: return -ENOKEY when the upcall fails to map the name
  NFS: Clear key construction data if the idmap upcall fails
  NFSv4: Don't use private xdr_stream fields in decode_getacl
  NFSv4: Fix the acl cache size calculation
  NFSv4: Fix pointer arithmetic in decode_getacl
  NFS: Alias the nfs module to nfs4
  NFS: Fix a regression when loading the NFS v4 module
  NFSv4.1: Remove a bogus BUG_ON() in nfs4_layoutreturn_done
  pnfs-obj: Better IO pattern in case of unaligned offset
  NFS41: add pg_layout_private to nfs_pageio_descriptor
  pnfs: nfs4_proc_layoutget returns void
  pnfs: defer release of pages in layoutget
  nfs: tear down caches in nfs_init_writepagecache when allocation fails
2012-08-22 09:57:25 -07:00
Linus Torvalds
467e9e51d0 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull assorted fixes - mostly vfs - from Al Viro:
 "Assorted fixes, with an unexpected detour into vfio refcounting logics
  (fell out when digging in an analog of eventpoll race in there)."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  task_work: add a scheduling point in task_work_run()
  fs: fix fs/namei.c kernel-doc warnings
  eventpoll: use-after-possible-free in epoll_create1()
  vfio: grab vfio_device reference *before* exposing the sucker via fd_install()
  vfio: get rid of vfio_device_put()/vfio_group_get_device* races
  vfio: get rid of open-coding kref_put_mutex
  introduce kref_put_mutex()
  vfio: don't dereference after kfree...
  mqueue: lift mnt_want_write() outside ->i_mutex, clean up a bit
2012-08-22 09:56:06 -07:00
Artem Bityutskiy
69f9025894 UBIFS: fix error messages spelling
Corruptio -> corruption.

Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2012-08-22 17:41:09 +03:00
Randy Dunlap
55852635a8 fs: fix fs/namei.c kernel-doc warnings
Fix kernel-doc warnings in fs/namei.c:

Warning(fs/namei.c:360): No description found for parameter 'inode'
Warning(fs/namei.c:672): No description found for parameter 'nd'

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Cc:	Alexander Viro <viro@zeniv.linux.org.uk>
Cc:	linux-fsdevel@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-08-22 10:30:10 -04:00
Al Viro
98022748f6 eventpoll: use-after-possible-free in epoll_create1()
As soon as we'd installed the file into descriptor table, it can
get closed by another thread.  Freeing ep in process...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-08-22 10:26:55 -04:00
Artem Bityutskiy
65b455b123 UBIFS: fix complaints about too small debug buffer size
When debugging is enabled, we use a temporary on-stack buffer for formatting
the key strings like "(11368871, direntry, 0xcd0750)". The buffer size is
32 bytes and sometimes it is not enough to fit the key string - e.g., when
inode numbers are high. This is not fatal, but the key strings are incomplete
and UBIFS complains like this:

	UBIFS assert failed in dbg_snprintf_key at 137 (pid 1)

This is a regression caused by "515315a UBIFS: fix key printing".

Fix the issue by increasing the buffer to 48 bytes.

Reported-by: Michael Hench <michaelhench@gmail.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Tested-by: Michael Hench <michaelhench@gmail.com>
Cc: stable@vger.kernel.org [v3.3+]
2012-08-22 11:51:33 +03:00
Sage Weil
45f2e081f5 ceph: avoid divide by zero in __validate_layout()
If "l->stripe_unit" is zero the the mod on the next line will cause a
divide by zero bug.  This comes from the copy_from_user() in
ceph_ioctl_set_layout_policy().  Passing 0 is valid, though (it means
"do not change") so avoid the % check in that case.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Alex Elder <elder@inktank.com>
2012-08-21 15:55:28 -07:00
Sage Weil
6c5e50fa61 ceph: tolerate (and warn on) extraneous dentry from mds
If the MDS gives us a dentry and we weren't prepared to handle it,
WARN_ON_ONCE instead of crashing.

Reported-by: Yan, Zheng <zheng.z.yan@intel.com>
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Alex Elder <elder@inktank.com>
2012-08-21 15:55:25 -07:00
Artem Bityutskiy
c212f4020d UBIFS: fix replay regression
Commit "d51f17e UBIFS: simplify reply code a bit" introduces a bug with the
following symptoms:

UBIFS error (pid 1): replay_log_leb: first CS node at LEB 3:0 has wrong commit number 0 expected 1

The issue is that we start replaying the log from UBIFS_LOG_LNUM instead
of c->lhead_lnum. This patch fixes that.

Reported-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2012-08-21 15:25:25 +03:00
Artem Bityutskiy
11e3be0be2 UBIFS: fix crash on error path
This patch fixes a regression introduced by
"4994297 UBIFS: make ubifs_lpt_init clean-up in case of failure" which
I've hit while running the 'integck -p' test. When remount the file-system
from R/O mode to R/W mode and 'lpt_init_wr()' fails, we free _all_ LPT
resources by calling 'ubifs_lpt_free(c, 0)', even those needed for R/O
mode. This leads to subsequent crashes, e.g., if we try to unmount
the file-system.

Cc: stable@vger.kernel.org [v3.5+]
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2012-08-21 15:25:24 +03:00
Artem Bityutskiy
73e8712aa0 UBIFS: remove stale commentary
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2012-08-21 15:25:24 +03:00
J. Bruce Fields
39307655a1 nfsd4: fix security flavor of NFSv4.0 callback
Commit d5497fc693 "nfsd4: move rq_flavor
into svc_cred" forgot to remove cl_flavor from the client, leaving two
places (cl_flavor and cl_cred.cr_flavor) for the flavor to be stored.
After that patch, the latter was the one that was updated, but the
former was the one that the callback used.

Symptoms were a long delay on utime().  This is because the utime()
generated a setattr which recalled a delegation, but the cb_recall was
ignored by the client because it had the wrong security flavor.

Cc: stable@vger.kernel.org
Tested-by: Jamie Heilman <jamie@audible.transient.net>
Reported-by: Jamie Heilman <jamie@audible.transient.net>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-08-20 18:38:36 -04:00
Al Viro
0e665d5d11 vfs: missed source of ->f_pos races
compat_sys_{read,write}v() need the same "pass a copy of file->f_pos" thing
as sys_{read,write}{,v}().

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-08-20 10:11:47 -07:00
Sage Weil
d1c338a509 libceph: delay debugfs initialization until we learn global_id
The debugfs directory includes the cluster fsid and our unique global_id.
We need to delay the initialization of the debug entry until we have
learned both the fsid and our global_id from the monitor or else the
second client can't create its debugfs entry and will fail (and multiple
client instances aren't properly reflected in debugfs).

Reported by: Yan, Zheng <zheng.z.yan@intel.com>
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
2012-08-20 10:03:15 -07:00
Trond Myklebust
0866004304 NFSv3: Ensure that do_proc_get_root() reports errors correctly
If the rpc call to NFS3PROC_FSINFO fails, then we need to report that
error so that the mount fails. Otherwise we can end up with a
superblock with completely unusable values for block sizes, maxfilesize,
etc.

Reported-by: Yuanming Chen <hikvision_linux@163.com>
Cc: stable@vger.kernel.org
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-08-20 12:52:42 -04:00
Trond Myklebust
7653f6ff4e NFSv4: Ensure that nfs4_alloc_client cleans up on error.
Any pointer that was allocated through nfs_alloc_client() needs to be
freed via a call to nfs_free_client().

Reported-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-08-20 12:12:29 -04:00
Pavel Shilovsky
ea7b4887e7 CIFS: Fix cifs_do_create error hadnling
Commit d2c127197d caused a regression
in cifs_do_create error handling. Fix this by closing a file handle
in the case of a get_inode_info(_unix) error. Also remove unnecessary
checks for newinode being NULL.

Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org>
Signed-off-by: Steve French <smfrench@gmail.com>
2012-08-19 22:30:18 -05:00
Steve French
985e4ff016 cifs: print error code if smb signature verification fails
While trying to debug a SMB signature related issue with Windows Servers
figured out it might be easier to debug if we print the error code from
cifs_verify_signature(). Also, fix indendation while at it.

Signed-off-by: Suresh Jayaraman <sjayaraman@suse.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
2012-08-19 22:30:13 -05:00
Pavel Shilovsky
7411286088 CIFS: Fix log messages in packet checking for SMB2
Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org>
Signed-off-by: Steve French <smfrench@gmail.com>
2012-08-19 22:30:07 -05:00
Steve French
b7ca692896 CIFS: Protect i_nlink from being negative
that can cause warning messages.  Pavel had initially
suggested a smaller patch around drop_nlink, after
a similar problem was discovered NFS.  Protecting
additional places where nlink is touched was
suggested by Jeff Layton and is included in this.

Reviewed-by: Pavel Shilovsky <pshilovsky@samba.org>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Signed-off-by: Steve French <smfrench@gmail.com>
2012-08-19 22:30:00 -05:00
Linus Torvalds
20fb1936de Merge branch 'vfs-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs
Pull vfs fixes from Miklos Szeredi.

This mainly fixes some confusion about whether the open 'mode' variable
passed around should contain the full file type (S_IFREG etc)
information or just the permission mode.  In particular, the lack of
proper file type information had confused fuse.

* 'vfs-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
  vfs: fix propagation of atomic_open create error on negative dentry
  fuse: check create mode in atomic open
  vfs: pass right create mode to may_o_create()
  vfs: atomic_open(): fix create mode usage
  vfs: canonicalize create mode in build_open_flags()
2012-08-18 10:02:17 -07:00
Linus Torvalds
ef824bfba2 The following are all bug fixes and regressions. The most notable are
the ones which cause problems for ext4 on RAID --- a performance
 problem when mounting very large filesystems, and a kernel OOPS when
 doing an rm -rf on large directory hierarchies on fast devices.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABCAAGBQJQLlVSAAoJENNvdpvBGATwhOYQAKx22YF9+SHnnVv2GtCQrsE6
 N3acE/FoDYiO1/LRa5M6NDO3ZIL4Vqi6409LYFQky/SQL8ziM5CBeLOBD9qPTE1L
 AGrzWn4vzZcjEf90ZEBN99fS5Uj1A9Axz2denxy7Hfb2HcSXfcuuVQXpdS42cFo8
 gwtlX4jxmPkbjRlAdeqYVBNuWpTZ0a+UyLc4A6v3aLcbzPSJvPYmI7mKCksiSySU
 yt0atjMDb56blAQJ2TITdAZN6rQShNzyok2pPfxaLusl5g0Gtjq8sSEPof1PQ1zq
 gDFc+kpZvUyPdwQzV3IL8+TodFFZ0x/2OhqoYCTKajROLHGjQFsdkb8EJbnNeDff
 EDxIjeVJR0kzSuSNWu+n5028lmEd9Sk7Ykr37cHeUxG8/0SADUqSDQYNhvbOPQsj
 iq5dwF79tKjuMqjJrABuWA1ZNgHBISXgyBmHLXgEk3LrgucT9UIg8Zlkhq480SYO
 JXhmkO2Ka4UwkVUShoWgRtEzRgUxhINBShs6g67zwm6slS4s2CWHnqhUn6EQe6+r
 DY/hoUA8KbdG3Cf5iJBFM2kUO68CDIXeJjocA7JvlouRgQSxmkOceIuk7DusAitM
 nHJKAtSNgC/z7yMoNi7YN0S5YYcCebmO1MEPzYSpPH07YwLJVNmh9Fk6BIfb7vi3
 vJSQMBrgGrbaXrnhBA7z
 =Zulv
 -----END PGP SIGNATURE-----

Merge tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4

Pull ext4 bug fixes from Ted Ts'o:
 "The following are all bug fixes and regressions.  The most notable are
  the ones which cause problems for ext4 on RAID --- a performance
  problem when mounting very large filesystems, and a kernel OOPS when
  doing an rm -rf on large directory hierarchies on fast devices."

* tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  ext4: fix kernel BUG on large-scale rm -rf commands
  ext4: fix long mount times on very big file systems
  ext4: don't call ext4_error while block group is locked
  ext4: avoid kmemcheck complaint from reading uninitialized memory
  ext4: make sure the journal sb is written in ext4_clear_journal_err()
2012-08-17 08:04:47 -07:00
Ian Kent
d807ff838f autofs4 - fix expire check
In some cases when an autofs indirect mount is contained in a file
system that is marked as shared (such as when systemd does the
equivalent of "mount --make-rshared /" early in the boot), mounts
stop expiring.

When this happens the first expiry check on a mountpoint dentry in
autofs_expire_indirect() sees a mountpoint dentry with a higher
than minimal reference count. Consequently the dentry is condidered
busy and the actual expiry check is never done.

This particular check was originally meant as an optimisation to
detect a path walk in progress but with the addition of rcu-walk
it can be ineffective anyway.

Removing the test allows automounts to expire again since the
actual expire check doesn't rely on the dentry reference count.

Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-08-17 06:56:39 -07:00
Theodore Ts'o
89a4e48f84 ext4: fix kernel BUG on large-scale rm -rf commands
Commit 968dee7722: "ext4: fix hole punch failure when depth is greater
than 0" introduced a regression in v3.5.1/v3.6-rc1 which caused kernel
crashes when users ran run "rm -rf" on large directory hierarchy on
ext4 filesystems on RAID devices:

    BUG: unable to handle kernel NULL pointer dereference at 0000000000000028

    Process rm (pid: 18229, threadinfo ffff8801276bc000, task ffff880123631710)
    Call Trace:
     [<ffffffff81236483>] ? __ext4_handle_dirty_metadata+0x83/0x110
     [<ffffffff812353d3>] ext4_ext_truncate+0x193/0x1d0
     [<ffffffff8120a8cf>] ? ext4_mark_inode_dirty+0x7f/0x1f0
     [<ffffffff81207e05>] ext4_truncate+0xf5/0x100
     [<ffffffff8120cd51>] ext4_evict_inode+0x461/0x490
     [<ffffffff811a1312>] evict+0xa2/0x1a0
     [<ffffffff811a1513>] iput+0x103/0x1f0
     [<ffffffff81196d84>] do_unlinkat+0x154/0x1c0
     [<ffffffff8118cc3a>] ? sys_newfstatat+0x2a/0x40
     [<ffffffff81197b0b>] sys_unlinkat+0x1b/0x50
     [<ffffffff816135e9>] system_call_fastpath+0x16/0x1b
    Code: 8b 4d 20 0f b7 41 02 48 8d 04 40 48 8d 04 81 49 89 45 18 0f b7 49 02 48 83 c1 01 49 89 4d 00 e9 ae f8 ff ff 0f 1f 00 49 8b 45 28 <48> 8b 40 28 49 89 45 20 e9 85 f8 ff ff 0f 1f 80 00 00 00

    RIP  [<ffffffff81233164>] ext4_ext_remove_space+0xa34/0xdf0

This could be reproduced as follows:

The problem in commit 968dee7722 was that caused the variable 'i' to
be left uninitialized if the truncate required more space than was
available in the journal.  This resulted in the function
ext4_ext_truncate_extend_restart() returning -EAGAIN, which caused
ext4_ext_remove_space() to restart the truncate operation after
starting a new jbd2 handle.

Reported-by: Maciej Żenczykowski <maze@google.com>
Reported-by: Marti Raudsepp <marti@juffo.org>
Tested-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: stable@vger.kernel.org
2012-08-17 09:42:17 -04:00
Theodore Ts'o
0548bbb853 ext4: fix long mount times on very big file systems
Commit 8aeb00ff85a: "ext4: fix overhead calculation used by
ext4_statfs()" introduced a O(n**2) calculation which makes very large
file systems take forever to mount.  Fix this with an optimization for
non-bigalloc file systems.  (For bigalloc file systems the overhead
needs to be set in the the superblock.)

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: stable@vger.kernel.org
2012-08-17 09:23:00 -04:00
Theodore Ts'o
7a4c5de27e ext4: don't call ext4_error while block group is locked
While in ext4_validate_block_bitmap(), if an block allocation bitmap
is found to be invalid, we call ext4_error() while the block group is
still locked.  This causes ext4_commit_super() to call a function
which might sleep while in an atomic context.

There's no need to keep the block group locked at this point, so hoist
the ext4_error() call up to ext4_validate_block_bitmap() and release
the block group spinlock before calling ext4_error().

The reported stack trace can be found at:

	http://article.gmane.org/gmane.comp.file-systems.ext4/33731

Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: stable@vger.kernel.org
2012-08-17 09:06:06 -04:00
Bryan Schumaker
12dfd08055 NFS: return -ENOKEY when the upcall fails to map the name
This allows the normal error-paths to handle the error, rather than
making a special call to complete_request_key() just for this instance.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Tested-by: William Dauchy <wdauchy@gmail.com>
Cc: stable@vger.kernel.org [>= 3.4]
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-08-16 17:20:06 -04:00
Bryan Schumaker
c5066945b7 NFS: Clear key construction data if the idmap upcall fails
idmap_pipe_downcall already clears this field if the upcall succeeds,
but if it fails (rpc.idmapd isn't running) the field will still be set
on the next call triggering a BUG_ON().  This patch tries to handle all
possible ways that the upcall could fail and clear the idmap key data
for each one.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Tested-by: William Dauchy <wdauchy@gmail.com>
Cc: stable@vger.kernel.org [>= 3.4]
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-08-16 17:20:02 -04:00
Trond Myklebust
cff298c721 NFSv4: Don't use private xdr_stream fields in decode_getacl
Instead of using the private field xdr->p from struct xdr_stream,
use the public xdr_stream_pos().

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-08-16 16:15:51 -04:00
Trond Myklebust
b291f1b1c8 NFSv4: Fix the acl cache size calculation
Currently, we do not take into account the size of the 16 byte
struct nfs4_cached_acl header, when deciding whether or not we should
cache the acl data.  Consequently, we will end up allocating an
8k buffer in order to fit a maximum size 4k acl.

This patch adjusts the calculation so that we limit the cache size
to 4k for the acl header+data.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-08-16 16:15:50 -04:00
Trond Myklebust
519d3959e3 NFSv4: Fix pointer arithmetic in decode_getacl
Resetting the cursor xdr->p to a previous value is not a safe
practice: if the xdr_stream has crossed out of the initial iovec,
then a bunch of other fields would need to be reset too.

Fix this issue by using xdr_enter_page() so that the buffer gets
page aligned at the bitmap _before_ we decode it.

Also fix the confusion of the ACL length with the page buffer length
by not adding the base offset to the ACL length...

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
2012-08-16 16:15:50 -04:00
bjschuma@gmail.com
425e776d93 NFS: Alias the nfs module to nfs4
This allows distros to remove the line from their modprobe
configuration.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Cc: stable@vger.kernel.org
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-08-16 16:15:49 -04:00
bjschuma@gmail.com
1ae811ee27 NFS: Fix a regression when loading the NFS v4 module
Some systems have a modprobe.d/nfs.conf file that sets an nfs4 alias
pointing to nfs.ko, rather than nfs4.ko.  This can prevent the v4 module
from loading on mount, since the kernel sees that something named "nfs4"
has already been loaded.  To work around this, I've renamed the modules
to "nfsv2.ko" "nfsv3.ko" and "nfsv4.ko".

I also had to move the nfs4_fs_type back to nfs.ko to ensure that `mount
-t nfs4` still works.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-08-16 16:15:49 -04:00
Ian Kent
a45440f05e autofs4 - fix get_next_positive_subdir()
Following a report of a crash during an automount expire I found that
the locking in fs/autofs4/expire.c:get_next_positive_subdir() was wrong.
Not only is the locking wrong but the function is more complex than it
needs to be.

The function is meant to calculate (and dget) the next entry in the list
of directories contained in the root of an autofs mount point (an autofs
indirect mount to be precise). The main problem was that the d_lock of
the owner of the list was not being taken when walking the list, which
lead to list corruption under load. The only other lock that needs to
be taken is against the next dentry candidate so it can be checked for
usability.

Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-08-16 11:58:28 -07:00
Linus Torvalds
2eac9eb8a2 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse
Pull fuse updates from Miklos Szeredi.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
  fuse: verify all ioctl retry iov elements
  fuse: add missing INIT flag descriptions
  fuse: add missing INIT flags
  fuse: update attributes on aio_read
  fuse: invalidate inode mapping if mtime changes
  fuse: add FUSE_AUTO_INVAL_DATA init flag
2012-08-16 11:46:31 -07:00
Sage Weil
62b2ce964b vfs: fix propagation of atomic_open create error on negative dentry
If ->atomic_open() returns -ENOENT, we take care to return the create
error (e.g., EACCES), if any.  Do the same when ->atomic_open() returns 1
and provides a negative dentry.

This fixes a regression where an unprivileged open O_CREAT fails with
ENOENT instead of EACCES, introduced with the new atomic_open code.  It
is tested by the open/08.t test in the pjd posix test suite, and was
observed on top of fuse (backed by ceph-fuse).

Signed-off-by: Sage Weil <sage@inktank.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2012-08-16 19:29:09 +02:00
Nikola Pajkovsky
68766a2edc udf: fix retun value on error path in udf_load_logicalvol
In case we detect a problem and bail out, we fail to set "ret" to a
nonzero value, and udf_load_logicalvol will mistakenly report success.

Signed-off-by: Nikola Pajkovsky <npajkovs@redhat.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2012-08-15 14:23:23 +02:00
Jan Kara
2e84f2641e jbd: don't write superblock when unmounting an ro filesystem
This sequence:

results in an IO error when unmounting the RO filesystem. The bug was
introduced by:

commit 9754e39c7b
Author: Jan Kara <jack@suse.cz>
Date:   Sat Apr 7 12:33:03 2012 +0200

    jbd: Split updating of journal superblock and marking journal empty

which lost some of the magic in journal_update_superblock() which
used to test for a journal with no outstanding transactions.

This is a port of a jbd2 fix by Eric Sandeen.

CC: <stable@vger.kernel.org> # 3.4.x
Signed-off-by: Jan Kara <jack@suse.cz>
2012-08-15 13:53:30 +02:00
Miklos Szeredi
af109bca94 fuse: check create mode in atomic open
Verify that the VFS is passing us a complete create mode with the S_IFREG to
atomic open.

Reported-by: Steve <steveamigauk@yahoo.co.uk>
Reported-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Tested-by: Richard W.M. Jones <rjones@redhat.com>
2012-08-15 13:01:24 +02:00
Miklos Szeredi
38227f78a5 vfs: pass right create mode to may_o_create()
Pass the umask-ed create mode to may_o_create() instead of the original one.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Tested-by: Richard W.M. Jones <rjones@redhat.com>
2012-08-15 13:01:24 +02:00
Miklos Szeredi
62b259d8b3 vfs: atomic_open(): fix create mode usage
Don't mask S_ISREG off the create mode before passing to ->atomic_open().  Other
methods (->create, ->mknod) also get the complete file mode and filesystems
expect it.

Reported-by: Steve <steveamigauk@yahoo.co.uk>
Reported-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Tested-by: Richard W.M. Jones <rjones@redhat.com>
2012-08-15 13:01:24 +02:00
Miklos Szeredi
e68726ff72 vfs: canonicalize create mode in build_open_flags()
Userspace can pass weird create mode in open(2) that we canonicalize to 
"(mode & S_IALLUGO) | S_IFREG" in vfs_create().

The problem is that we use the uncanonicalized mode before calling vfs_create()
with unforseen consequences.

So do the canonicalization early in build_open_flags().

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Tested-by: Richard W.M. Jones <rjones@redhat.com>
CC: stable@vger.kernel.org
2012-08-15 13:01:24 +02:00
Jeff Mahoney
48d1788493 reiserfs: fix deadlocks with quotas
The BKL push-down for reiserfs made lock recursion a special case that needs
to be handled explicitly. One of the cases that was unhandled is dropping
the quota during inode eviction. Both reiserfs_evict_inode and
reiserfs_write_dquot take the write lock, but when the journal lock is
taken it only drops one the references. The locking rules are that the journal
lock be acquired before the write lock so leaving the reference open leads
to a ABBA deadlock.

This patch pushes the unlock up before clear_inode and avoids the recursive
locking.

Another ABBA situation can occur when the write lock is dropped while reading
the bitmap buffer while in the quota code. When the lock is reacquired, it
will deadlock against dquot->dq_lock and dqopt->dqio_mutex in the dquot_acquire
path. It's safe to retain the lock across the read and should be cached under
write load.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2012-08-15 00:22:57 +02:00
Jeff Liu
6ea2eea1fa quota: Move down dqptr_sem read after initializing default warn[] type at __dquot_alloc_space().
sb->s_dqopt->dqptr_sem is used to serialize ops using pointers from inode to
dquots.  But for __dquot_alloc_space(), it could be safely moved down after the
default warn[] array got initialized.

Signed-off-by: Jie Liu <jeff.liu@oracle.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2012-08-15 00:22:57 +02:00
Ashish Sangwan
dc141a402b UDF: During mount free lvid_bh before rescanning with different blocksize
If s_lvid_bh is not freed and set to NULL before re-scanning partition
with default block size, we might end up using wrong lvid in case
s_lvid_bh is not updated in udf_load_logicalvolint during rescan.

Signed-off-by: Ashish Sangwan <ashish.sangwan2@gmail.com>
Signed-off-by: Namjae Jeon <linkinjeon@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2012-08-15 00:22:56 +02:00
Ian Abbott
bb2b6d19ec udf: fix udf_setsize() for file data in ICB
If the new size is larger than the old size and the old file data was
stored in the ICB (iinfo->i_alloc_type == ICBTAG_FLAG_AD_IN_ICB) and the
new size still fits in the ICB, skip the call to udf_extend_file() as it
does not handle this i_alloc_type value (it calls BUG()).

Signed-off-by: Ian Abbott <abbotti@mev.co.uk>
Signed-off-by: Jan Kara <jack@suse.cz>
2012-08-15 00:21:58 +02:00
Linus Torvalds
15fc5deb1f Merge branch 'for-linus-3.6' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
Pull btrfs merge fix from Chris Mason:
 "This fixes a merge error in rc1.  The calls to mnt_want_write should
  have been removed."

* 'for-linus-3.6' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
  Btrfs: remove mnt_want_write call in btrfs_mksubvol
2012-08-12 21:28:41 +03:00
Alexander Block
e00da2067b Btrfs: remove mnt_want_write call in btrfs_mksubvol
We got a recursive lock in mksubvol because the caller already held
a lock. I think we got into this due to a merge error. Commit a874a63
removed the mnt_want_write call from btrfs_mksubvol and added a
replacement call to mnt_want_write_file in btrfs_ioctl_snap_create_transid.
Commit e7848683 however tried to move all calls to mnt_want_write above
i_mutex. So somewhere while merging this, it got mixed up. The
solution is to remove the mnt_want_write call completely from
mksubvol.

Reported-by: David Sterba <dave@jikos.cz>
Signed-off-by: Alexander Block <ablock84@googlemail.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2012-08-09 11:01:54 -04:00
Fengguang Wu
647d1e4c52 block: move down direct IO plugging
Move unplugging for direct I/O from around ->direct_IO() down to
do_blockdev_direct_IO(). This implicitly adds plugging for direct
writes.

CC: Li Shaohua <shli@fusionio.com>
Acked-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2012-08-09 15:23:09 +02:00
Alexey Khoroshilov
389d7b26d9 bio: Fix potential memory leak in bio_find_or_create_slab()
Do not leak memory by updating pointer with potentially NULL realloc return value.

Found by Linux Driver Verification project (linuxtesting.org).

Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
Acked-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2012-08-09 15:19:25 +02:00
Trond Myklebust
47fbf7976e NFSv4.1: Remove a bogus BUG_ON() in nfs4_layoutreturn_done
Ever since commit 0a57cdac3f (NFSv4.1 send layoutreturn to fence
disconnected data server) we've been sending layoutreturn calls
while there is potentially still outstanding I/O to the data
servers. The reason we do this is to avoid races between replayed
writes to the MDS and the original writes to the DS.

When this happens, the BUG_ON() in nfs4_layoutreturn_done can
be triggered because it assumes that we would never call
layoutreturn without knowing that all I/O to the DS is
finished. The fix is to remove the BUG_ON() now that the
assumptions behind the test are obsolete.

Reported-by: Boaz Harrosh <bharrosh@panasas.com>
Reported-by: Tigran Mkrtchyan <tigran.mkrtchyan@desy.de>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org [>=3.5]
2012-08-08 16:03:13 -04:00
Zach Brown
fb6ccff667 fuse: verify all ioctl retry iov elements
Commit 7572777eef attempted to verify that
the total iovec from the client doesn't overflow iov_length() but it
only checked the first element.  The iovec could still overflow by
starting with a small element.  The obvious fix is to check all the
elements.

The overflow case doesn't look dangerous to the kernel as the copy is
limited by the length after the overflow.  This fix restores the
intention of returning an error instead of successfully copying less
than the iovec represented.

I found this by code inspection.  I built it but don't have a test case.
I'm cc:ing stable because the initial commit did as well.

Signed-off-by: Zach Brown <zab@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
CC: <stable@vger.kernel.org>         [2.6.37+]
2012-08-06 18:19:24 +02:00
Theodore Ts'o
7e731bc9a1 ext4: avoid kmemcheck complaint from reading uninitialized memory
Commit 03179fe923 introduced a kmemcheck complaint in
ext4_da_get_block_prep() because we save and restore
ei->i_da_metadata_calc_last_lblock even though it is left
uninitialized in the case where i_da_metadata_calc_len is zero.

This doesn't hurt anything, but silencing the kmemcheck complaint
makes it easier for people to find real bugs.

Addresses https://bugzilla.kernel.org/show_bug.cgi?id=45631
(which is marked as a regression).

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: stable@vger.kernel.org
2012-08-05 23:28:16 -04:00
Theodore Ts'o
d796c52ef0 ext4: make sure the journal sb is written in ext4_clear_journal_err()
After we transfer set the EXT4_ERROR_FS bit in the file system
superblock, it's not enough to call jbd2_journal_clear_err() to clear
the error indication from journal superblock --- we need to call
jbd2_journal_update_sb_errno() as well.  Otherwise, when the root file
system is mounted read-only, the journal is replayed, and the error
indicator is transferred to the superblock --- but the s_errno field
in the jbd2 superblock is left set (since although we cleared it in
memory, we never flushed it out to disk).

This can end up confusing e2fsck.  We should make e2fsck more robust
in this case, but the kernel shouldn't be leaving things in this
confused state, either.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: stable@kernel.org
2012-08-05 19:04:57 -04:00
Al Viro
fe7c80518e missed mnt_drop_write() in do_dentry_open()
This one ought to be __mnt_drop_write(), to match __mnt_want_write()
in the beginning...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-08-04 12:15:41 +04:00
Artem Bityutskiy
5c57f20b82 UBIFS: nuke pdflush from comments
The pdflush thread is long gone, so this patch removes references to pdflush
from UBIFS comments.

Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-08-04 12:15:41 +04:00
Artem Bityutskiy
e76e0ec984 gfs2: nuke pdflush from comments
The pdflush thread is long gone, so this patch removes references to pdflush
from gfs comments.

Cc: Steven Whitehouse <swhiteho@redhat.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-08-04 12:15:40 +04:00
Artem Bityutskiy
166ac34b74 nilfs2: nuke write_super from comments
The '->write_super' superblock method is gone, and this patch removes all the
references to 'write_super' from ntfs.

Cc: KONISHI Ryusuke <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-08-04 12:15:38 +04:00
Artem Bityutskiy
50640bcc0a hfs: nuke write_super from comments
The '->write_super' superblock method is gone, and this patch removes all the
references to 'write_super' from hfs.

Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-08-04 12:15:38 +04:00
Artem Bityutskiy
0d5c3eba2e vfs: nuke pdflush from comments
The pdflush thread is long gone, so this patch removes references to pdflush
from vfs comments.

Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-08-04 12:15:37 +04:00
Artem Bityutskiy
12810ad708 jbd/jbd2: nuke write_super from comments
The '->write_super' superblock method is gone, and this patch removes all the
references to 'write_super' from various jbd and jbd2.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jan Kara <jack@suse.cz>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-08-04 12:15:36 +04:00
Artem Bityutskiy
b257031408 btrfs: nuke pdflush from comments
The pdflush thread is long gone, so this patch removes references to pdflush
from btrfs comments.

Cc: Chris Mason <chris.mason@fusionio.com>
Cc: linux-btrfs@vger.kernel.org
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-08-04 12:15:35 +04:00
Artem Bityutskiy
34eaadaf22 btrfs: nuke write_super from comments
The '->write_super' superblock method is gone, and this patch removes all the
references to 'write_super' from btrfs.

Cc: Chris Mason <chris.mason@fusionio.com>
Cc: linux-btrfs@vger.kernel.org
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-08-04 12:15:35 +04:00
Artem Bityutskiy
f6463b0da6 ext4: nuke pdflush from comments
The pdflush thread is long gone, so this patch removes references to pdflush
from ext4 comments.

Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-08-04 12:15:34 +04:00
Artem Bityutskiy
7652bdfcb5 ext4: nuke write_super from comments
The '->write_super' superblock method is gone, and this patch removes all the
references to 'write_super' from ext3.

Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-08-04 12:15:33 +04:00
Artem Bityutskiy
d3009c6cff ext3: nuke write_super from comments
The '->write_super' superblock method is gone, and this patch removes all the
references to 'write_super' from ext3.

Cc: Jan Kara <jack@suse.cz>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-08-04 12:15:32 +04:00
Artem Bityutskiy
f0cd2dbb6c vfs: kill write_super and sync_supers
Finally we can kill the 'sync_supers' kernel thread along with the
'->write_super()' superblock operation because all the users are gone.
Now every file-system is supposed to self-manage own superblock and
its dirty state.

The nice thing about killing this thread is that it improves power management.
Indeed, 'sync_supers' is a source of monotonic system wake-ups - it woke up
every 5 seconds no matter what - even if there were no dirty superblocks and
even if there were no file-systems using this service (e.g., btrfs and
journalled ext4 do not need it). So it was wasting power most of the time. And
because the thread was in the core of the kernel, all systems had to have it.
So I am quite happy to make it go away.

Interestingly, this thread is a left-over from the pdflush kernel thread which
was a self-forking kernel thread responsible for all the write-back in old
Linux kernels. It was turned into per-block device BDI threads, and
'sync_supers' was a left-over. Thus, R.I.P, pdflush as well.

Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-08-04 01:24:44 +04:00
Linus Torvalds
d42d1dabf3 Merge branch 'for-linus' of git://git.open-osd.org/linux-open-osd
Pull exofs update from Boaz Harrosh:
 "They are all mostly fixes, except the most important patch by Artem
  Bityutskiy which removes the use of s_dirt.  After this patch s_dirt
  can be completely removed from the tree."

* 'for-linus' of git://git.open-osd.org/linux-open-osd:
  ore: Fix out-of-bounds access in _ios_obj()
  exofs: Use proper max_IO calculations from ore
  exofs: Fix __r4w_get_page when offset is beyond i_size
  exofs: stop using s_dirt
  exofs: readpage_strip: Add a BUG_ON to check for PageLocked(page)
2012-08-03 13:24:07 -07:00
Boaz Harrosh
7de6e28417 pnfs-obj: Better IO pattern in case of unaligned offset
Depending on layout and ARCH, ORE has some limits on max IO sizes
which is communicated on (what else) ore_layout->max_io_length,
which is always stripe aligned.
This was considered as the pg_test boundary for splitting and starting
a new IO.

But in the case of a long IO where the start offset is not aligned
what would happen is that both end of IO[N] and start of IO[N+1]
would be unaligned, causing each IO boundary parity unit to be
calculated and written twice.

So what we do in this patch is split the very start of an unaligned
IO, up to a stripe boundary, and then next IO's can continue fully
aligned til the end.

We might be sacrificing the case where the full unaligned IO would
fit within a single max_io_length, but the sacrifice is well worth
the elimination of double calculation and parity units IO.
Actually the sacrificing is marginal and is almost unmeasurable.

TODO:
	If we know the total expected linear segment that will
	be received, at pg_init, we could use that information
	in many places:
	1. blocks-layout get_layout write segment size
	2. Better mds-threshold
	3. In above situation for a better clean split

	I will do this in future submission.

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-08-02 17:42:51 -04:00
Peng Tao
f616638409 NFS41: add pg_layout_private to nfs_pageio_descriptor
To allow layout driver to pass private information around
pg_init/pg_doio.

Signed-off-by: Peng Tao <tao.peng@emc.com>
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-08-02 17:41:18 -04:00
Idan Kedar
21d1f58aed pnfs: nfs4_proc_layoutget returns void
since the only user of nfs4_proc_layoutget is send_layoutget, which
ignores its return value, there is no reason to return any value.

Signed-off-by: Idan Kedar <idank@tonian.com>
Signed-off-by: Benny Halevy <bhalevy@tonian.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-08-02 17:39:06 -04:00
Idan Kedar
8554116e17 pnfs: defer release of pages in layoutget
we have encountered a bug whereby reading a lot of files (copying
fedora's /bin) from a pNFS mount and hitting Ctrl+C in the middle caused
a general protection fault in xdr_shrink_bufhead. this function is
called when decoding the response from LAYOUTGET. the decoding is done
by a worker thread, and the caller of LAYOUTGET waits for the worker
thread to complete.

hitting Ctrl+C caused the synchronous wait to end and the next thing the
caller does is to free the pages, so when the worker thread calls
xdr_shrink_bufhead, the pages are gone. therefore, the cleanup of these
pages has been moved to nfs4_layoutget_release.

Signed-off-by: Idan Kedar <idank@tonian.com>
Signed-off-by: Benny Halevy <bhalevy@tonian.com>
Cc: stable@vger.kernel.org
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-08-02 17:38:54 -04:00
Jeff Layton
3dd4765fce nfs: tear down caches in nfs_init_writepagecache when allocation fails
...and ensure that we tear down the nfs_commit_data cache too when
unloading the module.

Cc: Bryan Schumaker <bjschuma@netapp.com>
Cc: stable@vger.kernel.org
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-08-02 17:36:07 -04:00
Linus Torvalds
c8924234bd Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client
Pull two ceph fixes from Sage Weil:
 "The first patch fixes up the old crufty open intent code to use the
  atomic_open stuff properly, and the second fixes a possible null deref
  and memory leak with the crypto keys."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
  libceph: fix crypto key null deref, memory leak
  ceph: simplify+fix atomic_open
2012-08-02 10:57:31 -07:00
Linus Torvalds
410fc4ce8a - Fixes a bug when the lower filesystem mount options include 'acl', but the
eCryptfs mount options do not
 - Cleanups in the messaging code
 - Better handling of empty files in the lower filesystem to improve usability.
   Failed file creations are now cleaned up and empty lower files are converted
   into eCryptfs during open().
 - The write-through cache changes are being reverted due to bugs that are not
   easy to fix. Stability outweighs the performance enhancements here.
 - Improvement to the mount code to catch unsupported ciphers specified in the
   mount options
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABCgAGBQJQGhVWAAoJENaSAD2qAscKjvcQAMgF66sl8KjwzFCgojh30a4u
 1hCXltxUCLjXxjXxyygUHb/pH0xdvC+Ss3FTtsPXAjgYm2lXjjoOmVG8WAvwHHx1
 2ZjDo+8fQ4XA8Rl9kYuvt/abF0IssNRK3csTWplR7lpoQ8AWbkpkag1I4WZhibey
 cgs/zECl8ACTJ5zQ+AyRGnrssq4jI7xZAKWLK0+KKk7S9yIRI7K/xdz1xK39jGK6
 N09Dw3VWY/bMcMq77ZXBtyHdP7KR7wKUtCeQttmCvdf20Ocy6AXzr1FRKvUxionF
 Sf31tJim0u9OO8hmy8cjCyWEy9LHnXnSd/5vn+Qd9ok9GvuiYmKw07rbXi/gjhBX
 ai5PKtl05WiQgp80BybUYfIY1Hq71MsppNi6h9Zgiid5rEvWWvCBWBWP95G8DTmC
 6TwLaCG0rh8uuZyeiVrs3xZQ2IG5Zmu0CX3XGyfsaLvqmQWhtT5ZQVMeMQEEBxyQ
 ur9SSU2O/nC8ceLB7fzGmZPTLZUWOuYQnd24NJNK+7j0P+Km7pqmDYpCdwmFpx3C
 CQ0gGaJGHeycvBF327bwxPmsPdO4fy+nmEL8vrEXPTyQ3ZVAPvxZK9t8Jpk4UFOl
 JSTWFiK0mvgE+5dX2kB0nzN7iD7hcMgmozht50qQP3OUkI1kkBrEHgHVO3KzgUIA
 +aRAljeLdelq44JBxjiB
 =4vDd
 -----END PGP SIGNATURE-----

Merge tag 'ecryptfs-3.6-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tyhicks/ecryptfs

Pull ecryptfs fixes from Tyler Hicks:
 - Fixes a bug when the lower filesystem mount options include 'acl',
   but the eCryptfs mount options do not
 - Cleanups in the messaging code
 - Better handling of empty files in the lower filesystem to improve
   usability.  Failed file creations are now cleaned up and empty lower
   files are converted into eCryptfs during open().
 - The write-through cache changes are being reverted due to bugs that
   are not easy to fix.  Stability outweighs the performance
   enhancements here.
 - Improvement to the mount code to catch unsupported ciphers specified
   in the mount options

* tag 'ecryptfs-3.6-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tyhicks/ecryptfs:
  eCryptfs: check for eCryptfs cipher support at mount
  eCryptfs: Revert to a writethrough cache model
  eCryptfs: Initialize empty lower files when opening them
  eCryptfs: Unlink lower inode when ecryptfs_create() fails
  eCryptfs: Make all miscdev functions use daemon ptr in file private_data
  eCryptfs: Remove unused messaging declarations and function
  eCryptfs: Copy up POSIX ACL and read-only flags from lower mount
2012-08-02 10:56:34 -07:00
Linus Torvalds
630103ea2c Merge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6
Pull CIFS update from Steve French:
 "Adds SMB2 rmdir/mkdir capability to the SMB2/SMB2.1 support in cifs.

  I am holding up a few more days on merging the remainder of the
  SMB2/SMB2.1 enablement although it is nearing review completion, in
  order to address some review comments from Jeff Layton on a few of the
  subsequent SMB2 patches, and also to debug an unrelated cifs problem
  that Pavel discovered."

* 'for-next' of git://git.samba.org/sfrench/cifs-2.6:
  CIFS: Add SMB2 support for rmdir
  CIFS: Move rmdir code to ops struct
  CIFS: Add SMB2 support for mkdir operation
  CIFS: Separate protocol specific part from mkdir
  CIFS: Simplify cifs_mkdir call
2012-08-02 10:54:11 -07:00
Sage Weil
5ef50c3bec ceph: simplify+fix atomic_open
The initial ->atomic_open op was carried over from the old intent code,
which was incomplete and didn't really work.  Replace it with a fresh
method.  In particular:

 * always attempt to do an atomic open+lookup, both for the create case
   and for lookups of existing files.
 * fix symlink handling by returning 1 to the VFS so that we can follow
   the link to its destination. This fixes a longstanding ceph bug (#2392).

Signed-off-by: Sage Weil <sage@inktank.com>
2012-08-02 09:11:19 -07:00
Boaz Harrosh
9e62bb4458 ore: Fix out-of-bounds access in _ios_obj()
_ios_obj() is accessed by group_index not device_table index.

The oc->comps array is only a group_full of devices at a time
it is not like ore_comp_dev() which is indexed by a global
device_table index.

This did not BUG until now because exofs only uses a single
COMP for all devices. But with other FSs like PanFS this is
not true.

This bug was only in the write_path, all other users were
using it correctly

[This is a bug since 3.2 Kernel]
CC: Stable Tree <stable@kernel.org>

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
2012-08-02 16:41:56 +03:00
Boaz Harrosh
be388f3d9a exofs: Use proper max_IO calculations from ore
exofs_max_io_pages should just use the ORE's
calculated layout->max_io_length,

And avoid unnecessary BUGs, calculations made here were
also a layering violation.

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
2012-08-02 16:39:17 +03:00
Boaz Harrosh
4b74f6ea84 exofs: Fix __r4w_get_page when offset is beyond i_size
It is very common for the end of the file to be unaligned on
stripe size. But since we know it's beyond file's end then
the XOR should be preformed with all zeros.

Old code used to just read zeros out of the OSD devices, which is a great
waist. But what scares me more about this situation is that, we now have
pages attached to the file's mapping that are beyond i_size. I don't
like the kind of bugs this calls for.

Fix both birds, by returning a global ZERO_PAGE, if offset is beyond
i_size.

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
2012-08-02 14:58:22 +03:00
Artem Bityutskiy
66153f6e0f exofs: stop using s_dirt
Exofs has the '->write_super()' handler and makes some use of the '->s_dirt'
superblock flag, but it really needs neither of them because it never sets
's_dirt' to one which means the VFS never calls its '->write_super()' handler.
Thus, remove both.

Note, I am trying to remove both 's_dirt' and 'write_super()' from VFS
altogether once all users are gone.

Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
2012-08-02 14:52:13 +03:00
Kautuk Consul
0e8d96dd2c exofs: readpage_strip: Add a BUG_ON to check for PageLocked(page)
readpage_strip can be called from several code paths all of which
require that the page be locked before any operations are carried
out.

Since we export the exofs_readpage callback to the VFS, add a
BUG_ON to check for PageLocked(page) to make sure that this
understanding is never compromised.

Signed-off-by: Kautuk Consul <consul.kautuk@gmail.com>
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
2012-08-02 14:52:12 +03:00
Jianpeng Ma
53362a05ae fs/block-dev.c:fix performance regression in O_DIRECT writes to md block devices
For regular file, write operaion used blk_plug function.But for block
file,write operation did not use blk_plug.
This patch is also for write-cache mode for block-device.

Signed-off-by: Jianpeng Ma <majianpeng@gmail.com>
Reviewed-by: NeilBrown <neilb@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2012-08-02 09:50:39 +02:00