Commit Graph

28239 Commits

Author SHA1 Message Date
Linus Torvalds
e1d33a5c63 xfs: bugfixes for 3.6-rc4
- fix uninitialised variable in xfs_rtbuf_get()
 - unlock the AGI buffer when looping in xfs_dialloc
 - check for possible overflow in xfs_ioc_trim
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.10 (GNU/Linux)
 
 iQIcBAABAgAGBQJQN7FMAAoJENaLyazVq6ZOrwsQAMB2Cg/t/XeiPFD5QtvFX7Ua
 9cS3klSo9p29Xr5oeo54b3mF+5dycS61PQu2zdg4XhU5ZqoWoS6xUIFxzo3yWiVi
 nC/BbgMQTgkfEQs+2ulmXRgP8A2DdakSDmvAZDQNURZQwH4zykU0tPKD9V009TMW
 SaIprIWQqhRrsd9prrv364FgKv3/vyuFL1Agho304o3ynk+2SLDzm3eqdTaXTD9Z
 9OGP5Hs5Mh9CZmr46MIpRnoaHppqNRZh7EbDyMFL8kMN8F2yOdJk5MgA1m7gPJ5Z
 ADfnv3laxEp04hhD5lazePYJwZlg/KB77Y4YXD5hmTSznIxcQNlfRcMV9ju4FGgr
 Nb5SgA/6iBk5dkZyuyIdqCzj1Mt6SimpIWb3+/vcS8+VaqQUsyoufccO0zrgeoIA
 /O1PtCe//haFENlD+hMc9QrQSISsaR/YYUv0yz4YcsxKsdmyYpbHdTyFhVjnkKoS
 BtA0ZOgWYebgEzr7J1ijW085CrDvAc9jsPujOWmZlBJWcfo1rtVfqhXWll0ms5cr
 ZRoWqLDqoOhjlUDqwMN4dk6ENxG4QbSl6AWXXVXrUvtvaNzoi0MBNX3HPt/RVmF5
 rGKFvLuIxBHAfD1xYCqtCfP9yzeQzDy1SPdASForEZOyM+RVUnOf9i+1nhbsu/97
 AhiW4lozQetWzx+RQwU+
 =+FuN
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-v3.6-rc4' of git://oss.sgi.com/xfs/xfs

Pull xfs bugfixes from Ben Myers:
 - fix uninitialised variable in xfs_rtbuf_get()
 - unlock the AGI buffer when looping in xfs_dialloc
 - check for possible overflow in xfs_ioc_trim

* tag 'for-linus-v3.6-rc4' of git://oss.sgi.com/xfs/xfs:
  xfs: check for possible overflow in xfs_ioc_trim
  xfs: unlock the AGI buffer when looping in xfs_dialloc
  xfs: fix uninitialised variable in xfs_rtbuf_get()
2012-08-25 11:47:06 -07:00
Linus Torvalds
8497ae61d0 Merge branch 'for-3.6' of git://linux-nfs.org/~bfields/linux
Pull nfsd bugfixes from J. Bruce Fields:
 "Particular thanks to Michael Tokarev, Malahal Naineni, and Jamie
  Heilman for their testing and debugging help."

* 'for-3.6' of git://linux-nfs.org/~bfields/linux:
  svcrpc: fix svc_xprt_enqueue/svc_recv busy-looping
  svcrpc: sends on closed socket should stop immediately
  svcrpc: fix BUG() in svc_tcp_clear_pages
  nfsd4: fix security flavor of NFSv4.0 callback
2012-08-25 11:43:41 -07:00
Linus Torvalds
a7e546f175 Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull block-related fixes from Jens Axboe:

 - Improvements to the buffered and direct write IO plugging from
   Fengguang.

 - Abstract out the mapping of a bio in a request, and use that to
   provide a blk_bio_map_sg() helper.  Useful for mapping just a bio
   instead of a full request.

 - Regression fix from Hugh, fixing up a patch that went into the
   previous release cycle (and marked stable, too) attempting to prevent
   a loop in __getblk_slow().

 - Updates to discard requests, fixing up the sizing and how we align
   them.  Also a change to disallow merging of discard requests, since
   that doesn't really work properly yet.

 - A few drbd fixes.

 - Documentation updates.

* 'for-linus' of git://git.kernel.dk/linux-block:
  block: replace __getblk_slow misfix by grow_dev_page fix
  drbd: Write all pages of the bitmap after an online resize
  drbd: Finish requests that completed while IO was frozen
  drbd: fix drbd wire compatibility for empty flushes
  Documentation: update tunable options in block/cfq-iosched.txt
  Documentation: update tunable options in block/cfq-iosched.txt
  Documentation: update missing index files in block/00-INDEX
  block: move down direct IO plugging
  block: remove plugging at buffered write time
  block: disable discard request merge temporarily
  bio: Fix potential memory leak in bio_find_or_create_slab()
  block: Don't use static to define "void *p" in show_partition_start()
  block: Add blk_bio_map_sg() helper
  block: Introduce __blk_segment_map_sg() helper
  fs/block-dev.c:fix performance regression in O_DIRECT writes to md block devices
  block: split discard into aligned requests
  block: reorganize rounding of max_discard_sectors
2012-08-25 11:36:43 -07:00
Linus Torvalds
62688e5b64 Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull UDF, ext3 & reiserfs fixes from Jan Kara:
 "A couple of fixes (udf, reiserfs, ext3) that accumulated over my
  vacation."

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  udf: fix retun value on error path in udf_load_logicalvol
  jbd: don't write superblock when unmounting an ro filesystem
  reiserfs: fix deadlocks with quotas
  quota: Move down dqptr_sem read after initializing default warn[] type at __dquot_alloc_space().
  UDF: During mount free lvid_bh before rescanning with different blocksize
  udf: fix udf_setsize() for file data in ICB
2012-08-23 21:56:22 -07:00
Linus Torvalds
f4673d6f18 UBIFS fixes for 3.6:
1. Fix crash on error which prevents emulated power-cut testing.
 2. Fix log reply regression introduced in 3.6-rc1.
 3. Fix UBIFS complaints about too small debug buffer size which.
 4. Fix error message spelling, and remove incorrect commentary.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABAgAGBQJQNdHYAAoJECmIfjd9wqK0t+cQANEkF1Xgs8cd/5U++tQWCGyu
 EInLm+416ytTy4Vg2bLwlJ4OnUhgtTSTeKovtp3yoSKytxB4vf50qI5P/gInXSxo
 mI1KkvHZLraFzFPGtnbXGCg0YS8xZLIhw2p8VDuNFcxZIEiuAGZ2qMC9tPzzzle6
 em/yDIumR//oioFHOqVNu6wKE3do9QY5fo0meqE6QTliKppSDhWcO9MCfodkFOCp
 WtlD2dfIQv3SdaZW+X8skbqMFQ9KJRAzk59hl1Cm2jlXEHEHYmvGwqTCSOk8W/0c
 3ZSgUVW1YKBYq25yA3Xdz6aDCgRT6LYsi/EH0VNNMlt/H9RfcCNTGe96ioTcgsOT
 FroLZZUK9PiQYmqUqnk8eqmUyq888vKZY3Sbxt5/bbZhSoK6aYzNgfLlkZNfnHDL
 unzpyPqYq9VahuTgY5ravDclPKCc2p7xeWJYeOE9JtUY6/fVKJh9qERrtVoNP+nD
 LjS89G/tNBn9ZPk8KyXgikJ4JZ7sw30vySmBdGtx+ckzXqKTsDtLznvdSU3/XiJM
 zdqYV3gulgoJtIPXFVuI3i3GeKyXuJlPpjNimuFn+9mNElZfrvs6RcGbQ3Df/3Mc
 omfTrRxS+vRadGC9MZoIMCV27p7+cEynbLLVXtvPebwqiO1tV1fCspdJGjJmkbvL
 14hi7hTYvYCtMR+m8i2+
 =VvYb
 -----END PGP SIGNATURE-----

Merge tag 'upstream-3.6-rc3' of git://git.infradead.org/linux-ubifs

Pull UBIFS fixes from Artem Bityutskiy:
 - Fix crash on error which prevents emulated power-cut testing.
 - Fix log reply regression introduced in 3.6-rc1.
 - Fix UBIFS complaints about too small debug buffer size which.
 - Fix error message spelling, and remove incorrect commentary.

* tag 'upstream-3.6-rc3' of git://git.infradead.org/linux-ubifs:
  UBIFS: fix error messages spelling
  UBIFS: fix complaints about too small debug buffer size
  UBIFS: fix replay regression
  UBIFS: fix crash on error path
  UBIFS: remove stale commentary
2012-08-23 21:50:40 -07:00
Tomas Racek
a672e1be30 xfs: check for possible overflow in xfs_ioc_trim
If range.start or range.minlen is bigger than filesystem size, return
invalid value error. This fixes possible overflow in BTOBB macro when
passed value was nearly ULLONG_MAX.

Signed-off-by: Tomas Racek <tracek@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2012-08-23 14:48:44 -05:00
Christoph Hellwig
7612903099 xfs: unlock the AGI buffer when looping in xfs_dialloc
Also update some commens in the area to make the code easier to read.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2012-08-23 14:48:32 -05:00
Dave Chinner
0b9e3f6d84 xfs: fix uninitialised variable in xfs_rtbuf_get()
Results in this assert failure in generic/090:

XFS: Assertion failed: *nmap >= 1, file: fs/xfs/xfs_bmap.c, line: 4363
.....
Call Trace:
 [<ffffffff814680db>] xfs_bmapi_read+0x6b/0x370
 [<ffffffff814b64b2>] xfs_rtbuf_get+0x42/0x130
 [<ffffffff814b6f09>] xfs_rtget_summary+0x89/0x120
 [<ffffffff814b7bfe>] xfs_rtallocate_extent_size+0xce/0x340
 [<ffffffff814b89f0>] xfs_rtallocate_extent+0x240/0x290
 [<ffffffff81462c1a>] xfs_bmap_rtalloc+0x1ba/0x340
 [<ffffffff81463a65>] xfs_bmap_alloc+0x35/0x40
 [<ffffffff8146f111>] xfs_bmapi_allocate+0xf1/0x350
 [<ffffffff8146f9de>] xfs_bmapi_write+0x66e/0xa60
 [<ffffffff8144538a>] xfs_iomap_write_direct+0x22a/0x3f0
 [<ffffffff8143707b>] __xfs_get_blocks+0x38b/0x5d0
 [<ffffffff814372d4>] xfs_get_blocks_direct+0x14/0x20
 [<ffffffff811b0081>] do_blockdev_direct_IO+0xf71/0x1eb0
 [<ffffffff811b1015>] __blockdev_direct_IO+0x55/0x60
 [<ffffffff814355ca>] xfs_vm_direct_IO+0x11a/0x1e0
 [<ffffffff8112d617>] generic_file_direct_write+0xd7/0x1b0
 [<ffffffff8143e16c>] xfs_file_dio_aio_write+0x13c/0x320
 [<ffffffff8143e6f2>] xfs_file_aio_write+0x1c2/0x1d0
 [<ffffffff81174a07>] do_sync_write+0xa7/0xe0
 [<ffffffff81175288>] vfs_write+0xa8/0x160
 [<ffffffff81175702>] sys_pwrite64+0x92/0xb0
 [<ffffffff81b68f69>] system_call_fastpath+0x16/0x1b

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2012-08-23 14:48:16 -05:00
Hugh Dickins
676ce6d5ca block: replace __getblk_slow misfix by grow_dev_page fix
Commit 91f68c89d8 ("block: fix infinite loop in __getblk_slow")
is not good: a successful call to grow_buffers() cannot guarantee
that the page won't be reclaimed before the immediate next call to
__find_get_block(), which is why there was always a loop there.

Yesterday I got "EXT4-fs error (device loop0): __ext4_get_inode_loc:3595:
inode #19278: block 664: comm cc1: unable to read itable block" on console,
which pointed to this commit.

I've been trying to bisect for weeks, why kbuild-on-ext4-on-loop-on-tmpfs
sometimes fails from a missing header file, under memory pressure on
ppc G5.  I've never seen this on x86, and I've never seen it on 3.5-rc7
itself, despite that commit being in there: bisection pointed to an
irrelevant pinctrl merge, but hard to tell when failure takes between
18 minutes and 38 hours (but so far it's happened quicker on 3.6-rc2).

(I've since found such __ext4_get_inode_loc errors in /var/log/messages
from previous weeks: why the message never appeared on console until
yesterday morning is a mystery for another day.)

Revert 91f68c89d8, restoring __getblk_slow() to how it was (plus
a checkpatch nitfix).  Simplify the interface between grow_buffers()
and grow_dev_page(), and avoid the infinite loop beyond end of device
by instead checking init_page_buffers()'s end_block there (I presume
that's more efficient than a repeated call to blkdev_max_block()),
returning -ENXIO to __getblk_slow() in that case.

And remove akpm's ten-year-old "__getblk() cannot fail ... weird"
comment, but that is worrying: are all users of __getblk() really
now prepared for a NULL bh beyond end of device, or will some oops??

Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: stable@vger.kernel.org # 3.0 3.2 3.4 3.5
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2012-08-23 12:17:36 +02:00
Linus Torvalds
f753c4ec15 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client
Pull ceph fixes from Sage Weil:
 "Jim's fix closes a narrow race introduced with the msgr changes.  One
  fix resolves problems with debugfs initialization that Yan found when
  multiple client instances are created (e.g., two clusters mounted, or
  rbd + cephfs), another one fixes problems with mounting a nonexistent
  server subdirectory, and the last one fixes a divide by zero error
  from unsanitized ioctl input that Dan Carpenter found."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
  ceph: avoid divide by zero in __validate_layout()
  libceph: avoid truncation due to racing banners
  ceph: tolerate (and warn on) extraneous dentry from mds
  libceph: delay debugfs initialization until we learn global_id
2012-08-22 09:58:05 -07:00
Linus Torvalds
ad746be969 NFS client bugfixes for Linux 3.6
- NFSv3 mounts need to fail if the FSINFO rpc call fails
 - Ensure that the NFS commit cache gets torn down when we unload the
   NFS module.
 - Fix memory scribble issues when interrupting a LAYOUTGET rpc call
 - Fix NFSv4 legacy idmapper regressions
 - Fix issues with the NFSv4 getacl command
 - Fix a regression when using the legacy "mount -t nfs4"
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABAgAGBQJQNPyKAAoJEGcL54qWCgDyQlAP/AmnLaJmKMPj5P4GCiA/uxMV
 sr0XSSVAv+OgUtRxLNO5PDyunVIIEjwA5nGa2RZEG14+eYwNEeQehaGoaGqoKLhh
 ATOCMvHHEs5qSKLckWsZWflf6U/MLimYS3D/hHVUtP0QWbBMtol3ImABs2ht/VUc
 HW0wQoEG+5mhbhC87d4ku5E+OHQzYeNNmdX2VkKOmT0CwRd/bfsHxlGwMmcHldP2
 TSge+MRhoKNycpy0j7nugHj2pZ12UiX+O8371OlAketnPk0OA+6RNGMyNo49IQAn
 VEN9MFWc4ypH1+hJ0OvhzmUA6Zn2/lyQXUtwM1DcXeBmLI4aClb3bT3G3xKNh09O
 yLEvkh2aUUlmoTSgP7vDK/Ui3QwVnsaJULvg+gywQJG6qTSFgUQfErNKgJmabHQb
 d0EuumlSGnJ7qdC5nmrUp+M/CpbfGh15ax/JnTRGVK7C3jeCm18vc2np+z4EUDp/
 USzrvuBu1B8eI0nQNoht0BeNf7sEJGgFIn8BJzxDakP8buXPMqEvFwA8c1JmMeBX
 E6pbG2jHqPdrTJtxQTpMRqhA0MxbFlVNi5cCs6U7ifiBqbRWEXyNVgAHTwoIxIrn
 ASbL0s8gZH+EMEXNOmPmFiO2NUjBCeSzAjrsTmSpWa13YmGfiroZN8N/iPzLnFf+
 FR3SFUnoJHq7x4uLZoAX
 =lqfh
 -----END PGP SIGNATURE-----

Merge tag 'nfs-for-3.6-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client bugfixes from Trond Myklebust:
 - NFSv3 mounts need to fail if the FSINFO rpc call fails
 - Ensure that the NFS commit cache gets torn down when we unload the
   NFS module.
 - Fix memory scribble issues when interrupting a LAYOUTGET rpc call
 - Fix NFSv4 legacy idmapper regressions
 - Fix issues with the NFSv4 getacl command
 - Fix a regression when using the legacy "mount -t nfs4"

* tag 'nfs-for-3.6-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  NFSv3: Ensure that do_proc_get_root() reports errors correctly
  NFSv4: Ensure that nfs4_alloc_client cleans up on error.
  NFS: return -ENOKEY when the upcall fails to map the name
  NFS: Clear key construction data if the idmap upcall fails
  NFSv4: Don't use private xdr_stream fields in decode_getacl
  NFSv4: Fix the acl cache size calculation
  NFSv4: Fix pointer arithmetic in decode_getacl
  NFS: Alias the nfs module to nfs4
  NFS: Fix a regression when loading the NFS v4 module
  NFSv4.1: Remove a bogus BUG_ON() in nfs4_layoutreturn_done
  pnfs-obj: Better IO pattern in case of unaligned offset
  NFS41: add pg_layout_private to nfs_pageio_descriptor
  pnfs: nfs4_proc_layoutget returns void
  pnfs: defer release of pages in layoutget
  nfs: tear down caches in nfs_init_writepagecache when allocation fails
2012-08-22 09:57:25 -07:00
Linus Torvalds
467e9e51d0 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull assorted fixes - mostly vfs - from Al Viro:
 "Assorted fixes, with an unexpected detour into vfio refcounting logics
  (fell out when digging in an analog of eventpoll race in there)."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  task_work: add a scheduling point in task_work_run()
  fs: fix fs/namei.c kernel-doc warnings
  eventpoll: use-after-possible-free in epoll_create1()
  vfio: grab vfio_device reference *before* exposing the sucker via fd_install()
  vfio: get rid of vfio_device_put()/vfio_group_get_device* races
  vfio: get rid of open-coding kref_put_mutex
  introduce kref_put_mutex()
  vfio: don't dereference after kfree...
  mqueue: lift mnt_want_write() outside ->i_mutex, clean up a bit
2012-08-22 09:56:06 -07:00
Artem Bityutskiy
69f9025894 UBIFS: fix error messages spelling
Corruptio -> corruption.

Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2012-08-22 17:41:09 +03:00
Randy Dunlap
55852635a8 fs: fix fs/namei.c kernel-doc warnings
Fix kernel-doc warnings in fs/namei.c:

Warning(fs/namei.c:360): No description found for parameter 'inode'
Warning(fs/namei.c:672): No description found for parameter 'nd'

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Cc:	Alexander Viro <viro@zeniv.linux.org.uk>
Cc:	linux-fsdevel@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-08-22 10:30:10 -04:00
Al Viro
98022748f6 eventpoll: use-after-possible-free in epoll_create1()
As soon as we'd installed the file into descriptor table, it can
get closed by another thread.  Freeing ep in process...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-08-22 10:26:55 -04:00
Artem Bityutskiy
65b455b123 UBIFS: fix complaints about too small debug buffer size
When debugging is enabled, we use a temporary on-stack buffer for formatting
the key strings like "(11368871, direntry, 0xcd0750)". The buffer size is
32 bytes and sometimes it is not enough to fit the key string - e.g., when
inode numbers are high. This is not fatal, but the key strings are incomplete
and UBIFS complains like this:

	UBIFS assert failed in dbg_snprintf_key at 137 (pid 1)

This is a regression caused by "515315a UBIFS: fix key printing".

Fix the issue by increasing the buffer to 48 bytes.

Reported-by: Michael Hench <michaelhench@gmail.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Tested-by: Michael Hench <michaelhench@gmail.com>
Cc: stable@vger.kernel.org [v3.3+]
2012-08-22 11:51:33 +03:00
Sage Weil
45f2e081f5 ceph: avoid divide by zero in __validate_layout()
If "l->stripe_unit" is zero the the mod on the next line will cause a
divide by zero bug.  This comes from the copy_from_user() in
ceph_ioctl_set_layout_policy().  Passing 0 is valid, though (it means
"do not change") so avoid the % check in that case.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Alex Elder <elder@inktank.com>
2012-08-21 15:55:28 -07:00
Sage Weil
6c5e50fa61 ceph: tolerate (and warn on) extraneous dentry from mds
If the MDS gives us a dentry and we weren't prepared to handle it,
WARN_ON_ONCE instead of crashing.

Reported-by: Yan, Zheng <zheng.z.yan@intel.com>
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Alex Elder <elder@inktank.com>
2012-08-21 15:55:25 -07:00
Artem Bityutskiy
c212f4020d UBIFS: fix replay regression
Commit "d51f17e UBIFS: simplify reply code a bit" introduces a bug with the
following symptoms:

UBIFS error (pid 1): replay_log_leb: first CS node at LEB 3:0 has wrong commit number 0 expected 1

The issue is that we start replaying the log from UBIFS_LOG_LNUM instead
of c->lhead_lnum. This patch fixes that.

Reported-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2012-08-21 15:25:25 +03:00
Artem Bityutskiy
11e3be0be2 UBIFS: fix crash on error path
This patch fixes a regression introduced by
"4994297 UBIFS: make ubifs_lpt_init clean-up in case of failure" which
I've hit while running the 'integck -p' test. When remount the file-system
from R/O mode to R/W mode and 'lpt_init_wr()' fails, we free _all_ LPT
resources by calling 'ubifs_lpt_free(c, 0)', even those needed for R/O
mode. This leads to subsequent crashes, e.g., if we try to unmount
the file-system.

Cc: stable@vger.kernel.org [v3.5+]
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2012-08-21 15:25:24 +03:00
Artem Bityutskiy
73e8712aa0 UBIFS: remove stale commentary
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2012-08-21 15:25:24 +03:00
J. Bruce Fields
39307655a1 nfsd4: fix security flavor of NFSv4.0 callback
Commit d5497fc693 "nfsd4: move rq_flavor
into svc_cred" forgot to remove cl_flavor from the client, leaving two
places (cl_flavor and cl_cred.cr_flavor) for the flavor to be stored.
After that patch, the latter was the one that was updated, but the
former was the one that the callback used.

Symptoms were a long delay on utime().  This is because the utime()
generated a setattr which recalled a delegation, but the cb_recall was
ignored by the client because it had the wrong security flavor.

Cc: stable@vger.kernel.org
Tested-by: Jamie Heilman <jamie@audible.transient.net>
Reported-by: Jamie Heilman <jamie@audible.transient.net>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-08-20 18:38:36 -04:00
Al Viro
0e665d5d11 vfs: missed source of ->f_pos races
compat_sys_{read,write}v() need the same "pass a copy of file->f_pos" thing
as sys_{read,write}{,v}().

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-08-20 10:11:47 -07:00
Sage Weil
d1c338a509 libceph: delay debugfs initialization until we learn global_id
The debugfs directory includes the cluster fsid and our unique global_id.
We need to delay the initialization of the debug entry until we have
learned both the fsid and our global_id from the monitor or else the
second client can't create its debugfs entry and will fail (and multiple
client instances aren't properly reflected in debugfs).

Reported by: Yan, Zheng <zheng.z.yan@intel.com>
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
2012-08-20 10:03:15 -07:00
Trond Myklebust
0866004304 NFSv3: Ensure that do_proc_get_root() reports errors correctly
If the rpc call to NFS3PROC_FSINFO fails, then we need to report that
error so that the mount fails. Otherwise we can end up with a
superblock with completely unusable values for block sizes, maxfilesize,
etc.

Reported-by: Yuanming Chen <hikvision_linux@163.com>
Cc: stable@vger.kernel.org
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-08-20 12:52:42 -04:00
Trond Myklebust
7653f6ff4e NFSv4: Ensure that nfs4_alloc_client cleans up on error.
Any pointer that was allocated through nfs_alloc_client() needs to be
freed via a call to nfs_free_client().

Reported-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-08-20 12:12:29 -04:00
Linus Torvalds
20fb1936de Merge branch 'vfs-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs
Pull vfs fixes from Miklos Szeredi.

This mainly fixes some confusion about whether the open 'mode' variable
passed around should contain the full file type (S_IFREG etc)
information or just the permission mode.  In particular, the lack of
proper file type information had confused fuse.

* 'vfs-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
  vfs: fix propagation of atomic_open create error on negative dentry
  fuse: check create mode in atomic open
  vfs: pass right create mode to may_o_create()
  vfs: atomic_open(): fix create mode usage
  vfs: canonicalize create mode in build_open_flags()
2012-08-18 10:02:17 -07:00
Linus Torvalds
ef824bfba2 The following are all bug fixes and regressions. The most notable are
the ones which cause problems for ext4 on RAID --- a performance
 problem when mounting very large filesystems, and a kernel OOPS when
 doing an rm -rf on large directory hierarchies on fast devices.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABCAAGBQJQLlVSAAoJENNvdpvBGATwhOYQAKx22YF9+SHnnVv2GtCQrsE6
 N3acE/FoDYiO1/LRa5M6NDO3ZIL4Vqi6409LYFQky/SQL8ziM5CBeLOBD9qPTE1L
 AGrzWn4vzZcjEf90ZEBN99fS5Uj1A9Axz2denxy7Hfb2HcSXfcuuVQXpdS42cFo8
 gwtlX4jxmPkbjRlAdeqYVBNuWpTZ0a+UyLc4A6v3aLcbzPSJvPYmI7mKCksiSySU
 yt0atjMDb56blAQJ2TITdAZN6rQShNzyok2pPfxaLusl5g0Gtjq8sSEPof1PQ1zq
 gDFc+kpZvUyPdwQzV3IL8+TodFFZ0x/2OhqoYCTKajROLHGjQFsdkb8EJbnNeDff
 EDxIjeVJR0kzSuSNWu+n5028lmEd9Sk7Ykr37cHeUxG8/0SADUqSDQYNhvbOPQsj
 iq5dwF79tKjuMqjJrABuWA1ZNgHBISXgyBmHLXgEk3LrgucT9UIg8Zlkhq480SYO
 JXhmkO2Ka4UwkVUShoWgRtEzRgUxhINBShs6g67zwm6slS4s2CWHnqhUn6EQe6+r
 DY/hoUA8KbdG3Cf5iJBFM2kUO68CDIXeJjocA7JvlouRgQSxmkOceIuk7DusAitM
 nHJKAtSNgC/z7yMoNi7YN0S5YYcCebmO1MEPzYSpPH07YwLJVNmh9Fk6BIfb7vi3
 vJSQMBrgGrbaXrnhBA7z
 =Zulv
 -----END PGP SIGNATURE-----

Merge tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4

Pull ext4 bug fixes from Ted Ts'o:
 "The following are all bug fixes and regressions.  The most notable are
  the ones which cause problems for ext4 on RAID --- a performance
  problem when mounting very large filesystems, and a kernel OOPS when
  doing an rm -rf on large directory hierarchies on fast devices."

* tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  ext4: fix kernel BUG on large-scale rm -rf commands
  ext4: fix long mount times on very big file systems
  ext4: don't call ext4_error while block group is locked
  ext4: avoid kmemcheck complaint from reading uninitialized memory
  ext4: make sure the journal sb is written in ext4_clear_journal_err()
2012-08-17 08:04:47 -07:00
Ian Kent
d807ff838f autofs4 - fix expire check
In some cases when an autofs indirect mount is contained in a file
system that is marked as shared (such as when systemd does the
equivalent of "mount --make-rshared /" early in the boot), mounts
stop expiring.

When this happens the first expiry check on a mountpoint dentry in
autofs_expire_indirect() sees a mountpoint dentry with a higher
than minimal reference count. Consequently the dentry is condidered
busy and the actual expiry check is never done.

This particular check was originally meant as an optimisation to
detect a path walk in progress but with the addition of rcu-walk
it can be ineffective anyway.

Removing the test allows automounts to expire again since the
actual expire check doesn't rely on the dentry reference count.

Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-08-17 06:56:39 -07:00
Theodore Ts'o
89a4e48f84 ext4: fix kernel BUG on large-scale rm -rf commands
Commit 968dee7722: "ext4: fix hole punch failure when depth is greater
than 0" introduced a regression in v3.5.1/v3.6-rc1 which caused kernel
crashes when users ran run "rm -rf" on large directory hierarchy on
ext4 filesystems on RAID devices:

    BUG: unable to handle kernel NULL pointer dereference at 0000000000000028

    Process rm (pid: 18229, threadinfo ffff8801276bc000, task ffff880123631710)
    Call Trace:
     [<ffffffff81236483>] ? __ext4_handle_dirty_metadata+0x83/0x110
     [<ffffffff812353d3>] ext4_ext_truncate+0x193/0x1d0
     [<ffffffff8120a8cf>] ? ext4_mark_inode_dirty+0x7f/0x1f0
     [<ffffffff81207e05>] ext4_truncate+0xf5/0x100
     [<ffffffff8120cd51>] ext4_evict_inode+0x461/0x490
     [<ffffffff811a1312>] evict+0xa2/0x1a0
     [<ffffffff811a1513>] iput+0x103/0x1f0
     [<ffffffff81196d84>] do_unlinkat+0x154/0x1c0
     [<ffffffff8118cc3a>] ? sys_newfstatat+0x2a/0x40
     [<ffffffff81197b0b>] sys_unlinkat+0x1b/0x50
     [<ffffffff816135e9>] system_call_fastpath+0x16/0x1b
    Code: 8b 4d 20 0f b7 41 02 48 8d 04 40 48 8d 04 81 49 89 45 18 0f b7 49 02 48 83 c1 01 49 89 4d 00 e9 ae f8 ff ff 0f 1f 00 49 8b 45 28 <48> 8b 40 28 49 89 45 20 e9 85 f8 ff ff 0f 1f 80 00 00 00

    RIP  [<ffffffff81233164>] ext4_ext_remove_space+0xa34/0xdf0

This could be reproduced as follows:

The problem in commit 968dee7722 was that caused the variable 'i' to
be left uninitialized if the truncate required more space than was
available in the journal.  This resulted in the function
ext4_ext_truncate_extend_restart() returning -EAGAIN, which caused
ext4_ext_remove_space() to restart the truncate operation after
starting a new jbd2 handle.

Reported-by: Maciej Żenczykowski <maze@google.com>
Reported-by: Marti Raudsepp <marti@juffo.org>
Tested-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: stable@vger.kernel.org
2012-08-17 09:42:17 -04:00
Theodore Ts'o
0548bbb853 ext4: fix long mount times on very big file systems
Commit 8aeb00ff85a: "ext4: fix overhead calculation used by
ext4_statfs()" introduced a O(n**2) calculation which makes very large
file systems take forever to mount.  Fix this with an optimization for
non-bigalloc file systems.  (For bigalloc file systems the overhead
needs to be set in the the superblock.)

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: stable@vger.kernel.org
2012-08-17 09:23:00 -04:00
Theodore Ts'o
7a4c5de27e ext4: don't call ext4_error while block group is locked
While in ext4_validate_block_bitmap(), if an block allocation bitmap
is found to be invalid, we call ext4_error() while the block group is
still locked.  This causes ext4_commit_super() to call a function
which might sleep while in an atomic context.

There's no need to keep the block group locked at this point, so hoist
the ext4_error() call up to ext4_validate_block_bitmap() and release
the block group spinlock before calling ext4_error().

The reported stack trace can be found at:

	http://article.gmane.org/gmane.comp.file-systems.ext4/33731

Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: stable@vger.kernel.org
2012-08-17 09:06:06 -04:00
Bryan Schumaker
12dfd08055 NFS: return -ENOKEY when the upcall fails to map the name
This allows the normal error-paths to handle the error, rather than
making a special call to complete_request_key() just for this instance.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Tested-by: William Dauchy <wdauchy@gmail.com>
Cc: stable@vger.kernel.org [>= 3.4]
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-08-16 17:20:06 -04:00
Bryan Schumaker
c5066945b7 NFS: Clear key construction data if the idmap upcall fails
idmap_pipe_downcall already clears this field if the upcall succeeds,
but if it fails (rpc.idmapd isn't running) the field will still be set
on the next call triggering a BUG_ON().  This patch tries to handle all
possible ways that the upcall could fail and clear the idmap key data
for each one.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Tested-by: William Dauchy <wdauchy@gmail.com>
Cc: stable@vger.kernel.org [>= 3.4]
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-08-16 17:20:02 -04:00
Trond Myklebust
cff298c721 NFSv4: Don't use private xdr_stream fields in decode_getacl
Instead of using the private field xdr->p from struct xdr_stream,
use the public xdr_stream_pos().

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-08-16 16:15:51 -04:00
Trond Myklebust
b291f1b1c8 NFSv4: Fix the acl cache size calculation
Currently, we do not take into account the size of the 16 byte
struct nfs4_cached_acl header, when deciding whether or not we should
cache the acl data.  Consequently, we will end up allocating an
8k buffer in order to fit a maximum size 4k acl.

This patch adjusts the calculation so that we limit the cache size
to 4k for the acl header+data.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-08-16 16:15:50 -04:00
Trond Myklebust
519d3959e3 NFSv4: Fix pointer arithmetic in decode_getacl
Resetting the cursor xdr->p to a previous value is not a safe
practice: if the xdr_stream has crossed out of the initial iovec,
then a bunch of other fields would need to be reset too.

Fix this issue by using xdr_enter_page() so that the buffer gets
page aligned at the bitmap _before_ we decode it.

Also fix the confusion of the ACL length with the page buffer length
by not adding the base offset to the ACL length...

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
2012-08-16 16:15:50 -04:00
bjschuma@gmail.com
425e776d93 NFS: Alias the nfs module to nfs4
This allows distros to remove the line from their modprobe
configuration.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Cc: stable@vger.kernel.org
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-08-16 16:15:49 -04:00
bjschuma@gmail.com
1ae811ee27 NFS: Fix a regression when loading the NFS v4 module
Some systems have a modprobe.d/nfs.conf file that sets an nfs4 alias
pointing to nfs.ko, rather than nfs4.ko.  This can prevent the v4 module
from loading on mount, since the kernel sees that something named "nfs4"
has already been loaded.  To work around this, I've renamed the modules
to "nfsv2.ko" "nfsv3.ko" and "nfsv4.ko".

I also had to move the nfs4_fs_type back to nfs.ko to ensure that `mount
-t nfs4` still works.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-08-16 16:15:49 -04:00
Ian Kent
a45440f05e autofs4 - fix get_next_positive_subdir()
Following a report of a crash during an automount expire I found that
the locking in fs/autofs4/expire.c:get_next_positive_subdir() was wrong.
Not only is the locking wrong but the function is more complex than it
needs to be.

The function is meant to calculate (and dget) the next entry in the list
of directories contained in the root of an autofs mount point (an autofs
indirect mount to be precise). The main problem was that the d_lock of
the owner of the list was not being taken when walking the list, which
lead to list corruption under load. The only other lock that needs to
be taken is against the next dentry candidate so it can be checked for
usability.

Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-08-16 11:58:28 -07:00
Linus Torvalds
2eac9eb8a2 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse
Pull fuse updates from Miklos Szeredi.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
  fuse: verify all ioctl retry iov elements
  fuse: add missing INIT flag descriptions
  fuse: add missing INIT flags
  fuse: update attributes on aio_read
  fuse: invalidate inode mapping if mtime changes
  fuse: add FUSE_AUTO_INVAL_DATA init flag
2012-08-16 11:46:31 -07:00
Sage Weil
62b2ce964b vfs: fix propagation of atomic_open create error on negative dentry
If ->atomic_open() returns -ENOENT, we take care to return the create
error (e.g., EACCES), if any.  Do the same when ->atomic_open() returns 1
and provides a negative dentry.

This fixes a regression where an unprivileged open O_CREAT fails with
ENOENT instead of EACCES, introduced with the new atomic_open code.  It
is tested by the open/08.t test in the pjd posix test suite, and was
observed on top of fuse (backed by ceph-fuse).

Signed-off-by: Sage Weil <sage@inktank.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2012-08-16 19:29:09 +02:00
Nikola Pajkovsky
68766a2edc udf: fix retun value on error path in udf_load_logicalvol
In case we detect a problem and bail out, we fail to set "ret" to a
nonzero value, and udf_load_logicalvol will mistakenly report success.

Signed-off-by: Nikola Pajkovsky <npajkovs@redhat.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2012-08-15 14:23:23 +02:00
Jan Kara
2e84f2641e jbd: don't write superblock when unmounting an ro filesystem
This sequence:

results in an IO error when unmounting the RO filesystem. The bug was
introduced by:

commit 9754e39c7b
Author: Jan Kara <jack@suse.cz>
Date:   Sat Apr 7 12:33:03 2012 +0200

    jbd: Split updating of journal superblock and marking journal empty

which lost some of the magic in journal_update_superblock() which
used to test for a journal with no outstanding transactions.

This is a port of a jbd2 fix by Eric Sandeen.

CC: <stable@vger.kernel.org> # 3.4.x
Signed-off-by: Jan Kara <jack@suse.cz>
2012-08-15 13:53:30 +02:00
Miklos Szeredi
af109bca94 fuse: check create mode in atomic open
Verify that the VFS is passing us a complete create mode with the S_IFREG to
atomic open.

Reported-by: Steve <steveamigauk@yahoo.co.uk>
Reported-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Tested-by: Richard W.M. Jones <rjones@redhat.com>
2012-08-15 13:01:24 +02:00
Miklos Szeredi
38227f78a5 vfs: pass right create mode to may_o_create()
Pass the umask-ed create mode to may_o_create() instead of the original one.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Tested-by: Richard W.M. Jones <rjones@redhat.com>
2012-08-15 13:01:24 +02:00
Miklos Szeredi
62b259d8b3 vfs: atomic_open(): fix create mode usage
Don't mask S_ISREG off the create mode before passing to ->atomic_open().  Other
methods (->create, ->mknod) also get the complete file mode and filesystems
expect it.

Reported-by: Steve <steveamigauk@yahoo.co.uk>
Reported-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Tested-by: Richard W.M. Jones <rjones@redhat.com>
2012-08-15 13:01:24 +02:00
Miklos Szeredi
e68726ff72 vfs: canonicalize create mode in build_open_flags()
Userspace can pass weird create mode in open(2) that we canonicalize to 
"(mode & S_IALLUGO) | S_IFREG" in vfs_create().

The problem is that we use the uncanonicalized mode before calling vfs_create()
with unforseen consequences.

So do the canonicalization early in build_open_flags().

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Tested-by: Richard W.M. Jones <rjones@redhat.com>
CC: stable@vger.kernel.org
2012-08-15 13:01:24 +02:00
Jeff Mahoney
48d1788493 reiserfs: fix deadlocks with quotas
The BKL push-down for reiserfs made lock recursion a special case that needs
to be handled explicitly. One of the cases that was unhandled is dropping
the quota during inode eviction. Both reiserfs_evict_inode and
reiserfs_write_dquot take the write lock, but when the journal lock is
taken it only drops one the references. The locking rules are that the journal
lock be acquired before the write lock so leaving the reference open leads
to a ABBA deadlock.

This patch pushes the unlock up before clear_inode and avoids the recursive
locking.

Another ABBA situation can occur when the write lock is dropped while reading
the bitmap buffer while in the quota code. When the lock is reacquired, it
will deadlock against dquot->dq_lock and dqopt->dqio_mutex in the dquot_acquire
path. It's safe to retain the lock across the read and should be cached under
write load.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2012-08-15 00:22:57 +02:00
Jeff Liu
6ea2eea1fa quota: Move down dqptr_sem read after initializing default warn[] type at __dquot_alloc_space().
sb->s_dqopt->dqptr_sem is used to serialize ops using pointers from inode to
dquots.  But for __dquot_alloc_space(), it could be safely moved down after the
default warn[] array got initialized.

Signed-off-by: Jie Liu <jeff.liu@oracle.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2012-08-15 00:22:57 +02:00