Commit Graph

3400 Commits

Author SHA1 Message Date
Badari Pulavarty
ade1a29e16 [PATCH] ext3: Add "-o bh" option
This patch adds "-o bh" option to force use of buffer_heads.  This option
is needed when we make "nobh" as default - and if we run into problems.

Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:20 -07:00
Alexey Dobriyan
9637f28f8b [PATCH] reiserfs: remove reiserfs_aio_write()
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: <reiserfs-dev@namesys.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:19 -07:00
Michael LeMay
4eb582cf1f [PATCH] keys: add a way to store the appropriate context for newly-created keys
Add a /proc/<pid>/attr/keycreate entry that stores the appropriate context for
newly-created keys.  Modify the selinux_key_alloc hook to make use of the new
entry.  Update the flask headers to include a new "setkeycreate" permission
for processes.  Update the flask headers to include a new "create" permission
for keys.  Use the create permission to restrict which SIDs each task can
assign to newly-created keys.  Add a new parameter to the security hook
"security_key_alloc" to indicate whether it is being invoked by the kernel, or
from userspace.  If it is being invoked by the kernel, the security hook
should never fail.  Update the documentation to reflect these changes.

Signed-off-by: Michael LeMay <mdlemay@epoch.ncsc.mil>
Signed-off-by: James Morris <jmorris@namei.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:18 -07:00
Akinobu Mita
f116629d03 [PATCH] fs: use list_move()
This patch converts the combination of list_del(A) and list_add(A, B) to
list_move(A, B) under fs/.

Cc: Ian Kent <raven@themaw.net>
Acked-by: Joel Becker <joel.becker@oracle.com>
Cc: Neil Brown <neilb@cse.unsw.edu.au>
Cc: Hans Reiser <reiserfs-dev@namesys.com>
Cc: Urban Widmark <urban@teststation.com>
Acked-by: David Howells <dhowells@redhat.com>
Acked-by: Mark Fasheh <mark.fasheh@oracle.com>
Signed-off-by: Akinobu Mita <mita@miraclelinux.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:18 -07:00
Akinobu Mita
1bfba4e8ea [PATCH] core: use list_move()
This patch converts the combination of list_del(A) and list_add(A, B) to
list_move(A, B).

Cc: Greg Kroah-Hartman <gregkh@suse.de>
Cc: Ram Pai <linuxram@us.ibm.com>
Signed-off-by: Akinobu Mita <mita@miraclelinux.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:17 -07:00
Akinobu Mita
8e13059a37 [PATCH] use list_add_tail() instead of list_add()
This patch converts list_add(A, B.prev) to list_add_tail(A, &B) for
readability.

Acked-by: Karsten Keil <kkeil@suse.de>
Cc: Jan Harkes <jaharkes@cs.cmu.edu>
Acked-by: Jan Kara <jack@suse.cz>
AOLed-by: David Woodhouse <dwmw2@infradead.org>
Cc: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: Akinobu Mita <mita@miraclelinux.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:17 -07:00
Andreas Mohr
d6e05edc59 spelling fixes
acquired (aquired)
contiguous (contigious)
successful (succesful, succesfull)
surprise (suprise)
whether (weather)
some other misspellings

Signed-off-by: Andreas Mohr <andi@lisas.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-06-26 18:35:02 +02:00
Ingo Molnar
124a27fe32 [CIFS] Remove calls to to take f_owner.lock
CIFS takes/releases f_owner.lock - why?  It does not change anything in the
fowner state.  Remove this locking.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2006-06-26 13:47:59 +00:00
David S. Miller
3d824a46b7 [OPENPROMFS]: Rewrite using in-kernel device tree and seq_file.
We lose property writing functionality for the time being, but
that will be easy to add back.  The code and framework is so
much simpler now.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-25 23:19:14 -07:00
Steve French
cd49b492fe [CIFS] remove some redundant null pointer checks
some of them pointed out by Dave Jones

Signed-off-by: Steve French <sfrench@us.ibm.com>
2006-06-26 04:22:36 +00:00
Malcolm Parsons
fcc18e83e1 [PATCH] uclinux: use PER_LINUX_32BIT in binfmt_flat
binfmt_flat.c calls set_personality with PER_LINUX as the personality.
On the arm architecture this results in the program running in 26bit
usermode.  PER_LINUX_32BIT should be used instead.  This doesn't affect
other architectures that use binfmt_flat.

Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 21:04:24 -07:00
Alexey Dobriyan
1e788f8d1a [PATCH] xfs: update ->flush method proto
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Nathan Scott <nathans@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 17:43:32 -07:00
Linus Torvalds
f36f44de72 Fix NFS2 compile error
Trond had apparently merged the same patch twice, causing a duplicate
include of the "internal.h" file, with resulting obvious confusion.

Tssk.  I'm the only one allowed to send out trees that don't even
compile! Who does this Trond guy think he is?

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 12:30:33 -07:00
Linus Torvalds
1d77062b14 Merge git://git.linux-nfs.org/pub/linux/nfs-2.6
* git://git.linux-nfs.org/pub/linux/nfs-2.6: (51 commits)
  nfs: remove nfs_put_link()
  nfs-build-fix-99
  git-nfs-build-fixes
  Merge branch 'odirect'
  NFS: alloc nfs_read/write_data as direct I/O is scheduled
  NFS: Eliminate nfs_get_user_pages()
  NFS: refactor nfs_direct_free_user_pages
  NFS: remove user_addr, user_count, and pos from nfs_direct_req
  NFS: "open code" the NFS direct write rescheduler
  NFS: Separate functions for counting outstanding NFS direct I/Os
  NLM: Fix reclaim races
  NLM: sem to mutex conversion
  locks.c: add the fl_owner to nlm_compare_locks
  NFS: Display the chosen RPCSEC_GSS security flavour in /proc/mounts
  NFS: Split fs/nfs/inode.c
  NFS: Fix typo in nfs_do_clone_mount()
  NFS: Fix compile errors introduced by referrals patches
  NFSv4: Ensure that referral mounts bind to a reserved port
  NFSv4: A root pathname is sent as a zero component4
  NFSv4: Follow a referral
  ...
2006-06-25 10:54:14 -07:00
Linus Torvalds
25581ad107 Merge master.kernel.org:/pub/scm/linux/kernel/git/mchehab/v4l-dvb
* master.kernel.org:/pub/scm/linux/kernel/git/mchehab/v4l-dvb: (244 commits)
  V4L/DVB (4210b): git-dvb: tea575x-tuner build fix
  V4L/DVB (4210a): git-dvb versus matroxfb
  V4L/DVB (4209): Added some BTTV PCI IDs for newer boards
  Fixes some sync issues between V4L/DVB development and GIT
  V4L/DVB (4206): Cx88-blackbird: always set encoder height based on tvnorm->id
  V4L/DVB (4205): Merge tda9887 module into tuner.
  V4L/DVB (4203): Explicitly set the enum values.
  V4L/DVB (4202): allow selecting CX2341x port mode
  V4L/DVB (4200): Disable bitrate_mode when encoding mpeg-1.
  V4L/DVB (4199): Add cx2341x-specific control array to cx2341x.c
  V4L/DVB (4198): Avoid newer usages of obsoleted experimental MPEGCOMP API
  V4L/DVB (4197): Port new MPEG API to saa7134-empress with saa6752hs
  V4L/DVB (4196): Port cx88-blackbird to the new MPEG API.
  V4L/DVB (4193): Update cx2341x fw encoding API doc.
  V4L/DVB (4192): Use control helpers for saa7115, cx25840, msp3400.
  V4L/DVB (4191): Add CX2341X MPEG encoder module.
  V4L/DVB (4190): Add helper functions for control processing to v4l2-common.
  V4L/DVB (4189): Add videodev support for VIDIOC_S/G/TRY_EXT_CTRLS.
  V4L/DVB (4188): Add new MPEG control/ioctl definitions to videodev2.h
  V4L/DVB (4186): Add support for the DNTV Live! mini DVB-T card.
  ...
2006-06-25 10:09:31 -07:00
Hua Zhong
f58a1ebb22 [PATCH] remove unlikely(sb) in prune_dcache
likely profiling shows that the following is a miss.

After boot:
[+- ] Type | # True | # False | Function:Filename@Line
+unlikely |     1074|        0  prune_dcache()@:fs/dcache.c@409

After a bonnie++ run:
+unlikely |    66716|    19584  prune_dcache()@:fs/dcache.c@409

So remove it.

Signed-off-by: Hua Zhong <hzhong@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:26 -07:00
Evgeniy Dushistov
7d93a1a53a [PATCH] ext2: cleanup: put_page and comment fix
Things which force me think a little: why so?

Signed-off-by: Evgeniy Dushistov <dushistov@mail.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:25 -07:00
Ulrich Drepper
45c9b11a1d [PATCH] Implement AT_SYMLINK_FOLLOW flag for linkat
When the linkat() syscall was added the flag parameter was added in the
last minute but it wasn't used so far.  The following patch should change
that.  My tests show that this is all that's needed.

If OLDNAME is a symlink setting the flag causes linkat to follow the
symlink and create a hardlink with the target.  This is actually the
behavior POSIX demands for link() as well but Linux wisely does not do
this.  With this flag (which will most likely be in the next POSIX
revision) the programmer can choose the behavior, defaulting to the safe
variant.  As a side effect it is now possible to implement a
POSIX-compliant link(2) function for those who are interested.

  touch file
  ln -s file symlink

  linkat(fd, "symlink", fd, "newlink", 0)
    -> newlink is hardlink of symlink

  linkat(fd, "symlink", fd, "newlink", AT_SYMLINK_FOLLOW)
    -> newlink is hardlink of file

The value of AT_SYMLINK_FOLLOW is determined by the definition we already
use in glibc.

Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:22 -07:00
Frode Isaksen
04a3446c90 [PATCH] fs: sys_poll with timeout -1 bug fix
If you do a poll() call with timeout -1, the wait will be a big number
(depending on HZ) instead of infinite wait, since -1 is passed to the
msecs_to_jiffies function.

Signed-off-by: Frode Isaksen <frode.isaksen@gmail.com>
Acked-by: Nishanth Aravamudan <nacc@us.ibm.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:22 -07:00
Al Viro
2b943cf09d [PATCH] fix %s in affs_fill_super()
%s is only valid if array is known to contain NUL or precision is given and
does not exceed the size of array.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:22 -07:00
Serge E. Hallyn
fa366ad5d7 [PATCH] kthread: convert smbiod
Update smbiod to use kthread instead of deprecated kernel_thread.

[akpm@osdl.org: cleanup]
Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:21 -07:00
Eric Sesterhenn
099a71d995 [PATCH] Remove needless checks in fs/9p/vfs_inode.c
coverity found two needless checks in vfs_inode.c (cid #1165 and #1164)
In both cases inode is always NULL when we goto error; either because it
is still initialized to NULL or is set to NULL explicitly. This patch
simply removes these checks to save some code.

Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Acked-by: Eric Van Hensbergen <ericvh@gmail.com>
Cc: Ron Minnich <rminnich@lanl.gov>
Cc: Latchesar Ionkov <lucho@ionkov.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:20 -07:00
Miklos Szeredi
9c8ef5614d [PATCH] fuse: scramble lock owner ID
VFS uses current->files pointer as lock owner ID, and it wouldn't be
prudent to expose this value to userspace.  So scramble it with XTEA using
a per connection random key, known only to the kernel.  Only one direction
needs to be implemented, since the ID is never sent in the reverse
direction.

The XTEA algorithm is implemented inline since it's simple enough to do so,
and this adds less complexity than if the crypto API were used.

Thanks to Jesper Juhl for the idea.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:20 -07:00
Miklos Szeredi
a4d27e75ff [PATCH] fuse: add request interruption
Add synchronous request interruption.  This is needed for file locking
operations which have to be interruptible.  However filesystem may implement
interruptibility of other operations (e.g.  like NFS 'intr' mount option).

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:19 -07:00
Miklos Szeredi
f9a2842e56 [PATCH] fuse: rename the interrupted flag
Rename the 'interrupted' flag to 'aborted', since it indicates exactly that,
and next patch will introduce an 'interrupted' flag for a

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:19 -07:00
Miklos Szeredi
33649c91a3 [PATCH] fuse: ensure FLUSH reaches userspace
All POSIX locks owned by the current task are removed on close().  If the
FLUSH request resulting initiated by close() fails to reach userspace, there
might be locks remaining, which cannot be removed.

The only reason it could fail, is if allocating the request fails.  In this
case use the request reserved for RELEASE, or if that is currently used by
another FLUSH, wait for it to become available.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:19 -07:00
Miklos Szeredi
7142125937 [PATCH] fuse: add POSIX file locking support
This patch adds POSIX file locking support to the fuse interface.

This implementation doesn't keep any locking state in kernel.  Unlocking on
close() is handled by the FLUSH message, which now contains the lock owner id.

Mandatory locking is not supported.  The filesystem may enfoce mandatory
locking in userspace if needed.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:19 -07:00
Miklos Szeredi
bafa96541b [PATCH] fuse: add control filesystem
Add a control filesystem to fuse, replacing the attributes currently exported
through sysfs.  An empty directory '/sys/fs/fuse/connections' is still created
in sysfs, and mounting the control filesystem here provides backward
compatibility.

Advantages of the control filesystem over the previous solution:

  - allows the object directory and the attributes to be owned by the
    filesystem owner, hence letting unpriviled users abort the
    filesystem connection

  - does not suffer from module unload race

[akpm@osdl.org: fix this fs for recent dhowells depredations]
[akpm@osdl.org: fix 64-bit printk warnings]
Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:19 -07:00
Miklos Szeredi
51eb01e735 [PATCH] fuse: no backgrounding on interrupt
Don't put requests into the background when a fatal interrupt occurs while the
request is in userspace.  This removes a major wart from the implementation.

Backgrounding of requests was introduced to allow breaking of deadlocks.
However now the same can be achieved by aborting the filesystem through the
'abort' sysfs attribute.

This is a change in the interface, but should not cause problems, since these
kinds of deadlocks never happen during normal operation.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:19 -07:00
Ian Kent
f9022f6633 [PATCH] autofs4: need to invalidate children on tree mount expire
I've found a case where invalid dentrys in a mount tree, waiting to be
cleaned up by d_invalidate, prevent the expected expire.

In this case dentrys created during a lookup for which a mount fails or has
no entry in the mount map contribute to the d_count of the parent dentry.
These dentrys may not be invalidated prior to comparing the interanl usage
count of valid autofs dentrys against the dentry d_count which makes a
mount tree appear busy so it doesn't expire.

Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:18 -07:00
Peter Staubach
6e656be899 [PATCH] ftruncate does not always update m/ctime
In the course of trying to track down a bug where a file mtime was not
being updated correctly, it was discovered that the m/ctime updates were
not quite being handled correctly for ftruncate() calls.

Quoth SUSv3:

open(2):

        If O_TRUNC is set and the file did previously exist, upon
        successful completion, open() shall mark for update the st_ctime
        and st_mtime fields of the file.

truncate(2):

        Upon successful completion, if the file size is changed, this
        function shall mark for update the st_ctime and st_mtime fields
        of the file, and the S_ISUID and S_ISGID bits of the file mode
        may be cleared.

ftruncate(2):

        Upon successful completion, if fildes refers to a regular file,
        the ftruncate() function shall mark for update the st_ctime and
        st_mtime fields of the file and the S_ISUID and S_ISGID bits of
        the file mode may be cleared. If the ftruncate() function is
        unsuccessful, the file is unaffected.

The open(O_TRUNC) and truncate cases were being handled correctly, but the
ftruncate case was being handled like the truncate case.  The semantics of
truncate and ftruncate don't quite match, so ftruncate needs to be handled
slightly differently.

The attached patch addresses this issue for ftruncate(2).

My thanx to Stephen Tweedie and Trond Myklebust for their help in
understanding the situation and semantics.

Signed-off-by: Peter Staubach <staubach@redhat.com>
Cc: "Stephen C. Tweedie" <sct@redhat.com>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:15 -07:00
Johann Lombardi
92eeccd8ba [PATCH] ext3: cleanup dead code in ext3_add_entry()
The variables nlen and rlen are defined/initialized but not used in
ext3_add_entry().

Signed-off-by: Johann Lombardi <johann.lombardi@bull.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:15 -07:00
Florin Malita
0710d36a0f [PATCH] 9pfs: missing result check in v9fs_vfs_readlink() and v9fs_vfs_link()
__getname() may fail and return NULL (as pointed out by Coverity 437 &
1220).

Signed-off-by: Florin Malita <fmalita@gmail.com>
Acked-by: Eric Van Hensbergen <ericvh@gmail.com>
Cc: <rminnich@lanl.gov>
Cc: Latchesar Ionkov <lucho@ionkov.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:15 -07:00
Davide Libenzi
3419b23a91 [PATCH] epoll: use unlocked wqueue operations
A few days ago Arjan signaled a lockdep red flag on epoll locks, and
precisely between the epoll's device structure lock (->lock) and the wait
queue head lock (->lock).

Like I explained in another email, and directly to Arjan, this can't happen
in reality because of the explicit check at eventpoll.c:592, that does not
allow to drop an epoll fd inside the same epoll fd.  Since lockdep is
working on per-structure locks, it will never be able to know of policies
enforced in other parts of the code.

It was decided time ago of having the ability to drop epoll fds inside
other epoll fds, that triggers a very trick wakeup operations (due to
possibly reentrant callback-driven wakeups) handled by the
ep_poll_safewake() function.  While looking again at the code though, I
noticed that all the operations done on the epoll's main structure wait
queue head (->wq) are already protected by the epoll lock (->lock), so that
locked-style functions can be used to manipulate the ->wq member.  This
makes both a lock-acquire save, and lockdep happy.

Running totalmess on my dual opteron for a while did not reveal any problem
so far:

http://www.xmailserver.org/totalmess.c

Signed-off-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:13 -07:00
Valerie Henson
21730eed11 [PATCH] Make EXT2_DEBUG work again
This patch makes EXT2_DEBUG work again.  Due to lack of proper include
file, EXT2_DEBUG was undefined in bitmap.c and ext2_count_free() is left
out.  Moved to balloc.c and removed bitmap.c entirely.

Second, debug versions of ext2_count_free_{inodes/blocks} reacquires
superblock lock.  Moved lock into callers.

Signed-off-by: Val Henson <val_henson@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:12 -07:00
H. Peter Anvin
69755652c9 [PATCH] Make procfs obligatory except under CONFIG_EMBEDDED
Make procfs non-optional unless EMBEDDED is set, just like sysfs.  procfs
is already de facto required for a large subset of Linux functionality.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:11 -07:00
Mingming Cao
43d23f9039 [PATCH] ext3_fsblk_t: the rest of in-kernel filesystem blocks conversion
Convert the ext3 in-kernel filesystem blocks to ext3_fsblk_t.  Convert the
rest of all unsigned long type in-kernel filesystem blocks to ext3_fsblk_t,
and replace the printk format string respondingly.

Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:10 -07:00
Mingming Cao
1c2bf374a4 [PATCH] ext3_fsblk_t: filesystem, group blocks and bug fixes
Some of the in-kernel ext3 block variable type are treated as signed 4 bytes
int type, thus limited ext3 filesystem to 8TB (4kblock size based).  While
trying to fix them, it seems quite confusing in the ext3 code where some
blocks are filesystem-wide blocks, some are group relative offsets that need
to be signed value (as -1 has special meaning).  So it seem saner to define
two types of physical blocks: one is filesystem wide blocks, another is
group-relative blocks.  The following patches clarify these two types of
blocks in the ext3 code, and fix the type bugs which limit current 32 bit ext3
filesystem limit to 8TB.

With this series of patches and the percpu counter data type changes in the mm
tree, we are able to extend exts filesystem limit to 16TB.

This work is also a pre-request for the recent >32 bit ext3 work, and makes
the kernel to able to address 48 bit ext3 block a lot easier: Simply redefine
ext3_fsblk_t from unsigned long to sector_t and redefine the format string for
ext3 filesystem block corresponding.

Two RFC with a series patches have been posted to ext2-devel list and have
been reviewed and discussed:
http://marc.theaimsgroup.com/?l=ext2-devel&m=114722190816690&w=2

http://marc.theaimsgroup.com/?l=ext2-devel&m=114784919525942&w=2

Patches are tested on both 32 bit machine and 64 bit machine, <8TB ext3 and
>8TB ext3 filesystem(with the latest to be released e2fsprogs-1.39).  Tests
includes overnight fsx, tiobench, dbench and fsstress.

This patch:

Defines ext3_fsblk_t and ext3_grpblk_t, and the printk format string for
filesystem wide blocks.

This patch classifies all block group relative blocks, and ext3_fsblk_t blocks
occurs in the same function where used to be confusing before.  Also include
kernel bug fixes for filesystem wide in-kernel block variables.  There are
some fileystem wide blocks are treated as int/unsigned int type in the kernel
currently, especially in ext3 block allocation and reservation code.  This
patch fixed those bugs by converting those variables to ext3_fsblk_t(unsigned
long) type.

Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:10 -07:00
NeilBrown
01408c4939 [PATCH] Prepare for __copy_from_user_inatomic to not zero missed bytes
The problem is that when we write to a file, the copy from userspace to
pagecache is first done with preemption disabled, so if the source address is
not immediately available the copy fails *and* *zeros* *the* *destination*.

This is a problem because a concurrent read (which admittedly is an odd thing
to do) might see zeros rather that was there before the write, or what was
there after, or some mixture of the two (any of these being a reasonable thing
to see).

If the copy did fail, it will immediately be retried with preemption
re-enabled so any transient problem with accessing the source won't cause an
error.

The first copying does not need to zero any uncopied bytes, and doing so
causes the problem.  It uses copy_from_user_atomic rather than copy_from_user
so the simple expedient is to change copy_from_user_atomic to *not* zero out
bytes on failure.

The first of these two patches prepares for the change by fixing two places
which assume copy_from_user_atomic does zero the tail.  The two usages are
very similar pieces of code which copy from a userspace iovec into one or more
page-cache pages.  These are changed to remove the assumption.

The second patch changes __copy_from_user_inatomic* to not zero the tail.
Once these are accepted, I will look at similar patches of other architectures
where this is important (ppc, mips and sparc being the ones I can find).

This patch:

There is a problem with __copy_from_user_inatomic zeroing the tail of the
buffer in the case of an error.  As it is called in atomic context, the error
may be transient, so it results in zeros being written where maybe they
shouldn't be.

In the usage in filemap, this opens a window for a well timed read to see data
(zeros) which is not consistent with any ordering of reads and writes.

Most cases where __copy_from_user_inatomic is called, a failure results in
__copy_from_user being called immediately.  As long as the latter zeros the
tail, the former doesn't need to.  However in *copy_from_user_iovec
implementations (in both filemap and ntfs/file), it is assumed that
copy_from_user_inatomic will zero the tail.

This patch removes that assumption, so that after this patch it will
be safe for copy_from_user_inatomic to not zero the tail.

This patch also adds some commentary to filemap.h and asm-i386/uaccess.h.

After this patch, all architectures that might disable preempt when
kmap_atomic is called need to have their __copy_from_user_inatomic* "fixed".
This includes
 - powerpc
 - i386
 - mips
 - sparc

Signed-off-by: Neil Brown <neilb@suse.de>
Cc: David Howells <dhowells@redhat.com>
Cc: Anton Altaparmakov <aia21@cantab.net>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: William Lee Irwin III <wli@holomorphy.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:09 -07:00
Theodore Ts'o
d2e5b13c4a [PATCH] ext3: remove inconsistent space before exclamation point in mount code
This was reported as Debian bug #336604.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:07 -07:00
Theodore Ts'o
e8f1c6227a [PATCH] ext3: fix memory leak when the journal file is corrupted
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:07 -07:00
Theodore Ts'o
f16fdadba2 [PATCH] ext2: clean up dead code from mount code
The variable i is guaranteed to be the same as db_count given the previous
for loop.  So get rid of it since it's dead code.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:07 -07:00
Mingming Cao
fcd5df3588 [PATCH] Avoid disk sector_t overflow for >2TB ext3 filesystem
If ext3 filesystem is larger than 2TB, and sector_t is a u32 (i.e.
CONFIG_LBD not defined in the kernel), the calculation of the disk sector
will overflow.  Add check at ext3_fill_super() and ext3_group_extend() to
prevent mount/remount/resize >2TB ext3 filesystem if sector_t size is 4
bytes.

Verified this patch on a 32 bit platform without CONFIG_LBD defined
(sector_t is 32 bits long), mount refuse to mount a 10TB ext3.

Signed-off-by: Mingming Cao<cmm@us.ibm.com>
Acked-by: Andreas Dilger <adilger@clusterfs.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:07 -07:00
Jan Engelhardt
a04ee14636 [PATCH] openpromfs: factorize out
"Move" "common code" out to PTR_NOD, which does the conversion from private
pointer to node number.  This is to reduce potential casting/conversion errors
due to redundancy.  (The naming PTR_NOD follows PTR_ERR, turning a pointer
into xyz.)

[akpm@osdl.org: cleanups]
Signed-off-by: Jan Engelhardt <jengelh@gmx.de>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:05 -07:00
Jan Engelhardt
515decdccf [PATCH] openpromfs: remove unnecessary casts
Remove unnecessary casts in fs/openpromfs/inode.c

Signed-off-by: Jan Engelhardt <jengelh@gmx.de>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:05 -07:00
Jan Engelhardt
0928d68056 [PATCH] openpromfs: fix missing NUL
tchars is not '\0'-terminated so the strtoul may run into problems.  Fix that.
 Also make tchars as big as a long in hexadecimal form would take rather than
just 16.

Signed-off-by: Jan Engelhardt <jengelh@gmx.de>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:05 -07:00
Adrian Bunk
138bb68ac9 [PATCH] fs/ufs/inode.c: make 2 functions static
Make two needlessly global functions static.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:04 -07:00
Evgeniy Dushistov
098d5af7be [PATCH] ufs: ubh_ll_rw_block cleanup
In ufs code there is function: ubh_ll_rw_block, it has parameter how many
ufs_buffer_head it should handle, but it always called with "1" on the place
of this parameter.  This patch removes unused parameter of "ubh_ll_wr_block".

Signed-off-by: Evgeniy Dushistov <dushistov@mail.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:04 -07:00
Evgeniy Dushistov
ee3ffd6c12 [PATCH] ufs: make fsck -f happy
ufs super block contains some statistic about file systems, like amount of
directories, free blocks, inodes and so on.

UFS1 hold this information in one location and uses 32bit integers for such
information, UFS2 hold statistic in another location and uses 64bit integers.

There is transition variant, if UFS1 has type 44BSD and flags field in super
block has some special value this mean that we work with statistic like UFS2
does.  and this also means that nobody care about old(UFS1) statistic.

So if start fsck against such file system, after usage linux ufs driver, it
found error: at now only UFS1 like statistic is updated.

This patch should fix this.  Also it contains some minor cleanup: CodingSytle
and remove unused variables.

Signed-off-by: Evgeniy Dushistov <dushistov@mail.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:04 -07:00
Evgeniy Dushistov
577a82752f [PATCH] ufs: fsync implementation
Presently ufs doesn't support "fsync", this make some applications unhappy,
for example vim.  This patch fixes this situation.

Signed-off-by: Evgeniy Dushistov <dushistov@mail.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:04 -07:00
Evgeniy Dushistov
647b7e87b5 [PATCH] ufs: one way to access super block
Super block of UFS usually has size >512, because of fragment size may be 512,
this cause some problems.

Currently, there are two methods to work with ufs super block:

1) split structure which describes ufs super blocks into structures with
   size <=512

2) use one structure which describes ufs super block, and hope that array
   of "buffer_head" which holds "super block", has such construction:

	bh[n]->b_data + bh[n]->b_size == bh[n + 1]->b_data

The second variant may cause some problems in the future, and usage of two
variants cause unnecessary code duplication.

This patch remove the second variant.  Also patch contains some CodingStyle
fixes.

Signed-off-by: Evgeniy Dushistov <dushistov@mail.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:04 -07:00
Evgeniy Dushistov
f391475812 [PATCH] ufs: missed brelse and wrong baseblk
This patch fixes two bugs, which introduced by previous patches:

1) Missed "brelse"

2) Sometimes "baseblk" may be wrongly calculated, if i_size is equal to
   zero, which lead infinite cycle in "mpage_writepages".

Signed-off-by: Evgeniy Dushistov <dushistov@mail.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:04 -07:00
Andrew Morton
96710b29e0 [PATCH] ufs: printk warning fixes
fs/ufs/super.c: In function `ufs_print_super_stuff':
fs/ufs/super.c:103: warning: unsigned int format, different type arg (arg 2)    fs/ufs/super.c: In function `ufs2_print_super_stuff':                           fs/ufs/super.c:147: warning: unsigned int format, different type arg (arg 2)    fs/ufs/super.c: In function `ufs_print_cylinder_stuff':
fs/ufs/super.c:175: warning: unsigned int format, different type arg (arg 2)

Cc: Evgeniy Dushistov <dushistov@mail.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:04 -07:00
Evgeniy Dushistov
022a6dc5f4 [PATCH] ufs: zero metadata
Presently if we allocate several "metadata" blocks (pointers to indirect
blocks for example), we fill with zeroes only the first block.  This cause
some problems in "truncate" function.  Also this patch remove some unused
arguments from several functions and add comments.

Signed-off-by: Evgeniy Dushistov <dushistov@mail.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:03 -07:00
Evgeniy Dushistov
2e006393ba [PATCH] ufs: unlock_super without lock
ufs_free_blocks function looks now in so way:
if (err)
 goto failed;
 lock_super();
failed:
 unlock_super();

So if error happen we'll unlock not locked super.

Signed-off-by: Evgeniy Dushistov <dushistov@mail.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:03 -07:00
Evgeniy Dushistov
50aa4eb0b9 [PATCH] ufs: i_blocks wrong count
At now UFS code uses DQUOT_* mechanism, but it also update inode->i_blocks
manually, this cause wrong i_blocks value.

Signed-off-by: Evgeniy Dushistov <dushistov@mail.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:03 -07:00
Evgeniy Dushistov
dd187a2603 [PATCH] ufs: little directory lookup optimization
This patch make little optimization of ufs_find_entry like "ext2" does.  Save
number of page and reuse it again in the next call.

Signed-off-by: Evgeniy Dushistov <dushistov@mail.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:03 -07:00
Evgeniy Dushistov
abf5d15fd2 [PATCH] ufs: easy debug
Currently to turn on debug mode "user" has to edit ~10 files, to turn off he
has to do it again.

This patch introduce such changes:
1)turn on(off) debug messages via ".config"
2)remove unnecessary duplication of code
3)make "UFSD" macros more similar to function
4)fix some compiler warnings

Signed-off-by: Evgeniy Dushistov <dushistov@mail.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:03 -07:00
Evgeniy Dushistov
5afb3145c9 [PATCH] ufs: Unmark CONFIG_UFS_FS_WRITE as BROKEN
To find new bugs, I suggest revert this patch:
http://lkml.org/lkml/2006/1/31/275 in -mm tree.

So others can test "write support" of UFS.

Signed-off-by: Evgeniy Dushistov <dushistov@mail.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:03 -07:00
Evgeniy Dushistov
3e41f597b1 [PATCH] ufs: not usual amounts of fragments per block
The writing to UFS file system with block/fragment!=8 may cause bogus
behaviour.  The problem in "ufs_bitmap_search" function, which doesn't work
correctly in "block/fragment!=8" case.  The idea is stolen from BSD code.

Signed-off-by: Evgeniy Dushistov <dushistov@mail.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:02 -07:00
Evgeniy Dushistov
9695ef16ed [PATCH] ufs: wrong type cast
There are two ugly macros in ufs code:
#define UCPI_UBH ((struct ufs_buffer_head *)ucpi)
#define USPI_UBH ((struct ufs_buffer_head *)uspi)
when uspi looks like
struct {
struct ufs_buffer_head ;
}
and USPI_UBH has some sence,
ucpi looks like
struct {
struct not_ufs_buffer_head;
}

To prevent bugs in future, this patch convert macros to inline function and
fix "ucpi" structure.

Signed-off-by: Evgeniy Dushistov <dushistov@mail.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:02 -07:00
Evgeniy Dushistov
b71034e5e6 [PATCH] ufs: directory and page cache: from blocks to pages
Change function in fs/ufs/dir.c and fs/ufs/namei.c to work with pages
instead of straight work with blocks.  It fixed such bugs:

* for i in `seq 1 1000`; do touch $i; done - crash system
* mkdir create directory without "." and ".." entries

Signed-off-by: Evgeniy Dushistov <dushistov@mail.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:02 -07:00
Evgeniy Dushistov
826843a347 [PATCH] ufs: directory and page cache: install aops
This series of patches finished "bugs fixing" mentioned
here http://lkml.org/lkml/2006/1/31/275 .

The main bugs:
* for i in `seq 1 1000`; do touch $i; done - crash system
* mkdir create directory without "." and ".." entries

The suggested solution is work with page cache instead of straight work
with blocks.  Such solution has following advantages

* reduce code size and its complexity
* some global locks go away
* fix bugs

The most part of code is stolen from ext2, because of it has similar
directory structure.

Patches testes with UFS1 and UFS2 file systems.

This patch installs i_mapping->a_ops for directory inodes and removes some
duplicated code.

Signed-off-by: Evgeniy Dushistov <dushistov@mail.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:02 -07:00
Evgeniy Dushistov
6ef4d6bf86 [PATCH] ufs: change block number on the fly
First of all some necessary notes about UFS by it self: To avoid waste of disk
space the tail of file consists not from blocks (which is ordinary big enough,
16K usually), it consists from fragments(which is ordinary 2K).  When file is
growing its tail occupy 1 fragment, 2 fragments...  At some stage decision to
allocate whole block is made and all fragments are moved to one block.

How this situation was handled before:

  ufs_prepare_write
  ->block_prepare_write
    ->ufs_getfrag_block
      ->...
        ->ufs_new_fragments:

	bh = sb_bread
	bh->b_blocknr = result + i;
	mark_buffer_dirty (bh);

This is wrong solution, because:

- it didn't take into consideration that there is another cache: "inode page
  cache"

- because of sb_getblk uses not b_blocknr, (it uses page->index) to find
  certain block, this breaks sb_getblk.

How this situation is handled now: we go though all "page inode cache", if
there are no such page in cache we load it into cache, and change b_blocknr.

Signed-off-by: Evgeniy Dushistov <dushistov@mail.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:01 -07:00
Evgeniy Dushistov
c9a27b5dca [PATCH] ufs: right block allocation
* After block allocation, we map it on the same "address" as 8 others
  blocks

* We nullify block several times: once in ufs/block.c and once in
  block_*write_full_page, and use different "caches" for this.

Signed-off-by: Evgeniy Dushistov <dushistov@mail.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:01 -07:00
Evgeniy Dushistov
2061df0f89 [PATCH] ufs: ufs_trunc_indirect: infinite cycle
Currently, ufs write support have two sets of problems: work with files and
work with directories.

This series of patches should solve the first problem.

This patch is similar to http://lkml.org/lkml/2006/1/17/61 this patch
complements it.

The situation the same: in ufs_trunc_(not direct), we read block, check if
count of links to it is equal to one, if so we finish cycle, if not
continue.  Because of "count of links" always >=2 this operation cause
infinite cycle and hang up the kernel.

Signed-off-by: Evgeniy Dushistov <dushistov@mail.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:01 -07:00
Cliff Wickman
552c03483e [PATCH] fs/freevxfs: cleanup of spelling errors
Fix of some spelling errors in fs/freevxfs error messages and comments

Signed-off-by: Cliff Wickman <cpw@sgi.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:01 -07:00
Steve French
f90f00a358 [CIFS] Fix compile warning when CONFIG_CIFS_EXPERIMENTAL is off
Signed-off-by: Steve French <sfrench@us.ibm.com>
2006-06-25 15:59:32 +00:00
Steve French
bbe5d235ee Merge with /pub/scm/linux/kernel/git/torvalds/linux-2.6.git 2006-06-25 15:57:32 +00:00
Alexey Dobriyan
9bf2aa129a nfs: remove nfs_put_link()
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-25 06:39:35 -04:00
Andrew Morton
6ab86aa130 nfs-build-fix-99
fs/built-in.o:(__param+0x20): undefined reference to `nfs_idmap_cache_timeout'
fs/built-in.o:(__param+0x48): undefined reference to `nfs_callback_set_tcpport'

Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Andreas Gruenbacher <agruen@suse.de>
Cc: Andy Adamson <andros@citi.umich.edu>
Cc: Chuck Lever <cel@netapp.com>
Cc: David Howells <dhowells@redhat.com>
Cc: J. Bruce Fields <bfields@fieldses.org>
Cc: Manoj Naik <manoj@almaden.ibm.com>
Cc: Marc Eshel <eshel@almaden.ibm.com>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-25 06:38:47 -04:00
Andrew Morton
d75d54147d git-nfs-build-fixes
Fix various problems with nfs4 disabled.  And various other things.

In file included from fs/nfs/inode.c:50:
fs/nfs/internal.h:24: error: static declaration of 'nfs_do_refmount' follows non-static declaration
include/linux/nfs_fs.h:320: error: previous declaration of 'nfs_do_refmount' was here
fs/nfs/internal.h:65: warning: 'struct nfs4_fs_locations' declared inside parameter list
fs/nfs/internal.h:65: warning: its scope is only this definition or declaration, which is probably not what you want
fs/nfs/internal.h: In function 'nfs4_path':
fs/nfs/internal.h:97: error: 'struct nfs_server' has no member named 'mnt_path'
fs/nfs/inode.c: In function 'init_once':
fs/nfs/inode.c:1116: error: 'struct nfs_inode' has no member named 'open_states'
fs/nfs/inode.c:1116: error: 'struct nfs_inode' has no member named 'delegation'
fs/nfs/inode.c:1116: error: 'struct nfs_inode' has no member named 'delegation_state'
fs/nfs/inode.c:1116: error: 'struct nfs_inode' has no member named 'rwsem'
distcc[26452] ERROR: compile fs/nfs/inode.c on g5/64 failed
make[1]: *** [fs/nfs/inode.o] Error 1
make: *** [fs/nfs/inode.o] Error 2
make: *** Waiting for unfinished jobs....
In file included from fs/nfs/nfs3xdr.c:26:
fs/nfs/internal.h:24: error: static declaration of 'nfs_do_refmount' follows non-static declaration
include/linux/nfs_fs.h:320: error: previous declaration of 'nfs_do_refmount' was here
fs/nfs/internal.h:65: warning: 'struct nfs4_fs_locations' declared inside parameter list
fs/nfs/internal.h:65: warning: its scope is only this definition or declaration, which is probably not what you want
fs/nfs/internal.h: In function 'nfs4_path':
fs/nfs/internal.h:97: error: 'struct nfs_server' has no member named 'mnt_path'
distcc[26486] ERROR: compile fs/nfs/nfs3xdr.c on g5/64 failed
make[1]: *** [fs/nfs/nfs3xdr.o] Error 1
make: *** [fs/nfs/nfs3xdr.o] Error 2
In file included from fs/nfs/nfs3proc.c:24:
fs/nfs/internal.h:24: error: static declaration of 'nfs_do_refmount' follows non-static declaration
include/linux/nfs_fs.h:320: error: previous declaration of 'nfs_do_refmount' was here
fs/nfs/internal.h:65: warning: 'struct nfs4_fs_locations' declared inside parameter list
fs/nfs/internal.h:65: warning: its scope is only this definition or declaration, which is probably not what you want
fs/nfs/internal.h: In function 'nfs4_path':
fs/nfs/internal.h:97: error: 'struct nfs_server' has no member named 'mnt_path'
distcc[26469] ERROR: compile fs/nfs/nfs3proc.c on bix/32 failed
make[1]: *** [fs/nfs/nfs3proc.o] Error 1
make: *** [fs/nfs/nfs3proc.o] Error 2
**FAILED**

Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Andreas Gruenbacher <agruen@suse.de>
Cc: Andy Adamson <andros@citi.umich.edu>
Cc: Chuck Lever <cel@netapp.com>
Cc: David Howells <dhowells@redhat.com>
Cc: J. Bruce Fields <bfields@fieldses.org>
Cc: Manoj Naik <manoj@almaden.ibm.com>
Cc: Marc Eshel <eshel@almaden.ibm.com>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-25 06:38:11 -04:00
Trond Myklebust
ccf01ef7aa Merge branch 'odirect' 2006-06-25 06:27:31 -04:00
Andrew Morton
1b77c54ee1 V4L/DVB (3809a): Remove compat stuff for DMX_GET_EVENT
The ioctl were removed by:
V4L/DVB (3727): Remove DMX_GET_EVENT and associated data structures
due to the ioctl DMX_GET_EVENT has never been implemented, and also
scrambling events can't be generated in a useful way by the hardware.

This patch removes the corresponding entry at fs/compat_ioctl.c

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2006-06-25 01:58:10 -03:00
Chuck Lever
82b145c5a5 NFS: alloc nfs_read/write_data as direct I/O is scheduled
Re-arrange the logic in the NFS direct I/O path so that nfs_read/write_data
structs are allocated just before they are scheduled, rather than
allocating them all at once before we start scheduling requests.

Signed-off-by: Chuck Lever <cel@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-24 13:11:39 -04:00
Chuck Lever
06cf6f2ed0 NFS: Eliminate nfs_get_user_pages()
Neil Brown observed that the kmalloc() in nfs_get_user_pages() is more
likely to fail if the I/O is large enough to require the allocation of more
than a single page to keep track of all the pinned pages in the user's
buffer.

Instead of tracking one large page array per dreq/iocb, track pages per
nfs_read/write_data, just like the cached I/O path does.  An array for
pages is already allocated for us by nfs_readdata_alloc() (and the write
and commit equivalents).

This is also required for adding support for vectored I/O to the NFS direct
I/O path.

The original reason to pin the user buffer and allocate all the NFS data
structures before trying to schedule I/O was to ensure all needed resources
are allocated on the client before starting to send requests.  This reduces
the chance that resource exhaustion on the client will cause a short read
or write.

On the other hand, for an application making very large application I/O
requests, this means that it will be nearly impossible for the application
to make forward progress on a resource-limited client.

Thus, moving the buffer pinning functionality into the I/O scheduling
loops should be good for scalability.  The next patch will do the same for
NFS data structure allocation.

Signed-off-by: Chuck Lever <cel@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-24 13:11:39 -04:00
Chuck Lever
9c93ab7dff NFS: refactor nfs_direct_free_user_pages
Clean-up and fix a minor bug: the logic was dirtying page cache pages on
both read and write operations.

Signed-off-by: Chuck Lever <cel@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-24 13:11:39 -04:00
Chuck Lever
51a7bc6cae NFS: remove user_addr, user_count, and pos from nfs_direct_req
Make the user_addr, user_count, and pos parameters explicit to the
scheduler routines, and remove the fields from nfs_direct_req.  The
iovec API will be passing in a series of these, not just one set.

Signed-off-by: Chuck Lever <cel@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-24 13:11:39 -04:00
Chuck Lever
fedb595c66 NFS: "open code" the NFS direct write rescheduler
An NFSv3/v4 client must reschedule on-the-wire writes if the writes are
UNSTABLE, and the server reboots before the client can complete a
subsequent COMMIT request.

To support direct asynchronous scatter-gather writes, the write
rescheduler in fs/nfs/direct.c must not depend on the I/O parameters
in the controlling nfs_direct_req structure.  iovecs can be somewhat
arbitrarily complex, so there could be an unbounded amount of information
to save for a rarely encountered requirement.

Refactor the direct write rescheduler so it uses information from each
nfs_write_data structure to reschedule writes, instead of caching that
information in the controlling nfs_direct_req structure.

Signed-off-by: Chuck Lever <cel@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-24 13:11:38 -04:00
Chuck Lever
b1c5921c5b NFS: Separate functions for counting outstanding NFS direct I/Os
Factor out the logic that increments and decrements the outstanding I/O
count.  This will be a commonly used bit of code in upcoming patches.
Also make this an atomic_t again, since it will be very often manipulated
outside dreq->spin lock.

Signed-off-by: Chuck Lever <cel@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-24 13:11:38 -04:00
Trond Myklebust
816724e65c Merge branch 'master' of /home/trondmy/kernel/linux-2.6/
Conflicts:

	fs/nfs/inode.c
	fs/super.c

Fix conflicts between patch 'NFS: Split fs/nfs/inode.c' and patch
'VFS: Permit filesystem to override root dentry on mount'
2006-06-24 13:07:53 -04:00
Jens Axboe
9e94cd4fd1 [PATCH] splice: retrieve mapping after locking the page
Otherwise we could be racing with truncate/mapping removal.

Problem found/fixed by Nick Piggin <npiggin@suse.de>, logic rewritten
by me.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-06-23 17:10:39 +02:00
Jens Axboe
b31dc66a54 [PATCH] Kill PF_SYNCWRITE flag
A process flag to indicate whether we are doing sync io is incredibly
ugly. It also causes performance problems when one does a lot of async
io and then proceeds to sync it. Part of the io will go out as async,
and the other part as sync. This causes a disconnect between the
previously submitted io and the synced io. For io schedulers such as CFQ,
this will cause us lost merges and suboptimal behaviour in scheduling.

Remove PF_SYNCWRITE completely from the fsync/msync paths, and let
the O_DIRECT path just directly indicate that the writes are sync
by using WRITE_SYNC instead.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-06-23 17:10:39 +02:00
Mike Miller
98bd34eaf1 [PATCH] make kernel warn about incorrectly sized partitions
Sometimes partitions claim to be larger than the reported capacity of a
disk device.  This patch makes the kernel warn about those partitions.

We still permit these patitions to be used.  Quoting Andries Brouwer
<Andries.Brouwer@cwi.nl>:

 Case 1: The kernel is mistaken about the size of the disk.  (There are
 commands to clip a disk to a certain capacity, there are jumpers to tell a
 disk that it should report a certain capacity etc.  Usually this is because
 of BIOS bugs.  In bad cases the machine will crash in the BIOS and hence fail
 to boot if the disk reports full capacity.) In such cases actually accessing
 the blocks of the partition may work fine, or may work fine after running an
 unclip utility.  I wrote "setmax" some years ago precisely for this reason.

 Case 2: There was a messy partition table (maybe just a rounding error) but
 the actual filesystem on the partition is contained in the physical disk.
 Now using the filesystem goes without problem.

 Case 3: Both partition and filesystem extend beyond the end of the disk.  In
 forensic or debugging situations one often uses a copy of the start of a
 disk.  Now access beyond the end gives an expected I/O error.

Signed-off-by: Mike Miller <mike.miller@hp.com>
Signed-off-by: Stephen Cameron <steve.cameron@hp.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:43:09 -07:00
Jan Kara
78ce89c92b [PATCH] JBD: split checkpoint lists
Split the checkpoint list of the transaction into two lists.  In the first
list we keep the buffers that need to be submitted for IO.  In the second
list are kept buffers that were already submitted and we just have to wait
for the IO to complete.  This should simplify a handling of checkpoint
lists a bit and can eventually be also a performance gain.

Signed-off-by: Jan Kara <jack@suse.cz>
Cc: Mark Fasheh <mark.fasheh@oracle.com>
Cc: "Stephen C. Tweedie" <sct@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:43:08 -07:00
Oleg Nesterov
626ab0e69d [PATCH] list: use list_replace_init() instead of list_splice_init()
list_splice_init(list, head) does unneeded job if it is known that
list_empty(head) == 1.  We can use list_replace_init() instead.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:43:07 -07:00
Mingming Cao
0216bfcffe [PATCH] percpu counter data type changes to suppport more than 2**31 ext3 free blocks counter
The percpu counter data type are changed in this set of patches to support
more users like ext3 who need more than 32 bit to store the free blocks
total in the filesystem.

- Generic perpcu counters data type changes.  The size of the global counter
  and local counter were explictly specified using s64 and s32.  The global
  counter is changed from long to s64, while the local counter is changed from
  long to s32, so we could avoid doing 64 bit update in most cases.

- Users of the percpu counters are updated to make use of the new
  percpu_counter_init() routine now taking an additional parameter to allow
  users to pass the initial value of the global counter.

Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:43:06 -07:00
Jesper Juhl
785d55708c [PATCH] binflt_elf: remove more casts
Remove redundant casts from NEW_AUX_ENT() arguments in fs/binfmt_elf.c

Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:43:05 -07:00
Jesper Juhl
f4e5cc2c44 [PATCH] binfmt_elf: CodingStyle cleanup and remove some pointless casts
Do a CodingStyle cleanup of fs/binfmt_elf.c and also remove some pointless
casts of kmalloc() return values in the same file.

Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:43:05 -07:00
Andrew Morton
e6022603b9 [PATCH] ext3_clear_inode(): avoid kfree(NULL)
Steven Rostedt <rostedt@goodmis.org> points out that `rsv' here is usually
NULL, so we should avoid calling kfree().

Also, fix up some nearby whitespace damage.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:43:05 -07:00
Andrew Morton
304c4c841a [PATCH] jbd: avoid kfree(NULL)
There are a couple of places where JBD has to check to see whether an unneeded
memory allocation was performed.  Usually it _was_ needed, so we end up
calling kfree(NULL).  We can micro-optimise that by checking the pointer
before calling kfree().

Thanks to Steven Rostedt <rostedt@goodmis.org> for identifying this.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:43:05 -07:00
Jan Kara
9ada734098 [PATCH] jbd: fix BUG in journal_commit_transaction()
Fix possible assertion failure in journal_commit_transaction() on
jh->b_next_transaction == NULL (when we are processing BJ_Forget list and
buffer is not jbddirty).

!jbddirty buffers can be placed on BJ_Forget list for example by
journal_forget() or by __dispose_buffer() - generally such buffer means
that it has been freed by this transaction.

Freed buffers should not be reallocated until the transaction has committed
(that's why we have the assertion there) but they *can* be reallocated when
the transaction has already been committed to disk and we are just
processing the BJ_Forget list (as soon as we remove b_committed_data from
the bitmap bh, ext3 will be able to reallocate buffers freed by the
committing transaction).  So we have to also count with the case that the
buffer has been reallocated and b_next_transaction has been already set.

And one more subtle point: it can happen that we manage to reallocate the
buffer and also mark it jbddirty.  Then we also add the freed buffer to the
checkpoint list of the committing trasaction.  But that should do no harm.

Non-jbddirty buffers should be filed to BJ_Reserved and not BJ_Metadata
list.  It can actually happen that we refile such buffers during the commit
phase when we reallocate in the running transaction blocks deleted in
committing transaction (and that can happen if the committing transaction
already wrote all the data and is just cleaning up BJ_Forget list).

Signed-off-by: Jan Kara <jack@suse.cz>
Acked-by: "Stephen C. Tweedie" <sct@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:43:04 -07:00
Vadim Lobanov
4a4b69f79b [PATCH] Poll cleanups/microoptimizations
The "count" and "pt" variables are declared and modified by do_poll(), as
well as accessed and written indirectly in the do_pollfd() subroutine.

This patch pulls all handling of these variables into the do_poll()
function, thereby eliminating the odd use of indirection in do_pollfd().
This is done by pulling the "struct pollfd" traversal loop from do_pollfd()
into its only caller do_poll().  As an added bonus, the patch saves a few
clock cycles, and also adds comments to make the code easier to follow.

Signed-off-by: Vadim Lobanov <vlobanov@speakeasy.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:43:03 -07:00
Adrian Bunk
2da1326463 [PATCH] fs/fat/misc.c: unexport fat_sync_bhs
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:43:03 -07:00
Adrian Bunk
b0904e147f [PATCH] fs/locks.c: make posix_locks_deadlock() static
We can now make posix_locks_deadlock() static.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:43:03 -07:00
Miklos Szeredi
75e1fcc0b1 [PATCH] vfs: add lock owner argument to flush operation
Pass the POSIX lock owner ID to the flush operation.

This is useful for filesystems which don't want to store any locking state
in inode->i_flock but want to handle locking/unlocking POSIX locks
internally.  FUSE is one such filesystem but I think it possible that some
network filesystems would need this also.

Also add a flag to indicate that a POSIX locking request was generated by
close(), so filesystems using the above feature won't send an extra locking
request in this case.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:43:02 -07:00
Miklos Szeredi
ff7b86b820 [PATCH] locks: clean up locks_remove_posix()
locks_remove_posix() can use posix_lock_file() instead of doing the lock
removal by hand.  posix_lock_file() now does exacly the same.

The comment about pids no longer applies, posix_lock_file() takes only the
owner into account.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:43:02 -07:00
Miklos Szeredi
39005d022a [PATCH] locks: don't do unnecessary allocations
posix_lock_file() always allocates new locks in advance, even if it's easy to
determine that no allocations will be needed.

Optimize these cases:

 - FL_ACCESS flag is set

 - Unlocking the whole range

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:43:02 -07:00
Miklos Szeredi
0d9a490abe [PATCH] locks: don't unnecessarily fail posix lock operations
posix_lock_file() was too cautious, failing operations on OOM, even if they
didn't actually require an allocation.

This has the disadvantage, that a failing unlock on process exit could lead to
a memory leak.  There are two possibilites for this:

- filesystem implements .lock() and calls back to posix_lock_file().  On
cleanup of files_struct locks_remove_posix() is called which should remove all
locks belonging to files_struct.  However if filesystem calls
posix_lock_file() which fails, then those locks will never be freed.

- if a file is closed while a lock is blocked, then after acquiring
fcntl_setlk() will undo the lock.  But this unlock itself might fail on OOM,
again possibly leaking the lock.

The solution is to move the checking of the allocations until after it is sure
that they will be needed.  This will solve the above problem since unlock will
always succeed unless it splits an existing region.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:43:02 -07:00
Pekka Enberg
090d2b185d [PATCH] read_mapping_page for address space
Add read_mapping_page() which is used for callers that pass
mapping->a_ops->readpage as the filler for read_cache_page.  This removes
some duplication from filesystem code.

Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:43:02 -07:00
Al Viro
0c426f26cc [PATCH] ext2 XIP won't build without MMU
Disable Ext2 XIP if the kernel is configured in no-MMU mode as the former
won't build.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:42:55 -07:00
Al Viro
530018bf3d [PATCH] frv: binfmt_elf_fdpic __user annotations
Add __user annotations to binfmt_elf_fdpic.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:42:54 -07:00
James Morris
03e6806063 [PATCH] lsm: add task_setioprio hook
Implement an LSM hook for setting a task's IO priority, similar to the hook
for setting a tasks's nice value.

A previous version of this LSM hook was included in an older version of
multiadm by Jan Engelhardt, although I don't recall it being submitted
upstream.

Also included is the corresponding SELinux hook, which re-uses the setsched
permission in the proccess class.

Signed-off-by: James Morris <jmorris@namei.org>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
Cc: Jan Engelhardt <jengelh@linux01.gwdg.de>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: Jens Axboe <axboe@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:42:53 -07:00
OGAWA Hirofumi
111ebb6e6f [PATCH] writeback: fix range handling
When a writeback_control's `start' and `end' fields are used to
indicate a one-byte-range starting at file offset zero, the required
values of .start=0,.end=0 mean that the ->writepages() implementation
has no way of telling that it is being asked to perform a range
request.  Because we're currently overloading (start == 0 && end == 0)
to mean "this is not a write-a-range request".

To make all this sane, the patch changes range of writeback_control.

So caller does: If it is calling ->writepages() to write pages, it
sets range (range_start/end or range_cyclic) always.

And if range_cyclic is true, ->writepages() thinks the range is
cyclic, otherwise it just uses range_start and range_end.

This patch does,

    - Add LLONG_MAX, LLONG_MIN, ULLONG_MAX to include/linux/kernel.h
      -1 is usually ok for range_end (type is long long). But, if someone did,

		range_end += val;		range_end is "val - 1"
		u64val = range_end >> bits;	u64val is "~(0ULL)"

      or something, they are wrong. So, this adds LLONG_MAX to avoid nasty
      things, and uses LLONG_MAX for range_end.

    - All callers of ->writepages() sets range_start/end or range_cyclic.

    - Fix updates of ->writeback_index. It seems already bit strange.
      If it starts at 0 and ended by check of nr_to_write, this last
      index may reduce chance to scan end of file.  So, this updates
      ->writeback_index only if range_cyclic is true or whole-file is
      scanned.

Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Cc: Nathan Scott <nathans@sgi.com>
Cc: Anton Altaparmakov <aia21@cantab.net>
Cc: Steven French <sfrench@us.ibm.com>
Cc: "Vladimir V. Saveliev" <vs@namesys.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:42:49 -07:00
Chen, Kenneth W
a43a8c39bb [PATCH] tightening hugetlb strict accounting
Current hugetlb strict accounting for shared mapping always assume mapping
starts at zero file offset and reserves pages between zero and size of the
file.  This assumption often reserves (or lock down) a lot more pages then
necessary if application maps at none zero file offset.  libhugetlbfs is
one example that requires proper reservation on shared mapping starts at
none zero offset.

This patch extends the reservation and hugetlb strict accounting to support
any arbitrary pair of (offset, len), resulting a much more robust and
accurate scheme.  More importantly, it won't lock down any hugetlb pages
outside file mapping.

Signed-off-by: Ken Chen <kenneth.w.chen@intel.com>
Acked-by: Adam Litke <agl@us.ibm.com>
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: William Lee Irwin III <wli@holomorphy.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:42:48 -07:00
KAMEZAWA Hiroyuki
6f0419e06a [PATCH] for_each_possible_cpu: xfs
for_each_cpu() actually iterates across all possible CPUs.  We've had mistakes
in the past where people were using for_each_cpu() where they should have been
iterating across only online or present CPUs.  This is inefficient and
possibly buggy.

We're renaming for_each_cpu() to for_each_possible_cpu() to avoid this in the
future.

This patch replaces for_each_cpu with for_each_possible_cpu.
in xfs.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Nathan Scott <nathans@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:42:45 -07:00
David Howells
d6938d1b27 [PATCH] XFS: Use the dentry passed to statfs() to limit the scope of the results
Enable XFS to limit the statfs() results to the project quota covering the
dentry used as a base for call.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:42:45 -07:00
David Howells
726c334223 [PATCH] VFS: Permit filesystem to perform statfs with a known root dentry
Give the statfs superblock operation a dentry pointer rather than a superblock
pointer.

This complements the get_sb() patch.  That reduced the significance of
sb->s_root, allowing NFS to place a fake root there.  However, NFS does
require a dentry to use as a target for the statfs operation.  This permits
the root in the vfsmount to be used instead.

linux/mount.h has been added where necessary to make allyesconfig build
successfully.

Interest has also been expressed for use with the FUSE and XFS filesystems.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: Nathan Scott <nathans@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:42:45 -07:00
David Howells
454e2398be [PATCH] VFS: Permit filesystem to override root dentry on mount
Extend the get_sb() filesystem operation to take an extra argument that
permits the VFS to pass in the target vfsmount that defines the mountpoint.

The filesystem is then required to manually set the superblock and root dentry
pointers.  For most filesystems, this should be done with simple_set_mnt()
which will set the superblock pointer and then set the root dentry to the
superblock's s_root (as per the old default behaviour).

The get_sb() op now returns an integer as there's now no need to return the
superblock pointer.

This patch permits a superblock to be implicitly shared amongst several mount
points, such as can be done with NFS to avoid potential inode aliasing.  In
such a case, simple_set_mnt() would not be called, and instead the mnt_root
and mnt_sb would be set directly.

The patch also makes the following changes:

 (*) the get_sb_*() convenience functions in the core kernel now take a vfsmount
     pointer argument and return an integer, so most filesystems have to change
     very little.

 (*) If one of the convenience function is not used, then get_sb() should
     normally call simple_set_mnt() to instantiate the vfsmount. This will
     always return 0, and so can be tail-called from get_sb().

 (*) generic_shutdown_super() now calls shrink_dcache_sb() to clean up the
     dcache upon superblock destruction rather than shrink_dcache_anon().

     This is required because the superblock may now have multiple trees that
     aren't actually bound to s_root, but that still need to be cleaned up. The
     currently called functions assume that the whole tree is rooted at s_root,
     and that anonymous dentries are not the roots of trees which results in
     dentries being left unculled.

     However, with the way NFS superblock sharing are currently set to be
     implemented, these assumptions are violated: the root of the filesystem is
     simply a dummy dentry and inode (the real inode for '/' may well be
     inaccessible), and all the vfsmounts are rooted on anonymous[*] dentries
     with child trees.

     [*] Anonymous until discovered from another tree.

 (*) The documentation has been adjusted, including the additional bit of
     changing ext2_* into foo_* in the documentation.

[akpm@osdl.org: convert ipath_fs, do other stuff]
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: Nathan Scott <nathans@sgi.com>
Cc: Roland Dreier <rolandd@cisco.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:42:45 -07:00
Steve French
189acaaef8 [CIFS] Enable sec flags on mount for cifs (part one)
Signed-off-by: Steve French <sfrench@us.ibm.com>
2006-06-23 02:33:48 +00:00
Andrew Morton
d702ccb342 [PATCH] prune_one_dentry() tweaks
- Add description of d_lock handling to comments over prune_one_dentry().

- It has three callsites - uninline it, saving 200 bytes of text.

Cc: Jan Blunck <jblunck@suse.de>
Cc: Kirill Korotaev <dev@openvz.org>
Cc: Olaf Hering <olh@suse.de>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-22 15:05:57 -07:00
NeilBrown
0feae5c47a [PATCH] Fix dcache race during umount
The race is that the shrink_dcache_memory shrinker could get called while a
filesystem is being unmounted, and could try to prune a dentry belonging to
that filesystem.

If it does, then it will call in to iput on the inode while the dentry is
no longer able to be found by the umounting process.  If iput takes a
while, generic_shutdown_super could get all the way though
shrink_dcache_parent and shrink_dcache_anon and invalidate_inodes without
ever waiting on this particular inode.

Eventually the superblock gets freed anyway and if the iput tried to touch
it (which some filesystems certainly do), it will lose.  The promised
"Self-destruct in 5 seconds" doesn't lead to a nice day.

The race is closed by holding s_umount while calling prune_one_dentry on
someone else's dentry.  As a down_read_trylock is used,
shrink_dcache_memory will no longer try to prune the dentry of a filesystem
that is being unmounted, and unmount will not be able to start until any
such active prune_one_dentry completes.

This requires that prune_dcache *knows* which filesystem (if any) it is
doing the prune on behalf of so that it can be careful of other
filesystems.  shrink_dcache_memory isn't called it on behalf of any
filesystem, and so is careful of everything.

shrink_dcache_anon is now passed a super_block rather than the s_anon list
out of the superblock, so it can get the s_anon list itself, and can pass
the superblock down to prune_dcache.

If prune_dcache finds a dentry that it cannot free, it leaves it where it
is (at the tail of the list) and exits, on the assumption that some other
thread will be removing that dentry soon.  To try to make sure that some
work gets done, a limited number of dnetries which are untouchable are
skipped over while choosing the dentry to work on.

I believe this race was first found by Kirill Korotaev.

Cc: Jan Blunck <jblunck@suse.de>
Acked-by: Kirill Korotaev <dev@openvz.org>
Cc: Olaf Hering <olh@suse.de>
Acked-by: Balbir Singh <balbir@in.ibm.com>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Balbir Singh <balbir@in.ibm.com>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-22 15:05:57 -07:00
Miklos Szeredi
c89681ed7d [PATCH] remove steal_locks()
This patch removes the steal_locks() function.

steal_locks() doesn't work correctly with any filesystem that does it's own
lock management, including NFS, CIFS, etc.

In addition it has weird semantics on local filesystems in case tasks
sharing file-descriptor tables are doing POSIX locking operations in
parallel to execve().

The steal_locks() function has an effect on applications doing:

clone(CLONE_FILES)
  /* in child */
  lock
  execve
  lock

POSIX locks acquired before execve (by "child", "parent" or any further
task sharing files_struct) will after the execve be owned exclusively by
"child".

According to Chris Wright some LSB/LTP kind of suite triggers without the
stealing behavior, but there's no known real-world application that would
also fail.

Apps using NPTL are not affected, since all other threads are killed before
execve.

Apps using LinuxThreads are only affected if they

  - have multiple threads during exec (LinuxThreads doesn't kill other
    threads, the app may do it with pthread_kill_other_threads_np())
  - rely on POSIX locks being inherited across exec

Both conditions are documented, but not their interaction.

Apps using clone() natively are affected if they

  - use clone(CLONE_FILES)
  - rely on POSIX locks being inherited across exec

The above scenarios are unlikely, but possible.

If the patch is vetoed, there's a plan B, that involves mostly keeping the
weird stealing semantics, but changing the way lock ownership is handled so
that network and local filesystems work consistently.

That would add more complexity though, so this solution seems to be
preferred by most people.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Matthew Wilcox <willy@debian.org>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Steven French <sfrench@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-22 15:05:57 -07:00
OGAWA Hirofumi
09d967c6f3 [PATCH] Fix a race condition between ->i_mapping and iput()
This race became a cause of oops, and can reproduce by the following.

    while true; do
	dd if=/dev/zero of=/dev/.static/dev/hdg1 bs=512 count=1000 & sync
    done

This race condition was between __sync_single_inode() and iput().

          cpu0 (fs's inode)                 cpu1 (bdev's inode)
          -----------------                 -------------------
                                       close("/dev/hda2")
                                       [...]
__sync_single_inode()
   /* copy the bdev's ->i_mapping */
   mapping = inode->i_mapping;

                                       generic_forget_inode()
                                          bdev_clear_inode()
					     /* restre the fs's ->i_mapping */
				             inode->i_mapping = &inode->i_data;
				          /* bdev's inode was freed */
                                          destroy_inode(inode);

   if (wait) {
      /* dereference a freed bdev's mapping->host */
      filemap_fdatawait(mapping);  /* Oops */

Since __sync_single_inode() is only taking a ref-count of fs's inode, the
another process can be close() and freeing the bdev's inode while writing
fs's inode.  So, __sync_signle_inode() accesses the freed ->i_mapping,
oops.

This patch takes a ref-count on the bdev's inode for the fs's inode before
setting a ->i_mapping, and the clear_inode() of the fs's inode does iput() on
the bdev's inode.  So if the fs's inode is still living, bdev's inode
shouldn't be freed.

Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-22 15:05:57 -07:00
Anton Altaparmakov
f893afbe12 [PATCH] NTFS: Critical bug fix (affects MIPS and possibly others)
Many thanks to Pauline Ng for the detailed bug report and analysis!

Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-22 15:05:55 -07:00
Linus Torvalds
52ab3f3dc7 Merge git://oss.sgi.com:8090/xfs-2.6
* git://oss.sgi.com:8090/xfs-2.6: (43 commits)
  [XFS] Remove files from the build that are now unused.
  [XFS] Fix a Makefile issue related to exports.o handling.
  [XFS] Remove version 1 directory code.	Never functioned on Linux, just
  [XFS] Map EFSCORRUPTED to an actual error code, not just a made up one
  [XFS] Kill direct access to ->count in valusema(); all we ever use it for
  [XFS] Remove unneeded conditional code on NFS export interface related
  [XFS] Remove an incorrect use of unlikely() on a relatively likely code
  [XFS] Push some common code out of write path into core XFS code for
  [XFS] Remove unnecessary local from open_exec dmapi path.
  [XFS] Minor XFS documentation updates.
  [XFS] Fix broken const use inside local suffix_strtoul routine.
  [XFS] Fix nused counter.  It's currently getting set to -1 rather than
  [XFS] Fix mismerge of the fs_writable cleanup patch causing a freeze/thaw
  [XFS] Fix up debug code so that bulkstat wont generate thousands of
  [XFS] Remove unused parameter from di2xflags routine.
  [XFS] Cleanup a missed porting conversion, and freezing.
  [XFS] Resolve a namespace collision on remaining vtypes for FreeBSD
  [XFS] Resolve a namespace collision on vnode/vnodeops for FreeBSD porters.
  [XFS] Resolve a namespace collision on vfs/vfsops for FreeBSD porters.
  [XFS] statvfs component of directory/project quota support, code
  ...
2006-06-21 18:10:19 -07:00
Kay Sievers
b9d9c82b4d [PATCH] Driver core: add generic "subsystem" link to all devices
Like the SUBSYTEM= key we find in the environment of the uevent, this
creates a generic "subsystem" link in sysfs for every device. Userspace
usually doesn't care at all if its a "class" or a "bus" device. This
provides an unified way to determine the subsytem of a device, regardless
of the way the driver core has created it.

Signed-off-by: Kay Sievers <kay.sievers@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-21 12:40:49 -07:00
Trond Myklebust
70ac4385a1 Merge branch 'master' of /home/trondmy/kernel/linux-2.6/
Conflicts:

	include/linux/nfs_fs.h

Fixed up conflict with kernel header updates.
2006-06-20 20:46:21 -04:00
Linus Torvalds
d9eaec9e29 Merge branch 'audit.b21' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current
* 'audit.b21' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current: (25 commits)
  [PATCH] make set_loginuid obey audit_enabled
  [PATCH] log more info for directory entry change events
  [PATCH] fix AUDIT_FILTER_PREPEND handling
  [PATCH] validate rule fields' types
  [PATCH] audit: path-based rules
  [PATCH] Audit of POSIX Message Queue Syscalls v.2
  [PATCH] fix se_sen audit filter
  [PATCH] deprecate AUDIT_POSSBILE
  [PATCH] inline more audit helpers
  [PATCH] proc_loginuid_write() uses simple_strtoul() on non-terminated array
  [PATCH] update of IPC audit record cleanup
  [PATCH] minor audit updates
  [PATCH] fix audit_krule_to_{rule,data} return values
  [PATCH] add filtering by ppid
  [PATCH] log ppid
  [PATCH] collect sid of those who send signals to auditd
  [PATCH] execve argument logging
  [PATCH] fix deadlocks in AUDIT_LIST/AUDIT_LIST_RULES
  [PATCH] audit_panic() is audit-internal
  [PATCH] inotify (5/5): update kernel documentation
  ...

Manual fixup of conflict in unclude/linux/inotify.h
2006-06-20 15:37:56 -07:00
Linus Torvalds
2edc322d42 Merge git://git.infradead.org/~dwmw2/rbtree-2.6
* git://git.infradead.org/~dwmw2/rbtree-2.6:
  [RBTREE] Switch rb_colour() et al to en_US spelling of 'color' for consistency
  Update UML kernel/physmem.c to use rb_parent() accessor macro
  [RBTREE] Update hrtimers to use rb_parent() accessor macro.
  [RBTREE] Add explicit alignment to sizeof(long) for struct rb_node.
  [RBTREE] Merge colour and parent fields of struct rb_node.
  [RBTREE] Remove dead code in rb_erase()
  [RBTREE] Update JFFS2 to use rb_parent() accessor macro.
  [RBTREE] Update eventpoll.c to use rb_parent() accessor macro.
  [RBTREE] Update key.c to use rb_parent() accessor macro.
  [RBTREE] Update ext3 to use rb_parent() accessor macro.
  [RBTREE] Change rbtree off-tree marking in I/O schedulers.
  [RBTREE] Add accessor macros for colour and parent fields of rb_node
2006-06-20 14:51:22 -07:00
Linus Torvalds
be967b7e2f Merge git://git.infradead.org/mtd-2.6
* git://git.infradead.org/mtd-2.6: (199 commits)
  [MTD] NAND: Fix breakage all over the place
  [PATCH] NAND: fix remaining OOB length calculation
  [MTD] NAND Fixup NDFC merge brokeness
  [MTD NAND] S3C2410 driver cleanup
  [MTD NAND] s3c24x0 board: Fix clock handling, ensure proper initialisation.
  [JFFS2] Check CRC32 on dirent and data nodes each time they're read
  [JFFS2] When retiring nextblock, allocate a node_ref for the wasted space
  [JFFS2] Mark XATTR support as experimental, for now
  [JFFS2] Don't trust node headers before the CRC is checked.
  [MTD] Restore MTD_ROM and MTD_RAM types
  [MTD] assume mtd->writesize is 1 for NOR flashes
  [MTD NAND] Fix s3c2410 NAND driver so it at least _looks_ like it compiles
  [MTD] Prepare physmap for 64-bit-resources
  [JFFS2] Fix more breakage caused by janitorial meddling.
  [JFFS2] Remove stray __exit from jffs2_compressors_exit()
  [MTD] Allow alternate JFFS2 mount variant for root filesystem.
  [MTD] Disconnect struct mtd_info from ABI
  [MTD] replace MTD_RAM with MTD_GENERIC_TYPE
  [MTD] replace MTD_ROM with MTD_GENERIC_TYPE
  [MTD] remove a forgotten MTD_XIP
  ...
2006-06-20 14:50:31 -07:00
Steve French
75ba632a01 Merge with /pub/scm/linux/kernel/git/torvalds/linux-2.6.git 2006-06-20 20:36:38 +00:00
Trond Myklebust
d59bf96cdd Merge branch 'master' of /home/trondmy/kernel/linux-2.6/ 2006-06-20 08:59:45 -04:00
Amy Griffis
9c937dcc71 [PATCH] log more info for directory entry change events
When an audit event involves changes to a directory entry, include
a PATH record for the directory itself.  A few other notable changes:

    - fixed audit_inode_child() hooks in fsnotify_move()
    - removed unused flags arg from audit_inode()
    - added audit log routines for logging a portion of a string

Here's some sample output.

before patch:
type=SYSCALL msg=audit(1149821605.320:26): arch=40000003 syscall=39 success=yes exit=0 a0=bf8d3c7c a1=1ff a2=804e1b8 a3=bf8d3c7c items=1 ppid=739 pid=800 auid=0 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=ttyS0 comm="mkdir" exe="/bin/mkdir" subj=root:system_r:unconfined_t:s0-s0:c0.c255
type=CWD msg=audit(1149821605.320:26):  cwd="/root"
type=PATH msg=audit(1149821605.320:26): item=0 name="foo" parent=164068 inode=164010 dev=03:00 mode=040755 ouid=0 ogid=0 rdev=00:00 obj=root:object_r:user_home_t:s0

after patch:
type=SYSCALL msg=audit(1149822032.332:24): arch=40000003 syscall=39 success=yes exit=0 a0=bfdd9c7c a1=1ff a2=804e1b8 a3=bfdd9c7c items=2 ppid=714 pid=777 auid=0 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=ttyS0 comm="mkdir" exe="/bin/mkdir" subj=root:system_r:unconfined_t:s0-s0:c0.c255
type=CWD msg=audit(1149822032.332:24):  cwd="/root"
type=PATH msg=audit(1149822032.332:24): item=0 name="/root" inode=164068 dev=03:00 mode=040750 ouid=0 ogid=0 rdev=00:00 obj=root:object_r:user_home_dir_t:s0
type=PATH msg=audit(1149822032.332:24): item=1 name="foo" inode=164010 dev=03:00 mode=040755 ouid=0 ogid=0 rdev=00:00 obj=root:object_r:user_home_t:s0

Signed-off-by: Amy Griffis <amy.griffis@hp.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-06-20 05:25:28 -04:00
Al Viro
e018290929 [PATCH] proc_loginuid_write() uses simple_strtoul() on non-terminated array
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-06-20 05:25:24 -04:00
Al Viro
473ae30bc7 [PATCH] execve argument logging
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-06-20 05:25:21 -04:00
Amy Griffis
3ca10067f7 [PATCH] inotify (4/5): allow watch removal from event handler
Allow callers to remove watches from their event handler via
inotify_remove_watch_locked().  This functionality can be used to
achieve IN_ONESHOT-like functionality for a subset of events in the
mask.

Signed-off-by: Amy Griffis <amy.griffis@hp.com>
Acked-by: Robert Love <rml@novell.com>
Acked-by: John McCutchan <john@johnmccutchan.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-06-20 05:25:19 -04:00
Amy Griffis
a9dc971d3f [PATCH] inotify (3/5): add interfaces to kernel API
Add inotify_init_watch() so caller can use inotify_watch refcounts
before calling inotify_add_watch().

Add inotify_find_watch() to find an existing watch for an (ih,inode)
pair.  This is similar to inotify_find_update_watch(), but does not
update the watch's mask if one is found.

Add inotify_rm_watch() to remove a watch via the watch pointer instead
of the watch descriptor.

Signed-off-by: Amy Griffis <amy.griffis@hp.com>
Acked-by: Robert Love <rml@novell.com>
Acked-by: John McCutchan <john@johnmccutchan.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-06-20 05:25:18 -04:00
Amy Griffis
7c29772288 [PATCH] inotify (2/5): add name's inode to event handler
When an inotify event includes a dentry name, also include the inode
associated with that name.

Signed-off-by: Amy Griffis <amy.griffis@hp.com>
Acked-by: Robert Love <rml@novell.com>
Acked-by: John McCutchan <john@johnmccutchan.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-06-20 05:25:18 -04:00
Amy Griffis
2d9048e201 [PATCH] inotify (1/5): split kernel API from userspace support
The following series of patches introduces a kernel API for inotify,
making it possible for kernel modules to benefit from inotify's
mechanism for watching inodes.  With these patches, inotify will
maintain for each caller a list of watches (via an embedded struct
inotify_watch), where each inotify_watch is associated with a
corresponding struct inode.  The caller registers an event handler and
specifies for which filesystem events their event handler should be
called per inotify_watch.

Signed-off-by: Amy Griffis <amy.griffis@hp.com>
Acked-by: Robert Love <rml@novell.com>
Acked-by: John McCutchan <john@johnmccutchan.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-06-20 05:25:17 -04:00
Nathan Scott
98174e4697 Merge HEAD from ../linux-2.6 2006-06-20 14:56:23 +10:00
Nathan Scott
d8ce753241 [XFS] Remove files from the build that are now unused.
Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-06-20 14:53:51 +10:00
Nathan Scott
d7b849da47 [XFS] Fix a Makefile issue related to exports.o handling.
Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-06-20 14:01:29 +10:00
Nathan Scott
f6c2d1fa63 [XFS] Remove version 1 directory code. Never functioned on Linux, just
pure bloat.

SGI-PV: 952969
SGI-Modid: xfs-linux-melb:xfs-kern:26251a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-06-20 13:04:51 +10:00
Nathan Scott
da2f4d679c [XFS] Map EFSCORRUPTED to an actual error code, not just a made up one
(990).	Turns out some ye-olde unices used EUCLEAN as
Filesystem-needs-cleaning, so now we use that too.

SGI-PV: 953954
SGI-Modid: xfs-linux-melb:xfs-kern:26286a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-06-20 13:01:38 +10:00
Al Viro
0d8fee3270 [XFS] Kill direct access to ->count in valusema(); all we ever use it for
is check if semaphore is actually locked, which can be trivially done in
portable way. Code gets more reabable, while we are at it... 

SGI-PV: 953915
SGI-Modid: xfs-linux-melb:xfs-kern:26274a

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-06-19 08:41:30 +10:00
Nathan Scott
a805bad5da [XFS] Remove unneeded conditional code on NFS export interface related
code paths.

SGI-PV: 904196
SGI-Modid: xfs-linux-melb:xfs-kern:26250a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-06-19 08:40:27 +10:00
Nathan Scott
6fe90e6d14 [XFS] Remove an incorrect use of unlikely() on a relatively likely code
path.

SGI-PV: 904196
SGI-Modid: xfs-linux-melb:xfs-kern:26249a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-06-19 08:40:12 +10:00
Nathan Scott
1e69dd0eb3 [XFS] Push some common code out of write path into core XFS code for
sharing.

SGI-PV: 904196
SGI-Modid: xfs-linux-melb:xfs-kern:26248a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-06-19 08:39:53 +10:00
Nathan Scott
1d47bec290 [XFS] Remove unnecessary local from open_exec dmapi path.
SGI-PV: 904196
SGI-Modid: xfs-linux-melb:xfs-kern:26247a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-06-19 08:39:16 +10:00
David Woodhouse
1046d88001 [JFFS2] Check CRC32 on dirent and data nodes each time they're read
Also, make sure dirents are marked REF_UNCHECKED when we 'discover' them
through eraseblock summary.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-06-18 22:44:21 +01:00
David Woodhouse
fc6612f627 [JFFS2] When retiring nextblock, allocate a node_ref for the wasted space
Failing to do so makes the calculated length of the last node incorrect,
when we're not using eraseblock summaries.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-06-18 18:39:38 +01:00
David Woodhouse
2ba72cb754 [JFFS2] Mark XATTR support as experimental, for now
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-06-18 10:22:40 +01:00
David Woodhouse
3877f0b6c9 [JFFS2] Don't trust node headers before the CRC is checked.
Especially when summary code is used, we can have in-memory data
structures referencing certain nodes without them actually being readable
on the flash. Discard the nodes gracefully in that case, rather than
triggering a BUG().

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-06-18 00:05:26 +01:00
Jens Axboe
991721572e [PATCH] Fix missing ret assignment in __bio_map_user() error path
If get_user_pages() returns less pages than what we asked for, we jump
to out_unmap which will return ERR_PTR(ret).  But ret can contain a
positive number just smaller than local_nr_pages, so be sure to set it
to -EFAULT always.

Problem found and diagnosed by Damien Le Moal <damien@sdl.hitachi.co.jp>

Signed-off-by: Jens Axboe <axboe@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-17 10:52:12 -07:00
Kirill Korotaev
9cedc194a7 [PATCH] Return error in case flock_lock_file failure
If flock_lock_file() failed to allocate flock with locks_alloc_lock()
then "error = 0" is returned. Need to return some non-zero.

Signed-off-by: Pavel Emelianov <xemul@openvz.org>
Signed-off-by: Kirill Korotaev <dev@openvz.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-14 08:59:44 -07:00
Pavel Machek
0fd1ffe063 [CIFS] Fix suspend/resume problem which causes EIO on subsequent access to
the mount.

Signed-off-by: Pavel Machek <pavel@suse.de>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2006-06-13 21:31:39 +00:00
Nathan Scott
d7ede1aa5d [XFS] Minor XFS documentation updates.
Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-06-13 16:28:11 +10:00
Steve French
6344a423e5 [CIFS] fix minor compile warning when config_cifs_weak_security is off
Signed-off-by: Steve French <sfrench@us.ibm.com>
2006-06-12 04:18:35 +00:00
David Woodhouse
4ed0156f77 [JFFS2] Fix more breakage caused by janitorial meddling.
jffs2_zlib_exit() and free_workspaces() shouldn't be marked __exit because
they get called in the error case from the init functions.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-06-09 15:06:42 +01:00
Trond Myklebust
28df955a2a NLM: Fix reclaim races
Currently it is possible for a task to remove its locks at the same time as
the NLM recovery thread is trying to recover them. This quickly leads to an
Oops.
Protect the locks using an rw semaphore while they are being recovered.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:40:27 -04:00
Trond Myklebust
5046791417 NLM: sem to mutex conversion
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:40:24 -04:00
Trond Myklebust
81039f1f20 NFS: Display the chosen RPCSEC_GSS security flavour in /proc/mounts
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:34:34 -04:00
David Howells
f7b422b17e NFS: Split fs/nfs/inode.c
As fs/nfs/inode.c is rather large, heterogenous and unwieldy, the attached
patch splits it up into a number of files:

 (*) fs/nfs/inode.c

     Strictly inode specific functions.

 (*) fs/nfs/super.c

     Superblock management functions for NFS and NFS4, normal access, clones
     and referrals.  The NFS4 superblock functions _could_ move out into a
     separate conditionally compiled file, but it's probably not worth it as
     there're so many common bits.

 (*) fs/nfs/namespace.c

     Some namespace-specific functions have been moved here.

 (*) fs/nfs/nfs4namespace.c

     NFS4-specific namespace functions (this could be merged into the previous
     file).  This file is conditionally compiled.

 (*) fs/nfs/internal.h

     Inter-file declarations, plus a few simple utility functions moved from
     fs/nfs/inode.c.

     Additionally, all the in-.c-file externs have been moved here, and those
     files they were moved from now includes this file.

For the most part, the functions have not been changed, only some multiplexor
functions have changed significantly.

I've also:

 (*) Added some extra banner comments above some functions.

 (*) Rearranged the function order within the files to be more logical and
     better grouped (IMO), though someone may prefer a different order.

 (*) Reduced the number of #ifdefs in .c files.

 (*) Added missing __init and __exit directives.

Signed-Off-By: David Howells <dhowells@redhat.com>
2006-06-09 09:34:33 -04:00
Trond Myklebust
4e5ccf60c5 NFS: Fix typo in nfs_do_clone_mount()
Doh!

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:34:32 -04:00
Trond Myklebust
860de07139 NFS: Fix compile errors introduced by referrals patches
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:34:31 -04:00
Trond Myklebust
87e4ba1a62 NFSv4: Ensure that referral mounts bind to a reserved port
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:34:30 -04:00
Andy Adamson
33a43f2802 NFSv4: A root pathname is sent as a zero component4
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:34:30 -04:00
Manoj Naik
6b97fd3da1 NFSv4: Follow a referral
Respond to a moved error on NFS lookup by setting up the referral.
Note: We don't actually follow the referral during lookup/getattr, but
later when we detect fsid mismatch in inode revalidation (similar to the
processing done for cloning submounts). Referrals will have fake attributes
until they are actually followed or traversed.

Signed-off-by: Manoj Naik <manoj@almaden.ibm.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:34:29 -04:00
Manoj Naik
9cdb3883c3 NFSv4: Ensure client submounts when following a referral
Set up mountpoint when hitting a referral on moved error by getting
fs_locations.

Signed-off-by: Manoj Naik <manoj@almaden.ibm.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:34:28 -04:00
Manoj Naik
61f5164cab NFS: Expand clone mounts to include other servers
Signed-off-by: Manoj Naik <manoj@almaden.ibm.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:34:27 -04:00
Manoj Naik
c818ba43f9 NFSv4: Create NFSv4 transport and client
Move existing code into a separate function so that it can be also used by
referral code.

Signed-off-by: Manoj Naik <manoj@almaden.ibm.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:34:26 -04:00
Manoj Naik
830b8e33fe NFSv4: Define an fs_locations bitmap
This is (similar to getattr bitmap) but includes fs_locations and
mounted_on_fileid attributes. Use this bitmap for encoding in fs_locations
requests.
Note: We can probably do better by requesting locations as part of fsinfo
itself.

Signed-off-by: Manoj Naik <manoj@almaden.ibm.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:34:25 -04:00
Manoj Naik
361e624f6d NFSv4: GETATTR attributes on referral
Per referral draft, only fs_locations, fsid, and mounted_on_fileid can be
requested in a GETATTR on referrals.

Signed-off-by: Manoj Naik <manoj@almaden.ibm.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:34:24 -04:00
Manoj Naik
99baf625d3 NFSv4: Decode mounted_on_fileid attribute in getattr.
It is ignored if fileid is also requested. This will be used on referrals
(fs_locations).

Signed-off-by: Manoj Naik <manoj@almaden.ibm.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:34:24 -04:00
Manoj Naik
7aaa0b3bd4 NFSv4: convert fs-locations-components to conform to RFC3530
Use component4-style formats for decoding list of servers and pathnames in
fs_locations.

Signed-off-by: Manoj Naik <manoj@almaden.ibm.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:34:23 -04:00
Trond Myklebust
683b57b435 NFSv4: Implement the fs_locations function call
NFSv4 allows for the fact that filesystems may be replicated across
several servers or that they may be migrated to a backup server in case of
failure of the primary server.
fs_locations is an NFSv4 operation for retrieving information about the
location of migrated and/or replicated filesystems.

Based on an initial implementation by Jiaying Zhang <jiayingz@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:34:22 -04:00
Trond Myklebust
51d8fa6a10 NFS: Add timeout to submounts
Make automounted partitions expire using the mark_mounts_for_expiry()
function. The timeout is controlled via a sysctl.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:34:20 -04:00
Trond Myklebust
55a975937d NFS: Ensure the client submounts, when it crosses a server mountpoint.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:34:19 -04:00
Trond Myklebust
8b4bdcf899 NFS: Store the file system "fsid" value in the NFS super block.
This should enable us to detect if we are crossing a mountpoint in the
case where the server is exporting "nohide" mounts.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:34:19 -04:00
Trond Myklebust
8b512d9a88 VFS: Remove dependency of ->umount_begin() call on MNT_FORCE
Allow filesystems to decide to perform pre-umount processing whether or not
MNT_FORCE is set.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:34:18 -04:00
Trond Myklebust
5528f911b4 VFS: Add shrink_submounts()
Allow a submount to be marked as being 'shrinkable' by means of the
vfsmount->mnt_flags, and then add a function 'shrink_submounts()' which
attempts to recursively unmount these submounts.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:34:17 -04:00
Trond Myklebust
1f5ce9e93a VFS: Unexport do_kern_mount() and clean up simple_pin_fs()
Replace all module uses with the new vfs_kern_mount() interface, and fix up
simple_pin_fs().

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:34:16 -04:00
Trond Myklebust
bb4a58bf46 VFS: Add GPL_EXPORTED function vfs_kern_mount()
do_kern_mount() does not allow the kernel to use private mount interfaces
without exposing the same interfaces to userland. The problem is that the
filesystem is referenced by name, thus meaning that it and its mount
interface must be registered in the global filesystem list.

vfs_kern_mount() passes the struct file_system_type as an explicit
parameter in order to overcome this limitation.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:34:15 -04:00
Trond Myklebust
da6d503aa0 NFS: Remove nfs_delete_inode()
Now that we have a real nfs_invalidate_page() to ensure that
truncate_inode_pages() does the right thing when there are pending dirty
pages, we can get rid of nfs_delete_inode().

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:34:14 -04:00
Trond Myklebust
d2ccddf042 NFS: Flesh out nfs_invalidate_page()
In the case of a call to truncate_inode_pages(), we should really try to
cancel any pending writes on the page.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:34:14 -04:00
J. Bruce Fields
c04871e634 NFSv4: remove obviously bogus comparison from decode_getacl
We just set *acl_len to zero, and attrlen is unsigned, so this comparison
is clearly bogus.  I have no idea what I was thinking.

Fixes a bug that caused getacl to fail over krb5p.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:34:13 -04:00
Alexey Dobriyan
3873bc50e2 NFSv4: really return status from decode_recall_args()
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:34:12 -04:00
Andreas Gruenbacher
4814f56d19 NFSv3: Client-side nfsacl caching fix
Fix two errors in the client-side acl cache: First, when nfs3_proc_getacl
requests only the default acl of a file and the access acl is not cached
already, a NULL access acl entry is cached instead of ERR_PTR(-EAGAIN)
("not cached").

Second, update the cached acls in nfs3_proc_setacls: nfs_refresh_inode does
not always invalidate the cached acls, and when it does not, the cached acls
get out of sync.

Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:34:11 -04:00
Trond Myklebust
1842bfb447 NFS: Fix up inode revalidation accounting
Currently, we are accounting for all calls to nfs_revalidate_inode(), but not
to nfs_revalidate_mapping(), or nfs_lookup_verify_inode(), etc...

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:34:10 -04:00
Trond Myklebust
44b11874ff NFS: Separate metadata and page cache revalidation mechanisms
Separate out the function of revalidating the inode metadata, and
revalidating the mapping. The former may be called by lookup(),
and only really needs to check that permissions, ctime, etc haven't changed
whereas the latter needs only done when we want to read data from the page
cache, and may need to sync and then invalidate the mapping.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:34:09 -04:00
Trond Myklebust
38478b24e3 NFS: More page cache revalidation fixups
Whenever the directory changes, we want to make sure that we always
invalidate its page cache. Fix up update_changeattr() and
nfs_mark_for_revalidate() so that they do so.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:34:09 -04:00
Trond Myklebust
f1bb0b92ba NFS: Fix page cache revalidation
Fix up a bug in the handling of NFS_INO_REVAL_PAGECACHE: make sure that
nfs_update_inode() clears it when we're sure we're not racing with other
updates.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:34:08 -04:00
Chuck Lever
0d0b5cb36f NFS: Optimize allocation of nfs_read/write_data structures
Clean up use of page_array, and fix an off-by-one error noticed by Tom
Talpey which causes kmalloc calls in cases where using the page_array
is sufficient.

Test plan:
Normal client functional testing with r/wsize=32768.

Signed-off-by: Chuck Lever <cel@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:34:07 -04:00
Trond Myklebust
73a3d07c10 NFS: Clean up inode metadata updates
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:34:04 -04:00
Trond Myklebust
9d1e923222 NFSv4: Some NFSv4 servers have broken behaviour for the change attribute
The Linux NFSv4 server violates RFC3530 in that the change attribute is not
guaranteed to be updated for every change to the inode. Our optimisation
for checking whether or not the inode metadata has changed or not is broken
too. Grr....

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:34:04 -04:00
Trond Myklebust
1de3fc12ea NFS: Clean up and fix page zeroing when we have short reads
The code that is supposed to zero the uninitialised partial pages when the
server returns a short read is currently broken: it looks at the nfs_page
wb_pgbase and wb_bytes fields instead of the equivalent nfs_read_data
values when deciding where to start truncating the page.

Also ensure that we are more careful about setting PG_uptodate
before retrying a short read: the retry will change the nfs_read_data
args.pgbase and args.count.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:34:03 -04:00
Nathan Scott
b190f1138b [XFS] Fix broken const use inside local suffix_strtoul routine.
SGI-PV: 904196
SGI-Modid: xfs-linux-melb:xfs-kern:26201a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-06-09 17:13:15 +10:00
Mandy Kirkconnell
477829ef2e [XFS] Fix nused counter. It's currently getting set to -1 rather than
getting decremented by 1.  Since nused never reaches 0, the "if
(!free->hdr.nused)" check in xfs_dir2_leafn_remove() fails every time and
xfs_dir2_shrink_inode() doesn't get called when it should.  This causes
extra blocks to be left on an empty directory and the directory in unable
to be converted back to inline extent mode.

SGI-PV: 951958
SGI-Modid: xfs-linux-melb:xfs-kern:211382a

Signed-off-by: Mandy Kirkconnell <alkirkco@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-06-09 17:13:04 +10:00
Nathan Scott
421ad13458 [XFS] Fix mismerge of the fs_writable cleanup patch causing a freeze/thaw
test hang.

SGI-PV: 953563
SGI-Modid: xfs-linux-melb:xfs-kern:26182a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-06-09 17:12:46 +10:00
Nathan Scott
4d1a2ed3d8 [XFS] Fix up debug code so that bulkstat wont generate thousands of
fsstress warnings.

SGI-PV: 904196
SGI-Modid: xfs-linux-melb:xfs-kern:26111a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-06-09 17:12:28 +10:00
Nathan Scott
a916e2bd15 [XFS] Remove unused parameter from di2xflags routine.
SGI-PV: 904192
SGI-Modid: xfs-linux-melb:xfs-kern:26110a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-06-09 17:12:17 +10:00
Nathan Scott
34327e1384 [XFS] Cleanup a missed porting conversion, and freezing.
SGI-PV: 953338
SGI-Modid: xfs-linux-melb:xfs-kern:26109a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-06-09 17:11:55 +10:00
Nathan Scott
8285fb58e7 [XFS] Resolve a namespace collision on remaining vtypes for FreeBSD
porters.

SGI-PV: 953338
SGI-Modid: xfs-linux-melb:xfs-kern:26108a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-06-09 17:07:12 +10:00
Nathan Scott
67fcaa73ad [XFS] Resolve a namespace collision on vnode/vnodeops for FreeBSD porters.
SGI-PV: 953338
SGI-Modid: xfs-linux-melb:xfs-kern:26107a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-06-09 17:00:52 +10:00
Nathan Scott
b83bd13881 [XFS] Resolve a namespace collision on vfs/vfsops for FreeBSD porters.
SGI-PV: 9533338
SGI-Modid: xfs-linux-melb:xfs-kern:26106a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-06-09 16:48:30 +10:00
Nathan Scott
932f2c3231 [XFS] statvfs component of directory/project quota support, code
originally by Glen.

SGI-PV: 932952
SGI-Modid: xfs-linux-melb:xfs-kern:26105a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-06-09 15:29:58 +10:00
Nathan Scott
b65745205f [XFS] Portability changes: remove prdev, stick to one diagnostic
interface.

SGI-PV: 953338
SGI-Modid: xfs-linux-melb:xfs-kern:26103a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-06-09 15:29:40 +10:00
Nathan Scott
9c48876a05 [XFS] Remove dead code from come bulkstat paths.
SGI-PV: 904196
SGI-Modid: xfs-linux-melb:xfs-kern:26102a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-06-09 15:29:22 +10:00
Nathan Scott
ad723875ac [XFS] Fix a typo in a header file comment.
SGI-PV: 904196
SGI-Modid: xfs-linux-melb:xfs-kern:26101a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-06-09 15:29:12 +10:00
Nathan Scott
7d4fb40ad7 [XFS] Start writeout earlier (on last close) in the case where we have a
truncate down followed by delayed allocation (buffered writes) - worst
case scenario for the notorious NULL files problem.  This reduces the
window where we are exposed to that problem significantly.

SGI-PV: 917976
SGI-Modid: xfs-linux-melb:xfs-kern:26100a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-06-09 15:27:16 +10:00
Nathan Scott
59c1b082f5 [XFS] Make the pflags test/set wrappers more legible for us mere humans.
SGI-PV: 953338
SGI-Modid: xfs-linux-melb:xfs-kern:26099a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-06-09 14:59:13 +10:00
Nathan Scott
e109007461 [XFS] Fix a buffer refcount leak in dir2 code on a forced shutdown.
SGI-PV: 904196
SGI-Modid: xfs-linux-melb:xfs-kern:26097a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-06-09 14:58:48 +10:00
Nathan Scott
7d04a335b6 [XFS] Shutdown the filesystem if all device paths have gone. Made
shutdown vop flags consistent with sync vop flags declarations too.

SGI-PV: 939911
SGI-Modid: xfs-linux-melb:xfs-kern:26096a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-06-09 14:58:38 +10:00
Nathan Scott
b76963fac4 [XFS] getattr can return an error code, so propogate any from lower
layers.

SGI-PV: 904196
SGI-Modid: xfs-linux-melb:xfs-kern:26095a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-06-09 14:58:20 +10:00
Nathan Scott
3d80ede479 [XFS] Drop use of m_writeio_blocks when zeroing, its not meaningful
anymore here.

SGI-PV: 904196
SGI-Modid: xfs-linux-melb:xfs-kern:26094a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-06-09 14:57:30 +10:00
Ingo Molnar
72c93bcc63 [XFS] lock validator: lockdep: small xfs init_rwsem() cleanup
init_rwsem() has no return value.  This is not a problem if init_rwsem()
is a function, but it's a problem if it's a do { ...  } while (0) macro. 
(which lockdep introduces) 

SGI-PV: 904196
SGI-Modid: xfs-linux-melb:xfs-kern:26082a

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-06-09 14:57:01 +10:00
Tim Shimmin
87c199c2a7 [XFS] Over zealous with doing endian conversions. We endian converted the
logged version of di_next_unlinked which is actually always stored in the
correct ondisk format. This was pointed out to us by Shailendra Tripathi.
And is evident in the xfs qa test of 121.

SGI-PV: 953263
SGI-Modid: xfs-linux-melb:xfs-kern:26044a

Signed-off-by: Tim Shimmin <tes@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-06-09 14:56:16 +10:00
David Chinner
714250879e [XFS] Stop a BUG from occurring in generic_delete_inode by preventing
transaction completion from marking the inode dirty while it is being
cleaned up on it's way out of the system.

SGI-PV: 952967
SGI-Modid: xfs-linux-melb:xfs-kern:26040a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-06-09 14:55:52 +10:00
Tim Shimmin
6d192a9b82 [XFS] inode items and EFI/EFDs have different ondisk format for 32bit and
64bit kernels allow recovery to handle both versions and do the necessary
decoding

SGI-PV: 952214
SGI-Modid: xfs-linux-melb:xfs-kern:26011a

Signed-off-by: Tim Shimmin <tes@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-06-09 14:55:38 +10:00
Yingping Lu
d210a28cd8 [XFS] In actual allocation of file system blocks and freeing extents, the
transaction within each such operation may involve multiple locking of AGF
buffer. While the freeing extent function has sorted the extents based on
AGF number before entering into transaction, however, when the file system
space is very limited, the allocation of space would try every AGF to get
space allocated, this could potentially cause out-of-order locking, thus
deadlock could happen. This fix mitigates the scarce space for allocation
by setting aside a few blocks without reservation, and avoid deadlock by
maintaining ascending order of AGF locking.

SGI-PV: 947395
SGI-Modid: xfs-linux-melb:xfs-kern:210801a

Signed-off-by: Yingping Lu <yingping@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-06-09 14:55:18 +10:00
Barry Naujok
d3446eac3f [XFS] Add degframentation exclusion support
SGI-PV: 953061
SGI-Modid: xfs-linux-melb:xfs-kern:25986a

Signed-off-by: Barry Naujok <bnaujok@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-06-09 14:54:19 +10:00
Nathan Scott
fbc1462bcb [XFS] Fix a noatime regression related to updating inode atime field on
mmap only.

SGI-PV: 952736
SGI-Modid: xfs-linux-melb:xfs-kern:25922a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-06-09 14:52:13 +10:00
Nathan Scott
ba0b92d671 [XFS] Fix a comment typo, originally noticed by Ming Zhang.
SGI-PV: 907752
SGI-Modid: xfs-linux-melb:xfs-kern:25921a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-06-09 14:52:00 +10:00
Mandy Kirkconnell
fe6c1e7240 [XFS] Fix size argument in kmem_free().
SGI-PV: 952291
SGI-Modid: xfs-linux-melb:xfs-kern:209807a

Signed-off-by: Mandy Kirkconnell <alkirkco@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-06-09 14:51:25 +10:00
Olaf Weber
3f368a0d58 [XFS] Originally the ATTR_DMI flag also had the functionality of the
ATTR_NOLOCK flag, but this was split off some time ago, as ATTR_DMI needed
to be used separately.	Two asserts were added to guard correctness of the
code during the transition.  These are no longer required.

SGI-PV: 952145
SGI-Modid: xfs-linux-melb:xfs-kern:209633a

Signed-off-by: Olaf Weber <olaf@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-06-09 14:51:11 +10:00
Christoph Hellwig
1d8daf06f6 [XFS] endianess annotations for xfs_dir_leaf_entry_t
SGI-PV: 943272
SGI-Modid: xfs-linux-melb:xfs-kern:25808a

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-06-09 14:50:37 +10:00
Christoph Hellwig
8034fff39b [XFS] endianess annotations for xfs_dir_leaf_hdr_t
SGI-PV: 943272
SGI-Modid: xfs-linux-melb:xfs-kern:25807a

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-06-09 14:50:24 +10:00
Christoph Hellwig
ff9901c1e7 [XFS] endianess annotations for xfs_dir2_data_entry_t
SGI-PV: 943272
SGI-Modid: xfs-linux-melb:xfs-kern:25806a

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-06-09 14:48:37 +10:00
Olaf Weber
3e57ecf640 [XFS] Add parameters to xfs_bmapi() and xfs_bunmapi() to have them report
the range spanned by modifications to the in-core extent map.  Add
XFS_BUNMAPI() and XFS_SWAP_EXTENTS() macros that call xfs_bunmapi() and
xfs_swap_extents() via the ioops vector. Change all calls that may modify
the in-core extent map for the data fork to go through the ioops vector. 
This allows a cache of extent map data to be kept in sync.

SGI-PV: 947615
SGI-Modid: xfs-linux-melb:xfs-kern:209226a

Signed-off-by: Olaf Weber <olaf@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-06-09 14:48:12 +10:00
Jens Axboe
71601e2b33 [PATCH] debugfs inode leak
Looking at the reiser4 crash, I found a leak in debugfs. In
debugfs_mknod(), we create the inode before checking if the dentry
already has one attached. We don't free it if that is the case.

These bugs happen quite often, I'm starting to think we should disallow
such coding in CodingStyle.

Signed-off-by: Jens Axboe <axboe@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-08 15:14:24 -07:00
Steve French
1717ffc588 [CIFS] NTLMv2 support part 5
NTLMv2 authentication (stronger authentication than default NTLM) which
many servers support now works.  There was a problem with the construction
of the security blob in the older code.  Currently requires
	/proc/fs/cifs/Experimental to be set to 2
and
	/proc/fs/cifs/SecurityFlags to be set to 0x4004 (to require using
	NTLMv2 instead of default of NTLM)

Next we will check signing to make sure optional NTLMv2 packet signing also
works.

Signed-off-by: Steve French <sfrench@us.ibm.com>
2006-06-08 05:41:32 +00:00
Steve French
f3ffb68144 Merge with /pub/scm/linux/kernel/git/torvalds/linux-2.6.git 2006-06-07 02:40:03 +00:00
Steve French
5bafd76593 [CIFS] Add support for readdir to legacy servers
Fixes oops to OS/2 on ls and removes redundant NTCreateX calls to servers
which do not support NT SMBs.  Key operations to OS/2 work.

Signed-off-by: Steve French <sfrench@us.ibm.com>
2006-06-07 00:18:43 +00:00
Steve French
a8ee03441f [CIFS] NTLMv2 support part 4
Signed-off-by: Steve French <sfrench@us.ibm.com>
2006-06-05 23:34:19 +00:00
Trond Myklebust
6d09bb627d [PATCH] fs/namei.c: Call to file_permission() under a spinlock in do_lookup_path()
From: Trond Myklebust <Trond.Myklebust@netapp.com>

We're presently running lock_kernel() under fs_lock via nfs's ->permission
handler.  That's a ranking bug and sometimes a sleep-in-spinlock bug.  This
problem was introduced in the openat() patchset.

We should not need to hold the current->fs->lock for a codepath that doesn't
use current->fs.

[vsu@altlinux.ru: fix error path]
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Al Viro <viro@ftp.linux.org.uk>
Signed-off-by: Sergey Vlasov <vsu@altlinux.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-05 12:29:16 -07:00
Steve French
6d027cfdb1 [CIFS] NTLMv2 support part 3
Response struct filled in exacty for 16 byte hash which we need to check
more to make sure it works.

Signed-off-by: Steve French <sfrench@us.ibm.com>
2006-06-05 16:26:05 +00:00
Evgeniy Dushistov
48ce8b056c JFS: commit_mutex cleanups
I look at code, and see that
1)locks wasn't release in the opposite order in which they were taken
2)in jfs_rename we lock new_ip, and in "error path" we didn't unlock it
3)I see strange expression: "! !"

May be this worth to fix?

Signed-off-by: Evgeniy Dushistov <dushistov@mail.ru>
Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
2006-06-05 08:21:03 -05:00
Steve French
f64b23ae4a [CIFS] NTLMv2 support part 2
Still need to fill in response structure and check that hash works

Signed-off-by: Steve French <sfrench@us.ibm.com>
2006-06-05 05:27:37 +00:00
Steve French
9312f6754d [CIFS] Fix mask so can set new cifs security flags properly
Signed-off-by: Steve French <sfrench@us.ibm.com>
2006-06-04 22:21:07 +00:00
Steve French
254e55ed03 CIFS] Support for older servers which require plaintext passwords - part 2
Signed-off-by: Steve French <sfrench@us.ibm.com>
2006-06-04 05:53:15 +00:00
David Woodhouse
3bcc86f507 [JFFS2] Remove stray __exit from jffs2_compressors_exit()
It's used from the initfunc in case of failure too. We could actually do
with an '__initexit' for this kind of thing -- when built in to the
kernel, it could do with being dropped with the init text. We _could_
actually just use __init for it, but that would break if/when we start
dropping init text from modules. So let's just leave it as it was for now,
and mutter a little more about random 'janitorial' fixes from people who
aren't paying attention to what they're doing.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-06-03 00:25:50 +01:00
Steve French
bdc4bf6e8a [CIFS] Support for older servers which require plaintext passwords
disabled by default, but can be enabled via proc for servers which
require such support.  Also includes support for setting security
flags for cifs.  See fs/cifs/README

Signed-off-by: Steve French <sfrench@us.ibm.com>
2006-06-02 22:57:13 +00:00
Steve French
43411d699e [CIFS] Fix mapping of old SMB return code Invalid Net Name so it is
recognized on mount

the old mapping of this was to ENODEV (instead of ENXIO) - but
ENODEV is what mount returns when the cifs driver will not load
so change this to map to ENXIO (which was what the equivalent
condition returned for mapping errors from more modern servers)

Signed-off-by: Steve French <sfrench@us.ibm.com>
2006-06-02 18:17:11 +00:00
Steve French
7a0d223176 [CIFS] Missing brace 2006-06-01 19:44:37 +00:00
Dave Kleikamp
273d81d6ad [CIFS] Do not overwrite aops
cifs should not be overwriting an element of the aops structure, since the
structure is shared by all cifs inodes.  Instead define a separate aops
structure to suit each purpose.

I also took the liberty of replacing a hard-coded 4096 with PAGE_CACHE_SIZE

Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
Signed-off-by: Steven French <sfrench@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2006-06-01 19:41:23 +00:00
Steve French
3856a9d443 [CIFS] Fix minor build breaks due to cifs kconfig issues
Signed-off-by: Steve French <sfrench@us.ibm.com>
2006-06-01 19:38:46 +00:00
Steve French
7c7b25bc8e [CIFS] Support for setting up SMB sessions to legacy lanman servers part 2 2006-06-01 19:20:10 +00:00
Steve French
9c53588ec9 [CIFS] Missing include shows up on some architectures
Signed-off-by: Steve French <sfrench@us.ibm.com>
2006-06-01 05:09:10 +00:00
Andrew Morton
6855a3a6c3 [PATCH] ext3 resize: fix double unlock_super()
From: Andrew Morton <akpm@osdl.org>

Spotted by Jan Capek <jca@sysgo.com>

Cc: "Stephen C. Tweedie" <sct@redhat.com>
Cc: Andreas Dilger <adilger@clusterfs.com>
Cc: Jan Capek <jca@sysgo.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-05-31 16:27:10 -07:00
Steve French
3979877e56 [CIFS] Support for setting up SMB sessions to legacy lanman servers 2006-05-31 22:40:51 +00:00
Steve French
26a21b980b [CIFS] Cleanup extra whitespace in dmesg logging. Update cifs change log 2006-05-31 18:05:34 +00:00
Steve French
55aa2e097d [[CIFS] Pass truncate open flag through on file open in case setattr fails
on set size to zero.

Signed-off-by: Sebastian Voitzsch <sebastoam/vpotzscj@web.de>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2006-05-30 18:09:31 +00:00
Steve French
08775834c4 [CIFS] Fix typos in previous fix
Signed-off-by: Steve French <sfrench@us.ibm.com>
2006-05-30 18:08:26 +00:00
Steve French
cec6815a12 [CIFS] endian fix for new POSIX byte range lock support
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2006-05-30 18:07:17 +00:00
Steve French
a424f8bfcb [CIFS] fix memory leak in cifs session info struct on reconnect
Signed-off-by: Steve French <sfrench@us.ibm.com>
2006-05-30 18:06:04 +00:00
Steve French
c01f36a896 [CIFS] ACPI suspend oops
Wasn't able to reproduce a hard hang, but was able to get an oops if
suspended the machine during a copy to the cifs mount.  This led to some
things hanging, including a "sync".  Also got I/O errors when trying to
access the mount afterwards (even when didn't see the oops), and had
to unmount and remount in order to access the filesystem.

This patch fixed the oops.

Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2006-05-30 18:05:10 +00:00
Steve French
a878fb2218 [CIFS] Do not limit the length of share names (was 100 for whole UNC name)
during mount. Especially important for some non-Western languages.

Signed-off-by: Steve French <sfrench@us.ibm.com>
2006-05-30 18:04:19 +00:00
Steve French
fc94cdb944 [CIFS] Fix new POSIX Locking for setting lock_type correctly on unlock
Signed-off-by: Steve French <sfrench@us.ibm.com>
2006-05-30 18:03:32 +00:00
David Woodhouse
098a19811b [JFFS2] Preallocate node refs for cleanmarker in summary scan
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-05-30 09:00:14 +01:00
David Woodhouse
13ba42df4a [JFFS2] Fix calculation of potential summary marker offset on NOR flash.
Helps if we look _inside_ the buffer, rather than adding jeb->offset to
it. Doh.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-05-30 08:59:34 +01:00
Thomas Gleixner
9a1fcdfd4b [MTD] NAND Signal that a bitflip was corrected by ECC
Return -EUCLEAN on read when a bitflip was detected and corrected, so the
clients can react and eventually copy the affected block to a spare one.
Make all in kernel users aware of the change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2006-05-29 15:06:51 +02:00
Thomas Gleixner
8593fbc68b [MTD] Rework the out of band handling completely
Hopefully the last iteration on this!

The handling of out of band data on NAND was accompanied by tons of fruitless
discussions and halfarsed patches to make it work for a particular
problem. Sufficiently annoyed by I all those "I know it better" mails and the
resonable amount of discarded "it solves my problem" patches, I finally decided
to go for the big rework. After removing the _ecc variants of mtd read/write
functions the solution to satisfy the various requirements was to refactor the
read/write _oob functions in mtd.

The major change is that read/write_oob now takes a pointer to an operation
descriptor structure "struct mtd_oob_ops".instead of having a function with at
least seven arguments.

read/write_oob which should probably renamed to a more descriptive name, can do
the following tasks:

- read/write out of band data
- read/write data content and out of band data
- read/write raw data content and out of band data (ecc disabled)

struct mtd_oob_ops has a mode field, which determines the oob handling mode.

Aside of the MTD_OOB_RAW mode, which is intended to be especially for
diagnostic purposes and some internal functions e.g. bad block table creation,
the other two modes are for mtd clients:

MTD_OOB_PLACE puts/gets the given oob data exactly to/from the place which is
described by the ooboffs and ooblen fields of the mtd_oob_ops strcuture. It's
up to the caller to make sure that the byte positions are not used by the ECC
placement algorithms.

MTD_OOB_AUTO puts/gets the given oob data automaticaly to/from the places in
the out of band area which are described by the oobfree tuples in the ecclayout
data structre which is associated to the devicee.

The decision whether data plus oob or oob only handling is done depends on the
setting of the datbuf member of the data structure. When datbuf == NULL then
the internal read/write_oob functions are selected, otherwise the read/write
data routines are invoked.

Tested on a few platforms with all variants. Please be aware of possible
regressions for your particular device / application scenario

Disclaimer: Any whining will be ignored from those who just contributed "hot
air blurb" and never sat down to tackle the underlying problem of the mess in
the NAND driver grown over time and the big chunk of work to fix up the
existing users. The problem was not the holiness of the existing MTD
interfaces. The problems was the lack of time to go for the big overhaul. It's
easy to add more mess to the existing one, but it takes alot of effort to go
for a real solution.

Improvements and bugfixes are welcome!

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2006-05-29 15:06:51 +02:00
Thomas Gleixner
f4a43cfcec [MTD] Remove silly MTD_WRITE/READ macros
Most of those macros are unused and the used ones just obfuscate
the code. Remove them and fixup all users.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2006-05-29 15:06:50 +02:00
Thomas Gleixner
5bd34c091a [MTD] NAND Replace oobinfo by ecclayout
The nand_oobinfo structure is not fitting the newer error correction
demands anymore. Replace it by struct nand_ecclayout and fixup the users
all over the place. Keep the nand_oobinfo based ioctl for user space
compability reasons.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2006-05-29 15:06:50 +02:00
Thomas Gleixner
ff268fb879 [MTD] NAND Consolidate oobinfo handling
The info structure for out of band data was copied into
the mtd structure. Make it a pointer and remove the ability
to set it from userspace. The position of ecc bytes is
defined by the hardware and should not be changed by software.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2006-05-29 15:06:49 +02:00
David Woodhouse
a6a8bef722 [JFFS2] Preallocate raw_node_refs in a couple of missing places in scan
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-05-29 00:41:11 +01:00
David Woodhouse
2ebf09c249 [JFFS2] Fix oops when marking space dirty in scan, but no previous node exists.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-05-28 22:13:25 +01:00
David Woodhouse
ddc58bd65e [JFFS2] Fix wbuf recovery of f->metadata->raw node.
A data node might not be in the fraglist; it could be f->metadata.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-05-27 13:15:16 +01:00
David Woodhouse
9bfeb691e7 [JFFS2] Switch to using an array of jffs2_raw_node_refs instead of a list.
This allows us to drop another pointer from the struct jffs2_raw_node_ref,
shrinking it to 8 bytes on 32-bit machines (if the TEST_TOTLEN) paranoia
check is turned off, which will be committed soon).

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-05-26 21:19:05 +01:00
Florin Malita
3ac8141366 [PATCH] affs: possible null pointer dereference in affs_rename()
If affs_bread() fails, the exit path calls mark_buffer_dirty_inode() with a
NULL argument.

Coverity CID: 312.

Signed-off-by: Florin Malita <fmalita@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-05-26 11:55:46 -07:00
David Woodhouse
89291a9d5b [JFFS2] Fix 64-bit size_t problems in XATTR code.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-05-25 13:30:24 +01:00
David Woodhouse
8b9e9fe8c6 [JFFS2] Fix and improve debugging output during scan.
Print wasted_size in scanned eraseblocks, print range correctly for
summary dirent and inode entries.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-05-25 01:53:09 +01:00
David Woodhouse
046b8b9808 [JFFS2] Add 'jeb' argument to jffs2_prealloc_raw_node_refs()
Preallocation of refs is shortly going to be a per-eraseblock thing,
rather than per-filesystem. Add the required argument to the function.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-05-25 01:50:35 +01:00
David Woodhouse
f61579c337 [JFFS2] Correctly handle wasted space before summary node.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-05-25 01:42:40 +01:00
David Woodhouse
c38c1b613d [JFFS2] jffs2_free_all_node_refs() doesn't free them all. Rename it.
... to jffs2_free_jeb_node_refs() since that's what it does.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-05-25 01:38:27 +01:00
David Woodhouse
f560928baa [JFFS2] Allocate node_ref for wasted space when skipping to page boundary
One more place where we were changing the accounting info without
actually allocating a ref for the lost space...

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-05-25 01:37:28 +01:00
David Woodhouse
c7c16c8e76 [JFFS2] Revert Artem's Bunkage in debug messages.
Random unthinking 'cleanup' caused debug messages like this:
   Obsoleting node at 0x0006daf4 of len 0x3a4: <7>Dirtying

If messages are continuation of an existing line, they don't need
to be prefixed with KERN_DEBUG.

THINK. Or you will be replaced by a small shell script.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-05-24 14:24:02 +01:00
Dave Kleikamp
b964638ffd JFS: Fix multiple errors in metapage_releasepage
It looks like metapage_releasepage was making in invalid assumption that
the releasepage method would not be called on a dirty page.  Instead of
issuing a warning and releasing the metapage, it should return 0, indicating
that the private data for the page cannot be released.

I also realized that metapage_releasepage had the return code all wrong.  If
it is successful in releasing the private data, it should return 1, otherwise
it needs to return 0.

Lastly, there is no need to call wait_on_page_writeback, since
try_to_release_page will not call us with a page in writback state.

Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
2006-05-24 07:43:38 -05:00
David Woodhouse
0305c8659f Merge branch 'master' of git://git.infradead.org/~gleixner/mtd-nand-2.6.git 2006-05-24 10:01:43 +01:00
David Woodhouse
99988f7bbd [JFFS2] Introduce ref_next() macro for finding next physical node
Another part of the preparation for switching to an array...

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-05-24 09:04:17 +01:00
David Woodhouse
2f785402f3 [JFFS2] Reduce visibility of raw_node_ref to upper layers of JFFS2 code.
As the first step towards eliminating the ref->next_phys member and saving
memory by using an _array_ of struct jffs2_raw_node_ref per eraseblock,
stop the write functions from allocating their own refs; have them just
_reserve_ the appropriate number instead. Then jffs2_link_node_ref() can
just fill them in.

Use a linked list of pre-allocated refs in the superblock, for now. Once
we switch to an array, it'll just be a case of extending that array.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-05-24 02:04:45 +01:00
NeilBrown
a2eb0c101d [PATCH] md: Make sure bi_max_vecs is set properly in bio_split
Else a subsequent bio_clone might make a mess.

Signed-off-by: Neil Brown <neilb@suse.de>
Cc: "Don Dupuis" <dondster@gmail.com>
Acked-by: Jens Axboe <axboe@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-05-23 10:35:31 -07:00
NeilBrown
f2d395865f [PATCH] knfsd: Fix two problems that can cause rmmod nfsd to die
Both cause the 'entries' count in the export cache to be non-zero at module
removal time, so unregistering that cache fails and results in an oops.

1/ exp_pseudoroot (used for NFSv4 only) leaks a reference to an export
   entry.
2/ sunrpc_cache_update doesn't increment the entries count when it adds
   an entry.

Thanks to "david m.  richter" <richterd@citi.umich.edu> for triggering the
problem and finding one of the bugs.

Cc: "david m. richter" <richterd@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-05-23 10:35:31 -07:00
Thomas Gleixner
9223a456da [MTD] Remove read/write _ecc variants
MTD clients are agnostic of FLASH which needs ECC suppport.
Remove the functions and fixup the callers.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2006-05-23 17:21:03 +02:00
Thomas Gleixner
4cbb9b80e1 Merge branch 'master' of /home/tglx/work/kernel/git/mtd-2.6/
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2006-05-23 12:37:31 +02:00
Thomas Gleixner
dcb0932884 [JFFS2] Simplify writebuffer handling
The writev based write buffer implementation was far to complex as
in most use cases the write buffer had to be handled anyway.
Simplify the write buffer handling and use mtd->write instead.

From extensive testing no performance impact has been noted.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2006-05-23 11:49:14 +02:00
David Woodhouse
9fe4854cd1 [JFFS2] Remove flash offset argument from various functions.
We don't need the upper layers to deal with the physical offset. It's
_always_ c->nextblock->offset + c->sector_size - c->nextblock->free_size
so we might as well just let the actual write functions deal with that.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-05-23 00:38:06 +01:00
Joern Engel
5fa433942b [MTD] Introduce MTD_BIT_WRITEABLE
o Add a flag MTD_BIT_WRITEABLE for devices that allow single bits to be
  cleared.
o Replace MTD_PROGRAM_REGIONS with a cleared MTD_BIT_WRITEABLE flag for
  STMicro and Intel Sibley flashes with internal ECC.  Those flashes
  disallow clearing of single bits, unlike regular NOR flashes, so the
  new flag models their behaviour better.
o Remove MTD_ECC.  After the STMicro/Sibley merge, this flag is only set
  and never checked.

Signed-off-by: Joern Engel <joern@wh.fh-wedel.de>
2006-05-22 23:18:29 +02:00
Joern Engel
c8b229de2b [MTD] Merge STMicro NOR_ECC code with Intel Sibley code
In 2002, STMicro started producing NOR flashes with internal ECC protection
for small blocks (8 or 16 bytes).  Support for those flashes was added by me.
In 2005, Intel Sibley flashes copied this strategy and Nico added support for
those.  Merge the code for both.

Signed-off-by: Joern Engel <joern@wh.fh-wedel.de>
2006-05-22 23:18:12 +02:00
Joern Engel
28318776a8 [MTD] Introduce writesize
At least two flashes exists that have the concept of a minimum write unit,
similar to NAND pages, but no other NAND characteristics.  Therefore, rename
the minimum write unit to "writesize" for all flashes, including NAND.

Signed-off-by: Joern Engel <joern@wh.fh-wedel.de>
2006-05-22 23:18:05 +02:00
David Woodhouse
987d47b71a [JFFS2] Put list of nodes in common part of ic/x_ref/x_datum structure
We'll be using a proper list of nodes in the jffs2_xattr_datum and
jffs2_xattr_ref structures, because the existing code to overwrite
them is just broken. Put it in the common part at the front of the
structure which is shared with the jffs2_inode_cache, so that the
jffs2_link_node_ref() function can do the right thing.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-05-22 16:32:05 +01:00
David Woodhouse
0eac940b8a [JFFS2] Add some preemptive BUG checks for XATTR code
In a couple of places, we assume that what's at the end of the
->next_in_ino list is a struct jffs2_inode_cache. Let's check
for that, since we expect it to change soon.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-05-22 16:29:23 +01:00
David Woodhouse
fcb7578719 [JFFS2] Extend jffs2_link_node_ref() to link into per-inode list too.
Let's avoid the potential for forgetting to set ref->next_in_ino, by doing
it within jffs2_link_node_ref() instead.

This highlights the ugliness of what we're currently doing with
xattr_datum and xattr_ref structures -- we should find a nicer way of
dealing with that.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-05-22 15:23:10 +01:00
David Woodhouse
a1b563d652 [JFFS2] Initialise ref->next_in_ino when marking dirty space in wbuf flush
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-05-22 13:55:46 +01:00
David Woodhouse
3b79673cfa [JFFS2] Fix accounting error in jffs2_link_node_ref()
When filing REF_OBSOLETE nodes, we'd add their size to the global
'dirty_size' count, but then to the eraseblock's 'used_size' count.
That's not clever.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-05-22 12:15:47 +01:00
David Woodhouse
06c6764b58 [JFFS2] Fix dummy jffs2_sum_scan_sumnode() macro for !SUMMARY case.
I added an argument to the real function...

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-05-22 11:27:14 +01:00
Amy Griffis
d66fd908ac [PATCH] fix NULL dereference in inotify_ignore
Don't reassign to watch.  If idr_find() returns NULL, then
put_inotify_watch() will choke.

Signed-off-by: Amy Griffis <amy.griffis@hp.com>
Cc: John McCutchan <john@johnmccutchan.com>
Cc: Robert Love <rlove@rlove.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-05-21 12:59:18 -07:00
Amy Griffis
66055a4e73 [PATCH] fix race in inotify_release
While doing some inotify stress testing, I hit the following race.  In
inotify_release(), it's possible for a watch to be removed from the lists
in between dropping dev->mutex and taking inode->inotify_mutex.  The
reference we hold prevents the watch from being freed, but not from being
removed.

Checking the dev's idr mapping will prevent a double list_del of the
same watch.

Signed-off-by: Amy Griffis <amy.griffis@hp.com>
Acked-by: John McCutchan <john@johnmccutchan.com>
Cc: Robert Love <rml@novell.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-05-21 12:59:18 -07:00
Andrew Morton
df88912a21 [PATCH] binfmt_flat: don't check for EMFILE
Bernd Schmidt points out that binfmt_flat is now leaving the exec file open
while the application runs.  This offsets all the application's fd numbers.
We should have closed the file within exec(), not at exit()-time.

But there doesn't seem to be a lot of point in doing all this just to avoid
going over RLIMIT_NOFILE by one fd for a few microseconds.  So take the EMFILE
checking out again.  This will cause binfmt_flat to again fail LTP's
exec-should-return-EMFILE-when-fdtable-is-full test.  That test appears to be
wrong anyway - Open Group specs say nothing about exec() returning EMFILE.

Cc: Bernd Schmidt <bernd.schmidt@analog.com>
Cc: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-05-21 12:59:17 -07:00
Florin Malita
9ccfc29c67 [PATCH] nfsd: sign conversion obscuring errors in nfsd_set_posix_acl()
Assigning the result of posix_acl_to_xattr() to an unsigned data type
(size/size_t) obscures possible errors.

Coverity CID: 1206.

Signed-off-by: Florin Malita <fmalita@gmail.com>
Acked-by: NeilBrown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-05-21 12:59:17 -07:00
Peter Staubach
8c7b389e53 [PATCH] NFS server subtree_check returns dubious value
Address a problem found when a Linux NFS server uses the "subtree_check"
export option.

The "subtree_check" NFS export option was designed to prohibit a client
from using a file handle for which it should not have permission.  The
algorithm used is to ensure that the entire path to the file being
referenced is accessible to the user attempting to use the file handle.  If
some part of the path is not accessible, then the operation is aborted and
the appropriate version of ESTALE is returned to the NFS client.

The error, ESTALE, is unfortunate in that it causes NFS clients to make
certain assumptions about the continued existence of the file.  They assume
that the file no longer exists and refuse to attempt to access it again.
In this case, the file really does exist, but access was denied by the
server for a particular user.

A better error to return would be an EACCES sort of error.  This would
inform the client that the particular operation that it was attempting was
not allowed, without the nasty side effects of the ESTALE error.

Signed-off-by: Peter Staubach <staubach@redhat.com>
Acked-By: NeilBrown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-05-21 12:59:16 -07:00
Lin Feng Shen
d64b1c878f [PATCH] NFS: fix error handling on access_ok in compat_sys_nfsservctl
Functions compat_nfs_svc_trans, compat_nfs_clnt_trans,
compat_nfs_exp_trans, compat_nfs_getfd_trans and compat_nfs_getfs_trans,
which are called by compat_sys_nfsservctl(fs/compat.c), don't handle the
return value of access_ok properly.  access_ok return 1 when the addr is
valid, and 0 when it's not, but these functions have the reversed
understanding.  When the address is valid, they always return -EFAULT to
compat_sys_nfsservctl.

An example is to run /usr/sbin/rpc.nfsd(32bit program on Power5).  It
doesn't function as expected.  strace showes that nfsservctl returns
-EFAULT.

The patch fixes this by correcting the error handling on the return value
of access_ok in the five functions.

Signed-off-by: Lin Feng Shen <shenlinf@cn.ibm.com>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Acked-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-05-21 12:59:16 -07:00
David Woodhouse
ca89a517fa [JFFS2] Finally eliminate __totlen field from struct jffs2_raw_node_ref
Well, almost. We'll actually keep a 'TEST_TOTLEN' macro set for now, and keep
doing some paranoia checks to make sure it's all working correctly. But if
TEST_TOTLEN is unset, the size of struct jffs2_raw_node_ref drops from 16
bytes to 12 on 32-bit machines. That's a saving of about half a megabyte of
memory on the OLPC prototype board, with 125K or so nodes in its 512MiB of
flash.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-05-21 13:29:11 +01:00
David Woodhouse
010b06d6d0 [JFFS2] Locking issues in summary write code.
We can't use jffs2_scan_dirty_space() because it doesn't do any locking; it's
only for use at scan time -- hence the 'scan' in the name.

Also, don't allocate refs while we have c->erase_completion_lock held.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-05-21 13:15:59 +01:00
David Woodhouse
9167e0f811 [JFFS2] Remove stray kfree of summary info in XATTR code.
We don't allocate this locally any more -- it's given to us and owner by
our caller. Also improve the debug messages a little.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-05-21 13:13:45 +01:00
David Woodhouse
0bcc099d6d [JFFS2] File node reference for wasted space when flushing wbuf
Next step in ongoing campaign to file a struct jffs2_raw_node_ref for every
piece of dirty space in the system, so that __totlen can be killed off....

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-05-21 13:00:54 +01:00
David Woodhouse
b64335f2b7 [JFFS2] Add length argument to jffs2_add_physical_node_ref()
If __totlen is going away, we need to pass the length in separately.
Also stop callers from needlessly setting ref->next_phys to NULL,
since that's done for them... and since that'll also be going away soon.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-05-21 04:36:45 +01:00
David Woodhouse
49f11d4075 [JFFS2] Mark gaps in summary list as dirty space
Make sure we allocate a ref for any dirty space which exists between nodes
which we find in an eraseblock summary.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-05-21 04:00:01 +01:00
David Woodhouse
25090a6b23 [JFFS2] Discard remaining free space when filing a dirty block in scan.
The incoming ref_totlen() calculation is going to rely on the existence
of nodes which cover all dirty space. We can't just tweak the accounting
data any more; we have to call jffs2_scan_dirty_space() to do it.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-05-21 03:57:56 +01:00
David Woodhouse
68270995f2 [JFFS2] Introduce jffs2_scan_dirty_space() function.
To eliminate the __totlen field from struct jffs2_raw_node_ref, we need
to allocate nodes for dirty space instead of just tweaking the accounting
data. Introduce jffs2_scan_dirty_space() in preparation for that.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-05-21 03:46:05 +01:00
David Woodhouse
7807ef7ba2 [JFFS2] Fix summary handling of unknown but compatible nodes.
For RWCOMPAT and ROCOMPAT nodes, we should still allow the mount to
succeed. Just abandon the summary and fall through to the full scan.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-05-21 03:45:27 +01:00
David Woodhouse
3560160aa2 [JFFS2] Fix memory leak in scan code; improve comments.
If we had to allocate extra space for the summary node, we weren't
correctly freeing it when jffs2_sum_scan_sumnode() returned nonzero --
which is both the success and the failure case. Only when it returned
zero, which means fall through to the full scan, were we correctly freeing
the buffer.

Document the meaning of those return codes while we're at it.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-05-21 01:28:05 +01:00
David Woodhouse
6171586a7a [JFFS2] Correct handling of JFFS2_FEATURE_RWCOMPAT_COPY nodes.
We should preserve these when we come to garbage collect them, not let
them get erased. Use jffs2_garbage_collect_pristine() for this, and make
sure the summary code copes -- just refrain from writing a summary for any
block which contains a node we don't understand.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-05-21 00:02:06 +01:00
David Woodhouse
fb9fbbcc93 [JFFS2] Correct accounting of erroneous cleanmarkers and failed summaries.
It should all be counted as dirty space, not wasted and _definitely_ not
unchecked.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-05-20 20:08:42 +01:00
David Woodhouse
f1f9671bd8 [JFFS2] Introduce jffs2_link_node_ref() function to reduce code duplication
The same sequence of code was repeated in many places, to add a new
struct jffs2_raw_node_ref to an eraseblock and adjust the space accounting
accordingly. Move it out-of-line.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-05-20 19:45:26 +01:00
David Woodhouse
0cfc7da3ff Merge git://git.infradead.org/jffs2-xattr-2.6
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-05-20 17:27:32 +01:00
David Woodhouse
1417fc44ee [JFFS2] Reduce calls to ref_totlen() in jffs2_mark_node_obsolete()
We were calling ref_totlen() 18 times. Even before that becomes a real
function rather than just a dereference, apparently some compilers still
suck anyway. It'll _certainly_ suck after ref_totlen() becomes more
complicated, so calculate it once and don't rely on CSE.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-05-20 16:20:19 +01:00
David Woodhouse
9641b784ff [JFFS2] Optimise reading of eraseblock summary nodes
This improves the time to mount 512MiB of NAND flash on my OLPC prototype
by about 4%. We used to read the last page of the eraseblock twice -- once
to find the offset of the summary node, and again to actually _read_ the
summary node. Now we read the last page only once, and read more only if
we need to.

We also don't allocate a new buffer just for the summary code -- we use
the buffer which was already allocated for the scan. Better still, if the
'buffer' for the scan is actually just a pointer directly into NOR flash,
we use that too, avoiding the memcpy() which we used to do.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-05-20 16:13:34 +01:00
Ferenc Havasi
8e4482fba2 [JFFS2] Remove forgotten summary code
Remove forgotten lines from jffs2_scan_eraseblock() which
were unnecessary and may cause problem in some environments.

Thanks to Alexander Belyakov <alexander.belyakov@intel.com>.

Signed-off-by: Ferenc Havasi <havasi@inf.u-szeged.hu>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-05-19 21:00:36 +01:00
David Woodhouse
aef9ab4784 [JFFS2] Support new device nodes
Device node major/minor numbers are just stored in the payload of a single
data node. Just extend that to 4 bytes and use new_encode_dev() for it.

We only use the 4-byte format if we _need_ to, if !old_valid_dev(foo).
This preserves backwards compatibility with older code as much as
possible. If we do make devices with major or minor numbers above 255, and
then mount the file system with the old code, it'll just read the first
two bytes and get the numbers wrong. If it comes to garbage-collect it,
it'll then write back those wrong numbers. But that's about the best we
can expect.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-05-19 00:28:49 +01:00
KaiGai Kohei
20a92fc74c Merge git://git.infradead.org/mtd-2.6 2006-05-19 00:43:53 +09:00
Joel Becker
cef0893dcf configfs: Make sure configfs_init() is called before consumers.
configfs_init() needs to be called first to register configfs before anyconsumers try to access it.  Move up configfs in fs/Makefile to make
sure it is initialized early.

Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-05-17 14:38:51 -07:00
Joel Becker
eed7a0db46 configfs: configfs_mkdir() failed to cleanup linkage.
If configfs_mkdir() errored in certain ways after the parent<->child
linkage was already created, it would not undo the linkage.  Also,
comment the reference counting for clarity.

Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-05-17 14:38:51 -07:00
Joel Becker
84efad1a53 configfs: Fix a reference leak in configfs_mkdir().
configfs_mkdir() failed to release the working parent reference in most
exit paths.  Also changed the exit path for readability.

Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-05-17 14:38:50 -07:00
Sunil Mushran
afae00ab45 ocfs2: fix gfp mask in some file system paths
We were using GFP_KERNEL in a handful of places which really wanted
GFP_NOFS. Fix this.

Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-05-17 14:38:49 -07:00
Mark Fasheh
dd4a2c2bfe ocfs2: Don't populate uptodate cache in ocfs2_force_read_journal()
This greatly reduces the amount of memory useded during recovery.

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-05-17 14:38:48 -07:00
Mark Fasheh
c4374f8a60 ocfs2: take meta data lock in ocfs2_file_aio_read()
Temporarily take the meta data lock in ocfs2_file_aio_read() to allow us to
update our inode fields.

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-05-17 14:38:47 -07:00
Mark Fasheh
53013cba41 ocfs2: take data locks around extend
We need to take a data lock around extends to protect the pages that
ocfs2_zero_extend is going to be pulling into the page cache. Otherwise an
extend on one node might populate the page cache with data pages that have
no lock coverage.

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-05-17 14:38:47 -07:00
David Woodhouse
c41ff6e5f3 [JFFS2] Fix printk format in jffs2_sum_write_data() error message.
fs/jffs2/summary.c: In function ‘jffs2_sum_write_data’:
fs/jffs2/summary.c:658: warning: format ‘%zd’ expects type ‘signed size_t’, but argument 4 has type ‘uint32_t’

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-05-16 17:05:33 +01:00
David Brownell
7d2beb1359 [JFFS2] Fix section mismatch warnings in JFFS2.
Mark certain functions with __init and __exit appropriately.

Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-05-16 16:08:10 +01:00
David Woodhouse
18594822fc Merge git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-05-16 01:19:52 +01:00
Florin Malita
5b5ffbc1e6 [PATCH] jffs2: memory leak in jffs2_scan_medium()
If jffs2_scan_eraseblock() fails and the exit path is taken, 's' is not
being deallocated.

Reported by Coverity, CID: 1258.

Signed-off-by: Florin Malita <fmalita@gmail.com>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-05-15 23:42:31 +01:00
Andrew Morton
194a61b8e0 [PATCH] jffs2 warning fixes
fs/jffs2/nodelist.c: In function `check_node_data':
fs/jffs2/nodelist.c:441: warning: unsigned int format, different type arg (arg 4)
fs/jffs2/nodelist.c:464: warning: int format, different type arg (arg 5)

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-05-15 11:20:58 -07:00
Andrew Morton
eee391a66d [PATCH] revert "vfs: propagate mnt_flags into do_loopback/vfsmount"
Revert commit f6422f17d3, due to

Valdis.Kletnieks@vt.edu wrote:
>
> There seems to have been a bug introduced in this changeset:
>
> Am running 2.6.17-rc3-mm1.  When this changeset is applied, 'mount --bind'
> misbehaves:
>
> > # mkdir /foo
> > # mount -t tmpfs -o rw,nosuid,nodev,noexec,noatime,nodiratime none /foo
> > # mkdir /foo/bar
> > # mount --bind /foo/bar /foo
> > # tail -2 /proc/mounts
> > none /foo tmpfs rw,nosuid,nodev,noexec,noatime,nodiratime 0 0
> > none /foo tmpfs rw 0 0
>
> Reverting this changeset causes both mounts to have the same options.
>
> (Thanks to Stephen Smalley for tracking down the changeset...)
>

Cc: Herbert Poetzl <herbert@13thfloor.at>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: <Valdis.Kletnieks@vt.edu>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-05-15 11:20:57 -07:00
Alexey Dobriyan
3835a9bd07 [PATCH] fs/compat.c: fix 'if (a |= b )' typo
Mentioned by Mark Armbrust somewhere on Usenet.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Ulrich Drepper <drepper@redhat.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-05-15 11:20:57 -07:00
Latchesar Ionkov
41e5a6ac80 [PATCH] v9fs: signal handling fixes
Multiple races can happen when v9fs is interrupted by a signal and Tflush
message is sent to the server.  After v9fs sends Tflush it doesn't wait
until it receives Rflush, and possibly the response of the original
message.  This behavior may confuse v9fs what fids are allocated by the
file server.

This patch fixes the races and the fid allocation.

Signed-off-by: Latchesar Ionkov <lucho@ionkov.net>
Cc: Eric Van Hensbergen <ericvh@hera.kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-05-15 11:20:56 -07:00
Latchesar Ionkov
343f1fe6f2 [PATCH] v9fs: Twalk memory leak
v9fs leaks memory if the file server responds with Rerror to a Twalk
message.  The patch fixes the leak.

Signed-off-by: Latchesar Ionkov <lucho@ionkov.net>
Cc: Eric Van Hensbergen <ericvh@hera.kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-05-15 11:20:56 -07:00
Jan Niehusmann
48564e628b [PATCH] smbfs: Fix slab corruption in samba error path
Yesterday, I got the following error with 2.6.16.13 during a file copy from
a smb filesystem over a wireless link.  I guess there was some error on the
wireless link, which in turn caused an error condition for the smb
filesystem.

In the log, smb_file_read reports error=4294966784 (0xfffffe00), which also
shows up in the slab dumps, and also is -ERESTARTSYS.  Error code 27499
corresponds to 0x6b6b, so the rq_errno field seems to be the only one being
set after freeing the slab.

In smb_add_request (which is the only place in smbfs where I found
ERESTARTSYS), I found the following:

        if (!timeleft || signal_pending(current)) {
                /*
                 * On timeout or on interrupt we want to try and remove the
                 * request from the recvq/xmitq.
                 */
                smb_lock_server(server);
                if (!(req->rq_flags & SMB_REQ_RECEIVED)) {
                        list_del_init(&req->rq_queue);
                        smb_rput(req);
                }
                smb_unlock_server(server);
        }
	[...]
        if (signal_pending(current))
                req->rq_errno = -ERESTARTSYS;

I guess that some codepath like smbiod_flush() caused the request to be
removed from the queue, and smb_rput(req) be called, without
SMB_REQ_RECEIVED being set.  This violates an asumption made by the quoted
code.

Then, the above code calls smb_rput(req) again, the req gets freed, and
req->rq_errno = -ERESTARTSYS writes into the already freed slab.  As
list_del_init doesn't cause an error if called multiple times, that does
cause the observed behaviour (freed slab with rq_errno=-ERESTARTSYS).

If this observation is correct, the following patch should fix it.

I wonder why the smb code uses list_del_init everywhere - using list_del
instead would catch such situations by poisoning the next and prev
pointers.

May  4 23:29:21 knautsch kernel: [17180085.456000] ipw2200: Firmware error detected.  Restarting.
May  4 23:29:21 knautsch kernel: [17180085.456000] ipw2200: Sysfs 'error' log captured.
May  4 23:33:02 knautsch kernel: [17180306.316000] ipw2200: Firmware error detected.  Restarting.
May  4 23:33:02 knautsch kernel: [17180306.316000] ipw2200: Sysfs 'error' log already exists.
May  4 23:33:02 knautsch kernel: [17180306.968000] smb_file_read: //some_file validation failed, error=4294966784
May  4 23:34:18 knautsch kernel: [17180383.256000] smb_file_read: //some_file validation failed, error=4294966784
May  4 23:34:18 knautsch kernel: [17180383.284000] SMB connection re-established (-5)
May  4 23:37:19 knautsch kernel: [17180563.956000] smb_file_read: //some_file validation failed, error=4294966784
May  4 23:40:09 knautsch kernel: [17180733.636000] smb_file_read: //some_file validation failed, error=4294966784
May  4 23:40:26 knautsch kernel: [17180750.700000] smb_file_read: //some_file validation failed, error=4294966784
May  4 23:43:02 knautsch kernel: [17180907.304000] smb_file_read: //some_file validation failed, error=4294966784
May  4 23:43:08 knautsch kernel: [17180912.324000] smb_file_read: //some_file validation failed, error=4294966784
May  4 23:43:34 knautsch kernel: [17180938.416000] smb_errno: class Unknown, code 27499 from command 0x6b
May  4 23:43:34 knautsch kernel: [17180938.416000] Slab corruption: start=c4ebe09c, len=244
May  4 23:43:34 knautsch kernel: [17180938.416000] Redzone: 0x5a2cf071/0x5a2cf071.
May  4 23:43:34 knautsch kernel: [17180938.416000] Last user: [<e087b903>](smb_rput+0x53/0x90 [smbfs])
May  4 23:43:34 knautsch kernel: [17180938.416000] 000: 6b 6b 6b 6b 6b 6b 6b 6b 6a 6b 6b 6b 6b 6b 6b 6b
May  4 23:43:34 knautsch kernel: [17180938.416000] 0f0: 00 fe ff ff
May  4 23:43:34 knautsch kernel: [17180938.416000] Next obj: start=c4ebe19c, len=244
May  4 23:43:34 knautsch kernel: [17180938.416000] Redzone: 0x5a2cf071/0x5a2cf071.
May  4 23:43:34 knautsch kernel: [17180938.416000] Last user: [<00000000>](_stext+0x3feffde0/0x30)
May  4 23:43:34 knautsch kernel: [17180938.416000] 000: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
May  4 23:43:34 knautsch kernel: [17180938.416000] 010: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
May  4 23:43:34 knautsch kernel: [17180938.460000] SMB connection re-established (-5)
May  4 23:43:42 knautsch kernel: [17180946.292000] ipw2200: Firmware error detected.  Restarting.
May  4 23:43:42 knautsch kernel: [17180946.292000] ipw2200: Sysfs 'error' log already exists.
May  4 23:45:04 knautsch kernel: [17181028.752000] ipw2200: Firmware error detected.  Restarting.
May  4 23:45:04 knautsch kernel: [17181028.752000] ipw2200: Sysfs 'error' log already exists.
May  4 23:45:05 knautsch kernel: [17181029.868000] smb_file_read: //some_file validation failed, error=4294966784
May  4 23:45:36 knautsch kernel: [17181060.984000] smb_errno: class Unknown, code 27499 from command 0x6b
May  4 23:45:36 knautsch kernel: [17181060.984000] Slab corruption: start=c4ebe09c, len=244
May  4 23:45:36 knautsch kernel: [17181060.984000] Redzone: 0x5a2cf071/0x5a2cf071.
May  4 23:45:36 knautsch kernel: [17181060.984000] Last user: [<e087b903>](smb_rput+0x53/0x90 [smbfs])
May  4 23:45:36 knautsch kernel: [17181060.984000] 000: 6b 6b 6b 6b 6b 6b 6b 6b 6a 6b 6b 6b 6b 6b 6b 6b
May  4 23:45:36 knautsch kernel: [17181060.984000] 0f0: 00 fe ff ff
May  4 23:45:36 knautsch kernel: [17181060.984000] Next obj: start=c4ebe19c, len=244
May  4 23:45:36 knautsch kernel: [17181060.984000] Redzone: 0x5a2cf071/0x5a2cf071.
May  4 23:45:36 knautsch kernel: [17181060.984000] Last user: [<00000000>](_stext+0x3feffde0/0x30)
May  4 23:45:36 knautsch kernel: [17181060.984000] 000: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
May  4 23:45:36 knautsch kernel: [17181060.984000] 010: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
May  4 23:45:36 knautsch kernel: [17181061.024000] SMB connection re-established (-5)
May  4 23:46:17 knautsch kernel: [17181102.132000] smb_file_read: //some_file validation failed, error=4294966784
May  4 23:47:46 knautsch kernel: [17181190.468000] smb_errno: class Unknown, code 27499 from command 0x6b
May  4 23:47:46 knautsch kernel: [17181190.468000] Slab corruption: start=c4ebe09c, len=244
May  4 23:47:46 knautsch kernel: [17181190.468000] Redzone: 0x5a2cf071/0x5a2cf071.
May  4 23:47:46 knautsch kernel: [17181190.468000] Last user: [<e087b903>](smb_rput+0x53/0x90 [smbfs])
May  4 23:47:46 knautsch kernel: [17181190.468000] 000: 6b 6b 6b 6b 6b 6b 6b 6b 6a 6b 6b 6b 6b 6b 6b 6b
May  4 23:47:46 knautsch kernel: [17181190.468000] 0f0: 00 fe ff ff
May  4 23:47:46 knautsch kernel: [17181190.468000] Next obj: start=c4ebe19c, len=244
May  4 23:47:46 knautsch kernel: [17181190.468000] Redzone: 0x5a2cf071/0x5a2cf071.
May  4 23:47:46 knautsch kernel: [17181190.468000] Last user: [<00000000>](_stext+0x3feffde0/0x30)
May  4 23:47:46 knautsch kernel: [17181190.468000] 000: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
May  4 23:47:46 knautsch kernel: [17181190.468000] 010: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
May  4 23:47:46 knautsch kernel: [17181190.492000] SMB connection re-established (-5)
May  4 23:49:20 knautsch kernel: [17181284.828000] smb_file_read: //some_file validation failed, error=4294966784
May  4 23:49:39 knautsch kernel: [17181303.896000] smb_file_read: //some_file validation failed, error=4294966784

Signed-off-by: Jan Niehusmann <jan@gondor.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-05-15 11:20:56 -07:00
Olaf Kirch
3b7c810827 [PATCH] smbfs chroot issue (CVE-2006-1864)
Mark Moseley reported that a chroot environment on a SMB share can be left
via "cd ..\\".  Similar to CVE-2006-1863 issue with cifs, this fix is for
smbfs.

Steven French <sfrench@us.ibm.com> wrote:

Looks fine to me.  This should catch the slash on lookup or equivalent,
which will be all obvious paths of interest.

Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-05-15 11:20:55 -07:00
Ian Kent
a537055395 [PATCH] autofs4: NFY_NONE wait race fix
This patch fixes two problems.

First, the comparison of entries in the waitq.c was incorrect.

Second, the NFY_NONE check was incorrect. The test of whether the dentry
is mounted if ineffective, for example, if an expire fails then we could
wait forever on a non existant expire. The bug was identified by Jeff
Moyer.

The patch changes autofs4 to wait on expires only as this is all that's
needed.  If there is no existing wait when autofs4_wait is call with a type
of NFY_NONE it delays until either a wait appears or the the expire flag is
cleared.

Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-05-15 11:20:54 -07:00
Adrian Bunk
6aff5cb8ec [PATCH] fs/open.c: unexport sys_openat
Remove the unused EXPORT_SYMBOL_GPL(sys_openat).

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-05-15 11:20:54 -07:00
Andrew Morton
184f565210 [JFFS2] Fix printk format in some error messages.
fs/jffs2/nodelist.c: In function `check_node_data':
fs/jffs2/nodelist.c:441: warning: unsigned int format, different type arg (arg 4)
fs/jffs2/nodelist.c:464: warning: int format, different type arg (arg 5)

Modified from Andrew's original fix because while his terminal may indeed
only have eighty columns, mine only has _TWENTYFOUR_ lines. So the
cosmetic fluff is perfectly OK out past column 80 where it was -- the
casual reader doesn't _care_ about anything more than the fact that it
goes 'if (foo) JFFS2_WARNING...', and there's no point wasting a whole
line to display the tail end of the printk which nobody actually cares
about.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-05-15 13:45:58 +01:00
David Woodhouse
3e68fbb59b [JFFS2] Don't pack on-medium structures, because GCC emits crappy code
If we use __attribute__((packed)), GCC will _also_ assume that the
structures aren't sensibly aligned, and it'll emit code to cope with
that instead of straight word load/save. This can be _very_ suboptimal
on architectures like ARM.

Ideally, we want an attribute which just tells GCC not to do any
padding, without the alignment side-effects. In the absense of that,
we'll just drop the 'packed' attribute and hope that everything stays as
it was (which to be fair is fairly much what we expect). And add some
paranoia checks in the initialisation code, which should be optimised
away completely in the normal case.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-05-15 00:49:43 +01:00
David Woodhouse
cf5eba5334 [JFFS2] Reduce excessive node count for syslog files.
We currently get fairly poor behaviour with files which get many short
writes, such as system logs. This is because we end up with many tiny
data nodes, and the rbtree gets massive. None of these nodes are
actually obsolete, so they are counted as 'clean' space. Eraseblocks can
be entirely full of these nodes (which are REF_NORMAL instead of
REF_PRISTINE), and still they count entirely towards 'used_size' and the
eraseblocks can sit on the clean_list for a long time without being
picked for GC.

One way to alleviate this in the long term is to account REF_NORMAL
space separately from REF_PRISTINE space, rather than counting them both
towards used_size. Then these eraseblocks can be picked for GC and the
offending nodes will be garbage collected.

The short-term fix, though -- which probably makes sense even if we do
eventually implement the above -- is to merge these nodes as they're
written. When we write the last byte in a page, write the _whole_ page.
This obsoletes the earlier nodes in the page _immediately_ and we don't
even need to wait for the garbage collection to do it.

Original implementation from Ferenc Havasi <havasi@inf.u-szeged.hu>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-05-14 04:06:24 +01:00
KaiGai Kohei
21b9879bf2 [JFFS2][XATTR] Fix obvious typo
[2/2] jffs2-xattr-v5.2-02-fix_obvious_typo.patch

Signed-off-by: KaiGai Kohei <kaigai@ak.jp.nec.com>
2006-05-13 15:22:29 +09:00
KaiGai Kohei
c8708a9275 [JFFS2][XATTR] Handling the duplicate JFFS2_NODETYPE_XATTR node cases.
When jffs2_sum_process_sum_data() found a JFFS2_NODETYPE_XATTR
which has duplicate xid and older version, an error was returned
without appropriate process.
In the result, mounting filesystem is failed.

This patch fix this problem. If jffs2_setup_xattr_datum() returned
-EEXIST, the caller marks this node as DIRTY_SPACE().

[1/2] jffs2-xattr-v5.2-01-fix-duplicate-xdatum.patch

Signed-off-by: KaiGai Kohei <kaigai@ak.jp.nec.com>
2006-05-13 15:21:38 +09:00
KaiGai Kohei
dea80134dc [JFFS2][XATTR] remove redundant pointer cast in acl.c
remove redundant pointer cast in acl.c.

[10/10] jffs2-xattr-v5.1-10-remove_pointer_cast.patch

Signed-off-by: KaiGai Kohei <kaigai@ak.jp.nec.com>
2006-05-13 15:20:24 +09:00
KaiGai Kohei
5a14959c07 [JFFS2][XATTR] remove '__KERNEL__' from acl.h
[9/10] jffs2-xattr-v5.1-09-remove__KERNEL__.patch

Signed-off-by: KaiGai Kohei <kaigai@ak.jp.nec.com>
2006-05-13 15:19:36 +09:00
KaiGai Kohei
ee886b5df1 [JFFS2][XATTR] remove senseless comment
remove senseless comment.

[8/10] jffs2-xattr-v5.1-08-remove_senseless_comment.patch

Signed-off-by: KaiGai Kohei <kaigai@ak.jp.nec.com>
2006-05-13 15:19:03 +09:00
KaiGai Kohei
652ecc20d1 [JFFS2][XATTR] Unify each file header part with any jffs2 file.
Unify each file header part with any jffs2 file.

[7/10] jffs2-xattr-v5.1-07-unify_file_header.patch

Signed-off-by: KaiGai Kohei <kaigai@ak.jp.nec.com>
2006-05-13 15:18:27 +09:00
KaiGai Kohei
4470d0409b [JFFS2][XATTR] '#include <linux/list.h>' was added into xattr.h.
'#include <linux/list.h>' was added into xattr.h.
because 'struct list_head' is used in this header file.

[6/10] jffs2-xattr-v5.1-06-add_list.h.patch

Signed-off-by: KaiGai Kohei <kaigai@ak.jp.nec.com>
2006-05-13 15:17:11 +09:00
KaiGai Kohei
084702e001 [JFFS2][XATTR] Remove jffs2_garbage_collect_xattr(c, ic)
Remove jffs2_garbage_collect_xattr(c, ic).
jffs2_garbage_collect_xattr_datum/ref() are called from gc.c directly.

In original implementation, jffs2_garbage_collect_xattr(c, ic) returns
with holding a spinlock if 'ic' is inode_cache. But it returns after
releasing a spinlock if 'ic' is xattr_datum/ref.
It looks so confusable behavior. Thus, this patch makes caller manage
locking/unlocking.

[5/10] jffs2-xattr-v5.1-05-update_xattr_gc.patch

Signed-off-by: KaiGai Kohei <kaigai@ak.jp.nec.com>
2006-05-13 15:16:13 +09:00
KaiGai Kohei
8f2b6f49c6 [JFFS2][XATTR] Remove 'struct list_head ilist' from jffs2_inode_cache.
This patch can reduce 4-byte of memory usage per inode_cache.

[4/10] jffs2-xattr-v5.1-04-remove_ilist_from_ic.patch

Signed-off-by: KaiGai Kohei <kaigai@ak.jp.nec.com>
2006-05-13 15:15:07 +09:00
KaiGai Kohei
8b0b339d46 [JFFS2][XATTR] Add a description about c->xattr_sem
Add a description about the c->xattr_sem read/write semaphore
into README.Locking.

[3/10] jffs2-xattr-v5.1-03-append_README.Locking.patch

Signed-off-by: KaiGai Kohei <kaigai@ak.jp.nec.com>
2006-05-13 15:14:14 +09:00
KaiGai Kohei
de1f72fab3 [JFFS2][XATTR] remove typedef from posix_acl related definition.
jffs2_acl_header, jffs2_acl_entry and jffs2_acl_entry_short were redefined
with using 'struct' instead of 'typedef' in kernel implementation.

[1/10] jffs2-xattr-v5.1-01-remove_typedef_kernel.patch

Signed-off-by: KaiGai Kohei <kaigai@ak.jp.nec.com>
2006-05-13 15:13:27 +09:00
KaiGai Kohei
aa98d7cf59 [JFFS2][XATTR] XATTR support on JFFS2 (version. 5)
This attached patches provide xattr support including POSIX-ACL and
SELinux support on JFFS2 (version.5).

There are some significant differences from previous version posted
at last December.
The biggest change is addition of EBS(Erase Block Summary) support.
Currently, both kernel and usermode utility (sumtool) can recognize
xattr nodes which have JFFS2_NODETYPE_XATTR/_XREF nodetype.

In addition, some bugs are fixed.
- A potential race condition was fixed.
- Unexpected fail when updating a xattr by same name/value pair was fixed.
- A bug when removing xattr name/value pair was fixed.

The fundamental structures (such as using two new nodetypes and exclusion
mechanism by rwsem) are unchanged. But most of implementation were reviewed
and updated if necessary.
Espacially, we had to change several internal implementations related to
load_xattr_datum() to avoid a potential race condition.

[1/2] xattr_on_jffs2.kernel.version-5.patch
[2/2] xattr_on_jffs2.utils.version-5.patch

Signed-off-by: KaiGai Kohei <kaigai@ak.jp.nec.com>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-05-13 15:09:47 +09:00
Linus Torvalds
032ebf2620 Alternative fix for MMC oops on unmount after removal
Make sure to clear the driverfs_dev pointer when we do del_gendisk() (on
disk removal), so that other users that may still have a ref to the disk
won't try to use the stale pointer.

Also move the KOBJ_REMOVE uevent handler up, so that the uevent still
has access to the driverfs_dev data.

This all should hopefully fix the problems with MMC umounts after device
removals that caused commit 56cf6504fc and
its reversal (1a2acc9e92).

Original problem reported by Todd Blumer and others.

Acked-by: Greg KH <gregkh@suse.de>
Cc: Russell King <rmk+lkml@arm.linux.org.uk>
Cc: James Bottomley <James.Bottomley@SteelEye.com>
Cc: Erik Mouw <erik@harddisk-recovery.com>
Cc: Andrew Vasquez <andrew.vasquez@qlogic.com>
Cc: Todd Blumer <todd@sdgsystems.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-05-12 18:42:09 -07:00
Jesper Juhl
20ffdcb00a [JFFS2] Remove number of pointer dereferences in fs/jffs2/summary.c
Reduce the nr.  of pointer dereferences in fs/jffs2/summary.c

Benefits:
 - micro speed optimization due to fewer pointer derefs
 - generated code is slightly smaller
 - better readability

(The first two sound like a compiler problem but I'll go with the third. dwmw2).

Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-05-12 11:55:51 +01:00
Domen Puncer
7e59f2ccd7 [JFFS2] Remove obsolete histo.h
This file hasn't actually been used since the very early days of JFFS2
when Arjan was playing with compression methods. It can go now.

Signed-off-by: Domen Puncer <domen@coderock.org>
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-05-12 11:51:46 +01:00
Linus Torvalds
e515f048c4 Merge git://oss.sgi.com:8090/xfs-2.6
* git://oss.sgi.com:8090/xfs-2.6:
  [XFS] Fix a possible metadata buffer (AGFL) refcount leak when fixing an
  [XFS] Fix a project quota space accounting leak on rename.
  [XFS] Fix a possible forced shutdown due to mishandling write barriers
2006-05-08 17:41:05 -07:00
Trond Myklebust
75dff55af9 [PATCH] fs/locks.c: Fix lease_init
It is insane to be giving lease_init() the task of freeing the lock it is
supposed to initialise, given that the lock is not guaranteed to be
allocated on the stack. This causes lockups in fcntl_setlease().
Problem diagnosed by Daniel Hokka Zakrisson <daniel@hozac.com>

Also fix a slab leak in __setlease() due to an uninitialised return value.
Problem diagnosed by Björn Steinbrink.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Tested-by: Daniel Hokka Zakrisson <daniel@hozac.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-05-08 08:07:17 -07:00
Nathan Scott
e63a369001 [XFS] Fix a possible metadata buffer (AGFL) refcount leak when fixing an
AG freelist.

SGI-PV: 952681
SGI-Modid: xfs-linux-melb:xfs-kern:25902a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-05-08 19:51:58 +10:00
Nathan Scott
b1ecdda931 [XFS] Fix a project quota space accounting leak on rename.
SGI-PV: 951636
SGI-Modid: xfs-linux-melb:xfs-kern:25811a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-05-08 19:51:42 +10:00
Nathan Scott
d08d389d5a [XFS] Fix a possible forced shutdown due to mishandling write barriers
with remount,ro.

SGI-PV: 951944
SGI-Modid: xfs-linux-melb:xfs-kern:25742a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-05-08 19:51:28 +10:00
Dmitry Bazhenov
422138dd68 [JFFS2] Fix race in setting file attributes
It seems like there is a potential race in the function jffs2_do_setattr()
in the case when attributes of a symlink are updated. The symlink metadata
is read without having f->sem locked.

The following patch should fix the race.

Signed-off-by: Dmitry Bazhenov <atrey@emcraft.com>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-05-05 22:46:49 +01:00
Jens Axboe
98232d504d [PATCH] compat_sys_vmsplice: one-off in UIO_MAXIOV check
nr_segs may not be > UIO_MAXIOV, however it may be equal to. This makes
the behaviour identical to the real sys_vmsplice(). The other foov
syscalls also agree that this is the way to go.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-05-04 09:13:49 +02:00
Jens Axboe
a0548871ed [PATCH] splice: redo page lookup if add_to_page_cache() returns -EEXIST
This can happen quite easily, if several processes are trying to splice
the same file at the same time. It's not a failure, it just means someone
raced with us in allocating this file page. So just dump the allocated
page and relookup the original.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-05-04 06:55:12 +02:00
Jens Axboe
76ad4d1110 [PATCH] splice: rename remaining info variables to pipe
Same thing was done in fs/pipe.c and most of fs/splice.c, but we had
a few missing still.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-05-04 06:55:12 +02:00
Jens Axboe
1432873af7 [PATCH] splice: LRU fixups
Nick says that the current construct isn't safe. This goes back to the
original, but sets PIPE_BUF_FLAG_LRU on user pages as well as they all
seem to be on the LRU in the first place.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-05-04 06:55:12 +02:00
Jens Axboe
bfc4ee39fd [PATCH] splice: fix unlocking of page on error ->prepare_write()
Looking at generic_file_buffered_write(), we need to unlock_page() if
prepare write fails and it isn't due to racing with truncate().

Also trim the size if ->prepare_write() fails, if we have to.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-05-04 06:55:12 +02:00
Mingming Cao
5dea5176e5 [PATCH] ext3: multile block allocate little endian fixes
Some places in ext3 multiple block allocation code (in 2.6.17-rc3) don't
handle the little endian well.  This was resulting in *wrong* block numbers
being assigned to in-memory block variables and then stored on disk
eventually.  The following patch has been verified to fix an ext3
filesystem failure when run ltp test on a 64 bit machine.

Signed-off-by; Mingming Cao <cmm@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-05-03 20:05:41 -07:00
David Woodhouse
edc4ff7c08 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 2006-05-03 13:30:35 +01:00
David Woodhouse
cbb9a56177 Move jffs2_fs_i.h and jffs2_fs_sb.h from include/linux/ to fs/jffs2/
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-05-03 13:07:27 +01:00
Jens Axboe
330ab71619 [PATCH] vmsplice: restrict stealing a little more
Apply the same rules as the anon pipe pages, only allow stealing
if no one else is using the page.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-05-02 15:29:57 +02:00
Jens Axboe
a893b99be7 [PATCH] splice: fix page LRU accounting
Currently we rely on the PIPE_BUF_FLAG_LRU flag being set correctly
to know whether we need to fiddle with page LRU state after stealing it,
however for some origins we just don't know if the page is on the LRU
list or not.

So remove PIPE_BUF_FLAG_LRU and do this check/add manually in pipe_to_file()
instead.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-05-02 15:03:27 +02:00
Jens Axboe
7591489a8f [PATCH] vmsplice: fix badly placed end paranthesis
We need to use the minium of {len, PAGE_SIZE-off}, not {len, PAGE_SIZE}-off.
The latter doesn't make any sense, and could cause us to attempt negative
length transfers...

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-05-02 12:57:18 +02:00
Linus Torvalds
9817d207dc Merge branch 'splice' of git://brick.kernel.dk/data/git/linux-2.6-block
* 'splice' of git://brick.kernel.dk/data/git/linux-2.6-block:
  [PATCH] vmsplice: allow user to pass in gift pages
  [PATCH] pipe: enable atomic copying of pipe data to/from user space
  [PATCH] splice: call handle_ra_miss() on failure to lookup page
  [PATCH] Add ->splice_read/splice_write to def_blk_fops
  [PATCH] pipe: introduce ->pin() buffer operation
  [PATCH] splice: fix bugs in pipe_to_file()
  [PATCH] splice: fix bugs with stealing regular pipe pages
2006-05-01 18:33:40 -07:00
Andi Kleen
d261020229 [PATCH] x86_64: Add compat_sys_vmsplice and use it in x86-64
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-05-01 18:17:43 -07:00
Jens Axboe
7afa6fd037 [PATCH] vmsplice: allow user to pass in gift pages
If SPLICE_F_GIFT is set, the user is basically giving this pages away to
the kernel. That means we can steal them for eg page cache uses instead
of copying it.

The data must be properly page aligned and also a multiple of the page size
in length.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-05-01 20:02:33 +02:00
Jens Axboe
f6762b7ad8 [PATCH] pipe: enable atomic copying of pipe data to/from user space
The pipe ->map() method uses kmap() to virtually map the pages, which
is both slow and has known scalability issues on SMP. This patch enables
atomic copying of pipe pages, by pre-faulting data and using kmap_atomic()
instead.

lmbench bw_pipe and lat_pipe measurements agree this is a Good Thing. Here
are results from that on a UP machine with highmem (1.5GiB of RAM), running
first a UP kernel, SMP kernel, and SMP kernel patched.

Vanilla-UP:
Pipe bandwidth: 1622.28 MB/sec
Pipe bandwidth: 1610.59 MB/sec
Pipe bandwidth: 1608.30 MB/sec
Pipe latency: 7.3275 microseconds
Pipe latency: 7.2995 microseconds
Pipe latency: 7.3097 microseconds

Vanilla-SMP:
Pipe bandwidth: 1382.19 MB/sec
Pipe bandwidth: 1317.27 MB/sec
Pipe bandwidth: 1355.61 MB/sec
Pipe latency: 9.6402 microseconds
Pipe latency: 9.6696 microseconds
Pipe latency: 9.6153 microseconds

Patched-SMP:
Pipe bandwidth: 1578.70 MB/sec
Pipe bandwidth: 1579.95 MB/sec
Pipe bandwidth: 1578.63 MB/sec
Pipe latency: 9.1654 microseconds
Pipe latency: 9.2266 microseconds
Pipe latency: 9.1527 microseconds

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-05-01 20:02:05 +02:00
Jens Axboe
e27dedd84c [PATCH] splice: call handle_ra_miss() on failure to lookup page
Notify the readahead logic of the missing page. Suggested by
Oleg Nesterov.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-05-01 19:59:54 +02:00
Jens Axboe
7f9c51f0d9 [PATCH] Add ->splice_read/splice_write to def_blk_fops
It can use the generic handlers.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-05-01 19:59:32 +02:00
Jens Axboe
f84d751994 [PATCH] pipe: introduce ->pin() buffer operation
The ->map() function is really expensive on highmem machines right now,
since it has to use the slower kmap() instead of kmap_atomic(). Splice
rarely needs to access the virtual address of a page, so it's a waste
of time doing it.

Introduce ->pin() to take over the responsibility of making sure the
page data is valid. ->map() is then reduced to just kmap(). That way we
can also share a most of the pipe buffer ops between pipe.c and splice.c

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-05-01 19:59:03 +02:00
Jens Axboe
0568b409c7 [PATCH] splice: fix bugs in pipe_to_file()
Found by Oleg Nesterov <oleg@tv-sign.ru>, fixed by me.

- Only allow full pages to go to the page cache.
- Check page != buf->page instead of using PIPE_BUF_FLAG_STOLEN.
- Remember to clear 'stolen' if add_to_page_cache() fails.

And as a cleanup on that:

- Make the bottom fall-through logic a little less convoluted. Also make
  the steal path hold an extra reference to the page, so we don't have
  to differentiate between stolen and non-stolen at the end.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-05-01 19:50:48 +02:00
Jens Axboe
46e678c96b [PATCH] splice: fix bugs with stealing regular pipe pages
- Check that page has suitable count for stealing in the regular pipes.
- pipe_to_file() assumes that the page is locked on succesful steal, so
  do that in the pipe steal hook
- Missing unlock_page() in add_to_page_cache() failure.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-30 16:36:32 +02:00
Andreas Schwab
2833c28aa0 [PATCH] powerpc: Wire up *at syscalls
Wire up *at syscalls.

This patch has been tested on ppc64 (using glibc's testsuite, both 32bit
and 64bit), and compile-tested for ppc32 (I have currently no ppc32 system
available, but I expect no problems).

Signed-off-by: Andreas Schwab <schwab@suse.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-04-28 21:04:59 +10:00
Jens Axboe
eb20796bf6 [PATCH] splice: make the read-side do batched page lookups
Use the new find_get_pages_contig() to potentially look up the entire
splice range in one single call. This speeds up generic_file_splice_read()
quite a bit.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-27 11:05:22 +02:00
Jens Axboe
eb645a24de [PATCH] splice: switch to using page_cache_readahead()
Avoids doing useless work, when the file is fully cached.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-27 08:59:48 +02:00
James Morris
e7edf9cded [PATCH] LSM: add missing hook to do_compat_readv_writev()
This patch addresses a flaw in LSM, where there is no mediation of readv()
and writev() in for 32-bit compatible apps using a 64-bit kernel.

This bug was discovered and fixed initially in the native readv/writev
code [1], but was not fixed in the compat code.  Thanks to Al for spotting
this one.

  [1] http://lwn.net/Articles/154282/

Signed-off-by: James Morris <jmorris@namei.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-26 07:52:21 -07:00
Al Viro
a090d9132c [PATCH] protect ext3 ioctl modifying append_only, immutable, etc. with i_mutex
All modifications of ->i_flags in inodes that might be visible to
somebody else must be under ->i_mutex.  That patch fixes ext3 ioctl()
setting S_APPEND and friends.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-26 07:52:21 -07:00
Al Viro
de0bb97aff [PATCH] forgotten ->b_data in memcpy() call in ext3/resize.c (oopsable)
sbi->s_group_desc is an array of pointers to buffer_head.  memcpy() of
buffer size from address of buffer_head is a bad idea - it will generate
junk in any case, may oops if buffer_head is close to the end of slab
page and next page is not mapped and isn't what was intended there.
IOW, ->b_data is missing in that call.  Fortunately, result doesn't go
into the primary on-disk data structures, so only backup ones get crap
written to them; that had allowed this bug to remain unnoticed until
now.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-26 07:52:21 -07:00
Linus Torvalds
7b97ebfb93 Merge branch 'splice' of git://brick.kernel.dk/data/git/linux-2.6-block
* 'splice' of git://brick.kernel.dk/data/git/linux-2.6-block:
  [PATCH] splice: add ->splice_write support for /dev/null
  [PATCH] splice: rearrange moving to/from pipe helpers
  [PATCH] Add support for the sys_vmsplice syscall
  [PATCH] splice: fix offset problems
  [PATCH] splice: fix min() warning
2006-04-26 07:47:55 -07:00
Jens Axboe
00522fb41a [PATCH] splice: rearrange moving to/from pipe helpers
We need these for people writing their own ->splice_read/write hooks.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-26 14:39:29 +02:00
Jens Axboe
912d35f867 [PATCH] Add support for the sys_vmsplice syscall
sys_splice() moves data to/from pipes with a file input/output. sys_vmsplice()
moves data to a pipe, with the input being a user address range instead.

This uses an approach suggested by Linus, where we can hold partial ranges
inside the pages[] map. Hopefully this will be useful for network
receive support as well.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-26 10:59:21 +02:00
Miklos Szeredi
8aa09a50b5 [fuse] fix race between checking and setting file->private_data
BKL does not protect against races if the task may sleep between
checking and setting a value.  So move checking of file->private_data
near to setting it in fuse_fill_super().

Found by Al Viro.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
2006-04-26 10:49:16 +02:00
Miklos Szeredi
6dbbcb1205 [fuse] fix deadlock between fuse_put_super() and request_end(), try #2
A deadlock was possible, when the last reference to the superblock was
held due to a background request containing a file reference.

Releasing the file would release the vfsmount which in turn would
release the superblock.  Since sbput_sem is held during the fput() and
fuse_put_super() tries to acquire this same semaphore, a deadlock
results.

The solution is to move the fput() outside the region protected by
sbput_sem.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
2006-04-26 10:49:06 +02:00
Miklos Szeredi
5a5fb1ea74 Revert "[fuse] fix deadlock between fuse_put_super() and request_end()"
This reverts 73ce8355c2 commit.

It was wrong, because it didn't take into account the requirement,
that iput() for background requests must be performed synchronously
with ->put_super(), otherwise active inodes may remain after unmount.

The right solution is to keep the sbput_sem and perform iput() within
the locked region, but move fput() outside sbput_sem.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
2006-04-26 10:48:55 +02:00
Jens Axboe
016b661e2f [PATCH] splice: fix offset problems
Make the move_from_pipe() actors return number of bytes processed, then
move_from_pipe() can decide more cleverly when to move on to the next
buffer.

This fixes problems with pipe offset and differing file offset.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-26 10:33:34 +02:00
Andrew Morton
ba5f5d90c4 [PATCH] splice: fix min() warning
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-26 10:33:34 +02:00
Steve French
301dc3e6f6 [CIFS] Fix compile error when CONFIG_CIFS_EXPERIMENTAL is undefined
Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2006-04-24 16:24:54 +00:00
Linus Torvalds
41bc3982b9 Merge master.kernel.org:/pub/scm/linux/kernel/git/sfrench/cifs-2.6-stable
* master.kernel.org:/pub/scm/linux/kernel/git/sfrench/cifs-2.6-stable:
  [CIFS] Fix typo in previous
  [CIFS] Readdir fixes to allow search to start at arbitrary position
  [CIFS] Use the kthread_ API instead of opencoding lots of hairy code for kernel
  [CIFS] Don't allow a backslash in a path component
  [CIFS] [CIFS] Do not take rename sem on most path based calls (during
2006-04-23 09:38:09 -07:00
Steve French
b66ac3ea21 [CIFS] Fix typo in previous
Signed-off-by: Steve French <sfrench@us.ibm.com>
2006-04-23 01:54:50 +00:00
Jan Kara
b9251b823b [PATCH] Fix reiserfs deadlock
reiserfs_cache_default_acl() should return whether we successfully found
the acl or not.  We have to return correct value even if reiserfs_get_acl()
returns error code and not just 0.  Otherwise callers such as
reiserfs_mkdir() can unnecessarily lock the xattrs and later functions such
as reiserfs_new_inode() fail to notice that we have already taken the lock
and try to take it again with obvious consequences.

Signed-off-by: Jan Kara <jack@suse.cz>
Cc: <reiserfs-dev@namesys.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-22 09:19:53 -07:00
Steve French
60808233f3 [CIFS] Readdir fixes to allow search to start at arbitrary position
in directory

Also includes first part of fix to compensate for servers which forget
to return . and .. as well as updates to changelog and cifs readme.

Signed-off-by: Steve French <sfrench@us.ibm.com>
2006-04-22 15:53:05 +00:00
Steve French
45af7a0f2e [CIFS] Use the kthread_ API instead of opencoding lots of hairy code for kernel
thread creation and teardown.

It does not move the cifsd thread handling to kthread due to problems
found in testing with wakeup of threads blocked in the socket peek api,
but the other cifs kernel threads now use kthread.
Also cleanup cifs_init to properly unwind when thread creation fails.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2006-04-21 22:52:25 +00:00
Steve French
296034f7de [CIFS] Don't allow a backslash in a path component
Unless Posix paths have been negotiated, the backslash, "\", is not a valid
character in a path component.

Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
Signed-off-by: Steve French  <sfrench@us.ibm.com>
2006-04-21 18:18:37 +00:00
Steve French
0bd4fa977f [CIFS] [CIFS] Do not take rename sem on most path based calls (during
building of full path) to avoid hang rename/readdir hang

Reported by Alan Tyson

Signed-off-by: Steve French <sfrench@us.ibm.com>
2006-04-21 18:17:42 +00:00
David Woodhouse
21f1d5fc59 [RBTREE] Update JFFS2 to use rb_parent() accessor macro.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-04-21 13:17:57 +01:00
David Woodhouse
c569882b2e [RBTREE] Update eventpoll.c to use rb_parent() accessor macro.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-04-21 13:17:24 +01:00
David Woodhouse
52b5108ca7 [RBTREE] Update ext3 to use rb_parent() accessor macro.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-04-21 13:15:57 +01:00
Jens Axboe
82aa5d6183 [PATCH] splice: fix smaller sized splice reads
Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-20 13:05:48 +02:00
Linus Torvalds
949b211235 Merge git://git.linux-nfs.org/pub/linux/nfs-2.6
* git://git.linux-nfs.org/pub/linux/nfs-2.6:
  SUNRPC: Dead code in net/sunrpc/auth_gss/auth_gss.c
  NFS: remove needless check in nfs_opendir()
  NFS: nfs_show_stats; for_each_possible_cpu(), not NR_CPUS
  NFS: make 2 functions static
  NFS,SUNRPC: Fix compiler warnings if CONFIG_PROC_FS & CONFIG_SYSCTL are unset
  NFS: fix PROC_FS=n compile error
  VFS: Fix another open intent Oops
  RPCSEC_GSS: fix leak in krb5 code caused by superfluous kmalloc
2006-04-19 10:46:59 -07:00
Carsten Otte
7451c4f0ee NFS: remove needless check in nfs_opendir()
Local variable res was initialized to 0 - no check needed here.

Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-04-19 13:06:37 -04:00
John Hawkes
b9d9506d94 NFS: nfs_show_stats; for_each_possible_cpu(), not NR_CPUS
Convert a for-loop that explicitly references "NR_CPUS" into the
potentially more efficient for_each_possible_cpu() construct.

Signed-off-by: John Hawkes <hawkes@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-04-19 13:06:20 -04:00
Adrian Bunk
ec535ce154 NFS: make 2 functions static
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-04-19 12:43:47 -04:00
Trond Myklebust
e99170ff3b NFS,SUNRPC: Fix compiler warnings if CONFIG_PROC_FS & CONFIG_SYSCTL are unset
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-04-19 12:43:47 -04:00
Trond Myklebust
95cf959b24 VFS: Fix another open intent Oops
If the call to nfs_intent_set_file() fails to open a file in
nfs4_proc_create(), we should return an error.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-04-19 12:43:46 -04:00
Linus Torvalds
0efd9323f3 Merge branch 'splice' of git://brick.kernel.dk/data/git/linux-2.6-block
* 'splice' of git://brick.kernel.dk/data/git/linux-2.6-block:
  [PATCH] splice: fixup writeout path after ->map changes
  [PATCH] splice: offset fixes
  [PATCH] tee: link_pipe() must be careful when dropping one of the pipe locks
  [PATCH] splice: cleanup the SPLICE_F_NONBLOCK handling
  [PATCH] splice: close i_size truncate races on read
2006-04-19 09:25:52 -07:00
Dipankar Sarma
ca99c1da08 [PATCH] Fix file lookup without ref
There are places in the kernel where we look up files in fd tables and
access the file structure without holding refereces to the file.  So, we
need special care to avoid the race between looking up files in the fd
table and tearing down of the file in another CPU.  Otherwise, one might
see a NULL f_dentry or such torn down version of the file.  This patch
fixes those special places where such a race may happen.

Signed-off-by: Dipankar Sarma <dipankar@in.ibm.com>
Acked-by: "Paul E. McKenney" <paulmck@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-19 09:13:51 -07:00
Arthur Othieno
dda27d1a55 [PATCH] hugetlbfs: add Kconfig help text
In kernel bugzilla #6248 (http://bugzilla.kernel.org/show_bug.cgi?id=6248),
Adrian Bunk <bunk@stusta.de> notes that CONFIG_HUGETLBFS is missing Kconfig
help text.

Signed-off-by: Arthur Othieno <apgo@patchbomb.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-19 09:13:50 -07:00
Eric W. Biederman
5e85d4abe3 [PATCH] task: Make task list manipulations RCU safe
While we can currently walk through thread groups, process groups, and
sessions with just the rcu_read_lock, this opens the door to walking the
entire task list.

We already have all of the other RCU guarantees so there is no cost in
doing this, this should be enough so that proc can stop taking the
tasklist lock during readdir.

prev_task was killed because it has no users, and using it will miss new
tasks when doing an rcu traversal.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-19 09:13:49 -07:00
Jens Axboe
9e0267c26e [PATCH] splice: fixup writeout path after ->map changes
Since ->map() no longer locks the page, we need to adjust the handling
of those pages (and stealing) a little. This now passes full regressions
again.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-19 15:57:31 +02:00
Jens Axboe
a4514ebd8e [PATCH] splice: offset fixes
- We need to adjust *ppos for writes as well.
- Copy back modified offset value if one was passed in, similar to
  what sendfile does.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-19 15:57:05 +02:00
Jens Axboe
2a27250e6c [PATCH] tee: link_pipe() must be careful when dropping one of the pipe locks
We need to ensure that we only drop a lock that is ordered last, to avoid
ABBA deadlocks with competing processes.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-19 15:56:40 +02:00
Jens Axboe
c4f895cbe1 [PATCH] splice: cleanup the SPLICE_F_NONBLOCK handling
- generic_file_splice_read() more readable and correct
- Don't bail on page allocation with NONBLOCK set, just don't allow
  direct blocking on IO (eg lock_page).

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-19 15:56:12 +02:00
Jens Axboe
91ad66ef44 [PATCH] splice: close i_size truncate races on read
We need to check i_size after doing a blocking readpage.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-19 15:55:10 +02:00
Linus Torvalds
385910f2b2 x86: be careful about tailcall breakage for sys_open[at] too
Came up through a quick grep for other cases similar to the ftruncate()
one in commit 0a489cb3b6.

Also, add a comment, so that people who read the code understand why we
do what looks like a no-op.

(Again, this won't actually matter to any sane user, since libc will
save and restore the register gcc stomps on, but it's still wrong to
stomp on it)

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-18 13:22:59 -07:00
Linus Torvalds
0a489cb3b6 x86: don't allow tail-calls in sys_ftruncate[64]()
Gcc thinks it owns the incoming argument stack, but that's not true for
"asmlinkage" functions, and it corrupts the caller-set-up argument stack
when it pushes the third argument onto the stack.  Which can result in
%ebx getting corrupted in user space.

Now, normally nobody sane would ever notice, since libc will save and
restore %ebx anyway over the system call, but it's still wrong.

I'd much rather have "asmlinkage" tell gcc directly that it doesn't own
the stack, but no such attribute exists, so we're stuck with our hacky
manual "prevent_tail_call()" macro once more (we've had the same issue
before with sys_waitpid() and sys_wait4()).

Thanks to Hans-Werner Hilse <hilse@sub.uni-goettingen.de> for reporting
the issue and testing the fix.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-18 13:02:48 -07:00
Richard Purdie
373d5e7183 JFFS2: Return an error for long filenames
Return an error if a name is too long for JFFS2 rather than
corrupting data.

Signed-off-by: Richard Purdie <rpurdie@rpsys.net>
2006-04-18 02:05:46 +01:00
Ananiev, Leonid I
75616cf985 [PATCH] ext3: Fix missed mutex unlock
Missed unlock_super()call is added in error condition code path.

Signed-off-by: Leonid Ananiev <leonid.i.ananiev@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-04-17 14:24:57 -07:00
Stephen Rothwell
2436f039d2 [PATCH] Fix block device symlink name
As noted further on the this file, some block devices have a / in their
name, so fix the "block:..." symlink name the same as the /sys/block name.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-04-17 14:24:57 -07:00
David Woodhouse
94171db1d2 Merge with git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git 2006-04-17 15:35:18 +01:00
David Woodhouse
d96fb997c6 [JFFS2] Fix race in post-mount node checking
For a while now, we've postponed CRC-checking of data nodes to be done
by the GC thread, instead of being done while the user is waiting for
mount to finish. The GC thread would iterate through all the inodes on
the system and check each of their data nodes. It would skip over inodes
which had already been used or were already being read in by
read_inode(), because their data nodes would have been examined anyway.

However, we could sometimes reach the end of the for-each-inode loop and
still have some unchecked space left, if an inode we'd skipped was
_still_ in the process of being read. This fixes that race by actually
waiting for read_inode() to finish rather than just moving on.

Thanks to Ladislav Michl for coming up with a reproducible test case and
helping to track it down.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-04-17 00:19:48 +01:00
Kay Sievers
d4d7e5dffc [PATCH] BLOCK: delay all uevents until partition table is scanned
[BLOCK] delay all uevents until partition table is scanned

Here we delay the annoucement of all block device events until the
disk's partition table is scanned and all partition devices are already
created and sysfs is populated.

We have a bunch of old bugs for removable storage handling where we
probe successfully for a filesystem on the raw disk, but at the
same time the kernel recognizes a partition table and creates partition
devices.
Currently there is no sane way to tell if partitions will show up or not
at the time the disk device is announced to userspace. With the delayed
events we can simply skip any probe for a filesystem on the raw disk when
we find already present partitions.

Signed-off-by: Kay Sievers <kay.sievers@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-04-14 11:41:24 -07:00
NeilBrown
4508a7a734 [PATCH] sysfs: Allow sysfs attribute files to be pollable
It works like this:
  Open the file
  Read all the contents.
  Call poll requesting POLLERR or POLLPRI (so select/exceptfds works)
  When poll returns,
     close the file and go to top of loop.
   or lseek to start of file and go back to the 'read'.

Events are signaled by an object manager calling
   sysfs_notify(kobj, dir, attr);

If the dir is non-NULL, it is used to find a subdirectory which
contains the attribute (presumably created by sysfs_create_group).

This has a cost of one int  per attribute, one wait_queuehead per kobject,
one int per open file.

The name "sysfs_notify" may be confused with the inotify
functionality.  Maybe it would be nice to support inotify for sysfs
attributes as well?

This patch also uses sysfs_notify to allow /sys/block/md*/md/sync_action
to be pollable

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-04-14 11:41:24 -07:00
Linus Torvalds
9a7e9f1c60 Merge branch 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/mszeredi/fuse
* 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/mszeredi/fuse:
  [fuse] Direct I/O  should not use fuse_reset_request
  [fuse] Don't init request twice
  [fuse] Fix accounting the number of waiting requests
  [fuse] fix deadlock between fuse_put_super() and request_end()
2006-04-14 09:11:34 -07:00
Linus Torvalds
9ca686626c Merge branch 'tee' of git://brick.kernel.dk/data/git/linux-2.6-block
* 'tee' of git://brick.kernel.dk/data/git/linux-2.6-block:
  [PATCH] splice: add support for sys_tee()
  [PATCH] splice: pass offset around for ->splice_read() and ->splice_write()
2006-04-14 09:02:07 -07:00
Eric W. Biederman
c06511d12d [PATCH] de_thread: Don't change our parents and ptrace flags.
This is two distinct changes.
 - Not changing our real parents.
 - Not changing our ptrace parents.

Not changing our real parents is trivially correct because both tasks
have the same real parents as they are part of a thread group.  Now that
we demote the leader to a thread there is no longer any reason to change
it's parentage.

Not changing our ptrace parents is a user visible change if someone
looks hard enough.  I don't think user space applications will care or
even notice.

In the practical and I think common case a debugger will have attached
to all of the threads using the same ptrace flags.  From my quick skim
of strace and gdb that appears to be the case.  Which if true means
debuggers will not notice a change.

Before this point we have already generated a ptrace event in do_exit
that reports the leaders pid has died so de_thread is visible to a
debugger.  Which means attempting to hide this case by copying flags
around appears excessive.

By not doing anything it avoids all of the weird locking issues between
de_thread and ptrace attach, and removes one case from consideration for
fixing the ptrace locking.

This only addresses Oleg's first concern with ptrace_attach, that of the
problems caused by reparenting.  Oleg's second concern is essentially a
race between ptrace_attach and release_task that causes an oops when we
get to force_sig_specific.  There is nothing special about de_thread
with respect to that race.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-14 08:49:19 -07:00
Randy Dunlap
fb6a82c94a [PATCH] jffs2: fix printk warnings
Fix printk format warnings in jffs2.

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-04-11 20:12:10 -04:00
Miklos Szeredi
56cf34ff07 [fuse] Direct I/O should not use fuse_reset_request
It's cleaner to allocate a new request, otherwise the uid/gid/pid
fields of the request won't be filled in.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
2006-04-11 21:16:51 +02:00
Miklos Szeredi
4858cae4f0 [fuse] Don't init request twice
Request is already initialized in fuse_request_alloc() so no need to
do it again in fuse_get_req().

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
2006-04-11 21:16:38 +02:00
Miklos Szeredi
9bc5dddad1 [fuse] Fix accounting the number of waiting requests
Properly accounting the number of waiting requests was forgotten in
"clean up request accounting" patch.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
2006-04-11 21:16:09 +02:00
Miklos Szeredi
73ce8355c2 [fuse] fix deadlock between fuse_put_super() and request_end()
A deadlock was possible, when the last reference to the superblock was
held due to a background request containing a file reference.

Releasing the file would release the vfsmount which in turn would
release the superblock.  Since sbput_sem is held during the fput() and
fuse_put_super() tries to acquire this same semaphore, a deadlock
results.

The chosen soltuion is to get rid of sbput_sem, and instead use the
spinlock to ensure the referenced inodes/file are released only once.
Since the actual release may sleep, defer these outside the locked
region, but using local variables instead of the structure members.

This is a much more rubust solution.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
2006-04-11 21:14:26 +02:00
Jens Axboe
70524490ee [PATCH] splice: add support for sys_tee()
Basically an in-kernel implementation of tee, which uses splice and the
pipe buffers as an intelligent way to pass data around by reference.

Where the user space tee consumes the input and produces a stdout and
file output, this syscall merely duplicates the data inside a pipe to
another pipe. No data is copied, the output just grabs a reference to the
input pipe data.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-11 15:51:17 +02:00
Jens Axboe
cbb7e577e7 [PATCH] splice: pass offset around for ->splice_read() and ->splice_write()
We need not use ->f_pos as the offset for the file input/output. If the
user passed an offset pointer in through sys_splice(), just use that and
leave ->f_pos alone.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-11 15:47:07 +02:00
Linus Torvalds
88dd9c16ce Merge branch 'splice' of git://brick.kernel.dk/data/git/linux-2.6-block
* 'splice' of git://brick.kernel.dk/data/git/linux-2.6-block:
  [PATCH] vfs: add splice_write and splice_read to documentation
  [PATCH] Remove sys_ prefix of new syscalls from __NR_sys_*
  [PATCH] splice: warning fix
  [PATCH] another round of fs/pipe.c cleanups
  [PATCH] splice: comment styles
  [PATCH] splice: add Ingo as addition copyright holder
  [PATCH] splice: unlikely() optimizations
  [PATCH] splice: speedups and optimizations
  [PATCH] pipe.c/fifo.c code cleanups
  [PATCH] get rid of the PIPE_*() macros
  [PATCH] splice: speedup __generic_file_splice_read
  [PATCH] splice: add direct fd <-> fd splicing support
  [PATCH] splice: add optional input and output offsets
  [PATCH] introduce a "kernel-internal pipe object" abstraction
  [PATCH] splice: be smarter about calling do_page_cache_readahead()
  [PATCH] splice: optimize the splice buffer mapping
  [PATCH] splice: cleanup __generic_file_splice_read()
  [PATCH] splice: only call wake_up_interruptible() when we really have to
  [PATCH] splice: potential !page dereference
  [PATCH] splice: mark the io page as accessed
2006-04-11 06:34:02 -07:00
NeilBrown
358dd55aa3 [PATCH] knfsd: nfsd4: grant delegations more frequently
Keep unused openowners around for at least one lease period, to avoid the need
for as many open confirmations and to allow handing out more delegations.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-11 06:18:53 -07:00
NeilBrown
ef0f3390eb [PATCH] knfsd: nfsd4: limit number of delegations handed out.
It's very easy for the server to DOS itself by just giving out too many
delegations.

For now we just solve the problem with a dumb hard limit.  Eventually we'll
want a smarter policy.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-11 06:18:53 -07:00
NeilBrown
4e2fd495b5 [PATCH] knfsd: nfsd4: add missing rpciod_down()
We should be shutting down rpciod for the callback channel when we shut down
the server.

Also note that we do rpciod_up() and create the callback client *before*
setting cb_set--the cb_set only determines whether the initial null was
succesful.  So cb_set is not a reliable determiner of whether we need to clean
up, only cb_client is.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-11 06:18:53 -07:00
NeilBrown
541e0e0981 [PATCH] knfsd: nfsd4: nfsd4_probe_callback cleanup
Some obvious cleanup.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-11 06:18:53 -07:00
NeilBrown
5e8d5c2948 [PATCH] knfsd: nfsd4: fix laundromat shutdown race
We need to make sure the laundromat work doesn't reschedule itself just when
we try to cancel it.  Also, we shouldn't be waiting for it to finish running
while holding the state lock, as that's a potential deadlock.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-11 06:18:52 -07:00
NeilBrown
bb6e8a9f40 [PATCH] knfsd: nfsd4: fix corruption on readdir encoding with 64k pages
Fix corruption on readdir encoding with 64k pages.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-11 06:18:52 -07:00
NeilBrown
6ed6decccf [PATCH] knfsd: nfsd4: fix corruption of returned data when using 64k pages
In v4 we grab an extra page just for the padding of returned data.  The
formula that the rpc server uses to allocate pages for the response doesn't
take into account this extra page.

Instead of adjusting those formulae, we adopt the same solution as v2 and v3,
and put the "tail" data in the same page as the "head" data.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-11 06:18:52 -07:00
NeilBrown
f0e2993e9e [PATCH] knfsd: nfsd4: remove nfsd_setuser from putrootfh
Since nfsd_setuser() is already called from any operation that uses the
current filehandle (because it's called from fh_verify), there's no reason to
call it from putrootfh.

Signed-off-by: Andy Adamson <andros@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-11 06:18:52 -07:00
NeilBrown
54cceebb67 [PATCH] knfsd: nfsd: nfsd_setuser doesn't really need to modify rqstp->rq_cred.
In addition to setting the processes filesystem id's, nfsd_setuser also
modifies the value of the rq_cred which stores the id's that originally came
from the rpc call, for example to reflect root squashing.

There's no real reason to do that--the only case where rqstp->rq_cred is
actually used later on is in the NFSv4 SETCLIENTID/SETCLIENTID_CONFIRM
operations, and there the results are the opposite of what we want--those two
operations don't deal with the filesystem at all, they only record the
credentials used with the rpc call for later reference (so that we may require
the same credentials be used on later operations), and the credentials
shouldn't vary just because there was or wasn't a previous operation in the
compound that referred to some export

This fixes a bug which caused mounts from Solaris clients to fail.

Signed-off-by: Andy Adamson <andros@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-11 06:18:52 -07:00
NeilBrown
cd15654963 [PATCH] knfsd: nfsd: oops exporting nonexistent directory
Export a directory that does not exist:
	exportfs -orw,fsid=0,insecure,no_subtree_check client:/home/NFS4

Try to mount from client with nfs4. Mount hangs (I'm not sure why -
that's another issue).

While client is hung, back on server

	mkdir /home/NFS4

The server panics in dput.  I traced the problem back to svc_export_parse()
calling path_release() even though path_lookup() failed (it happens to fill in
the nameidata structure with a negative dentry - so the test after out:
succeeds).

After patching, an recreating the problem, the client mount still takes some
time before finally exiting with a message "couldn't read superblock".

Here is a simple patch to resolve this issue:

Signed-off-by: Frank Filz <ffilzlnx@us.ibm.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-11 06:18:52 -07:00
NeilBrown
b5872b0dcc [PATCH] knfsd: nfsd4: fix acl xattr length return
We should be using the length from the second vfs_getxattr, in case it
changed.  (Note: there's still a small race here; we could end up returning
-ENOMEM if the length increased between the first and second call.  I don't
know whether it's worth spending a lot of effort to fix that.)

This makes XFS ACLs usable on NFS exports, which they currently aren't, since
XFS appears to be returning a too-large value for vfs_getxattr() when it's
passed a NULL buffer.  So there's probably an XFS bug here too, though since
getxattr with a NULL buffer is usually used to decide how much memory to
allocate, it may be a fairly harmless bug in most cases.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-11 06:18:51 -07:00
NeilBrown
b905b7b0a0 [PATCH] knfsd: nfsd4: better nfs4acl errors
We're returning -1 in a few places in the NFSv4<->POSIX acl translation code
where we could return a reasonable error.

Also allows some minor simplification elsewhere.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-11 06:18:51 -07:00
NeilBrown
249920527f [PATCH] knfsd: nfsd4: Wrong error handling in nfs4acl
this fixes coverity id #3.  Coverity detected dead code, since the == -1
comparison only returns 0 or 1 to error.  Therefore the if ( error < 0 )
statement was always false.  Seems that this was an if( error = nfs4...  )
statement some time ago, which got broken during cleanup.

Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-11 06:18:51 -07:00
Adrian Bunk
e465a77f94 [PATCH] fs/nfsd/nfs4state.c: make a struct static
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Cc: Marc Eshel <eshel@almaden.ibm.com>
Cc: Andy Adamson <andros@citi.umich.edu>
Cc: J. Bruce Fields <bfields@citi.umich.edu>
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-11 06:18:51 -07:00
NeilBrown
d5b9026a67 [PATCH] knfsd: locks: flag NFSv4-owned locks
Use the fl_lmops field to identify which locks are ours, instead of trying to
look them up in our private hash.  This is safer and more efficient.

Earlier versions of this patch used a lock flag instead, but Trond pointed out
that adding a new flag for each lock manager wasn't going to scale well, and
suggested this approach instead; a separate patch converts lockd to using
fl_lmops in the same way.

In the NFSv4 case this looks like a bit of a hack, since the NFSv4 server
isn't currently actually defining a lock_manager_operations struct, so we end
up defining one *just* to serve as a cookie to identify our locks.

But it works, and we actually do expect to start using the
lock_manager_operations at some point anyway.

Signed-off-by: Marc Eshel <eshel@almaden.ibm.com>
Signed-off-by: Andy Adamson <andros@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-11 06:18:51 -07:00
NeilBrown
7775f4c85d [PATCH] knfsd: Correct reserved reply space for read requests.
NFSd makes sure there is enough space to hold the maximum possible reply
before accepting a request.  The units for this maximum is (4byte) words.
However in three places, particularly for read request, the number given is
a number of bytes.

This means too much space is reserved which is slightly wasteful.

This is the sort of patch that could uncover a deeper bug, and it is not
critical, so it would be best for it to spend a while in -mm before going
in to mainline.

(akpm: target 2.6.17-rc2, 2.6.16.3 (approx))

Discovered-by: "Eivind  Sarto" <ivan@kasenna.com>
Signed-off-by: Neil Brown <neilb@suse.de>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-11 06:18:51 -07:00
Miklos Szeredi
08a53cdce6 [PATCH] fuse: account background requests
The previous patch removed limiting the number of outstanding requests.  This
patch adds a much simpler limiting, that is also compatible with file locking
operations.

A task may have at most one synchronous request allocated.  So these requests
need not be otherwise limited.

However the number of background requests (release, forget, asynchronous
reads, interrupted requests) can grow indefinitely.  This can be used by a
malicous user to cause FUSE to allocate arbitrary amounts of unswappable
kernel memory, denying service.

For this reason add a limit for the number of background requests, and block
allocations of new requests until the number goes bellow the limit.

Also use this mechanism to block all requests until the INIT reply is
received.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-11 06:18:49 -07:00
Miklos Szeredi
ce1d5a491f [PATCH] fuse: clean up request accounting
FUSE allocated most requests from a fixed size pool filled at mount time.
However in some cases (release/forget) non-pool requests were used.  File
locking operations aren't well served by the request pool, since they may
block indefinetly thus exhausting the pool.

This patch removes the request pool and always allocates requests on demand.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-11 06:18:49 -07:00
Miklos Szeredi
a87046d822 [PATCH] fuse: consolidate device errors
Return consistent error values for the case when the opened device file has no
mount associated yet.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-11 06:18:48 -07:00
Miklos Szeredi
d713311464 [PATCH] fuse: use a per-mount spinlock
Remove the global spinlock in favor of a per-mount one.

This patch is basically find & replace.  The difficult part has already been
done by the previous patch.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-11 06:18:48 -07:00
Miklos Szeredi
0720b31597 [PATCH] fuse: simplify locking
This is in preparation for removing the global spinlock in favor of a
per-mount one.

The only critical part is the interaction between fuse_dev_release() and
fuse_fill_super(): fuse_dev_release() must see the assignment to
file->private_data, otherwise it will leak the reference to fuse_conn.

This is ensured by the fput() operation, which will synchronize the assignment
with other CPU's that may do a final fput() soon after this.

Also redundant locking is removed from fuse_fill_super(), where exclusion is
already ensured by the BKL held for this function by the VFS.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-11 06:18:48 -07:00
Jeff Dike
e5ac1d1e70 [PATCH] fuse: add O_NONBLOCK support to FUSE device
I don't like duplicating the connected and list_empty tests in fuse_dev_readv,
but this seemed cleaner than adding the f_flags test to request_wait.

Signed-off-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-11 06:18:48 -07:00
Jeff Dike
385a17bfc3 [PATCH] fuse: add O_ASYNC support to FUSE device
This adds asynchronous notification to FUSE - a FUSE server can request
O_ASYNC on a /dev/fuse file descriptor and receive SIGIO when there is input
available.

One subtlety - fuse_dev_fasync, which is called when O_ASYNC is requested,
does no locking, unlink the other methods.  I think it's unnecessary, as the
fuse_conn.fasync list is manipulated only by fasync_helper and kill_fasync,
which provide their own locking.  It would also be wrong to use the fuse_lock,
as it's a spin lock and fasync_helper can sleep.  My one concern with this is
the fuse_conn going away underneath fuse_dev_fasync - sys_fcntl takes a
reference on the file struct, so this seems not to be a problem.

Signed-off-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-11 06:18:48 -07:00
Miklos Szeredi
7025d9ad10 [PATCH] fuse: fix fuse_dev_poll() return value
fuse_dev_poll() returned an error value instead of a poll mask.  Luckily (or
unluckily) -ENODEV does contain the POLLERR bit.

There's also a race if filesystem is unmounted between fuse_get_conn() and
spin_lock(), in which case this event will be missed by poll().

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-11 06:18:47 -07:00
Miklos Szeredi
d3406ffa4a [PATCH] fuse: fix oops in fuse_send_readpages()
During heavy parallel filesystem activity it was possible to Oops the kernel.
The reason is that read_cache_pages() could skip pages which have already been
inserted into the cache by another task.  Occasionally this may result in zero
pages actually being sent, while fuse_send_readpages() relies on at least one
page being in the request.

So check this corner case and just free the request instead of trying to send
it.

Reported and tested by Konstantin Isakov.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-11 06:18:47 -07:00
Ananiev, Leonid I
389ed39b97 [PATCH] ext3: Fix missed mutex unlock
Missed unlock_super()call is added in error condition code path.

Signed-off-by: Leonid Ananiev <leonid.i.ananiev@intel.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-11 06:18:46 -07:00
Arnd Bergmann
091e881d0e [PATCH] inotify: check for NULL inode in inotify_d_instantiate
The spufs file system creates files in a directory before instantiating the
directory itself, which causes a NULL pointer access in
inotify_d_instantiate since c32ccd87bf.

I'd like to keep this behavior since it means that the user will not have
access to files in the directory before I know that I succeed in creating
everything in it.  This patch adds a simple check for the inode to keep
that working.

Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Acked-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-11 06:18:45 -07:00
Vivek Goyal
68250ba5df [PATCH] kdump: enable CONFIG_PROC_VMCORE by default
Everybody seems to be using /proc/vmcore as a method to access the kernel
crash dump.  Hence probably it makes sense to enable CONFIG_PROC_VMCORE by
default if CONFIG_CRASH_DUMP is selected.  This makes kdump configuration
further easier for a user.

Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-11 06:18:45 -07:00
Roland McGrath
f5e902817f [PATCH] process accounting: take original leader's start_time in non-leader exec
The only record we have of the real-time age of a process, regardless of
execs it's done, is start_time.  When a non-leader thread exec, the
original start_time of the process is lost.  Things looking at the
real-time age of the process are fooled, for example the process accounting
record when the process finally dies.  This change makes the oldest
start_time stick around with the process after a non-leader exec.  This way
the association between PID and start_time is kept constant, which seems
correct to me.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-11 06:18:42 -07:00
Davide Libenzi
2395140ee2 [PATCH] uniform POLLRDHUP handling between epoll and poll/select
As reported by Michael Kerrisk, POLLRDHUP handling was not consistent
between epoll and poll/select, since in epoll it was unmaskeable.  This
patch brings uniformity in POLLRDHUP handling.

Signed-off-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-11 06:18:42 -07:00
Vivek Goyal
80e8ff6341 [PATCH] kdump proc vmcore size oveflow fix
A couple of /proc/vmcore data structures overflow with 32bit systems having
memory more than 4G.  This patch fixes those.

Signed-off-by: Ken'ichi Ohmichi <oomichi@mxs.nes.nec.co.jp>
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-11 06:18:42 -07:00
Mitchell Blank Jr
b04eb6aa08 [PATCH] select: don't overflow if (SELECT_STACK_ALLOC % sizeof(long) != 0)
If SELECT_STACK_ALLOC is not a multiple of sizeof(long) then stack_fds[]
would be shorter than SELECT_STACK_ALLOC bytes and could overflow later in
the function.  Fixed by simply rearranging the test later to work on
sizeof(stack_fds) Currently SELECT_STACK_ALLOC is 256 so this doesn't
happen, but it's nasty to have things like this hidden in the code.  What
if later someone decides to change SELECT_STACK_ALLOC to 300?

Signed-off-by: Mitchell Blank Jr <mitch@sfgoth.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-11 06:18:41 -07:00
Eric Van Hensbergen
00fbc6dfe7 [PATCH] 9p: handle sget() failure
Handle a failing sget() in v9fs_get_sb().

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-11 06:18:41 -07:00
Herbert Poetzl
f6422f17d3 [PATCH] vfs: propagate mnt_flags into do_loopback/vfsmount
The mnt_flags are propagated into do_loopback(), so that they can be stored
with the vfsmount

Signed-off-by: Herbert Poetzl <herbert@13thfloor.at>
Acked-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-11 06:18:41 -07:00
Andrew Morton
5246d05031 [PATCH] sync_file_range(): use unsigned for flags
Ulrich suggested that the `flags' arg to sync_file_range() become unsigned.

Cc: Ulrich Drepper <drepper@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-11 06:18:40 -07:00
Jeff Dike
7b04d7170e [PATCH] Add GFP_NOWAIT
Introduce GFP_NOWAIT, as an alias for GFP_ATOMIC & ~__GFP_HIGH.

This also changes XFS, which is the only in-tree user of this idiom that I
could find.  The XFS piece is compile-tested only.

Signed-off-by: Jeff Dike <jdike@addtoit.com>
Acked-by: Nathan Scott <nathans@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-11 06:18:35 -07:00
Andrew Morton
29ff2db551 [PATCH] select() warning fixes
fs/select.c: In function `core_sys_select':
fs/select.c:339: warning: assignment from incompatible pointer type
fs/select.c:376: warning: comparison of distinct pointer types lacks a cast

By using a void* we can remove lots of casts rather than adding more.

Cc: Jes Sorensen <jes@trained-monkey.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-11 06:18:30 -07:00
Ingo Molnar
341b446bc5 [PATCH] another round of fs/pipe.c cleanups
make pipe.c a bit more readable and hackable.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-11 13:57:45 +02:00
Ingo Molnar
73d62d83ec [PATCH] splice: comment styles
- capitalize consistently
 - end sentences in one way or another
 - update comment text to match the implementation

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-11 13:57:21 +02:00
Jens Axboe
c2058e0611 [PATCH] splice: add Ingo as addition copyright holder
The comment is also somewhat out of date, correct that as well.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-11 13:56:34 +02:00
Jens Axboe
49570e9b29 [PATCH] splice: unlikely() optimizations
Also corrects a few comments. Patch mainly from Ingo, changes by me.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-11 13:56:09 +02:00
Jens Axboe
6f767b0425 [PATCH] splice: speedups and optimizations
- Kill the local variables that cache ->nrbufs, they just take up space.

- Only set do_wakeup for a real pipe. This is a big win for direct splicing.

- Kill i_mutex lock around ->f_pos update, regular io paths don't do this
  either.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-11 13:53:56 +02:00
Ingo Molnar
923f4f2394 [PATCH] pipe.c/fifo.c code cleanups
more code cleanups after the macro conversion:

 - standardize on 'struct pipe_inode_info *pipe' variable names
 - introduce 'pipe' temporaries to reduce mass inode->i_pipe dereferencing

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-11 13:53:33 +02:00
Ingo Molnar
9aeedfc471 [PATCH] get rid of the PIPE_*() macros
get rid of the PIPE_*() macros. Scripted transformation.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-11 13:53:10 +02:00
Jens Axboe
7480a90435 [PATCH] splice: speedup __generic_file_splice_read
Using find_get_page() is a lot faster than find_or_create_page(). This
gets splice a lot closer to sendfile() for fd -> socket transfers.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-11 13:52:47 +02:00
Jens Axboe
b92ce55893 [PATCH] splice: add direct fd <-> fd splicing support
It's more efficient for sendfile() emulation. Basically we cache an
internal private pipe and just use that as the intermediate area for
pages. Direct splicing is not available from sys_splice(), it is only
meant to be used for sendfile() emulation.

Additional patch from Ingo Molnar to avoid the PIPE_BUFFERS loop at
exit for the normal fast path.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-11 13:52:07 +02:00
Nathan Scott
019ff2d57b [XFS] Fix a problem in aligning inode allocations to stripe unit
boundaries.

SGI-PV: 951862
SGI-Modid: xfs-linux-melb:xfs-kern:25726a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-04-11 15:45:05 +10:00
Nathan Scott
8c0b5113a5 [XFS] Fix utime(2) in the case that no times parameter was passed in.
SGI-PV: 949858
SGI-Modid: xfs-linux-melb:xfs-kern:25717a

Signed-off-by: Jes Sorensen <jes@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-04-11 15:12:45 +10:00
David Chinner
58829e490e [XFS] Fix an inode use-after-free durin an unpin. When reclaiming inodes
that have been unlinked, we may need to execute transactions during
reclaim. By the time the transaction has hit the disk, the linux inode and
xfs vnode may already have been freed so we can't reference them safely.
Use the known xfs inode state to determine if it is safe to reference the
vnode and linux inode during the unpin operation.

SGI-PV: 946321
SGI-Modid: xfs-linux-melb:xfs-kern:25687a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-04-11 15:11:20 +10:00
David Chinner
1fc5d959d8 [XFS] Fix inode reclaim scalability regression. When a filesystem has
millions of inodes cached and has sparse cluster population, removing
inodes from the cluster hash consumes excessive amounts of CPU time.
Reduce the CPU cost by making removal O(1) via use of a double linked list
for the hash chains.

SGI-PV: 951551
SGI-Modid: xfs-linux-melb:xfs-kern:25683a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-04-11 15:11:12 +10:00
Nathan Scott
8272145c05 [XFS] Fix a writepage regression where we accidentally stopped honouring
nonblock mode with the new IO path code (since 2.6.16).

SGI-PV: 951662
SGI-Modid: xfs-linux-melb:xfs-kern:25676a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-04-11 15:10:55 +10:00
Nathan Scott
e50bd16fe4 [XFS] Fix superblock validation regression for the zero imaxpct case.
Thanks to kjamieson for noticing.

SGI-PV: 951661
SGI-Modid: xfs-linux-melb:xfs-kern:25675a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-04-11 15:10:45 +10:00
Linus Torvalds
e38d557896 Merge branch 'upstream-linus' of git://oss.oracle.com/home/sourcebo/git/ocfs2
* 'upstream-linus' of git://oss.oracle.com/home/sourcebo/git/ocfs2:
  [PATCH] CONFIGFS_FS must depend on SYSFS
  [PATCH] Bogus NULL pointer check in fs/configfs/dir.c
  ocfs2: Better I/O error handling in heartbeat
  ocfs2: test and set teardown flag early in user_dlm_destroy_lock()
  ocfs2: Handle the DLM_CANCELGRANT case in user_unlock_ast()
  ocfs2: catch an invalid ast case in dlmfs
  ocfs2: remove an overly aggressive BUG() in dlmfs
  ocfs2: multi node truncate fix
2006-04-10 16:44:09 -07:00
Eric W. Biederman
de12a7878c [PATCH] de_thread: Don't confuse users do_each_thread.
Oleg Nesterov spotted two interesting bugs with the current de_thread
code.  The simplest is a long standing double decrement of
__get_cpu_var(process_counts) in __unhash_process.  Caused by
two processes exiting when only one was created.

The other is that since we no longer detach from the thread_group list
it is possible for do_each_thread when run under the tasklist_lock to
see the same task_struct twice.  Once on the task list as a
thread_group_leader, and once on the thread list of another
thread.

The double appearance in do_each_thread can cause a double increment
of mm_core_waiters in zap_threads resulting in problems later on in
coredump_wait.

To remedy those two problems this patch takes the simple approach
of changing the old thread group leader into a child thread.
The only routine in release_task that cares is __unhash_process,
and it can be trivially seen that we handle cleaning up a
thread group leader properly.

Since de_thread doesn't change the pid of the exiting leader process
and instead shares it with the new leader process.  I change
thread_group_leader to recognize group leadership based on the
group_leader field and not based on pids.  This should also be
slightly cheaper then the existing thread_group_leader macro.

I performed a quick audit and I couldn't see any user of
thread_group_leader that cared about the difference.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-10 16:36:50 -07:00
Adrian Bunk
65714b9184 [PATCH] CONFIGFS_FS must depend on SYSFS
This patch fixes the a compile error with CONFIG_SYSFS=n

Configfs is creating, as a matter of policy, the /sys/kernel/config
mountpoint.  This means it requires CONFIG_SYSFS.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-04-10 11:17:21 -07:00
Eric Sesterhenn
cbca692c24 [PATCH] Bogus NULL pointer check in fs/configfs/dir.c
We check the "group" pointer after we dereference it.  This check is
bogus, as it cannot be NULL coming in.

Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-04-10 11:16:17 -07:00
Ingo Molnar
529565dcb1 [PATCH] splice: add optional input and output offsets
add optional input and output offsets to sys_splice(), for seekable file
descriptors:

 asmlinkage long sys_splice(int fd_in, loff_t __user *off_in,
                            int fd_out, loff_t __user *off_out,
                            size_t len, unsigned int flags);

semantics are straightforward: f_pos will be updated with the offset
provided by user-space, before the splice transfer is about to begin.
Providing a NULL offset pointer means the existing f_pos will be used
(and updated in situ).  Providing an offset for a pipe results in
-ESPIPE. Providing an invalid offset pointer results in -EFAULT.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-10 15:18:58 +02:00
Ingo Molnar
3a326a2ce8 [PATCH] introduce a "kernel-internal pipe object" abstraction
separate out the 'internal pipe object' abstraction, and make it
usable to splice. This cleans up and fixes several aspects of the
internal splice APIs and the pipe code:

 - pipes: the allocation and freeing of pipe_inode_info is now more symmetric
   and more streamlined with existing kernel practices.

 - splice: small micro-optimization: less pointer dereferencing in splice
   methods

Signed-off-by: Ingo Molnar <mingo@elte.hu>

Update XFS for the ->splice_read/->splice_write changes.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-10 15:18:35 +02:00
Jens Axboe
0b749ce380 [PATCH] splice: be smarter about calling do_page_cache_readahead()
We don't want to call into the read-ahead logic unless we are at the
start of a page, _or_ we have multiple pages to read.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-10 09:05:04 +02:00
Jens Axboe
49d0b21be2 [PATCH] splice: optimize the splice buffer mapping
We don't really need to lock down the pages, just make sure they
are uptodate.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-10 09:04:41 +02:00
Jens Axboe
16c523ddab [PATCH] splice: cleanup __generic_file_splice_read()
The whole shadow/pages logic got overly complex, and this simpler
approach is actually faster in testing.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-10 09:03:58 +02:00
Jens Axboe
c0bd1f650b [PATCH] splice: only call wake_up_interruptible() when we really have to
__wake_up_common() is pretty heavy in the kernel profiles, this brings
it down to a more acceptable level.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-10 09:03:32 +02:00
Dave Jones
9aefe431f5 [PATCH] splice: potential !page dereference
We can get to out: with a NULL page, which we probably
don't want to be calling page_cache_release() on.

Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-10 09:02:40 +02:00
Jens Axboe
c7f21e4f5a [PATCH] splice: mark the io page as accessed
We should do that, since we do the LRU manipulation ourselves now. Suggested
by Nick Piggin.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-10 09:01:01 +02:00
Mark Fasheh
a9e2ae3917 ocfs2: Better I/O error handling in heartbeat
Propagate errors received in o2hb_bio_end_io() back to the heartbeat thread
so it can skip re-arming the timer.

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-04-07 18:03:09 -07:00
Mark Fasheh
2cd9888590 ocfs2: test and set teardown flag early in user_dlm_destroy_lock()
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-04-07 17:39:43 -07:00
Mark Fasheh
f43e6918c0 ocfs2: Handle the DLM_CANCELGRANT case in user_unlock_ast()
Remove the code which attempted to catch it via dlmunlock() return status -
this never happens there.

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-04-07 17:37:52 -07:00
Mark Fasheh
cc6eb72595 ocfs2: catch an invalid ast case in dlmfs
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-04-07 17:36:16 -07:00
Mark Fasheh
1f7bc828e3 ocfs2: remove an overly aggressive BUG() in dlmfs
Don't BUG() user_dlm_unblock_lock() on the absence of the USER_LOCK_BLOCKED
flag - this turns out to be a valid case. Make some of the related BUG()
statements print more useful information.

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-04-07 17:27:43 -07:00
Mark Fasheh
ab0920ce7e ocfs2: multi node truncate fix
Fix ocfs2_truncate_file() so that it forces a truncate_inode_pages() on all
interested nodes in all cases of a truncate(), not just allocation change.

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-04-07 16:47:24 -07:00
Linus Torvalds
d69636157a Merge branch 'splice' of git://brick.kernel.dk/data/git/linux-2.6-block
* 'splice' of git://brick.kernel.dk/data/git/linux-2.6-block:
  [PATCH] splice: fix page stealing LRU handling.
  [PATCH] splice: page stealing needs to wait_on_page_writeback()
  [PATCH] splice: export generic_splice_sendpage
  [PATCH] splice: add a SPLICE_F_MORE flag
  [PATCH] splice: add comments documenting more of the code
  [PATCH] splice: improve writeback and clean up page stealing
  [PATCH] splice: fix shadow[] filling logic
2006-04-02 14:22:06 -07:00
Jens Axboe
3e7ee3e7b3 [PATCH] splice: fix page stealing LRU handling.
Originally from Nick Piggin, just adapted to the newer branch.

You can't check PageLRU without holding zone->lru_lock.  The page
release code can get away with it only because the page refcount is 0 at
that point. Also, you can't reliably remove pages from the LRU unless
the refcount is 0. Ever.

Signed-off-by: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-02 23:11:04 +02:00
Jens Axboe
ad8d6f0a78 [PATCH] splice: page stealing needs to wait_on_page_writeback()
Thanks to Andrew for the good explanation of why this is so. akpm writes:

If a page is under writeback and we remove it from pagecache, it's still
going to get written to disk.  But the VFS no longer knows about that page,
nor that this page is about to modify disk blocks.

So there might be scenarios in which those
blocks-which-are-about-to-be-written-to get reused for something else.
When writeback completes, it'll scribble on those blocks.

This won't happen in ext2/ext3-style filesystems in normal mode because the
page has buffers and try_to_release_page() will fail.

But ext2 in nobh mode doesn't attach buffers at all - it just sticks the
page in a BIO, finds some new blocks, points the BIO at those blocks and
lets it rip.

While that write IO's in flight, someone could truncate the file.  Truncate
won't block on the writeout because the page isn't in pagecache any more.
So truncate will the free the blocks from the file under the page's feet.
Then something else can reallocate those blocks.  Then write data to them.

Now, the original write completes, corrupting the filesystem.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-02 23:10:32 +02:00
Jens Axboe
059a8f3734 [PATCH] splice: export generic_splice_sendpage
Forgot that one, thanks Jeff. Also move the other EXPORT_SYMBOL
to right below the functions.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-02 23:06:05 +02:00
Jens Axboe
b2b39fa478 [PATCH] splice: add a SPLICE_F_MORE flag
This lets userspace indicate whether more data will be coming in a
subsequent splice call.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-02 23:05:41 +02:00
Jens Axboe
83f9135bdd [PATCH] splice: add comments documenting more of the code
Hopefully this will make Andrew a little more happy.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-02 23:05:09 +02:00
Jens Axboe
4f6f0bd2ff [PATCH] splice: improve writeback and clean up page stealing
By cleaning up the writeback logic (killing write_one_page() and the manual
set_page_dirty()), we can get rid of ->stolen inside the pipe_buffer and
just keep it local in pipe_to_file().

This also adds dirty page balancing logic and O_SYNC handling.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-02 23:04:46 +02:00
Jens Axboe
53cd9ae886 [PATCH] splice: fix shadow[] filling logic
Clear the entire range, and don't increment pidx or we keep filling
the same position again and again.

Thanks to KAMEZAWA Hiroyuki.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-02 23:04:21 +02:00
Linus Torvalds
a2308b7f08 Merge git://oss.sgi.com:8090/oss/git/xfs-2.6
* git://oss.sgi.com:8090/oss/git/xfs-2.6:
  [XFS] Provide XFS support for the splice syscall.
  [XFS] Reenable write barriers by default.
  [XFS] Make project quota enforcement return an error code consistent with
  [XFS] Implement the silent parameter to fill_super, previously ignored.
  [XFS] Cleanup comment to remove reference to obsoleted function
2006-04-02 13:11:25 -07:00
Greg Kroah-Hartman
6e0dd741a8 [PATCH] sysfs: zero terminate sysfs write buffers
No one should be writing a PAGE_SIZE worth of data to a normal sysfs
file, so properly terminate the buffer.

Thanks to Al Viro for pointing out my supidity here.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-02 13:03:31 -07:00
Linus Torvalds
63589ed078 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial
* git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial: (48 commits)
  Documentation: fix minor kernel-doc warnings
  BUG_ON() Conversion in drivers/net/
  BUG_ON() Conversion in drivers/s390/net/lcs.c
  BUG_ON() Conversion in mm/slab.c
  BUG_ON() Conversion in mm/highmem.c
  BUG_ON() Conversion in kernel/signal.c
  BUG_ON() Conversion in kernel/signal.c
  BUG_ON() Conversion in kernel/ptrace.c
  BUG_ON() Conversion in ipc/shm.c
  BUG_ON() Conversion in fs/freevxfs/
  BUG_ON() Conversion in fs/udf/
  BUG_ON() Conversion in fs/sysv/
  BUG_ON() Conversion in fs/inode.c
  BUG_ON() Conversion in fs/fcntl.c
  BUG_ON() Conversion in fs/dquot.c
  BUG_ON() Conversion in md/raid10.c
  BUG_ON() Conversion in md/raid6main.c
  BUG_ON() Conversion in md/raid5.c
  Fix minor documentation typo
  BFP->BPF in Documentation/networking/tuntap.txt
  ...
2006-04-02 12:58:45 -07:00
Linus Torvalds
29e350944f splice: add SPLICE_F_NONBLOCK flag
It doesn't make the splice itself necessarily nonblocking (because the
actual file descriptors that are spliced from/to may block unless they
have the O_NONBLOCK flag set), but it makes the splice pipe operations
nonblocking.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-02 12:46:35 -07:00
Martin Waitz
a580290c3e Documentation: fix minor kernel-doc warnings
This patch updates the comments to match the actual code.

Signed-off-by: Martin Waitz <tali@admingilde.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-04-02 13:59:55 +02:00
Eric Sesterhenn
7ec7073809 BUG_ON() Conversion in fs/freevxfs/
this changes if() BUG(); constructs to BUG_ON() which is
cleaner, contains unlikely() and can better optimized away.

Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-04-02 13:41:02 +02:00
Eric Sesterhenn
2c2111c2bd BUG_ON() Conversion in fs/udf/
this changes if() BUG(); constructs to BUG_ON() which is
cleaner, contains unlikely() and can better optimized away.

Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-04-02 13:40:13 +02:00
Eric Sesterhenn
d6735bfcc9 BUG_ON() Conversion in fs/sysv/
this changes if() BUG(); constructs to BUG_ON() which is
cleaner, contains unlikely() and can better optimized away.

Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-04-02 13:39:21 +02:00
Eric Sesterhenn
b7542f8c7e BUG_ON() Conversion in fs/inode.c
this changes if() BUG(); constructs to BUG_ON() which is
cleaner and can better optimized away

Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-04-02 13:38:18 +02:00
Eric Sesterhenn
f6298aab2e BUG_ON() Conversion in fs/fcntl.c
this changes if() BUG(); constructs to BUG_ON() which is
cleaner and can better optimized away

Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-04-02 13:37:19 +02:00
Eric Sesterhenn
8abf6a4707 BUG_ON() Conversion in fs/dquot.c
this changes if() BUG(); constructs to BUG_ON() which is
cleaner and can better optimized away

Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-04-02 13:36:13 +02:00
Adrian Bunk
733f896927 Merge with git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git 2006-04-02 10:37:38 +02:00
Linus Torvalds
547a77ae62 Merge master.kernel.org:/pub/scm/linux/kernel/git/sfrench/cifs-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
  [CIFS] Fix typo in earlier cifs_unlink change and protect one
  [CIFS] Incorrect signature sent on SMB Read
  [CIFS] Fix unlink oops when indirectly called in rename error path
  [CIFS] Fix two remaining coverity scan tool warnings.
  [CIFS] Set correct lock type on new posix unlock call
  [CIFS] Upate cifs change log
  [CIFS] Fix slow oplock break response when mounts to different
  [CIFS] Workaround various server bugs found in testing at connectathon
  [CIFS] Allow fallback for setting file size to Procom SMB server when
  [CIFS] Make POSIX CIFS Extensions SetFSInfo match exactly what we want
  [CIFS] Move noisy debug message (triggerred by some older servers) from
  [CIFS] Use correct pid on new cifs posix byte range lock call
  [CIFS] Add posix (advisory) byte range locking support to cifs client
  [CIFS] CIFS readdir perf optimizations part 1
  [CIFS] Free small buffers earlier so we exceed the cifs
  [CIFS] Fix large (ie over 64K for MaxCIFSBufSize) buffer case for wrapping
  [CIFS] Convert remaining places in fs/cifs from
  [CIFS] SessionSetup cleanup part 2
  [CIFS] fix compile error (typo) and warning in cifssmb.c
  [CIFS] Cleanup NTLMSSP session setup handling
2006-03-31 21:27:53 -08:00
Eric Sesterhenn
99cee0cd75 BUG_ON() Conversion in fs/sysfs/
this changes if() BUG(); constructs to BUG_ON() which is
cleaner, contains unlikely() and can better optimized away.

Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-04-01 01:18:38 +02:00
Eric Sesterhenn
5df0d31241 BUG_ON() Conversion in fs/smbfs/
this changes if() BUG(); constructs to BUG_ON() which is
cleaner, contains unlikely() and can better optimized away.

Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-04-01 01:16:26 +02:00
Eric Sesterhenn
4b4d1cc733 BUG_ON() Conversion in fs/jffs2/
this changes if() BUG(); constructs to BUG_ON() which is
cleaner, contains unlikely() and can better optimized away.

Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-04-01 01:15:35 +02:00
Eric Sesterhenn
0bf3ba538a BUG_ON() Conversion in fs/hfsplus/
this changes if() BUG(); constructs to BUG_ON() which is
cleaner, contains unlikely() and can better optimized away.

Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-04-01 01:14:43 +02:00
Eric Sesterhenn
7dddb12c63 BUG_ON() Conversion in fs/exec.c
this changes if() BUG(); constructs to BUG_ON() which is
cleaner and can better optimized away

Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-04-01 01:13:38 +02:00
Eric Sesterhenn
d4569d2e69 BUG_ON() Conversion in fs/direct-io.c
this changes if() BUG(); constructs to BUG_ON() which is
cleaner and can better optimized away

Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-04-01 01:10:13 +02:00
Steve French
06bcfedd05 [CIFS] Fix typo in earlier cifs_unlink change and protect one
extra path.

Since cifs_unlink can also be called from rename path and there
was one report of oops am making the extra check for null inode.

Signed-off-by: Steve French <sfrench@us.ibm.com>
2006-03-31 22:43:50 +00:00
Steve French
e9917a000f [CIFS] Incorrect signature sent on SMB Read
Fixes Samba bug 3621 and kernel.org bug 6147

For servers which require SMB/CIFS packet signing, we were sending the
wrong signature (all zeros) on SMB Read request.  The new cifs routine
to do signatures across an iovec was not complete - and SMB Read, unlike
the new SMBWrite2, did not fall back to the older routine (ie use
SendReceive vs. the more efficient SendReceive2 ie used the older
cifs_sign_smb vs. the disabled  cifs_sign_smb2) for calculating signatures.

This finishes up cifs_sign_smb2/cifs_calc_signature2 so that the callers
of SendReceive2 can get SMB/CIFS packet signatures.

Now that cifs_sign_smb2 is supported, we could start using it in
the write path but this smaller fix does not include the change
to use SMBWrite2 when signatures are required (which when enabled
will make more Writes more efficient and alloc less memory).
Currently Write2 is only used when signatures are not
required at the moment but after more testing we will enable
that as well).

Thanks to James Slepicka and Sam Flory for initial investigation.

Signed-off-by: Steve French <sfrench@us.ibm.com>
2006-03-31 21:22:00 +00:00
Jes Sorensen
30c14e40ed [PATCH] avoid unaligned access when accessing poll stack
Commit 70674f95c0:

  [PATCH] Optimize select/poll by putting small data sets on the stack

resulted in the poll stack being 4-byte aligned on 64-bit architectures,
causing misaligned accesses to elements in the array.

This patch fixes it by declaring the stack in terms of 'long' instead
of 'char'.

Force alignment of poll and select stacks to long to avoid unaligned
access on 64 bit architectures.

Signed-off-by: Jes Sorensen <jes@sgi.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-31 12:30:48 -08:00
Adrian Bunk
a244e1698a [PATCH] fs/namei.c: make lookup_hash() static
As announced, lookup_hash() can now become static.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-31 12:19:01 -08:00
Eric W. Biederman
3e7e241f8c [PATCH] dcache: Add helper d_hash_and_lookup
It is very common to hash a dentry and then to call lookup.  If we take fs
specific hash functions into account the full hash logic can get ugly.
Further full_name_hash as an inline function is almost 100 bytes on x86 so
having a non-inline choice in some cases can measurably decrease code size.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-31 12:19:00 -08:00
Herbert Poetzl
e4e5d3fc80 [PATCH] cleanup in proc_check_chroot()
proc_check_chroot() does the check in a very unintuitive way (keeping a
copy of the argument, then modifying the argument), and has uncommented
sideeffects.

Signed-off-by: Herbert Poetzl <herbert@13thfloor.at>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-31 12:18:59 -08:00
Trond Myklebust
993dfa8776 [PATCH] fs/locks.c: Fix sys_flock() race
sys_flock() currently has a race which can result in a double free in the
multi-thread case.

Thread 1			Thread 2

sys_flock(file, LOCK_EX)
				sys_flock(file, LOCK_UN)

If Thread 2 removes the lock from inode->i_lock before Thread 1 tests for
list_empty(&lock->fl_link) at the end of sys_flock, then both threads will
end up calling locks_free_lock for the same lock.

Fix is to make flock_lock_file() do the same as posix_lock_file(), namely
to make a copy of the request, so that the caller can always free the lock.

This also has the side-effect of fixing up a reference problem in the
lockd handling of flock.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-31 12:18:56 -08:00
Amy Griffis
7a2bd3f7ef [PATCH] inotify: IN_DELETE events missing
IN_DELETE events are no longer generated for the removal of a file from a
watched directory.

This seems to be a result of clearing DCACHE_INOTIFY_PARENT_WATCHED in
d_delete() directly before calling fsnotify_nameremove().

Assuming the flag doesn't need to be cleared before dentry_iput(), this
should do the trick.

Signed-off-by: Amy Griffis <amy.griffis@hp.com>
Cc: John McCutchan <ttb@tentacle.dhs.org>
Acked-by: Robert Love <rml@novell.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-31 12:18:55 -08:00
OGAWA Hirofumi
094e320d76 [PATCH] fat: kill reserved names
Since these names on old MSDOS is used as device, so, current fat driver
doesn't allow a user to create those names.  But many OSes and even Windows
can create those names actually, now.

This patch removes the reserved name check.

Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-31 12:18:55 -08:00
Andrew Morton
f79e2abb9b [PATCH] sys_sync_file_range()
Remove the recently-added LINUX_FADV_ASYNC_WRITE and LINUX_FADV_WRITE_WAIT
fadvise() additions, do it in a new sys_sync_file_range() syscall instead.
Reasons:

- It's more flexible.  Things which would require two or three syscalls with
  fadvise() can be done in a single syscall.

- Using fadvise() in this manner is something not covered by POSIX.

The patch wires up the syscall for x86.

The sycall is implemented in the new fs/sync.c.  The intention is that we can
move sys_fsync(), sys_fdatasync() and perhaps sys_sync() into there later.

Documentation for the syscall is in fs/sync.c.

A test app (sync_file_range.c) is in
http://www.zip.com.au/~akpm/linux/patches/stuff/ext3-tools.tar.gz.

The available-to-GPL-modules do_sync_file_range() is for knfsd: "A COMMIT can
say NFS_DATA_SYNC or NFS_FILE_SYNC.  I can skip the ->fsync call for
NFS_DATA_SYNC which is hopefully the more common."

Note: the `async' writeout mode SYNC_FILE_RANGE_WRITE will turn synchronous if
the queue is congested.  This is trivial to fix: add a new flag bit, set
wbc->nonblocking.  But I'm not sure that we want to expose implementation
details down to that level.

Note: it's notable that we can sync an fd which wasn't opened for writing.
Same with fsync() and fdatasync()).

Note: the code takes some care to handle attempts to sync file contents
outside the 16TB offset on 32-bit machines.  It makes such attempts appear to
succeed, for best 32-bit/64-bit compatibility.  Perhaps it should make such
requests fail...

Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Cc: Ulrich Drepper <drepper@redhat.com>
Cc: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-31 12:18:54 -08:00
Joe Korty
68eef3b479 [PATCH] Simplify proc/devices and fix early termination regression
Make baby-simple the code for /proc/devices.  Based on the proven design
for /proc/interrupts.

This also fixes the early-termination regression 2.6.16 introduced, as
demonstrated by:

    # dd if=/proc/devices bs=1
    Character devices:
      1 mem
    27+0 records in
    27+0 records out

This should also work (but is untested) when /proc/devices >4096 bytes,
which I believe is what the original 2.6.16 rewrite fixed.

[akpm@osdl.org: cleanups, simplifications]
Signed-off-by: Joe Korty <joe.korty@ccur.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-31 12:18:53 -08:00
Miklos Szeredi
5ce29646eb [PATCH] locks: don't panic
Don't panic!  Just BUG_ON().

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-31 12:18:52 -08:00
Al Viro
347f217cec [PATCH] uml: __user annotations
__user annotations (hppfs)

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-31 12:18:51 -08:00
Steve French
f1682e94a3 Merge with /pub/scm/linux/kernel/git/torvalds/linux-2.6.git
Signed-off-by: Steve French <sfrench@us.ibm.com>
2006-03-31 15:43:35 +00:00
Jeff Garzik
a0f0678025 [PATCH] splice exports
Woe be unto he who builds their filesystems as modules.

Signed-off-by: Jeff Garzik <jeff@garzik.org>
[ Obscure quote from the infamous geek bible? ]
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-30 22:16:24 -08:00
Steve French
6910ab30a2 [CIFS] Fix unlink oops when indirectly called in rename error path
under heavy stress.

Signed-off-by: Steve French <sfrench@us.ibm.com>
2006-03-31 03:37:08 +00:00