Commit Graph

6247 Commits

Author SHA1 Message Date
David Woodhouse
ac0c955d50 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 2007-08-23 10:43:14 +01:00
Oleg Nesterov
abd96ecb29 exec: kill unsafe BUG_ON(sig->count) checks
de_thread:

	if (atomic_read(&oldsighand->count) <= 1)
		BUG_ON(atomic_read(&sig->count) != 1);

This is not safe without the rmb() in between.  The results of two
correctly ordered __exit_signal()->atomic_dec_and_test()'s could be seen
out of order on our CPU.

The same is true for the "thread_group_empty()" case, __unhash_process()'s
changes could be seen before atomic_dec_and_test(&sig->count).

On some platforms (including i386) atomic_read() doesn't provide even the
compiler barrier, in that case these checks are simply racy.

Remove these BUG_ON()'s. Alternatively, we can do something like

	BUG_ON( ({ smp_rmb(); atomic_read(&sig->count) != 1; }) );

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-08-22 19:52:47 -07:00
Ian Kent
1864f7bd58 autofs4: deadlock during create
Due to inconsistent locking in the VFS between calls to lookup and
revalidate deadlock can occur in the automounter.

The inconsistency is that the directory inode mutex is held for both lookup
and revalidate calls when called via lookup_hash whereas it is held only
for lookup during a path walk.  Consequently, if the mutex is held during a
call to revalidate autofs4 can't release the mutex to callback the daemon
as it can't know whether it owns the mutex.

This situation happens when a process tries to create a directory within an
automount and a second process also tries to create the same directory
between the lookup and the mkdir.  Since the first process has dropped the
mutex for the daemon callback, the second process takes it during
revalidate leading to deadlock between the autofs daemon and the second
process when the daemon tries to create the mount point directory.

After spending quite a bit of time trying to resolve this on more than one
occassion, using rather complex and ulgy approaches, it turns out that just
delaying the hashing of the dentry until the create operation works fine.

Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-08-22 19:52:46 -07:00
Oleg Nesterov
f9ee228bdc signalfd: make it group-wide, fix posix-timers scheduling
With this patch any thread can dequeue its own private signals via signalfd,
even if it was created by another sub-thread.

To do so, we pass "current" to dequeue_signal() if the caller is from the same
thread group. This also fixes the scheduling of posix timers broken by the
previous patch.

If the caller doesn't belong to this thread group, we can't handle __SI_TIMER
case properly anyway. Perhaps we should forbid the cross-process signalfd usage
and convert ctx->tsk to ctx->sighand.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Davide Libenzi <davidel@xmailserver.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Cc: Roland McGrath <roland@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-08-22 19:52:46 -07:00
Ryusuke Konishi
df06846416 eCryptfs: fix lookup error for special files
When ecryptfs_lookup() is called against special files, eCryptfs generates
the following errors because it tries to treat them like regular eCryptfs
files.

Error opening lower file for lower_dentry [0xffff810233a6f150], lower_mnt [0xffff810235bb4c80], and flags [0x8000]
Error opening lower_file to read header region
Error attempting to read the [user.ecryptfs] xattr from the lower file; return value = [-95]
Valid metadata not found in header region or xattr region; treating file as unencrypted

For instance, the problem can be reproduced by the steps below.

  # mkdir /root/crypt /mnt/crypt
  # mount -t ecryptfs /root/crypt /mnt/crypt
  # mknod /mnt/crypt/c0 c 0 0
  # umount /mnt/crypt
  # mount -t ecryptfs /root/crypt /mnt/crypt
  # ls -l /mnt/crypt

This patch fixes it by adding a check similar to directories and
symlinks.

Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Acked-by: Michael Halcrow <mhalcrow@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-08-22 19:52:44 -07:00
Andrew Morton
f4e35647f5 [JFFS2] fix printk warning in jffs2_block_check_erase()
fs/jffs2/erase.c: In function 'jffs2_block_check_erase':
fs/jffs2/erase.c:355: warning: format '%08x' expects type 'unsigned int', but argument 3 has type 'long unsigned int'

and

fs/jffs2/erase.c: In function 'jffs2_erase_pending_blocks':
fs/jffs2/erase.c:404: warning: 'bad_offset' may be used uninitialized in this function

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2007-08-22 12:41:48 +01:00
David Woodhouse
9ed437c50d [JFFS2] Fix ACL vs. mode handling.
When POSIX ACL support was enabled, we weren't writing correct
legacy modes to the medium on inode creation, or when the ACL was set.
This meant that the permissions would be incorrect after the file system
was remounted.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2007-08-22 12:39:19 +01:00
Zach Brown
848c4dd515 dio: zero struct dio with kzalloc instead of manually
This patch uses kzalloc to zero all of struct dio rather than manually
trying to track which fields we rely on being zero.  It passed aio+dio
stress testing and some bug regression testing on ext3.

This patch was introduced by Linus in the conversation that lead up to
Badari's minimal fix to manually zero .map_bh.b_state in commit:

  6a648fa721

It makes the code a bit smaller.  Maybe a couple fewer cachelines to
load, if we're lucky:

   text    data     bss     dec     hex filename
3285925  568506 1304616 5159047  4eb887 vmlinux
3285797  568506 1304616 5158919  4eb807 vmlinux.patched

I was unable to measure a stable difference in the number of cpu cycles
spent in blockdev_direct_IO() when pushing aio+dio 256K reads at
~340MB/s.

So the resulting intent of the patch isn't a performance gain but to
avoid exposing ourselves to the risk of finding another field like
.map_bh.b_state where we rely on zeroing but don't enforce it in the
code.

Signed-off-by: Zach Brown <zach.brown@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-08-20 22:50:25 -07:00
David Woodhouse
b574864333 JFFS2 locking regression fix.
Commit a491486a20 introduced a locking
problem in JFFS2 -- we up() the alloc_sem when we weren't previously
holding it. This leads to all kinds of fun behaviour later.

There was a _reason_ for the
	if (1 /* alternative path needs testing */ ||
which the above-mentioned commit removed :)

Discovered and debugged by Giulio Fedel <giulio.fedel@andorsystems.com>

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-08-20 22:44:27 -07:00
Linus Torvalds
edd5f25f74 Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
  [CIFS] Check return code on failed alloc
  [CIFS] Update CIFS project web site
  [CIFS] Fix hang in find_writable_file
2007-08-18 09:30:07 -07:00
Marcel Holtmann
d2d56c5f51 Reset current->pdeath_signal on SUID binary execution
This fixes a vulnerability in the "parent process death signal"
implementation discoverd by Wojciech Purczynski of COSEINC PTE Ltd.
and iSEC Security Research.

http://marc.info/?l=bugtraq&m=118711306802632&w=2

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-08-18 09:29:07 -07:00
Cyrill Gorcunov
5e6e623275 [CIFS] Check return code on failed alloc
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2007-08-18 00:15:20 +00:00
Steven Whitehouse
d18c4d687d [GFS2] Revert remounting w/o acl option leaves acls enabled
This reverts commit 569a7b6c2e. The
code was correct originally. The default setting for ACLs after a
remount should be to be the same as before the remount.

Signed-off-by: Abhijith Das <adas@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-08-14 10:34:40 +01:00
Steven Whitehouse
b9af7ca6d3 [GFS2] Fix setting of inherit jdata attr
Due to a mix up between the jdata attribute and inherit jdata attribute
it has not been possible to set the inherit jdata attribute on
directories. This is now fixed and the ioctl will report the inherit
jdata attribute for directories rather than the jdata attribute as it
did previously. This stems from our need to have the one bit in the
ioctl attr flags mean two different things according to whether the
underlying inode is a directory or not.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-08-14 10:34:11 +01:00
Steven Whitehouse
a867bb28c1 [GFS2] Fix incorrect error path in prepare_write()
The error path in prepare_write() was incorrect in the (very rare) event
that the transaction fails to start. The following prevents a NULL
pointer dereference,

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-08-14 10:33:44 +01:00
Steven Whitehouse
6eefaf61f6 [GFS2] Fix incorrect return code in rgrp.c
The following patch fixes a bug where 0 was being used as a return code
to indicate "nothing to do" when in fact 0 was a valid block location
which might be returned by the function.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-08-14 10:33:15 +01:00
Bob Peterson
24c7387333 [GFS2] soft lockup in rgblk_search
This patch seems to fix the problem described in bugzilla bug 246114.
It was written by Steve Whitehouse with some tweaking by me.

The code was looping in the relatively new section of code designed to
search for and reuse unlinked inodes.  In cases where it was finding an
appropriate inode to reuse, it was looping around and finding the same
block over and over because a "<=" check should have been a "<" when
comparing the goal block to the last unlinked block found.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-08-14 10:32:43 +01:00
Bob Peterson
bdcb88562c [GFS2] soft lockup detected in databuf_lo_before_commit
This is part 2 of the patch for bug #245832, part 1 of which is already
in the git tree.

The problem was that sdp->sd_log_num_databuf was not always being
protected by the gfs2_log_lock spinlock, but the sd_log_le_databuf
(which it is supposed to reflect) was protected.  That meant there
was a timing window during which gfs2_log_flush called
databuf_lo_before_commit and the count didn't match what was
really on the linked list in that window.  So when it ran out of
items on the linked list, it decremented total_dbuf from 0 to -1 and
thus never left the "while(total_dbuf)" loop.

The solution is to protect the variable sdp->sd_log_num_databuf so
that the value will always match the contents of the linked list,
and therefore the number will never go negative, and therefore, the
loop will be exited properly.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-08-14 10:32:04 +01:00
David Teigland
3650925893 [DLM] fix basts for granted PR waiting CW
Fix a long standing bug where a blocking callback would be missed
when there's a granted lock in PR mode and waiting locks in both
PR and CW modes (and the PR lock was added to the waiting queue
before the CW lock).  The logic simply compared the numerical values
of the modes to determine if a blocking callback was required, but in
the one case of PR and CW, the lower valued CW mode blocks the higher
valued PR mode.  We just need to add a special check for this PR/CW
case in the tests that decide when a blocking callback is needed.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-08-14 10:31:02 +01:00
Patrick Caulfield
9e5f2825a8 [DLM] More othercon fixes
The last patch to clean out 'othercon' structures only fixed half the problem.
The attached addresses the other situations too, and fixes bz#238490

Signed-Off-By: Patrick Caulfield <pcaulfie@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-08-14 10:30:36 +01:00
Jesper Juhl
1a2bf2eefb [DLM] Fix memory leak in dlm_add_member() when dlm_node_weight() returns less than zero
There's a memory leak in fs/dlm/member.c::dlm_add_member().

If "dlm_node_weight(ls->ls_name, nodeid)" returns < 0, then
we'll return without freeing the memory allocated to the (at
that point yet unused) 'memb'.
This patch frees the allocated memory in that case and thus
avoids the leak.

Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-08-14 10:30:04 +01:00
Patrick Caulfield
01c8cab258 [DLM] zero unused parts of sockaddr_storage
When we build a sockaddr_storage for an IP address, clear the unused parts as
they could be used for node comparisons.

I have seen this occasionally make sctp connections fail.

Signed-Off-By: Patrick Caulfield <pcaulfie@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-08-14 10:29:27 +01:00
David Teigland
41684f9547 [DLM] fix NULL ls usage
Fix regression in recent patch "[DLM] variable allocation" which
attempts to dereference an "ls" struct when it's NULL.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-08-14 10:28:44 +01:00
Patrick Caulfield
25720c2d73 [DLM] Clear othercon pointers when a connection is closed
This patch clears the othercon pointer and frees the memory when a connnection
is closed. This could cause a small memory leak when nodes leave the cluster.

Signed-Off-By: Patrick Caulfield <pcaulfie@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-08-14 10:28:05 +01:00
Linus Torvalds
886c818348 Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2:
  ocfs2: set non-default s_time_gran during mount
  ocfs2: Retry sendpage() if it returns EAGAIN
  ocfs2: Fix rename/extend race
  [2.6 patch] ocfs2_insert_extent(): remove dead code
  ocfs2: Fix max offset calculations
  ocfs2: check ia_size limits in setattr
  ocfs2: Fix some casting errors related to file writes
  ocfs2: use s_maxbytes directly in ocfs2_change_file_space()
  ocfs2: Restrict inode changes in ocfs2_update_inode_atime()
2007-08-11 16:01:34 -07:00
Ryusuke Konishi
a75de1b379 eCryptfs: fix error handling in ecryptfs_init
ecryptfs_init() exits without doing any cleanup jobs if
ecryptfs_init_messaging() fails.  In that case, eCryptfs leaves
sysfs entries, leaks memory, and causes an invalid page fault.
This patch fixes the problem.

Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Acked-by: Michael Halcrow <mhalcrow@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-08-11 15:47:40 -07:00
Ryusuke Konishi
202a21d691 eCryptfs: fix lookup error for special files
When ecryptfs_lookup() is called against special files, eCryptfs generates
the following errors because it tries to treat them like regular eCryptfs
files.

Error opening lower file for lower_dentry [0xffff810233a6f150], lower_mnt [0xffff810235bb4c80], and flags
[0x8000]
Error opening lower_file to read header region
Error attempting to read the [user.ecryptfs] xattr from the lower file; return value = [-95]
Valid metadata not found in header region or xattr region; treating file as unencrypted

For instance, the problem can be reproduced by the steps below.

  # mkdir /root/crypt /mnt/crypt
  # mount -t ecryptfs /root/crypt /mnt/crypt
  # mknod /mnt/crypt/c0 c 0 0
  # umount /mnt/crypt
  # mount -t ecryptfs /root/crypt /mnt/crypt
  # ls -l /mnt/crypt

This patch fixes it by adding a check similar to directories and
symlinks.

Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Acked-by: Michael Halcrow <mhalcrow@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-08-11 15:47:40 -07:00
Badari Pulavarty
6a648fa721 direct-io: fix error-path crashes
Need to initialize map_bh.b_state to zero.  Otherwise, in case of a faulty
user-buffer its possible to go into dio_zero_block() and submit a page by
mistake - since it checks for buffer_new().

http://marc.info/?l=linux-kernel&m=118551339032528&w=2

akpm: Linus had a (better) patch to just do a kzalloc() in there, but it got
lost.  Probably this version is better for -stable anwyay.

Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com>
Acked-by: Joe Jin <joe.jin@oracle.com>
Acked-by: Zach Brown <zach.brown@oracle.com>
Cc: gurudas pai <gurudas.pai@oracle.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-08-11 15:47:40 -07:00
Mark Fasheh
e0dceaf0a4 ocfs2: set non-default s_time_gran during mount
We need to manually set this to '1' during mount, otherwise inode_setattr()
will chop off the nanosecond portion of our timestamps.

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-08-09 17:27:58 -07:00
Sunil Mushran
ce17204ae6 ocfs2: Retry sendpage() if it returns EAGAIN
Instead of treating EAGAIN, returned from sendpage(), as an error, this
patch retries the operation.

Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-08-09 17:27:38 -07:00
Sunil Mushran
480214d71f ocfs2: Fix rename/extend race
If one process is extending a file while another is renaming it, there
exists a window when rename could flush the old inode's stale i_size to
disk. This patch recognizes the fact that rename is only updating the old
inode's ctime, so it ensures only that value is flushed to disk.

Signed-off-by: Sunil Mushran <sunil.musran@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-08-09 17:27:10 -07:00
Adrian Bunk
6a18380e7d [2.6 patch] ocfs2_insert_extent(): remove dead code
This patch removes some now dead code.

Spotted by the Coverity checker.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-08-09 17:26:03 -07:00
Mark Fasheh
5a25403175 ocfs2: Fix max offset calculations
ocfs2_max_file_offset() was over-estimating the largest file size for
several cases. This wasn't really a problem before, but now that we support
sparse files, it needs to be more accurate.

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-08-09 17:25:49 -07:00
Mark Fasheh
ce76fd30ce ocfs2: check ia_size limits in setattr
We have to manually check the requested truncate size as the check in
vmtruncate() comes too late for Ocfs2.

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-08-09 17:25:38 -07:00
Mark Fasheh
7c08d70c69 ocfs2: Fix some casting errors related to file writes
ocfs2_align_clusters_to_page_index() needs to cast the clusters shift to
pgoff_t and ocfs2_file_buffered_write() needs loff_t when calculating
destination start for memcpy.

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-08-09 17:25:27 -07:00
Mark Fasheh
a00cce356b ocfs2: use s_maxbytes directly in ocfs2_change_file_space()
There's no need to recalculate things via ocfs2_max_file_offset() as we've
already done that to fill s_maxbytes, so use that instead. We can also
un-export ocfs2_max_file_offset() then.

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-08-09 17:25:07 -07:00
Mark Fasheh
c11e9fafb3 ocfs2: Restrict inode changes in ocfs2_update_inode_atime()
ocfs2_update_inode_atime() calls ocfs2_mark_inode_dirty() to push changes
from the struct inode into the ocfs2 disk inode. The problem is,
ocfs2_mark_inode_dirty() might change other fields, depending on what
happened to the struct inode. Since we don't always have locking to
serialize changes to other fields (like i_size, etc), just fix things up to
only touch the atime field.

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-08-09 17:23:50 -07:00
Linus Torvalds
8b80fc02b8 Merge git://git.linux-nfs.org/pub/linux/nfs-2.6
* git://git.linux-nfs.org/pub/linux/nfs-2.6:
  SUNRPC: Replace flush_workqueue() with cancel_work_sync() and friends
  NFS: Replace flush_scheduled_work with cancel_work_sync() and friends
  SUNRPC: Don't call gss_delete_sec_context() from an rcu context
  NFSv4: Don't call put_rpccred() from an rcu callback
  NFS: Fix NFSv4 open stateid regressions
  NFSv4: Fix a locking regression in nfs4_set_mode_locked()
  NFS: Fix put_nfs_open_context
  SUNRPC: Fix a race in rpciod_down()
2007-08-09 08:38:14 -07:00
David Woodhouse
09b3fba562 [JFFS2] Correct cleanmarker checks -- we should use only 8 bytes
Commit a7a6ace140 revamped the OOB
handling but accidentally switched to 12-byte cleanmarkers, which is
incompatible with what 'flash_eraseall -j' will do. So using
flash_eraseall -j and then trying to mount the 'empty' flash will fail,
because the cleanmarkers aren't recognised.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2007-08-09 17:28:20 +08:00
Trond Myklebust
3d39c691ff NFS: Replace flush_scheduled_work with cancel_work_sync() and friends
This will avoid deadlocks of the form:

stack backtrace:
 [<c0104fda>] show_trace_log_lvl+0x1a/0x30
 [<c0105c02>] show_trace+0x12/0x20
 [<c0105d15>] dump_stack+0x15/0x20
 [<c013ee42>] __lock_acquire+0xc22/0x1030
 [<c013f2b1>] lock_acquire+0x61/0x80
 [<c012edd9>] flush_workqueue+0x49/0x70
 [<c012ee0d>] flush_scheduled_work+0xd/0x10
 [<dcf55c0c>] nfs_release_automount_timer+0x2c/0x30 [nfs]
 [<dcf45d8e>] nfs_free_server+0x9e/0xd0 [nfs]
 [<dcf4e626>] nfs_kill_super+0x16/0x20 [nfs]
 [<c017b38d>] deactivate_super+0x7d/0xa0
 [<c018f94b>] mntput_no_expire+0x4b/0x80
 [<c018fd94>] expire_mount_list+0xe4/0x140
 [<c0191219>] mark_mounts_for_expiry+0x99/0xb0
 [<dcf55d1d>] nfs_expire_automounts+0xd/0x40 [nfs]
 [<c012e61b>] run_workqueue+0x12b/0x1e0
 [<c012f05b>] worker_thread+0x9b/0x100
 [<c0131c72>] kthread+0x42/0x70
 [<c0104c0f>] kernel_thread_helper+0x7/0x18
 =======================

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-08-07 16:12:50 -04:00
Trond Myklebust
905f8d16e3 NFSv4: Don't call put_rpccred() from an rcu callback
Doing so would require us to introduce bh-safe locks into put_rpccred().
This patch fixes the lockdep complaint reported by Marc Dietrich:

inconsistent {softirq-on-W} -> {in-softirq-W} usage.
swapper/0 [HC0[0]:SC1[1]:HE1:SE0] takes:
 (rpc_credcache_lock){-+..}, at: [<c01dc487>]
_atomic_dec_and_lock+0x17/0x60
{softirq-on-W} state was registered at:
  [<c013e870>] __lock_acquire+0x650/0x1030
  [<c013f2b1>] lock_acquire+0x61/0x80
  [<c02db9ac>] _spin_lock+0x2c/0x40
  [<c01dc487>] _atomic_dec_and_lock+0x17/0x60
  [<dced55fd>] put_rpccred+0x5d/0x100 [sunrpc]
  [<dced56c1>] rpcauth_unbindcred+0x21/0x60 [sunrpc]
  [<dced3fd4>] a0 [sunrpc]
  [<dcecefe0>] rpc_call_sync+0x30/0x40 [sunrpc]
  [<dcedc73b>] rpcb_register+0xdb/0x180 [sunrpc]
  [<dced65b3>] svc_register+0x93/0x160 [sunrpc]
  [<dced6ebe>] __svc_create+0x1ee/0x220 [sunrpc]
  [<dced7053>] svc_create+0x13/0x20 [sunrpc]
  [<dcf6d722>] nfs_callback_up+0x82/0x120 [nfs]
  [<dcf48f36>] nfs_get_client+0x176/0x390 [nfs]
  [<dcf49181>] nfs4_set_client+0x31/0x190 [nfs]
  [<dcf49983>] nfs4_create_server+0x63/0x3b0 [nfs]
  [<dcf52426>] nfs4_get_sb+0x346/0x5b0 [nfs]
  [<c017b444>] vfs_kern_mount+0x94/0x110
  [<c0190a62>] do_mount+0x1f2/0x7d0
  [<c01910a6>] sys_mount+0x66/0xa0
  [<c0104046>] syscall_call+0x7/0xb
  [<ffffffff>] 0xffffffff
irq event stamp: 5277830
hardirqs last  enabled at (5277830): [<c017530a>] kmem_cache_free+0x8a/0xc0
hardirqs last disabled at (5277829): [<c01752d2>] kmem_cache_free+0x52/0xc0
softirqs last  enabled at (5277798): [<c0124173>] __do_softirq+0xa3/0xc0
softirqs last disabled at (5277817): [<c01241d7>] do_softirq+0x47/0x50

other info that might help us debug this:
no locks held by swapper/0.

stack backtrace:
 [<c0104fda>] show_trace_log_lvl+0x1a/0x30
 [<c0105c02>] show_trace+0x12/0x20
 [<c0105d15>] dump_stack+0x15/0x20
 [<c013ccc3>] print_usage_bug+0x153/0x160
 [<c013d8b9>] mark_lock+0x449/0x620
 [<c013e824>] __lock_acquire+0x604/0x1030
 [<c013f2b1>] lock_acquire+0x61/0x80
 [<c02db9ac>] _spin_lock+0x2c/0x40
 [<c01dc487>] _atomic_dec_and_lock+0x17/0x60
 [<dced55fd>] put_rpccred+0x5d/0x100 [sunrpc]
 [<dcf6bf83>] nfs_free_delegation_callback+0x13/0x20 [nfs]
 [<c012f9ea>] __rcu_process_callbacks+0x6a/0x1c0
 [<c012fb52>] rcu_process_callbacks+0x12/0x30
 [<c0124218>] tasklet_action+0x38/0x80
 [<c0124125>] __do_softirq+0x55/0xc0
 [<c01241d7>] do_softirq+0x47/0x50
 [<c0124605>] irq_exit+0x35/0x40
 [<c0112463>] smp_apic_timer_interrupt+0x43/0x80
 [<c0104a77>] apic_timer_interrupt+0x33/0x38
 [<c02690df>] cpuidle_idle_call+0x6f/0x90
 [<c01023c3>] cpu_idle+0x43/0x70
 [<c02d8c27>] rest_init+0x47/0x50
 [<c03bcb6a>] start_kernel+0x22a/0x2b0
 [<00000000>] 0x0
 =======================

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-08-07 15:15:57 -04:00
Trond Myklebust
45328c354e NFS: Fix NFSv4 open stateid regressions
Do not allow cached open for O_RDONLY or O_WRONLY unless the file has been
previously opened in these modes.

Also Fix the calculation of the mode in nfs4_close_prepare. We should only
issue an OPEN_DOWNGRADE if we're sure that we will still be holding the
correct open modes. This may not be the case if we've been doing delegated
opens.

Finally, there is no need to adjust the open mode bit flags in
nfs4_close_done(): that has already been done in nfs4_close_prepare().

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-08-07 15:13:19 -04:00
Trond Myklebust
ba683031fa NFSv4: Fix a locking regression in nfs4_set_mode_locked()
We don't really need to clear &state->inode_states inside
nfs4_set_mode_locked, and doing so without holding the inode->i_lock would
in any case be a bug...

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-08-07 15:13:18 -04:00
Trond Myklebust
5e11934d13 NFS: Fix put_nfs_open_context
We need to grab the inode->i_lock atomically with the last reference put in
order to remove the open context that is being freed from the
nfsi->open_files list.

Fix by converting the kref to a standard atomic counter and then using
atomic_dec_and_lock()...

Thanks to Arnd Bergmann for pointing out the problem.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-08-07 15:13:17 -04:00
Masakazu Mokuno
313b0d3d86 [PATCH] remove duplicated ioctl entries in compat_ioctl.c
This patch removes some duplicated wireless ioctl entries in the array
'struct ioctl_trans ioctl_start[]' of fs/compat_ioctl.c

These entries are registered twice like:

	COMPATIBLE_IOCTL(SIOCGIWPRIV)

and

	HANDLE_IOCTL(SIOCGIWPRIV, do_wireless_ioctl)

Signed-off-by: Masakazu Mokuno <mokuno@sm.sony.co.jp>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2007-08-06 15:06:03 -04:00
David Woodhouse
b8e3ec30c2 [JFFS2] Print correct node offset when complaining about broken data CRC
Debugging the hardware problems in OLPC trac #1905 would be a whole lot
easier if the correct node offsets were printed for the offending nodes.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2007-08-02 21:43:46 +01:00
David Woodhouse
7b687707d7 [JFFS2] Fix suspend failure with JFFS2 GC thread.
The try_to_freeze() call was in the wrong place; we need it in the
signal-pending loop now that a pending freeze also makes
signal_pending() return true.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2007-08-02 21:43:03 +01:00
David Woodhouse
71c2339775 [JFFS2] Deletion dirents should be REF_NORMAL, not REF_PRISTINE.
Otherwise they'll never actually get garbage-collected.
Noted by Jonathan Larmour.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2007-08-02 21:39:50 +01:00
Joakim Tjernlund
5bd5c03c31 [JFFS2] Prevent oops after 'node added in wrong place' debug check
jffs2_add_physical_node_ref() should never really return error -- it's
an internal debugging check which triggered. We really need to work out
why and stop it happening. But in the meantime, let's make the failure
mode a little less nasty.

Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2007-08-02 21:36:35 +01:00
David Woodhouse
3ca135e16a [JFFS2] LZO compression should default off for compatibility.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2007-08-02 16:32:02 +01:00