linux/fs
Joel Becker 0e0333429a configfs: Silence lockdep on mkdir(), rmdir() and configfs_depend_item()
When attaching default groups (subdirs) of a new group (in mkdir() or
in configfs_register()), configfs recursively takes inode's mutexes
along the path from the parent of the new group to the default
subdirs. This is needed to ensure that the VFS will not race with
operations on these sub-dirs. This is safe for the following reasons:

- the VFS allows one to lock first an inode and second one of its
  children (The lock subclasses for this pattern are respectively
  I_MUTEX_PARENT and I_MUTEX_CHILD);
- from this rule any inode path can be recursively locked in
  descending order as long as it stays under a single mountpoint and
  does not follow symlinks.

Unfortunately lockdep does not know (yet?) how to handle such
recursion.

I've tried to use Peter Zijlstra's lock_set_subclass() helper to
upgrade i_mutexes from I_MUTEX_CHILD to I_MUTEX_PARENT when we know
that we might recursively lock some of their descendant, but this
usage does not seem to fit the purpose of lock_set_subclass() because
it leads to several i_mutex locked with subclass I_MUTEX_PARENT by
the same task.

>From inside configfs it is not possible to serialize those recursive
locking with a top-level one, because mkdir() and rmdir() are already
called with inodes locked by the VFS. So using some
mutex_lock_nest_lock() is not an option.

I am proposing two solutions:
1) one that wraps recursive mutex_lock()s with
   lockdep_off()/lockdep_on().
2) (as suggested earlier by Peter Zijlstra) one that puts the
   i_mutexes recursively locked in different classes based on their
   depth from the top-level config_group created. This
   induces an arbitrary limit (MAX_LOCK_DEPTH - 2 == 46) on the
   nesting of configfs default groups whenever lockdep is activated
   but this limit looks reasonably high. Unfortunately, this alos
   isolates VFS operations on configfs default groups from the others
   and thus lowers the chances to detect locking issues.

This patch implements solution 1).

Solution 2) looks better from lockdep's point of view, but fails with
configfs_depend_item(). This needs to rework the locking
scheme of configfs_depend_item() by removing the variable lock recursion
depth, and I think that it's doable thanks to the configfs_dirent_lock.
For now, let's stick to solution 1).

Signed-off-by: Louis Rilling <louis.rilling@kerlabs.com>
Acked-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-02-02 14:20:18 -08:00
..
9p fs/Kconfig: move 9p out 2009-01-22 13:16:01 +03:00
adfs fs/Kconfig: move adfs out 2009-01-22 13:15:56 +03:00
affs fs/Kconfig: move affs out 2009-01-22 13:15:56 +03:00
afs fs/Kconfig: move afs out 2009-01-22 13:16:01 +03:00
autofs fs/Kconfig: move autofs, autofs4 out 2009-01-22 13:15:54 +03:00
autofs4 fs/Kconfig: move autofs, autofs4 out 2009-01-22 13:15:54 +03:00
befs fs/Kconfig: move befs out 2009-01-22 13:15:57 +03:00
bfs fs/Kconfig: move bfs out 2009-01-22 13:15:57 +03:00
btrfs fs/Kconfig: move btrfs out 2009-01-22 13:15:54 +03:00
cifs cifs: make sure we allocate enough storage for socket address 2009-01-29 03:32:13 +00:00
coda fs/Kconfig: move coda out 2009-01-22 13:16:01 +03:00
configfs configfs: Silence lockdep on mkdir(), rmdir() and configfs_depend_item() 2009-02-02 14:20:18 -08:00
cramfs fs/Kconfig: move cramfs out 2009-01-22 13:15:58 +03:00
debugfs debugfs: add helpers for exporting a size_t simple value 2009-01-07 10:00:16 -08:00
devpts zero i_uid/i_gid on inode allocation 2009-01-05 11:54:28 -05:00
dlm dlm: initialize file_lock struct in GETLK before copying conflicting lock 2009-01-21 15:28:45 -06:00
ecryptfs fs/Kconfig: move ecryptfs out 2009-01-22 13:15:56 +03:00
efs fs/Kconfig: move efs out 2009-01-22 13:15:57 +03:00
exportfs Merge branch 'next' into for-linus 2008-12-25 11:40:09 +11:00
ext2 ext2: also update the inode on disk when dir is IS_DIRSYNC 2009-01-15 16:39:42 -08:00
ext3 ext3: Add sanity check to make_indexed_dir 2009-01-16 11:13:47 -05:00
ext4 ext4: Remove bogus BUG() check in ext4_bmap() 2009-01-30 00:00:24 -05:00
fat fs/Kconfig: move fat out 2009-01-22 13:15:55 +03:00
freevxfs fs/Kconfig: move vxfs out 2009-01-22 13:15:58 +03:00
fuse Merge branch 'Kconfig' of git://git.kernel.org/pub/scm/linux/kernel/git/adobriyan/misc 2009-01-26 10:08:50 -08:00
gfs2 filesystem freeze: add error handling of write_super_lockfs/unlockfs 2009-01-09 16:54:42 -08:00
hfs fs/Kconfig: move hfs, hfsplus out 2009-01-22 13:15:57 +03:00
hfsplus fs/Kconfig: move hfs, hfsplus out 2009-01-22 13:15:57 +03:00
hostfs fs: symlink write_begin allocation context fix 2009-01-04 13:33:20 -08:00
hpfs fs/Kconfig: move hpfs out 2009-01-22 13:15:59 +03:00
hppfs CRED: Use creds in file structs 2008-11-14 10:39:25 +11:00
hugetlbfs hugetlb: unsigned ret cannot be negative 2009-01-06 15:59:08 -08:00
isofs fs/Kconfig: move iso9660, udf out 2009-01-22 13:15:55 +03:00
jbd jbd: remove excess kernel-doc notation 2009-01-08 08:31:01 -08:00
jbd2 ext4: fix wrong use of do_div 2009-01-11 22:34:01 -05:00
jffs2 [JFFS2] remove junk prototypes 2009-01-09 21:05:21 +00:00
jfs fs/Kconfig: move jfs out 2009-01-22 13:15:54 +03:00
lockd NLM: Clean up flow of control in make_socks() function 2009-01-07 15:40:44 -05:00
minix fs/Kconfig: move minix out 2009-01-22 13:15:58 +03:00
ncpfs fs/Kconfig: move the rest of ncpfs out 2009-01-22 13:16:01 +03:00
nfs fs/Kconfig: move nfs out 2009-01-22 13:16:00 +03:00
nfs_common SUNRPC: nfsacl_encode/nfsacl_decode should be exported as GPL-only 2008-12-23 15:21:32 -05:00
nfsd nfsd: only set file_lock.fl_lmops in nfsd4_lockt if a stateowner is found 2009-01-27 17:26:59 -05:00
nls
notify inotify: clean up inotify_read and fix locking problems 2009-01-26 10:08:05 -08:00
ntfs fs/Kconfig: move ntfs out 2009-01-22 13:15:55 +03:00
ocfs2 ocfs2: Fix possible deadlock in ocfs2_write_dquot() 2009-02-02 14:20:17 -08:00
omfs fs/Kconfig: move omfs out 2009-01-22 13:15:58 +03:00
openpromfs zero i_uid/i_gid on inode allocation 2009-01-05 11:54:28 -05:00
partitions block: fix bug in ptbl lookup cache 2009-01-09 21:46:13 +01:00
proc Merge git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-2.6-nommu 2009-01-09 14:00:58 -08:00
qnx4 fs/Kconfig: move qnx4 out 2009-01-22 13:15:59 +03:00
ramfs NOMMU: Fix cleanup handling in ramfs_nommu_get_umapped_area() 2009-01-08 12:04:46 +00:00
reiserfs fs/Kconfig: move reiserfs out 2009-01-22 13:15:53 +03:00
romfs fs/Kconfig: move romfs out 2009-01-22 13:15:59 +03:00
smbfs fs/Kconfig: move smbfs out 2009-01-22 13:16:01 +03:00
squashfs fs/Kconfig: move squashfs out 2009-01-22 13:15:58 +03:00
sysfs Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6 2009-01-26 10:40:28 -08:00
sysv fs/Kconfig: move sysv out 2009-01-22 13:15:59 +03:00
ubifs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2009-01-07 11:31:52 -08:00
udf fs/Kconfig: move iso9660, udf out 2009-01-22 13:15:55 +03:00
ufs fs/Kconfig: move ufs out 2009-01-22 13:16:00 +03:00
xfs Long btree pointers are still 64 bit on disk 2009-01-22 01:23:11 -06:00
aio.c [CVE-2009-0029] System call wrappers part 16 2009-01-14 14:15:25 +01:00
anon_inodes.c anon_inodes: use fops->owner for module refcount 2008-12-31 16:55:44 +02:00
attr.c CRED: Wrap task credential accesses in the filesystem subsystem 2008-11-14 10:39:05 +11:00
bad_inode.c kill ->dir_notify() 2008-12-31 18:07:43 -05:00
binfmt_aout.c sanitize ifdefs in binfmt_aout 2009-01-03 11:45:54 -08:00
binfmt_elf_fdpic.c FDPIC: Don't attempt to expand the userspace stack to fill the space allocated 2009-01-08 12:04:47 +00:00
binfmt_elf.c ELF: implement AT_RANDOM for glibc PRNG seeding 2009-01-08 08:31:12 -08:00
binfmt_em86.c
binfmt_flat.c FLAT: Don't attempt to expand the userspace stack to fill the space allocated 2009-01-08 12:04:47 +00:00
binfmt_misc.c fs/binfmt_misc.c: add terminating newline to /proc/sys/fs/binfmt_misc/status 2009-01-06 15:59:19 -08:00
binfmt_script.c
binfmt_som.c CRED: Make execve() take advantage of copy-on-write credentials 2008-11-14 10:39:24 +11:00
bio-integrity.c block: Remove obsolete BUG_ON 2009-01-30 12:34:36 +01:00
bio.c [SCSI] block: make blk_rq_map_user take a NULL user-space buffer for WRITE 2009-01-02 11:10:35 -06:00
block_dev.c filesystem freeze: implement generic freeze feature 2009-01-09 16:54:42 -08:00
buffer.c [CVE-2009-0029] System call wrappers part 10 2009-01-14 14:15:22 +01:00
char_dev.c fs: fix name overwrite in __register_chrdev_region() 2009-01-06 15:59:13 -08:00
compat_binfmt_elf.c
compat_ioctl.c tun: Add some missing TUN compat ioctl translations. 2009-01-29 16:53:35 -08:00
compat.c [CVE-2009-0029] Make sys_pselect7 static 2009-01-14 14:15:16 +01:00
dcache.c [CVE-2009-0029] System call wrappers part 20 2009-01-14 14:15:26 +01:00
dcookies.c [CVE-2009-0029] System call wrapper special cases 2009-01-14 14:15:18 +01:00
direct-io.c fs: truncate blocks outside i_size after O_DIRECT write error 2009-01-06 15:59:06 -08:00
dquot.c quota: Improve locking 2009-01-16 18:02:10 +01:00
drop_caches.c
eventfd.c [CVE-2009-0029] System call wrappers part 32 2009-01-14 14:15:31 +01:00
eventpoll.c epoll: drop max_user_instances and rely only on max_user_watches 2009-01-29 18:04:45 -08:00
exec.c [CVE-2009-0029] System call wrappers part 27 2009-01-14 14:15:29 +01:00
fcntl.c [CVE-2009-0029] System call wrappers part 15 2009-01-14 14:15:24 +01:00
fifo.c
file_table.c filp_cachep can be static in fs/file_table.c 2008-12-31 18:07:42 -05:00
file.c
filesystems.c [CVE-2009-0029] System call wrappers part 27 2009-01-14 14:15:29 +01:00
fs-writeback.c fs: sys_sync fix 2009-01-06 15:59:09 -08:00
generic_acl.c
inode.c partial revert of asynchronous inode delete 2009-01-09 13:15:49 -08:00
internal.h CRED: Make execve() take advantage of copy-on-write credentials 2008-11-14 10:39:24 +11:00
ioctl.c [CVE-2009-0029] System call wrappers part 15 2009-01-14 14:15:24 +01:00
ioprio.c [CVE-2009-0029] System call wrappers part 28 2009-01-14 14:15:30 +01:00
Kconfig fs/Kconfig: move 9p out 2009-01-22 13:16:01 +03:00
Kconfig.binfmt CORE_DUMP_DEFAULT_ELF_HEADERS depends on ELF_CORE 2009-01-09 16:54:41 -08:00
libfs.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 2009-01-05 18:32:06 -08:00
locks.c [CVE-2009-0029] System call wrappers part 16 2009-01-14 14:15:25 +01:00
Makefile Merge git://git.kernel.org/pub/scm/linux/kernel/git/pkl/squashfs-linus 2009-01-09 15:18:49 -08:00
mbcache.c
mpage.c do_mpage_readpage(): remove useless clear_buffer_mapped() call 2009-01-06 15:59:01 -08:00
namei.c [CVE-2009-0029] System call wrappers part 29 2009-01-14 14:15:30 +01:00
namespace.c [CVE-2009-0029] System call wrappers part 14 2009-01-14 14:15:24 +01:00
nfsctl.c [CVE-2009-0029] System call wrappers part 27 2009-01-14 14:15:29 +01:00
no-block.c
open.c [CVE-2009-0029] System call wrappers part 30 2009-01-14 14:15:30 +01:00
pipe.c [CVE-2009-0029] System call wrappers part 33 2009-01-14 14:15:32 +01:00
pnode.c
pnode.h
posix_acl.c CRED: Wrap task credential accesses in the filesystem subsystem 2008-11-14 10:39:05 +11:00
quota_tree.c quota: Split off quota tree handling into a separate file 2009-01-05 08:40:21 -08:00
quota_tree.h quota: Split off quota tree handling into a separate file 2009-01-05 08:40:21 -08:00
quota_v1.c quota: Move quotaio_v[12].h from include/linux/ to fs/ 2009-01-05 08:36:58 -08:00
quota_v2.c quota: Convert union in mem_dqinfo to a pointer 2009-01-05 08:40:21 -08:00
quota.c [CVE-2009-0029] System call wrappers part 20 2009-01-14 14:15:26 +01:00
quotaio_v1.h quota: Move quotaio_v[12].h from include/linux/ to fs/ 2009-01-05 08:36:58 -08:00
quotaio_v2.h quota: Split off quota tree handling into a separate file 2009-01-05 08:40:21 -08:00
read_write.c [CVE-2009-0029] System call wrappers part 20 2009-01-14 14:15:26 +01:00
read_write.h
readdir.c [CVE-2009-0029] System call wrappers part 32 2009-01-14 14:15:31 +01:00
select.c [CVE-2009-0029] System call wrappers part 32 2009-01-14 14:15:31 +01:00
seq_file.c Merge branch 'cpus4096-for-linus-3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2009-01-03 12:04:39 -08:00
signalfd.c [CVE-2009-0029] System call wrappers part 31 2009-01-14 14:15:31 +01:00
splice.c [CVE-2009-0029] System call wrappers part 31 2009-01-14 14:15:31 +01:00
stack.c
stat.c [CVE-2009-0029] System call wrappers part 30 2009-01-14 14:15:30 +01:00
super.c [CVE-2009-0029] System call wrappers part 11 2009-01-14 14:15:23 +01:00
sync.c [CVE-2009-0029] System call wrappers part 09 2009-01-14 14:15:21 +01:00
timerfd.c [CVE-2009-0029] System call wrappers part 32 2009-01-14 14:15:31 +01:00
utimes.c [CVE-2009-0029] System call wrappers part 30 2009-01-14 14:15:30 +01:00
xattr_acl.c
xattr.c [CVE-2009-0029] System call wrappers part 13 2009-01-14 14:15:23 +01:00