linux/fs
Al Viro acfec9a5a8 livelock avoidance in sget()
Eric Sandeen has found a nasty livelock in sget() - take a mount(2) about
to fail.  The superblock is on ->fs_supers, ->s_umount is held exclusive,
->s_active is 1.  Along comes two more processes, trying to mount the same
thing; sget() in each is picking that superblock, bumping ->s_count and
trying to grab ->s_umount.  ->s_active is 3 now.  Original mount(2)
finally gets to deactivate_locked_super() on failure; ->s_active is 2,
superblock is still ->fs_supers because shutdown will *not* happen until
->s_active hits 0.  ->s_umount is dropped and now we have two processes
chasing each other:
s_active = 2, A acquired ->s_umount, B blocked
A sees that the damn thing is stillborn, does deactivate_locked_super()
s_active = 1, A drops ->s_umount, B gets it
A restarts the search and finds the same superblock.  And bumps it ->s_active.
s_active = 2, B holds ->s_umount, A blocked on trying to get it
... and we are in the earlier situation with A and B switched places.

The root cause, of course, is that ->s_active should not grow until we'd
got MS_BORN.  Then failing ->mount() will have deactivate_locked_super()
shut the damn thing down.  Fortunately, it's easy to do - the key point
is that grab_super() is called only for superblocks currently on ->fs_supers,
so it can bump ->s_count and grab ->s_umount first, then check MS_BORN and
bump ->s_active; we must never increment ->s_count for superblocks past
->kill_sb(), but grab_super() is never called for those.

The bug is pretty old; we would've caught it by now, if not for accidental
exclusion between sget() for block filesystems; the things like cgroup or
e.g. mtd-based filesystems don't have anything of that sort, so they get
bitten.  The right way to deal with that is obviously to fix sget()...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-07-20 04:58:58 +04:00
..
9p Second round of 9p patches for the 3.11 merge window. 2013-07-11 10:21:23 -07:00
adfs Don't pass inode to ->d_hash() and ->d_compare() 2013-06-29 12:57:36 +04:00
affs Don't pass inode to ->d_hash() and ->d_compare() 2013-06-29 12:57:36 +04:00
afs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2013-07-03 09:10:19 -07:00
autofs4 helper for reading ->d_count 2013-07-05 18:59:33 +04:00
befs [readdir] convert befs 2013-06-29 12:56:55 +04:00
bfs [readdir] convert bfs 2013-06-29 12:56:33 +04:00
btrfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs 2013-07-09 12:33:09 -07:00
cachefiles mm: remove lru parameter from __pagevec_lru_add and remove parts of pagevec API 2013-07-03 16:07:31 -07:00
ceph Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client 2013-07-09 12:39:10 -07:00
cifs CIFS: Fix a deadlock when a file is reopened 2013-07-11 18:05:41 -05:00
coda helper for reading ->d_count 2013-07-05 18:59:33 +04:00
configfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2013-07-14 11:42:26 -07:00
cramfs [readdir] convert f2fs 2013-06-29 12:56:46 +04:00
debugfs debugfs: write_file_bool() - ensure strtobool() operates on valid data 2013-06-03 13:55:02 -07:00
devpts
dlm dlm: Avoid LVB truncation 2013-06-26 11:38:02 -05:00
ecryptfs Code cleanups and improved buffer handling during page crypto operations 2013-07-11 10:20:18 -07:00
efivarfs efivarfs: we can use simple_lookup() now 2013-07-14 17:48:35 +04:00
efs [readdir] convert efs 2013-06-29 12:56:31 +04:00
exofs Lots of bug fixes, cleanups and optimizations. In the bug fixes 2013-07-02 09:39:34 -07:00
exportfs [readdir] constify ->actor 2013-06-29 12:57:05 +04:00
ext2 [O_TMPFILE] it's still short a few helpers, but infrastructure should be OK now... 2013-06-29 12:57:10 +04:00
ext3 Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs 2013-07-09 12:08:43 -07:00
ext4 Various regression and bug fixes for ext4. 2013-07-14 21:47:51 -07:00
f2fs f2fs: fix readdir incorrectness 2013-07-08 13:35:48 +04:00
fat fatfs: add FAT_IOCTL_GET_VOLUME_ID 2013-07-09 10:33:25 -07:00
freevxfs [readdir] convert freevxfs 2013-06-29 12:56:53 +04:00
fscache FS-Cache: Don't use spin_is_locked() in assertions 2013-06-19 14:16:47 +01:00
fuse mm: use totalram_pages instead of num_physpages at runtime 2013-07-03 16:07:35 -07:00
gfs2 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2013-07-03 09:10:19 -07:00
hfs Don't pass inode to ->d_hash() and ->d_compare() 2013-06-29 12:57:36 +04:00
hfsplus Don't pass inode to ->d_hash() and ->d_compare() 2013-06-29 12:57:36 +04:00
hostfs [readdir] convert hostfs 2013-06-29 12:56:59 +04:00
hpfs Merge branch 'hpfs' from Mikulas Patocka 2013-07-04 11:22:55 -07:00
hppfs clean up scary strncpy(dst, src, strlen(src)) uses 2013-07-03 16:07:41 -07:00
hugetlbfs hugetlbfs: fix mmap failure in unaligned size request 2013-05-07 18:38:27 -07:00
isofs Don't pass inode to ->d_hash() and ->d_compare() 2013-06-29 12:57:36 +04:00
jbd jbd: change journal_invalidatepage() to accept length 2013-05-21 23:26:36 -04:00
jbd2 jbd2: invalidate handle if jbd2_journal_restart() fails 2013-07-01 08:12:41 -04:00
jffs2 [readdir] convert jffs2 2013-06-29 12:56:47 +04:00
jfs A couple cleanups to JFS for 3.11 2013-07-11 10:19:34 -07:00
lockd drivers: avoid parsing names as kthread_run() format strings 2013-07-03 16:07:41 -07:00
logfs Lots of bug fixes, cleanups and optimizations. In the bug fixes 2013-07-02 09:39:34 -07:00
minix minix: bug widening a binary "not" operation 2013-06-29 12:57:35 +04:00
ncpfs ncpfs: fix error return code in ncp_parse_options() 2013-07-09 10:33:25 -07:00
nfs NFS: Allow nfs_updatepage to extend a write under additional circumstances 2013-07-09 19:32:50 -04:00
nfs_common
nfsd Merge branch 'for-3.11' of git://linux-nfs.org/~bfields/linux 2013-07-11 10:17:13 -07:00
nilfs2 helper for reading ->d_count 2013-07-05 18:59:33 +04:00
nls
notify fsnotify: update comments concerning locking scheme 2013-07-09 10:33:20 -07:00
ntfs Lots of bug fixes, cleanups and optimizations. In the bug fixes 2013-07-02 09:39:34 -07:00
ocfs2 ocfs2: fix NULL pointer dereference when traversing o2hb_all_regions 2013-07-03 16:07:25 -07:00
omfs [readdir] convert omfs 2013-06-29 12:56:37 +04:00
openpromfs [readdir] convert openpromfs 2013-06-29 12:56:32 +04:00
proc fs/proc/kcore.c: using strlcpy() instead of strncpy() 2013-07-03 16:08:02 -07:00
pstore Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc 2013-07-04 10:29:23 -07:00
qnx4 [readdir] convert qnx4 2013-06-29 12:56:38 +04:00
qnx6 [readdir] convert qnx6 2013-06-29 12:56:39 +04:00
quota quota: Convert use of typedef ctl_table to struct ctl_table 2013-07-04 19:22:55 +02:00
ramfs
reiserfs Lots of bug fixes, cleanups and optimizations. In the bug fixes 2013-07-02 09:39:34 -07:00
romfs [readdir] convert romfs 2013-06-29 12:56:29 +04:00
squashfs [readdir] convert squashfs 2013-06-29 12:56:28 +04:00
sysfs Driver core patches for 3.11-rc1 2013-07-02 11:44:19 -07:00
sysv Don't pass inode to ->d_hash() and ->d_compare() 2013-06-29 12:57:36 +04:00
ubifs Only a single patch which fixes a message. 2013-07-05 12:08:47 -07:00
udf udf: provide ->tmpfile() 2013-06-29 12:57:12 +04:00
ufs [readdir] simple local unixlike: switch to ->iterate() 2013-06-29 12:46:47 +04:00
xfs xfs: update (#2) for 3.11-rc1 2013-07-13 11:40:24 -07:00
aio.c aio: fix wrong comment in aio_complete() 2013-07-03 16:08:06 -07:00
anon_inodes.c
attr.c
bad_inode.c [readdir] ->readdir() is gone 2013-06-29 12:57:04 +04:00
binfmt_aout.c mm: remove free_area_cache 2013-07-10 18:11:34 -07:00
binfmt_elf_fdpic.c Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc 2013-05-02 10:16:16 -07:00
binfmt_elf.c mm: remove free_area_cache 2013-07-10 18:11:34 -07:00
binfmt_em86.c
binfmt_flat.c new helper: read_code() 2013-04-29 15:40:23 -04:00
binfmt_misc.c binfmt_misc: reuse string_unescape_inplace() 2013-04-30 17:04:03 -07:00
binfmt_script.c
binfmt_som.c
bio-integrity.c
bio.c Merge branch 'for-3.10/core' of git://git.kernel.dk/linux-block 2013-05-08 10:13:35 -07:00
block_dev.c Merge branch 'for-3.11/core' of git://git.kernel.dk/linux-block 2013-07-11 13:03:24 -07:00
buffer.c mm: vmscan: take page buffers dirty and locked state into account 2013-07-03 16:07:29 -07:00
char_dev.c
compat_binfmt_elf.c
compat_ioctl.c compat.c: LOOP_CLR_FD is taken care of in loop.c itself... 2013-06-29 12:46:44 +04:00
compat.c [readdir] constify ->actor 2013-06-29 12:57:05 +04:00
coredump.c coredump: '% at the end' shouldn't bypass core_uses_pid logic 2013-07-03 16:08:02 -07:00
coredump.h
dcache.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2013-07-03 09:10:19 -07:00
dcookies.c
direct-io.c Merge branch 'for-3.10/core' of git://git.kernel.dk/linux-block 2013-05-08 10:13:35 -07:00
drop_caches.c
eventfd.c
eventpoll.c Merge branch 'akpm' (updates from Andrew Morton) 2013-07-03 17:12:13 -07:00
exec.c fs/exec.c:de_thread: mt-exec should update ->real_start_time 2013-07-03 16:08:03 -07:00
fcntl.c
fhandle.c
file_table.c fput: turn "list_head delayed_fput_list" into llist_head 2013-07-13 13:29:10 +04:00
file.c don't bother with deferred freeing of fdtables 2013-05-01 17:31:42 -04:00
filesystems.c
fs_struct.c
fs-writeback.c mm/writeback: don't check force_wait to handle bdi->work_list 2013-07-09 10:33:22 -07:00
generic_acl.c
inode.c allow the temp files created by open() to be linked to 2013-06-29 12:57:11 +04:00
internal.h constify rw_verify_area() 2013-06-29 12:57:34 +04:00
ioctl.c
ioprio.c
Kconfig efivarfs: Move to fs/efivarfs 2013-04-17 13:25:09 +01:00
Kconfig.binfmt fs: make binfmt support for #! scripts modular and removable 2013-04-30 17:04:04 -07:00
libfs.c make simple_lookup() usable for filesystems that set ->s_d_op 2013-07-14 17:43:25 +04:00
locks.c locks: move file_lock_list to a set of percpu hlist_heads and convert file_lock_lock to an lglock 2013-07-08 13:36:42 +04:00
Makefile Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2013-05-01 17:51:54 -07:00
mbcache.c
mount.h get rid of full-hash scan on detaching vfsmounts 2013-04-09 14:12:52 -04:00
mpage.c
namei.c Safer ABI for O_TMPFILE 2013-07-13 13:26:37 +04:00
namespace.c create_mnt_ns: unidiomatic use of list_add() 2013-05-04 15:18:53 -04:00
no-block.c
open.c allow O_TMPFILE to work with O_WRONLY 2013-07-20 03:11:32 +04:00
pipe.c aio: don't include aio.h in sched.h 2013-05-07 20:16:25 -07:00
pnode.c vfs: Fix invalid ida_remove() call 2013-05-31 15:16:33 -04:00
pnode.h Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2013-05-01 17:51:54 -07:00
posix_acl.c
proc_namespace.c
read_write.c vfs: export lseek_execute() to modules 2013-07-03 16:23:27 +04:00
readdir.c [readdir] constify ->actor 2013-06-29 12:57:05 +04:00
select.c net: rename include/net/ll_poll.h to include/net/busy_poll.h 2013-07-10 17:08:27 -07:00
seq_file.c seq_file: add seq_list_*_percpu helpers 2013-07-08 13:36:41 +04:00
signalfd.c
splice.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2013-07-03 09:10:19 -07:00
stack.c
stat.c
statfs.c
super.c livelock avoidance in sget() 2013-07-20 04:58:58 +04:00
sync.c
timerfd.c timerfd: Add alarm timers 2013-05-29 12:57:34 -07:00
utimes.c
xattr_acl.c
xattr.c