linux/fs
Dave Chinner 8af3dcd3c8 xfs: xlog_cil_force_lsn doesn't always wait correctly
When running a tight mount/unmount loop on an older kernel, RedHat
QE found that unmount would occasionally hang in
xfs_buf_unpin_wait() on the superblock buffer. Tracing and other
debug work by Eric Sandeen indicated that it was hanging on the
writing of the superblock during unmount immediately after logging
the superblock counters in a synchronous transaction. Further debug
indicated that the synchronous transaction was not waiting for
completion correctly, and we narrowed it down to
xlog_cil_force_lsn() returning NULLCOMMITLSN and hence not pushing
the transaction in the iclog buffer to disk correctly.

While this unmount superblock write code is now very different in
mainline kernels, the xlog_cil_force_lsn() code is identical, and it
was bisected to the backport of commit f876e44 ("xfs: always do log
forces via the workqueue"). This commit made the CIL push
asynchronous for log forces and hence exposed a race condition that
couldn't occur on a synchronous push.

Essentially, the xlog_cil_force_lsn() relied implicitly on the fact
that the sequence push would be complete by the time
xlog_cil_push_now() returned, resulting in the context being pushed
being in the committing list. When it was made asynchronous, it was
recognised that there was a race condition in detecting whether an
asynchronous push has started or not and code was added to handle
it.

Unfortunately, the fix was not quite right and left a race condition
where it it would detect an empty CIL while a push was in progress
before the context had been added to the committing list. This was
incorrectly seen as a "nothing to do" condition and so would tell
xfs_log_force_lsn() that there is nothing to wait for, and hence it
would push the iclogbufs in memory.

The fix is simple, but explaining the logic and the race condition
is a lot more complex. The fix is to add the context to the
committing list before we start emptying the CIL. This allows us to
detect the difference between an empty "do nothing" push and a push
that has not started by adding a discrete "emptying the CIL" state
to avoid the transient, incorrect "empty" condition that the
(unchanged) waiting code was seeing.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-09-23 15:57:59 +10:00
..
9p Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2014-06-12 10:30:18 -07:00
adfs adfs: add __printf verification, fix format/argument mismatches 2014-08-08 15:57:24 -07:00
affs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2014-06-12 10:30:18 -07:00
afs AFS: Correctly assemble the client UUID 2014-07-29 10:14:36 -07:00
autofs4 autofs4: comment typo: remove a a doubled word 2014-08-08 15:57:19 -07:00
befs fs/befs/linuxvfs.c: check superblock before dump operation 2014-08-08 15:57:20 -07:00
bfs fs/bfs: use bfs prefix for dump_imap 2014-08-08 15:57:24 -07:00
btrfs Merge branch 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs 2014-08-16 09:06:55 -06:00
cachefiles fs/cachefiles: replace kerror by pr_err 2014-06-06 16:08:14 -07:00
ceph Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client 2014-08-13 17:43:29 -06:00
cifs Merge branch 'for-linus' of git://git.samba.org/sfrench/cifs-2.6 2014-08-20 18:33:21 -05:00
coda fs/coda: use linux/uaccess.h 2014-08-08 15:57:20 -07:00
configfs fs/configfs: use pr_fmt 2014-06-04 16:53:53 -07:00
cramfs fs/cramfs/inode.c: use linux/uaccess.h 2014-08-08 15:57:25 -07:00
debugfs fs: debugfs: remove trailing whitespace 2014-07-09 16:58:21 -07:00
devpts fs/devpts/inode.c: convert printk to pr_foo() 2014-06-06 16:08:14 -07:00
dlm fs/dlm/debug_fs.c: remove unnecessary null test before debugfs_remove 2014-08-08 15:57:27 -07:00
ecryptfs write_iter variants of {__,}generic_file_aio_write() 2014-05-06 17:38:00 -04:00
efivarfs fs/efivarfs/super.c: use static const for dentry_operations 2014-06-04 16:54:14 -07:00
efs fs/efs/namei.c: return is not a function 2014-08-08 15:57:18 -07:00
exofs fs/exofs/ore_raid.c: replace count*size kzalloc by kcalloc 2014-08-08 15:57:24 -07:00
exportfs fs/exportfs/expfs.c: kernel-doc warning fixes 2014-06-04 16:54:14 -07:00
ext2 fs/ext2/super.c: Drop memory allocation cast 2014-07-15 22:40:22 +02:00
ext3 ext3: Count internal journal as bsddf overhead in ext3_statfs 2014-08-19 23:16:51 +02:00
ext4 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2014-08-11 11:44:11 -07:00
f2fs f2fs: use for_each_set_bit to simplify the code 2014-08-04 13:20:53 -07:00
fat Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2014-06-12 10:30:18 -07:00
freevxfs Major changes for 3.14 include support for the newly added ZERO_RANGE 2014-04-04 15:39:39 -07:00
fscache fs/fscache: make ctl_table static 2014-08-06 18:01:12 -07:00
fuse switch iov_iter_get_pages() to passing maximal number of pages 2014-08-07 14:40:11 -04:00
gfs2 Merge branch 'sched/urgent' into sched/core, to merge fixes before applying new changes 2014-07-28 10:03:00 +02:00
hfs write_iter variants of {__,}generic_file_aio_write() 2014-05-06 17:38:00 -04:00
hfsplus Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2014-06-12 10:30:18 -07:00
hostfs hostfs: support rename flags 2014-08-07 14:40:09 -04:00
hpfs fs/hpfs/dnode.c: fix suspect code indent 2014-08-08 15:57:22 -07:00
hppfs
hugetlbfs fs/hugetlbfs/inode.c: remove null test before kfree 2014-06-04 16:54:11 -07:00
isofs isofs: Fix unbounded recursion when processing relocated directories 2014-08-19 18:29:30 +02:00
jbd fs/jbd/revoke.c: replace shift loop by ilog2 2014-05-21 10:26:13 +02:00
jbd2 sched: Remove proliferation of wait_on_bit() action functions 2014-07-16 15:10:39 +02:00
jffs2 MTD updates for 3.17-rc1 2014-08-08 18:13:21 -07:00
jfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2014-06-12 10:30:18 -07:00
kernfs Merge 3.16-rc6 into driver-core-next 2014-07-21 10:07:25 -07:00
lockd fs: lockd: Use ktime_get_ns() 2014-07-23 15:01:44 -07:00
logfs fs/logfs/readwrite.c: kernel-doc warning fixes 2014-08-06 18:01:12 -07:00
minix minix zmap block counts calculation fix 2014-08-08 15:57:20 -07:00
ncpfs fs/ncpfs/getopt.c: replace simple_strtoul by kstrtoul 2014-06-04 16:54:21 -07:00
nfs nfs: Don't busy-wait on SIGKILL in __nfs_iocounter_wait 2014-08-22 18:04:44 -04:00
nfs_common fs/nfs_common/nfsacl.c: move EXPORT symbol after functions 2014-07-12 18:43:42 -04:00
nfsd Merge branch 'for-3.17' of git://linux-nfs.org/~bfields/linux 2014-08-09 14:31:18 -07:00
nilfs2 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2014-08-11 11:44:11 -07:00
nls
notify list: fix order of arguments for hlist_add_after(_rcu) 2014-08-06 18:01:24 -07:00
ntfs ntfs: kernel-doc warning fixes 2014-08-06 18:01:12 -07:00
ocfs2 fs/ocfs2/slot_map.c: replace count*size kzalloc by kcalloc 2014-08-06 18:01:13 -07:00
omfs fs/omfs/inode.c: replace count*size kzalloc by kcalloc 2014-08-08 15:57:25 -07:00
openpromfs
proc Revert "proc: Point /proc/{mounts,net} at /proc/thread-self/{mounts,net} instead of /proc/self/{mounts,net}" 2014-08-10 21:24:59 -07:00
pstore fs/pstore/ram_core.c: replace count*size kmalloc by kmalloc_array 2014-08-08 15:57:25 -07:00
qnx4
qnx6 fs/qnx6: update debugging to current functions 2014-08-08 15:57:26 -07:00
quota fs/quota: kernel-doc warning fixes 2014-07-15 22:40:23 +02:00
ramfs fs/ramfs/file-nommu.c: replace count*size kzalloc by kcalloc 2014-08-08 15:57:18 -07:00
reiserfs Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs 2014-08-13 17:45:40 -06:00
romfs fs/romfs/super.c: add blank line after declarations 2014-08-08 15:57:25 -07:00
squashfs fs/squashfs/super.c: logging cleanup 2014-08-06 18:01:13 -07:00
sysfs kernfs: move the last knowledge of sysfs out from kernfs 2014-06-03 08:11:18 -07:00
sysv write_iter variants of {__,}generic_file_aio_write() 2014-05-06 17:38:00 -04:00
ubifs UBIFS: Add log overlap assertions 2014-07-31 15:52:51 +03:00
udf udf: avoid unneeded up_write when fail to add entry in ->symlink 2014-08-19 18:29:30 +02:00
ufs fs/ufs/inode.c: kernel-doc warning fixes 2014-08-08 15:57:21 -07:00
xfs xfs: xlog_cil_force_lsn doesn't always wait correctly 2014-09-23 15:57:59 +10:00
aio.c aio: fix reqs_available handling 2014-08-24 15:47:27 -07:00
anon_inodes.c
attr.c fs,userns: Change inode_capable to capable_wrt_inode_uidgid 2014-06-10 13:57:22 -07:00
bad_inode.c bad_inode: add ->rename2() 2014-08-07 14:40:09 -04:00
binfmt_aout.c
binfmt_elf_fdpic.c
binfmt_elf.c Merge branch 'x86/vdso' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into next 2014-06-05 08:05:29 -07:00
binfmt_em86.c
binfmt_flat.c fs/binfmt_flat.c: make old_reloc() static 2014-06-04 16:54:21 -07:00
binfmt_misc.c binfmt_misc: add missing 'break' statement 2014-04-03 16:21:16 -07:00
binfmt_script.c
binfmt_som.c
block_dev.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2014-06-12 10:30:18 -07:00
buffer.c sched: Remove proliferation of wait_on_bit() action functions 2014-07-16 15:10:39 +02:00
char_dev.c
compat_binfmt_elf.c
compat_ioctl.c Bluetooth: Move HCI socket definitions into its own header file 2014-07-11 13:53:04 +03:00
compat.c locks: rename file-private locks to "open file description locks" 2014-04-22 08:23:58 -04:00
coredump.c coredump: fix the setting of PF_DUMPCORE 2014-07-23 15:10:54 -07:00
dcache.c fs: mark __d_obtain_alias static 2014-08-07 14:40:11 -04:00
dcookies.c
direct-io.c switch iov_iter_get_pages() to passing maximal number of pages 2014-08-07 14:40:11 -04:00
drop_caches.c fs: convert use of typedef ctl_table to struct ctl_table 2014-06-06 16:08:16 -07:00
eventfd.c
eventpoll.c epoll: fix use-after-free in eventpoll_release_file 2014-06-16 17:21:59 -10:00
exec.c fork/exec: cleanup mm initialization 2014-08-08 15:57:23 -07:00
fcntl.c shm: add sealing API 2014-08-08 15:57:31 -07:00
fhandle.c
file_table.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2014-06-12 10:30:18 -07:00
file.c fs/file.c: don't open-code kvfree() 2014-05-06 17:31:10 -04:00
filesystems.c sys_sysfs: Add CONFIG_SYSFS_SYSCALL 2014-04-03 16:21:05 -07:00
fs_pin.c make fs/{namespace,super}.c forget about acct.h 2014-08-07 14:40:09 -04:00
fs_struct.c
fs-writeback.c sched: Remove proliferation of wait_on_bit() action functions 2014-07-16 15:10:39 +02:00
inode.c mm: allow drivers to prevent new writable mappings 2014-08-08 15:57:31 -07:00
internal.h make fs/{namespace,super}.c forget about acct.h 2014-08-07 14:40:09 -04:00
ioctl.c
Kconfig
Kconfig.binfmt
libfs.c fs/libfs.c: add generic data flush to fsync 2014-06-04 16:53:55 -07:00
locks.c locks: move locks_free_lock calls in do_fcntl_add_lease outside spinlock 2014-08-14 10:07:47 -04:00
Makefile take fs_pin stuff to fs/* 2014-08-07 14:40:08 -04:00
mbcache.c fs/mbcache: replace __builtin_log2() with ilog2() 2014-06-25 22:08:29 -04:00
mount.h death to mnt_pinned 2014-08-07 14:40:09 -04:00
mpage.c fs/block_dev.c: add bdev_read_page() and bdev_write_page() 2014-06-04 16:54:02 -07:00
namei.c namei: trivial fix to vfs_rename_dir comment 2014-08-07 14:40:10 -04:00
namespace.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2014-08-11 11:44:11 -07:00
no-block.c
open.c vfs: fix check for fallocate on active swapfile 2014-08-01 02:36:04 -04:00
pipe.c new helper: copy_page_from_iter() 2014-05-06 17:39:42 -04:00
pnode.c smarter propagate_mnt() 2014-04-01 23:19:08 -04:00
pnode.h smarter propagate_mnt() 2014-04-01 23:19:08 -04:00
posix_acl.c posix_acl: handle NULL ACL in posix_acl_equiv_mode 2014-05-06 13:58:42 -04:00
proc_namespace.c namespaces: Use task_lock and not rcu to protect nsproxy 2014-07-29 18:08:50 -07:00
read_write.c switch simple generic_file_aio_read() users to ->read_iter() 2014-05-06 17:37:55 -04:00
readdir.c fanotify: create FAN_ACCESS event for readdir 2014-06-04 16:53:52 -07:00
select.c
seq_file.c fs/seq_file: fallback to vmalloc allocation 2014-07-03 09:21:54 -07:00
signalfd.c
splice.c Merge commit '9f12600fe425bc28f0ccba034a77783c09c15af4' into for-linus 2014-06-12 00:28:09 -04:00
stack.c
stat.c
statfs.c
super.c Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs 2014-08-13 17:45:40 -06:00
sync.c
timerfd.c timerfd: Use ktime_mono_to_real() 2014-07-23 10:18:02 -07:00
utimes.c
xattr.c simple_xattr: permit 0-size extended attributes 2014-07-23 15:10:55 -07:00