linux/fs
Jan Kara ec4cb1aa2b ext4: fix jbd2 warning under heavy xattr load
When heavily exercising xattr code the assertion that
jbd2_journal_dirty_metadata() shouldn't return error was triggered:

WARNING: at /srv/autobuild-ceph/gitbuilder.git/build/fs/jbd2/transaction.c:1237
jbd2_journal_dirty_metadata+0x1ba/0x260()

CPU: 0 PID: 8877 Comm: ceph-osd Tainted: G    W 3.10.0-ceph-00049-g68d04c9 #1
Hardware name: Dell Inc. PowerEdge R410/01V648, BIOS 1.6.3 02/07/2011
 ffffffff81a1d3c8 ffff880214469928 ffffffff816311b0 ffff880214469968
 ffffffff8103fae0 ffff880214469958 ffff880170a9dc30 ffff8802240fbe80
 0000000000000000 ffff88020b366000 ffff8802256e7510 ffff880214469978
Call Trace:
 [<ffffffff816311b0>] dump_stack+0x19/0x1b
 [<ffffffff8103fae0>] warn_slowpath_common+0x70/0xa0
 [<ffffffff8103fb2a>] warn_slowpath_null+0x1a/0x20
 [<ffffffff81267c2a>] jbd2_journal_dirty_metadata+0x1ba/0x260
 [<ffffffff81245093>] __ext4_handle_dirty_metadata+0xa3/0x140
 [<ffffffff812561f3>] ext4_xattr_release_block+0x103/0x1f0
 [<ffffffff81256680>] ext4_xattr_block_set+0x1e0/0x910
 [<ffffffff8125795b>] ext4_xattr_set_handle+0x38b/0x4a0
 [<ffffffff810a319d>] ? trace_hardirqs_on+0xd/0x10
 [<ffffffff81257b32>] ext4_xattr_set+0xc2/0x140
 [<ffffffff81258547>] ext4_xattr_user_set+0x47/0x50
 [<ffffffff811935ce>] generic_setxattr+0x6e/0x90
 [<ffffffff81193ecb>] __vfs_setxattr_noperm+0x7b/0x1c0
 [<ffffffff811940d4>] vfs_setxattr+0xc4/0xd0
 [<ffffffff8119421e>] setxattr+0x13e/0x1e0
 [<ffffffff811719c7>] ? __sb_start_write+0xe7/0x1b0
 [<ffffffff8118f2e8>] ? mnt_want_write_file+0x28/0x60
 [<ffffffff8118c65c>] ? fget_light+0x3c/0x130
 [<ffffffff8118f2e8>] ? mnt_want_write_file+0x28/0x60
 [<ffffffff8118f1f8>] ? __mnt_want_write+0x58/0x70
 [<ffffffff811946be>] SyS_fsetxattr+0xbe/0x100
 [<ffffffff816407c2>] system_call_fastpath+0x16/0x1b

The reason for the warning is that buffer_head passed into
jbd2_journal_dirty_metadata() didn't have journal_head attached. This is
caused by the following race of two ext4_xattr_release_block() calls:

CPU1                                CPU2
ext4_xattr_release_block()          ext4_xattr_release_block()
lock_buffer(bh);
/* False */
if (BHDR(bh)->h_refcount == cpu_to_le32(1))
} else {
  le32_add_cpu(&BHDR(bh)->h_refcount, -1);
  unlock_buffer(bh);
                                    lock_buffer(bh);
                                    /* True */
                                    if (BHDR(bh)->h_refcount == cpu_to_le32(1))
                                      get_bh(bh);
                                      ext4_free_blocks()
                                        ...
                                        jbd2_journal_forget()
                                          jbd2_journal_unfile_buffer()
                                          -> JH is gone
  error = ext4_handle_dirty_xattr_block(handle, inode, bh);
  -> triggers the warning

We fix the problem by moving ext4_handle_dirty_xattr_block() under the
buffer lock. Sadly this cannot be done in nojournal mode as that
function can call sync_dirty_buffer() which would deadlock. Luckily in
nojournal mode the race is harmless (we only dirty already freed buffer)
and thus for nojournal mode we leave the dirtying outside of the buffer
lock.

Reported-by: Sage Weil <sage@inktank.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: stable@vger.kernel.org
2014-04-07 10:54:21 -04:00
..
9p mm + fs: store shadow entries in page cache 2014-04-03 16:21:01 -07:00
adfs fs: push sync_filesystem() down to the file system's remount_fs() 2014-03-13 10:14:33 -04:00
affs Major changes for 3.14 include support for the newly added ZERO_RANGE 2014-04-04 15:39:39 -07:00
afs mm + fs: store shadow entries in page cache 2014-04-03 16:21:01 -07:00
autofs4 autofs: fix symlinks aren't checked for expiry 2014-01-23 16:36:59 -08:00
befs Major changes for 3.14 include support for the newly added ZERO_RANGE 2014-04-04 15:39:39 -07:00
bfs mm + fs: store shadow entries in page cache 2014-04-03 16:21:01 -07:00
btrfs Major changes for 3.14 include support for the newly added ZERO_RANGE 2014-04-04 15:39:39 -07:00
cachefiles Merge branch 'cross-rename' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs 2014-04-04 14:03:05 -07:00
ceph ceph: fix __dcache_readdir() 2014-02-17 12:37:13 -08:00
cifs Major changes for 3.14 include support for the newly added ZERO_RANGE 2014-04-04 15:39:39 -07:00
coda Major changes for 3.14 include support for the newly added ZERO_RANGE 2014-04-04 15:39:39 -07:00
configfs
cramfs Major changes for 3.14 include support for the newly added ZERO_RANGE 2014-04-04 15:39:39 -07:00
debugfs Major changes for 3.14 include support for the newly added ZERO_RANGE 2014-04-04 15:39:39 -07:00
devpts fs: push sync_filesystem() down to the file system's remount_fs() 2014-03-13 10:14:33 -04:00
dlm dlm: use INFO for recovery messages 2014-02-14 11:54:44 -06:00
ecryptfs Merge branch 'cross-rename' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs 2014-04-04 14:03:05 -07:00
efivarfs efivarfs: 'efivarfs_file_write' function reorganization 2014-03-04 16:16:16 +00:00
efs Major changes for 3.14 include support for the newly added ZERO_RANGE 2014-04-04 15:39:39 -07:00
exofs mm + fs: store shadow entries in page cache 2014-04-03 16:21:01 -07:00
exportfs
ext2 Major changes for 3.14 include support for the newly added ZERO_RANGE 2014-04-04 15:39:39 -07:00
ext3 Major changes for 3.14 include support for the newly added ZERO_RANGE 2014-04-04 15:39:39 -07:00
ext4 ext4: fix jbd2 warning under heavy xattr load 2014-04-07 10:54:21 -04:00
f2fs Major changes for 3.14 include support for the newly added ZERO_RANGE 2014-04-04 15:39:39 -07:00
fat Major changes for 3.14 include support for the newly added ZERO_RANGE 2014-04-04 15:39:39 -07:00
freevxfs Major changes for 3.14 include support for the newly added ZERO_RANGE 2014-04-04 15:39:39 -07:00
fscache FS-Cache: Handle removal of unadded object to the fscache_object_list rb tree 2014-02-17 13:47:35 -08:00
fuse Major changes for 3.14 include support for the newly added ZERO_RANGE 2014-04-04 15:39:39 -07:00
gfs2 Major changes for 3.14 include support for the newly added ZERO_RANGE 2014-04-04 15:39:39 -07:00
hfs Major changes for 3.14 include support for the newly added ZERO_RANGE 2014-04-04 15:39:39 -07:00
hfsplus Major changes for 3.14 include support for the newly added ZERO_RANGE 2014-04-04 15:39:39 -07:00
hostfs mm + fs: store shadow entries in page cache 2014-04-03 16:21:01 -07:00
hpfs Major changes for 3.14 include support for the newly added ZERO_RANGE 2014-04-04 15:39:39 -07:00
hppfs
hugetlbfs mm, hugetlb: unify region structure handling 2014-04-03 16:20:59 -07:00
isofs fs: push sync_filesystem() down to the file system's remount_fs() 2014-03-13 10:14:33 -04:00
jbd jbd: Revise KERN_EMERG error messages 2013-12-04 12:27:46 +01:00
jbd2 jbd2: improve error messages for inconsistent journal heads 2014-03-12 16:38:03 -04:00
jffs2 Major changes for 3.14 include support for the newly added ZERO_RANGE 2014-04-04 15:39:39 -07:00
jfs Major changes for 3.14 include support for the newly added ZERO_RANGE 2014-04-04 15:39:39 -07:00
kernfs Merge branch 'akpm' (incoming from Andrew) 2014-04-03 16:22:16 -07:00
lockd lockd: send correct lock when granting a delayed lock. 2014-02-13 14:55:02 -05:00
logfs mm + fs: store shadow entries in page cache 2014-04-03 16:21:01 -07:00
minix Major changes for 3.14 include support for the newly added ZERO_RANGE 2014-04-04 15:39:39 -07:00
ncpfs Major changes for 3.14 include support for the newly added ZERO_RANGE 2014-04-04 15:39:39 -07:00
nfs Major changes for 3.14 include support for the newly added ZERO_RANGE 2014-04-04 15:39:39 -07:00
nfs_common
nfsd Merge branch 'cross-rename' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs 2014-04-04 14:03:05 -07:00
nilfs2 Major changes for 3.14 include support for the newly added ZERO_RANGE 2014-04-04 15:39:39 -07:00
nls nls: have register_nls() set ->owner 2014-01-25 03:14:05 -05:00
notify fanotify: move unrelated handling from copy_event_to_user() 2014-04-03 16:20:51 -07:00
ntfs Major changes for 3.14 include support for the newly added ZERO_RANGE 2014-04-04 15:39:39 -07:00
ocfs2 Major changes for 3.14 include support for the newly added ZERO_RANGE 2014-04-04 15:39:39 -07:00
omfs mm + fs: store shadow entries in page cache 2014-04-03 16:21:01 -07:00
openpromfs fs: push sync_filesystem() down to the file system's remount_fs() 2014-03-13 10:14:33 -04:00
proc Major changes for 3.14 include support for the newly added ZERO_RANGE 2014-04-04 15:39:39 -07:00
pstore Major changes for 3.14 include support for the newly added ZERO_RANGE 2014-04-04 15:39:39 -07:00
qnx4 fs: push sync_filesystem() down to the file system's remount_fs() 2014-03-13 10:14:33 -04:00
qnx6 fs: push sync_filesystem() down to the file system's remount_fs() 2014-03-13 10:14:33 -04:00
quota quota: provide function to grab quota structure reference 2014-04-03 16:20:54 -07:00
ramfs fs/ramfs: move ramfs_aops to inode.c 2014-01-23 16:36:58 -08:00
reiserfs Major changes for 3.14 include support for the newly added ZERO_RANGE 2014-04-04 15:39:39 -07:00
romfs fs: push sync_filesystem() down to the file system's remount_fs() 2014-03-13 10:14:33 -04:00
squashfs fs: push sync_filesystem() down to the file system's remount_fs() 2014-03-13 10:14:33 -04:00
sysfs Revert "sysfs, driver-core: remove unused {sysfs|device}_schedule_callback_owner()" 2014-03-25 20:54:57 -07:00
sysv Major changes for 3.14 include support for the newly added ZERO_RANGE 2014-04-04 15:39:39 -07:00
ubifs Major changes for 3.14 include support for the newly added ZERO_RANGE 2014-04-04 15:39:39 -07:00
udf Major changes for 3.14 include support for the newly added ZERO_RANGE 2014-04-04 15:39:39 -07:00
ufs Major changes for 3.14 include support for the newly added ZERO_RANGE 2014-04-04 15:39:39 -07:00
xfs xfs: update for 3.15-rc1 2014-04-04 15:50:08 -07:00
aio.c Merge git://git.kvack.org/~bcrl/aio-next 2013-12-22 11:03:49 -08:00
anon_inodes.c vfs: Allocate anon_inode_inode in anon_inode_init() 2014-03-27 09:52:54 -07:00
attr.c fs: fix iversion handling 2013-12-05 16:36:21 -06:00
bad_inode.c
binfmt_aout.c
binfmt_elf_fdpic.c
binfmt_elf.c fs, kernel: permit disabling the uselib syscall 2014-04-03 16:21:05 -07:00
binfmt_em86.c
binfmt_flat.c
binfmt_misc.c binfmt_misc: add missing 'break' statement 2014-04-03 16:21:16 -07:00
binfmt_script.c
binfmt_som.c
bio-integrity.c Merge branch 'for-3.15/core' of git://git.kernel.dk/linux-block 2014-04-01 19:19:15 -07:00
bio.c Merge branch 'for-3.15' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup 2014-04-03 13:05:42 -07:00
block_dev.c mm + fs: store shadow entries in page cache 2014-04-03 16:21:01 -07:00
buffer.c Merge branch 'master' into for-next 2014-02-20 14:54:28 +01:00
char_dev.c
compat_binfmt_elf.c binfmt_elf: add ELF_HWCAP2 to compat auxv entries 2014-03-04 08:05:21 +00:00
compat_ioctl.c fs/compat: convert to COMPAT_SYSCALL_DEFINE with changing parameter types 2014-03-06 16:30:44 +01:00
compat.c Merge branch 'locks-3.15' of git://git.samba.org/jlayton/linux 2014-04-04 14:21:20 -07:00
coredump.c coredump: make __get_dumpable/get_dumpable inline, kill fs/coredump.h 2014-01-23 16:37:01 -08:00
dcache.c vfs: add cross-rename 2014-04-01 17:08:43 +02:00
dcookies.c fs/compat: fix lookup_dcookie() parameter handling 2014-01-29 16:22:40 -08:00
direct-io.c xfs: update for 3.15-rc1 2014-04-04 15:50:08 -07:00
drop_caches.c drop_caches: add some documentation and info message 2014-04-03 16:21:04 -07:00
eventfd.c eventfd_ctx_fdget(): use fdget() instead of fget() 2014-01-25 03:13:04 -05:00
eventpoll.c epoll: do not take the nested ep->mtx on EPOLL_CTL_DEL 2014-01-02 14:40:30 -08:00
exec.c fs, kernel: permit disabling the uselib syscall 2014-04-03 16:21:05 -07:00
fcntl.c locks: add new fcntl cmd values for handling file private locks 2014-03-31 08:24:43 -04:00
fhandle.c
file_table.c Merge branch 'locks-3.15' of git://git.samba.org/jlayton/linux 2014-04-04 14:21:20 -07:00
file.c Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2014-03-31 11:05:24 -07:00
filesystems.c sys_sysfs: Add CONFIG_SYSFS_SYSCALL 2014-04-03 16:21:05 -07:00
fs_struct.c
fs-writeback.c One of the main highlights this time, is not the patches themselves 2014-04-04 14:49:16 -07:00
inode.c Major changes for 3.14 include support for the newly added ZERO_RANGE 2014-04-04 15:39:39 -07:00
internal.h
ioctl.c
ioprio.c
Kconfig kernfs: add CONFIG_KERNFS 2014-02-07 16:08:57 -08:00
Kconfig.binfmt
libfs.c
locks.c locks: make locks_mandatory_area check for file-private locks 2014-03-31 08:24:43 -04:00
Makefile kernfs: add CONFIG_KERNFS 2014-02-07 16:08:57 -08:00
mbcache.c ext4: each filesystem creates and uses its own mb_cache 2014-03-18 19:24:49 -04:00
mount.h switch mnt_hash to hlist 2014-03-30 19:18:51 -04:00
mpage.c
namei.c Merge branch 'locks-3.15' of git://git.samba.org/jlayton/linux 2014-04-04 14:21:20 -07:00
namespace.c switch mnt_hash to hlist 2014-03-30 19:18:51 -04:00
no-block.c
open.c xfs: update for 3.15-rc1 2014-04-04 15:50:08 -07:00
pipe.c fs/pipe.c: skip file_update_time on frozen fs 2014-01-23 16:37:00 -08:00
pnode.c switch mnt_hash to hlist 2014-03-30 19:18:51 -04:00
pnode.h switch mnt_hash to hlist 2014-03-30 19:18:51 -04:00
posix_acl.c One of the main highlights this time, is not the patches themselves 2014-04-04 14:49:16 -07:00
proc_namespace.c fs/proc_namespace.c: simplify testing nsp and nsp->mnt_ns 2014-01-23 16:37:02 -08:00
read_write.c Merge branch 'compat' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux 2014-03-31 14:32:17 -07:00
readdir.c
select.c
seq_file.c
signalfd.c
splice.c fuse: fix pipe_buf_operations 2014-01-22 19:36:57 +01:00
stack.c
stat.c
statfs.c
super.c fs: push sync_filesystem() down to the file system's remount_fs() 2014-03-13 10:14:33 -04:00
sync.c Revert "writeback: do not sync data dirtied after sync start" 2014-02-22 02:02:28 +01:00
timerfd.c timerfd: support CLOCK_BOOTTIME clock 2014-01-23 16:57:40 -08:00
utimes.c
xattr.c