linux/fs/ext4
Theodore Ts'o decbd919f4 ext4: only call ext4_jbd2_file_inode when an inode has been extended
In delayed allocation mode, it's important to only call
ext4_jbd2_file_inode when the file has been extended.  This is
necessary to avoid a race which first got introduced in commit
678aaf481, but which was made much more common with the introduction
of the "punch hole" functionality.  (Especially when dioread_nolock
was enabled; when I could reliably reproduce this problem with
xfstests #74.)

The race is this: If while trying to writeback a delayed allocation
inode, there is a need to map delalloc blocks, and we run out of space
in the journal, *and* at the same time the inode is already on the
committing transaction's t_inode_list (because for example while doing
the punch hole operation, ext4_jbd2_file_inode() is called), then the
commit operation will wait for the inode to finish all of its pending
writebacks by calling filemap_fdatawait(), but since that inode has
one or more pages with the PageWriteback flag set, the commit
operation will wait forever, and the so the writeback of the inode can
never take place, and the kjournald thread and the writeback thread
end up waiting for each other --- forever.

It's important at this point to recall why an inode is placed on the
t_inode_list; it is to provide the data=ordered guarantees that we
don't end up exposing stale data.  In the case where we are truncating
or punching a hole in the inode, there is no possibility that stale
data could be exposed in the first place, so we don't need to put the
inode on the t_inode_list!

The right long-term fix is to get rid of data=ordered mode altogether,
and only update the extent tree or indirect blocks after the data has
been written.  Until then, this change will also avoid some
unnecessary waiting in the commit operation.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: Allison Henderson <achender@linux.vnet.ibm.com>
Cc: Jan Kara <jack@suse.cz>
2011-09-06 02:37:06 -04:00
..
acl.c switch posix_acl_equiv_mode() to umode_t * 2011-08-01 02:10:06 -04:00
acl.h fs: take the ACL checks to common code 2011-07-25 14:30:23 -04:00
balloc.c ext4: refactor duplicated block placement code 2011-06-28 10:01:31 -04:00
bitmap.c ext4: Change unsigned long to unsigned int 2008-11-05 00:14:04 -05:00
block_validity.c ext4: move ext4_ind_* functions from inode.c to indirect.c 2011-06-27 19:40:50 -04:00
dir.c ext4: Use ext4_error_file() to print the pathname to the corrupted inode 2011-01-10 12:10:55 -05:00
ext4_extents.h ext4: use FIEMAP_EXTENT_LAST flag for last extent in fiemap 2011-06-06 00:06:52 -04:00
ext4_jbd2.c jbd2: add debugging information to jbd2_journal_dirty_metadata() 2011-09-04 10:18:14 -04:00
ext4_jbd2.h ext4: Fix ext4_should_writeback_data() for no-journal mode 2011-08-13 11:25:18 -04:00
ext4.h ext4: improve handling of conflicting mount options 2011-09-03 18:22:38 -04:00
extents.c jbd2: add debugging information to jbd2_journal_dirty_metadata() 2011-09-04 10:18:14 -04:00
file.c fs: take the ACL checks to common code 2011-07-25 14:30:23 -04:00
fsync.c Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 2011-08-01 13:56:03 -10:00
hash.c ext4: Add support for non-native signed/unsigned htree hash algorithms 2008-10-28 13:21:44 -04:00
ialloc.c ext4: use the correct error exit path in ext4_init_inode_table() 2011-08-01 06:32:19 -04:00
indirect.c ext4: flush any pending end_io requests before DIO reads w/dioread_nolock 2011-08-19 19:13:32 -04:00
inode.c ext4: only call ext4_jbd2_file_inode when an inode has been extended 2011-09-06 02:37:06 -04:00
ioctl.c ext4: prevent parallel resizers by atomic bit ops 2011-07-26 21:35:44 -04:00
Kconfig ext4: Don't ask about supporting ext2/3 in ext4 if ext4 is not configured 2009-12-21 10:54:09 -05:00
Makefile ext4: move ext4_ind_* functions from inode.c to indirect.c 2011-06-27 19:40:50 -04:00
mballoc.c ext4: prevent memory leaks from ext4_mb_init_backend() on error path 2011-08-01 17:41:46 -04:00
mballoc.h ext4: remove ac_repeats from ext4_allocation_context 2011-07-23 16:18:55 -04:00
migrate.c ext4: set extents flag when migrating file to use extents 2011-05-03 09:34:42 -04:00
mmp.c ext4: add support for multiple mount protection 2011-05-24 18:31:25 -04:00
move_extent.c ext4: Fix max file size and logical block counting of extent format file 2011-06-06 00:05:17 -04:00
namei.c ext4: call ext4_handle_dirty_metadata with correct inode in ext4_dx_add_entry 2011-08-31 12:02:51 -04:00
page-io.c ext4: remove i_mutex lock in ext4_evict_inode to fix lockdep complaining 2011-08-31 11:50:51 -04:00
resize.c ext4: use ext4_kvzalloc()/ext4_kvmalloc() for s_group_desc and s_group_info 2011-08-01 08:45:38 -04:00
super.c ext4: improve handling of conflicting mount options 2011-09-03 18:22:38 -04:00
symlink.c ext4: symlink must be handled via filesystem specific operation 2010-05-16 02:00:00 -04:00
truncate.h ext4: move common truncate functions to header file 2011-06-27 19:16:04 -04:00
xattr_security.c fs/vfs/security: pass last path component to LSM on inode creation 2011-02-01 11:12:29 -05:00
xattr_trusted.c ext4: constify xattr_handler 2010-05-21 18:31:19 -04:00
xattr_user.c ext4: constify xattr_handler 2010-05-21 18:31:19 -04:00
xattr.c ext4: add flag to ext4_has_free_blocks 2011-05-25 07:41:26 -04:00
xattr.h fs/vfs/security: pass last path component to LSM on inode creation 2011-02-01 11:12:29 -05:00