linux/fs/xfs
Brian Foster e4229d6b0b xfs: fix eofblocks race with file extending async dio writes
It's possible for post-eof blocks to end up being used for direct I/O
writes. dio write performs an upfront unwritten extent allocation, sends
the dio and then updates the inode size (if necessary) on write
completion. If a file release occurs while a file extending dio write is
in flight, it is possible to mistake the post-eof blocks for speculative
preallocation and incorrectly truncate them from the inode. This means
that the resulting dio write completion can discover a hole and allocate
new blocks rather than perform unwritten extent conversion.

This requires a strange mix of I/O and is thus not likely to reproduce
in real world workloads. It is intermittently reproduced by generic/299.
The error manifests as an assert failure due to transaction overrun
because the aforementioned write completion transaction has only
reserved enough blocks for btree operations:

  XFS: Assertion failed: tp->t_blk_res_used <= tp->t_blk_res, \
   file: fs/xfs//xfs_trans.c, line: 309

The root cause is that xfs_free_eofblocks() uses i_size to truncate
post-eof blocks from the inode, but async, file extending direct writes
do not update i_size until write completion, long after inode locks are
dropped. Therefore, xfs_free_eofblocks() effectively truncates the inode
to the incorrect size.

Update xfs_free_eofblocks() to serialize against dio similar to how
extending writes are serialized against i_size updates before post-eof
block zeroing. Specifically, wait on dio while under the iolock. This
ensures that dio write completions have updated i_size before post-eof
blocks are processed.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2017-01-30 16:32:25 -08:00
..
libxfs xfs: remove unused struct declarations 2017-01-30 16:32:25 -08:00
Kconfig xfs: implement iomap based buffered write path 2016-06-21 09:53:44 +10:00
kmem.c xfs: improve kmem_realloc 2016-04-06 09:47:01 +10:00
kmem.h xfs: improve kmem_realloc 2016-04-06 09:47:01 +10:00
Makefile xfs: introduce the CoW fork 2016-10-04 18:06:40 -07:00
mrlock.h
uuid.c
uuid.h
xfs_acl.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-10-10 20:16:43 -07:00
xfs_acl.h
xfs_aops.c xfs: Timely free truncated dirty pages 2017-01-11 10:20:04 -08:00
xfs_aops.h xfs: use iomap_dio_rw 2016-11-30 14:37:15 +11:00
xfs_attr_inactive.c xfs: make several functions static 2016-06-01 17:38:15 +10:00
xfs_attr_list.c xfs: several xattr functions can be void 2016-12-05 12:32:14 +11:00
xfs_attr.h xfs: several xattr functions can be void 2016-12-05 12:32:14 +11:00
xfs_bmap_item.c xfs: when replaying bmap operations, don't let unlinked inodes get reaped 2016-10-04 11:05:44 -07:00
xfs_bmap_item.h xfs: log bmap intent items 2016-10-04 11:05:44 -07:00
xfs_bmap_util.c xfs: fix eofblocks race with file extending async dio writes 2017-01-30 16:32:25 -08:00
xfs_bmap_util.h xfs: pull up iolock from xfs_free_eofblocks() 2017-01-30 16:32:25 -08:00
xfs_buf_item.c xfs: normalize "infinite" retries in error configs 2016-09-14 07:51:30 +10:00
xfs_buf_item.h
xfs_buf.c xfs: clear _XBF_PAGES from buffers when readahead page 2017-01-25 20:24:57 -08:00
xfs_buf.h xfs: use rhashtable to track buffer cache 2016-12-07 17:36:36 +11:00
xfs_dir2_readdir.c xfs: remove i_iolock and use i_rwsem in the VFS inode instead 2016-11-30 14:33:25 +11:00
xfs_discard.c xfs: rmap btree requires more reserved free space 2016-08-03 11:38:24 +10:00
xfs_discard.h
xfs_dquot_item.c xfs: allocate log vector buffers outside CIL context lock 2016-07-22 09:52:35 +10:00
xfs_dquot_item.h
xfs_dquot.c xfs: don't wrap ID in xfs_dq_get_next_id 2017-01-17 11:43:38 -08:00
xfs_dquot.h
xfs_error.c Merge branch 'xfs-4.8-misc-fixes-3' into for-next 2016-07-20 11:51:08 +10:00
xfs_error.h xfs: simulate per-AG reservations being critically low 2016-10-05 16:26:31 -07:00
xfs_export.c xfs: abstract block export operations from nfsd layouts 2016-07-15 15:31:29 -04:00
xfs_export.h
xfs_extent_busy.c xfs: remote attribute blocks aren't really userdata 2016-09-26 08:21:28 +10:00
xfs_extent_busy.h
xfs_extfree_item.c xfs: remove unnecessary parentheses from log redo item recovery functions 2016-08-03 12:29:32 +10:00
xfs_extfree_item.h xfs: refactor redo intent item processing 2016-08-03 11:23:49 +10:00
xfs_file.c xfs: sync eofblocks scans under iolock are livelock prone 2017-01-30 16:32:25 -08:00
xfs_filestream.c Merge branch 'xfs-4.9-log-recovery-fixes' into for-next 2016-10-03 09:56:28 +11:00
xfs_filestream.h
xfs_fsops.c xfs: remove boilerplate around xfs_btree_init_block 2017-01-30 16:32:24 -08:00
xfs_fsops.h xfs: preallocate blocks for worst-case btree expansion 2016-10-05 16:26:27 -07:00
xfs_globals.c xfs: garbage collect old cowextsz reservations 2016-10-05 16:26:28 -07:00
xfs_icache.c xfs: sync eofblocks scans under iolock are livelock prone 2017-01-30 16:32:25 -08:00
xfs_icache.h xfs: sync eofblocks scans under iolock are livelock prone 2017-01-30 16:32:25 -08:00
xfs_icreate_item.c fs: xfs: xfs_icreate_item: constify xfs_item_ops structure 2016-11-28 14:57:42 +11:00
xfs_icreate_item.h
xfs_inode_item.c xfs: provide helper for counting extents from if_bytes 2016-11-08 12:59:42 +11:00
xfs_inode_item.h
xfs_inode.c xfs: pull up iolock from xfs_free_eofblocks() 2017-01-30 16:32:25 -08:00
xfs_inode.h xfs: remove i_iolock and use i_rwsem in the VFS inode instead 2016-11-30 14:33:25 +11:00
xfs_ioctl32.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
xfs_ioctl32.h
xfs_ioctl.c Merge uncontroversial parts of branch 'readlink' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs 2016-12-17 19:16:12 -08:00
xfs_ioctl.h xfs: don't pass ioflags around in the ioctl path 2016-07-20 11:29:35 +10:00
xfs_iomap.c iomap: constify struct iomap_ops 2017-01-30 16:32:25 -08:00
xfs_iomap.h iomap: constify struct iomap_ops 2017-01-30 16:32:25 -08:00
xfs_iops.c xfs: sanity check inode mode when creating new dentry 2017-01-17 11:42:22 -08:00
xfs_iops.h xfs: Propagate dentry down to inode_change_ok() 2016-09-22 10:56:19 +02:00
xfs_itable.c xfs: create a separate cow extent size hint for the allocator 2016-10-05 16:26:26 -07:00
xfs_itable.h
xfs_linux.h xfs: make the ASSERT() condition likely 2017-01-17 11:41:41 -08:00
xfs_log_cil.c xfs: allocate log vector buffers outside CIL context lock 2016-07-22 09:52:35 +10:00
xfs_log_priv.h xfs: rework log recovery to submit buffers on LSN boundaries 2016-09-26 08:22:16 +10:00
xfs_log_recover.c Merge branch 'xfs-4.10-misc-fixes-3' into for-next 2016-12-07 17:42:30 +11:00
xfs_log.c xfs: don't print warnings when xfs_log_force fails 2017-01-09 13:45:01 -08:00
xfs_log.h xfs: remove unused struct declarations 2017-01-30 16:32:25 -08:00
xfs_message.c
xfs_message.h
xfs_mount.c xfs: use rhashtable to track buffer cache 2016-12-07 17:36:36 +11:00
xfs_mount.h xfs: use per-AG reservations for the finobt 2017-01-25 07:49:35 -08:00
xfs_mru_cache.c
xfs_mru_cache.h
xfs_ondisk.h xfs: define the on-disk refcount btree format 2016-10-03 09:11:18 -07:00
xfs_pnfs.c xfs: remove i_iolock and use i_rwsem in the VFS inode instead 2016-11-30 14:33:25 +11:00
xfs_pnfs.h xfs: remove i_iolock and use i_rwsem in the VFS inode instead 2016-11-30 14:33:25 +11:00
xfs_qm_bhv.c
xfs_qm_syscalls.c xfs: better xfs_trans_alloc interface 2016-04-06 09:19:55 +10:00
xfs_qm.c xfs: prevent quotacheck from overloading inode lru 2017-01-27 09:32:30 -08:00
xfs_qm.h
xfs_quota.h
xfs_quotaops.c
xfs_refcount_item.c xfs: fix double-cleanup when CUI recovery fails 2017-01-03 18:39:32 -08:00
xfs_refcount_item.h xfs: log refcount intent items 2016-10-03 09:11:21 -07:00
xfs_reflink.c vfs: fix isize/pos/len checks for reflink & dedupe 2016-12-22 23:00:23 -05:00
xfs_reflink.h xfs: use new extent lookup helpers in xfs_reflink_trim_irec_to_next_cow 2016-11-24 11:39:50 +11:00
xfs_rmap_item.c xfs: convert unwritten status of reverse mappings for shared files 2016-10-05 16:26:29 -07:00
xfs_rmap_item.h xfs: convert RUI log formats to use variable length arrays 2016-09-19 10:24:27 +10:00
xfs_rtalloc.c xfs: rename flist/free_list to dfops 2016-08-03 11:19:29 +10:00
xfs_rtalloc.h xfs: make several functions static 2016-06-01 17:38:15 +10:00
xfs_stats.c xfs: make xfs btree stats less huge 2016-12-05 14:38:58 +11:00
xfs_stats.h xfs: make xfs btree stats less huge 2016-12-05 14:38:58 +11:00
xfs_super.c xfs: deprecate barrier/nobarrier mount option 2016-12-09 16:49:54 +11:00
xfs_super.h xfs: quiesce the filesystem after recovery on readonly mount 2016-09-26 08:21:44 +10:00
xfs_symlink.c xfs: remove i_iolock and use i_rwsem in the VFS inode instead 2016-11-30 14:33:25 +11:00
xfs_symlink.h
xfs_sysctl.c xfs: garbage collect old cowextsz reservations 2016-10-05 16:26:28 -07:00
xfs_sysctl.h xfs: garbage collect old cowextsz reservations 2016-10-05 16:26:28 -07:00
xfs_sysfs.c xfs: fix max_retries _show and _store functions 2017-01-03 20:34:17 -08:00
xfs_sysfs.h xfs: configurable error behavior via sysfs 2016-05-18 10:58:51 +10:00
xfs_trace.c xfs: rework xfs_bmap_free callers to use xfs_defer_ops 2016-08-03 11:15:38 +10:00
xfs_trace.h xfs: remove unused struct declarations 2017-01-30 16:32:25 -08:00
xfs_trans_ail.c
xfs_trans_bmap.c xfs: implement deferred bmbt map/unmap operations 2016-10-04 11:05:44 -07:00
xfs_trans_buf.c
xfs_trans_dquot.c
xfs_trans_extfree.c xfs: set up per-AG free space reservations 2016-09-19 10:30:52 +10:00
xfs_trans_inode.c fs: Replace current_fs_time() with current_time() 2016-09-27 21:06:22 -04:00
xfs_trans_priv.h
xfs_trans_refcount.c xfs: connect refcount adjust functions to upper layers 2016-10-03 09:11:22 -07:00
xfs_trans_rmap.c xfs: add shared rmap map/unmap/convert log item types 2016-10-05 16:26:29 -07:00
xfs_trans.c Merge branch 'xfs-4.9-reflink-prep' into for-next 2016-10-03 09:52:31 +11:00
xfs_trans.h xfs: remove unused struct declarations 2017-01-30 16:32:25 -08:00
xfs_xattr.c xfs: several xattr functions can be void 2016-12-05 12:32:14 +11:00
xfs.h