linux/fs/ocfs2
Jiufei Xue 814ce69432 ocfs2: fix a tiny race that leads file system read-only
when o2hb detect a node down, it first set the dead node to recovery map
and create ocfs2rec which will replay journal for dead node.  o2hb
thread then call dlm_do_local_recovery_cleanup() to delete the lock for
dead node.  After the lock of dead node is gone, locks for other nodes
can be granted and may modify the meta data without replaying journal of
the dead node.  The detail is described as follows.

     N1                         N2                   N3(master)
modify the extent tree of
inode, and commit
dirty metadata to journal,
then goes down.
                                                 o2hb thread detects
                                                 N1 goes down, set
                                                 recovery map and
                                                 delete the lock of N1.

                                                 dlm_thread flush ast
                                                 for the lock of N2.
                        do not detect the death
                        of N1, so recovery map is
                        empty.

                        read inode from disk
                        without replaying
                        the journal of N1 and
                        modify the extent tree
                        of the inode that N1
                        had modified.
                                                 ocfs2rec recover the
                                                 journal of N1.
                                                 The modification of N2
                                                 is lost.

The modification of N1 and N2 are not serial, and it will lead to
read-only file system.  We can set recovery_waiting flag to the lock
resource after delete the lock for dead node to prevent other node from
getting the lock before dlm recovery.  After dlm recovery, the recovery
map on N2 is not empty, ocfs2_inode_lock_full_nested() will wait for ocfs2
recovery.

Signed-off-by: Jiufei Xue <xuejiufei@huawei.com>
Reviewed-by: Joseph Qi <joseph.qi@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-15 16:55:16 -07:00
..
cluster ocfs2/cluster: replace the interrupt safe spinlocks with common ones 2016-03-15 16:55:16 -07:00
dlm ocfs2: fix a tiny race that leads file system read-only 2016-03-15 16:55:16 -07:00
dlmfs kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
acl.c ocfs2: take inode lock in ocfs2_iop_set/get_acl() 2015-09-04 16:54:41 -07:00
acl.h ocfs2: use generic posix ACL infrastructure 2014-01-25 23:58:21 -05:00
alloc.c wrappers for ->i_mutex access 2016-01-22 18:04:28 -05:00
alloc.h ocfs2: constify ocfs2_extent_tree_operations structures 2016-01-14 16:00:49 -08:00
aops.c ocfs2: unlock inode if deleting inode from orphan fails 2016-02-27 10:28:52 -08:00
aops.h ocfs2: remove OCFS2_IOCB_SEM lock type in direct io 2015-06-24 17:49:39 -07:00
blockcheck.c
blockcheck.h
buffer_head_io.c ocfs2: clear the rest of the buffers on error 2015-09-04 16:54:41 -07:00
buffer_head_io.h
dcache.c VFS: normal filesystems (and lustre): d_inode() annotations 2015-04-15 15:06:57 -04:00
dcache.h ocfs2: revert iput deferring code in ocfs2_drop_dentry_lock 2014-04-03 16:20:55 -07:00
dir.c wrappers for ->i_mutex access 2016-01-22 18:04:28 -05:00
dir.h VFS: normal filesystems (and lustre): d_inode() annotations 2015-04-15 15:06:57 -04:00
dlmglue.c ocfs2: NFS hangs in __ocfs2_cluster_lock due to race with ocfs2_unblock_lock 2016-01-21 17:20:51 -08:00
dlmglue.h ocfs2: avoid blocking in ocfs2_mark_lockres_freeing() in downconvert thread 2014-04-03 16:20:55 -07:00
export.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2015-04-26 17:22:07 -07:00
export.h
extent_map.c ocfs2: neaten do_error, ocfs2_error and ocfs2_abort 2015-09-04 16:54:41 -07:00
extent_map.h
file.c wrappers for ->i_mutex access 2016-01-22 18:04:28 -05:00
file.h ocfs2: prepare some interfaces used in append direct io 2015-02-16 17:56:04 -08:00
heartbeat.c
heartbeat.h
inode.c wrappers for ->i_mutex access 2016-01-22 18:04:28 -05:00
inode.h ocfs2: only take lock if dio entry when recover orphans 2015-11-05 19:34:48 -08:00
ioctl.c wrappers for ->i_mutex access 2016-01-22 18:04:28 -05:00
ioctl.h
journal.c wrappers for ->i_mutex access 2016-01-22 18:04:28 -05:00
journal.h ocfs2: add functions to add and remove inode in orphan dir 2015-02-16 17:56:04 -08:00
Kconfig
localalloc.c wrappers for ->i_mutex access 2016-01-22 18:04:28 -05:00
localalloc.h ocfs2: free allocated clusters if error occurs after ocfs2_claim_clusters 2014-02-06 13:48:51 -08:00
locks.c ocfs2: fix flock panic issue 2015-12-29 17:45:49 -08:00
locks.h
Makefile ocfs2: remove versioning information 2014-01-21 16:19:41 -08:00
mmap.c ocfs2: fix return value from ocfs2_page_mkwrite() 2016-03-09 15:43:42 -08:00
mmap.h
move_extents.c wrappers for ->i_mutex access 2016-01-22 18:04:28 -05:00
move_extents.h
namei.c wrappers for ->i_mutex access 2016-01-22 18:04:28 -05:00
namei.h ocfs2: do not include dio entry in case of orphan scan 2015-11-05 19:34:48 -08:00
ocfs1_fs_compat.h
ocfs2_fs.h treewide: fix typos in comment blocks 2015-08-07 14:46:24 +02:00
ocfs2_ioctl.h
ocfs2_lockid.h
ocfs2_lockingver.h
ocfs2_trace.h ocfs2: fix a tiny race when running dirop_fileop_racer 2014-06-23 16:47:45 -07:00
ocfs2.h ocfs2: add errors=continue 2015-09-04 16:54:41 -07:00
quota_global.c wrappers for ->i_mutex access 2016-01-22 18:04:28 -05:00
quota_local.c ocfs2: neaten do_error, ocfs2_error and ocfs2_abort 2015-09-04 16:54:41 -07:00
quota.h quota: constify qtree_fmt_operations structures 2016-01-04 10:58:35 +01:00
refcounttree.c wrappers for ->i_mutex access 2016-01-22 18:04:28 -05:00
refcounttree.h ocfs2: fix NULL pointer dereference in ocfs2_duplicate_clusters_by_page 2013-08-13 17:57:49 -07:00
reservations.c ocfs2: make resv_lock spinlock static 2015-02-10 14:30:29 -08:00
reservations.h
resize.c wrappers for ->i_mutex access 2016-01-22 18:04:28 -05:00
resize.h
slot_map.c ocfs2: fix slot overwritten if storage link down during mount 2016-01-14 16:00:49 -08:00
slot_map.h
stack_o2cb.c ocfs2: avoid a pointless delay in o2cb_cluster_check() 2015-04-14 16:48:57 -07:00
stack_user.c char: make misc_deregister a void function 2015-08-05 10:35:49 -07:00
stackglue.c ocfs2: remove NULL assignments on static 2014-06-04 16:53:53 -07:00
stackglue.h ocfs2: pass ocfs2_cluster_connection to ocfs2_this_node 2014-01-21 16:19:41 -08:00
suballoc.c wrappers for ->i_mutex access 2016-01-22 18:04:28 -05:00
suballoc.h ocfs2: rollback alloc_dinode counts when ocfs2_block_group_set_bits() failed 2014-04-03 16:20:56 -07:00
super.c ocfs2: use spinlock_irqsave() to downconvert lock in ocfs2_osb_dump() 2016-03-15 16:55:16 -07:00
super.h ocfs2: neaten do_error, ocfs2_error and ocfs2_abort 2015-09-04 16:54:41 -07:00
symlink.c switch ->get_link() to delayed_call, kill ->put_link() 2015-12-30 13:01:03 -05:00
symlink.h
sysfile.c ocfs2: avoid system inode ref confusion by adding mutex lock 2014-04-03 16:20:57 -07:00
sysfile.h
uptodate.c ocfs2: remove NULL assignments on static 2014-06-04 16:53:53 -07:00
uptodate.h
xattr.c wrappers for ->i_mutex access 2016-01-22 18:04:28 -05:00
xattr.h ocfs2: use generic posix ACL infrastructure 2014-01-25 23:58:21 -05:00