linux/fs/ocfs2
Junxiao Bi 34aa8dac48 ocfs2: dlm: fix lock migration crash
This issue was introduced by commit 800deef3f6 ("ocfs2: use
list_for_each_entry where benefical") in 2007 where it replaced
list_for_each with list_for_each_entry.  The variable "lock" will point
to invalid data if "tmpq" list is empty and a panic will be triggered
due to this.  Sunil advised reverting it back, but the old version was
also not right.  At the end of the outer for loop, that
list_for_each_entry will also set "lock" to an invalid data, then in the
next loop, if the "tmpq" list is empty, "lock" will be an stale invalid
data and cause the panic.  So reverting the list_for_each back and reset
"lock" to NULL to fix this issue.

Another concern is that this seemes can not happen because the "tmpq"
list should not be empty.  Let me describe how.

old lock resource owner(node 1):                                  migratation target(node 2):
image there's lockres with a EX lock from node 2 in
granted list, a NR lock from node x with convert_type
EX in converting list.
dlm_empty_lockres() {
 dlm_pick_migration_target() {
   pick node 2 as target as its lock is the first one
   in granted list.
 }
 dlm_migrate_lockres() {
   dlm_mark_lockres_migrating() {
     res->state |= DLM_LOCK_RES_BLOCK_DIRTY;
     wait_event(dlm->ast_wq, !dlm_lockres_is_dirty(dlm, res));
	 //after the above code, we can not dirty lockres any more,
     // so dlm_thread shuffle list will not run
                                                                   downconvert lock from EX to NR
                                                                   upconvert lock from NR to EX
<<< migration may schedule out here, then
<<< node 2 send down convert request to convert type from EX to
<<< NR, then send up convert request to convert type from NR to
<<< EX, at this time, lockres granted list is empty, and two locks
<<< in the converting list, node x up convert lock followed by
<<< node 2 up convert lock.

	 // will set lockres RES_MIGRATING flag, the following
	 // lock/unlock can not run
     dlm_lockres_release_ast(dlm, res);
   }

   dlm_send_one_lockres()
                                                                 dlm_process_recovery_data()
                                                                   for (i=0; i<mres->num_locks; i++)
                                                                     if (ml->node == dlm->node_num)
                                                                       for (j = DLM_GRANTED_LIST; j <= DLM_BLOCKED_LIST; j++) {
                                                                        list_for_each_entry(lock, tmpq, list)
                                                                        if (lock) break; <<< lock is invalid as grant list is empty.
                                                                       }
                                                                       if (lock->ml.node != ml->node)
                                                                         BUG() >>> crash here
 }

I see the above locks status from a vmcore of our internal bug.

Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
Reviewed-by: Wengang Wang <wen.gang.wang@oracle.com>
Cc: Sunil Mushran <sunil.mushran@gmail.com>
Reviewed-by: Srinivas Eeda <srinivas.eeda@oracle.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-03 16:20:54 -07:00
..
cluster Merge branch 'for-3.14/core' of git://git.kernel.dk/linux-block 2014-01-30 11:19:05 -08:00
dlm ocfs2: dlm: fix lock migration crash 2014-04-03 16:20:54 -07:00
dlmfs ocfs2: remove versioning information 2014-01-21 16:19:41 -08:00
acl.c ocfs2: use generic posix ACL infrastructure 2014-01-25 23:58:21 -05:00
acl.h ocfs2: use generic posix ACL infrastructure 2014-01-25 23:58:21 -05:00
alloc.c ocfs2: improve fsync efficiency and fix deadlock between aio_write and sync_file 2014-04-03 16:20:53 -07:00
alloc.h ocfs2: Add ocfs2_trim_fs for SSD trim support. 2011-05-23 23:37:18 -07:00
aops.c ocfs2: improve fsync efficiency and fix deadlock between aio_write and sync_file 2014-04-03 16:20:53 -07:00
aops.h ocfs2: change ip_unaligned_aio to of type mutex from atomit_t 2014-04-03 16:20:53 -07:00
blockcheck.c ocfs2: kill endianness abuses in blockcheck.c 2012-05-29 23:28:35 -04:00
blockcheck.h
buffer_head_io.c ocfs2: return ENOMEM when sb_getblk() fails 2013-11-13 12:09:00 +09:00
buffer_head_io.h
dcache.c ocfs2: needs ->d_lock to poke in ->d_parent->d_inode from ->d_revalidate() 2013-09-29 22:02:20 -04:00
dcache.h
dir.c ocfs2: improve fsync efficiency and fix deadlock between aio_write and sync_file 2014-04-03 16:20:53 -07:00
dir.h [readdir] convert ocfs2 2013-06-29 12:57:02 +04:00
dlmglue.c ocfs2: pass ocfs2_cluster_connection to ocfs2_this_node 2014-01-21 16:19:41 -08:00
dlmglue.h
export.c fs: encode_fh: return FILEID_INVALID if invalid fid_type 2013-02-26 02:46:10 -05:00
export.h
extent_map.c ocfs2: fix the end cluster offset of FIEMAP 2013-09-11 15:56:53 -07:00
extent_map.h ocfs2: Implement llseek() 2011-07-25 14:58:15 -07:00
file.c ocfs2: improve fsync efficiency and fix deadlock between aio_write and sync_file 2014-04-03 16:20:53 -07:00
file.h ->permission() sanitizing: don't pass flags to ->permission() 2011-07-20 01:43:24 -04:00
heartbeat.c ocfs2: Remove mlog(0) from fs/ocfs2/heartbeat.c 2011-02-23 21:17:39 +08:00
heartbeat.h
inode.c ocfs2: improve fsync efficiency and fix deadlock between aio_write and sync_file 2014-04-03 16:20:53 -07:00
inode.h ocfs2: improve fsync efficiency and fix deadlock between aio_write and sync_file 2014-04-03 16:20:53 -07:00
ioctl.c ocfs2: adjust minlen with discard_granularity in the FITRIM ioctl 2014-01-21 16:19:42 -08:00
ioctl.h
journal.c ocfs2: use i_size_read() to access i_size 2013-09-11 15:56:30 -07:00
journal.h ocfs2: improve fsync efficiency and fix deadlock between aio_write and sync_file 2014-04-03 16:20:53 -07:00
Kconfig
localalloc.c ocfs2: free allocated clusters if error occurs after ocfs2_claim_clusters 2014-02-06 13:48:51 -08:00
localalloc.h ocfs2: free allocated clusters if error occurs after ocfs2_claim_clusters 2014-02-06 13:48:51 -08:00
locks.c
locks.h
Makefile ocfs2: remove versioning information 2014-01-21 16:19:41 -08:00
mmap.c kill f_vfsmnt 2013-02-26 02:46:10 -05:00
mmap.h
move_extents.c ocfs2: remove redundant ocfs2_alloc_dinode_update_counts() and ocfs2_block_group_set_bits() 2014-01-21 16:19:42 -08:00
move_extents.h Ocfs2/move_extents: move/defrag extents within a certain range. 2011-05-25 15:17:12 +08:00
namei.c ocfs2: improve fsync efficiency and fix deadlock between aio_write and sync_file 2014-04-03 16:20:53 -07:00
namei.h
ocfs1_fs_compat.h
ocfs2_fs.h Revert wrong fixes for common misspellings 2011-04-26 23:31:11 -07:00
ocfs2_ioctl.h Ocfs2/move_extents: Adding new ioctl code 'OCFS2_IOC_MOVE_EXT' to ocfs2. 2011-05-25 15:17:08 +08:00
ocfs2_lockid.h
ocfs2_lockingver.h
ocfs2_trace.h ocfs2: lighten up allocate transaction 2013-09-11 15:56:28 -07:00
ocfs2.h ocfs2: add clustername to cluster connection 2014-01-21 16:19:41 -08:00
quota_global.c ocfs2: fix quota file corruption 2014-03-04 07:55:48 -08:00
quota_local.c ocfs2: fix quota file corruption 2014-03-04 07:55:48 -08:00
quota.h
refcounttree.c ocfs2: use generic posix ACL infrastructure 2014-01-25 23:58:21 -05:00
refcounttree.h ocfs2: fix NULL pointer dereference in ocfs2_duplicate_clusters_by_page 2013-08-13 17:57:49 -07:00
reservations.c ocfs2: Remove masklog ML_RESERVATIONS. 2011-02-23 22:10:56 +08:00
reservations.h Fix common misspellings 2011-03-31 11:26:23 -03:00
resize.c ocfs2: do not call brelse() if group_bh is not initialized in ocfs2_group_add() 2013-11-13 12:09:01 +09:00
resize.h
slot_map.c ocfs2: Clean up messages in the fs 2011-07-24 10:34:54 -07:00
slot_map.h
stack_o2cb.c ocfs2: pass ocfs2_cluster_connection to ocfs2_this_node 2014-01-21 16:19:41 -08:00
stack_user.c ocfs2: fix sparse non static symbol warning 2014-01-21 16:19:42 -08:00
stackglue.c ocfs2: check if cluster name exists before deref 2014-03-28 13:56:58 -07:00
stackglue.h ocfs2: pass ocfs2_cluster_connection to ocfs2_this_node 2014-01-21 16:19:41 -08:00
suballoc.c ocfs2: remove redundant ocfs2_alloc_dinode_update_counts() and ocfs2_block_group_set_bits() 2014-01-21 16:19:42 -08:00
suballoc.h ocfs2: remove redundant ocfs2_alloc_dinode_update_counts() and ocfs2_block_group_set_bits() 2014-01-21 16:19:42 -08:00
super.c ocfs2: improve fsync efficiency and fix deadlock between aio_write and sync_file 2014-04-03 16:20:53 -07:00
super.h treewide: use __printf not __attribute__((format(printf,...))) 2011-10-31 17:30:54 -07:00
symlink.c ocfs2: Fix oops in ocfs2_fast_symlink_readpage() code path 2013-02-26 02:46:12 -05:00
symlink.h ocfs: simplify symlink handling 2012-05-29 23:28:40 -04:00
sysfile.c ocfs2: remove kfree() redundant null checks 2013-02-21 17:22:19 -08:00
sysfile.h
uptodate.c ocfs2: Remove masklog ML_UPTODATE. 2011-02-24 16:22:20 +08:00
uptodate.h
xattr.c ocfs2: use generic posix ACL infrastructure 2014-01-25 23:58:21 -05:00
xattr.h ocfs2: use generic posix ACL infrastructure 2014-01-25 23:58:21 -05:00