linux/fs/ocfs2/dlm
piaojun 60c7ec9ee4 ocfs2/dlm: wait for dlm recovery done when migrating all lock resources
Wait for dlm recovery done when migrating all lock resources in case that
new lock resource left after leaving dlm domain.  And the left lock
resource will cause other nodes BUG.

        NodeA                       NodeB                NodeC

  umount:
    dlm_unregister_domain()
      dlm_migrate_all_locks()

                                   NodeB down

  do recovery for NodeB
  and collect a new lockres
  form other live nodes:

    dlm_do_recovery
      dlm_remaster_locks
        dlm_request_all_locks:

    dlm_mig_lockres_handler
      dlm_new_lockres
        __dlm_insert_lockres

  at last NodeA become the
  master of the new lockres
  and leave domain:
    dlm_leave_domain()

                                                    mount:
                                                      dlm_join_domain()

                                                    touch file and request
                                                    for the owner of the new
                                                    lockres, but all the
                                                    other nodes said 'NO',
                                                    so NodeC decide to be
                                                    the owner, and send do
                                                    assert msg to other
                                                    nodes:
                                                    dlmlock()
                                                      dlm_get_lock_resource()
                                                        dlm_do_assert_master()

                                                    other nodes receive the msg
                                                    and found two masters exist.
                                                    at last cause BUG in
                                                    dlm_assert_master_handler()
                                                    -->BUG();

Link: http://lkml.kernel.org/r/5AAA6E25.7090303@huawei.com
Fixes: bc9838c4d4 ("dlm: allow dlm do recovery during shutdown")
Signed-off-by: Jun Piao <piaojun@huawei.com>
Reviewed-by: Alex Chen <alex.chen@huawei.com>
Reviewed-by: Yiwen Jiang <jiangyiwen@huawei.com>
Acked-by: Joseph Qi <jiangqi903@gmail.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <ge.changwei@h3c.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-05 21:36:22 -07:00
..
dlmapi.h ocfs2/trivial: Remove trailing whitespaces 2010-01-25 19:20:51 -08:00
dlmast.c ocfs2/dlm: clean up unused stack variable in dlm_do_local_ast() 2018-04-05 21:36:22 -07:00
dlmcommon.h ocfs2/dlm: wait for dlm recovery done when migrating all lock resources 2018-04-05 21:36:22 -07:00
dlmconvert.c ocfs2/dlm: fix race between convert and migration 2016-09-19 15:36:16 -07:00
dlmconvert.h
dlmdebug.c locking/atomic, kref: Add kref_read() 2017-01-14 11:37:18 +01:00
dlmdebug.h ocfs2/dlm: fix memory leak of dlm_debug_ctxt 2016-07-26 16:19:19 -07:00
dlmdomain.c ocfs2/dlm: wait for dlm recovery done when migrating all lock resources 2018-04-05 21:36:22 -07:00
dlmdomain.h ocfs2/dlm: don't handle migrate lockres if already in shutdown 2018-04-05 21:36:22 -07:00
dlmlock.c ocfs2: remove unnecessary null pointer check before kmem_cache_destroy() 2018-04-05 21:36:22 -07:00
dlmmaster.c ocfs2: remove unnecessary null pointer check before kmem_cache_destroy() 2018-04-05 21:36:22 -07:00
dlmrecovery.c ocfs2/dlm: wait for dlm recovery done when migrating all lock resources 2018-04-05 21:36:22 -07:00
dlmthread.c ocfs2/dlm: continue to purge recovery lockres when recovery master goes down 2016-08-02 17:31:41 -04:00
dlmunlock.c locking/atomic, kref: Add kref_read() 2017-01-14 11:37:18 +01:00
Makefile ocfs2: remove versioning information 2014-01-21 16:19:41 -08:00