A mirror of the official Linux kernel repository just in case
Go to file
Alexander Aring f74dacb4c8 dlm: fix recovery of middle conversions
In one special case, recovery is unable to reliably rebuild
lock state by simply recreating lkb structs as sent from the
lock holders.  That case is when the lkb's include conversions
between PR and CW modes.

The recovery code has always recognized this special case,
but the implemention has always been broken, and would set
invalid modes in recovered lkb's.  Unpredictable or bogus
errors could then be returned for further locking calls on
these locks.

This bug has gone unnoticed for so long due to some
combination of:
- applications never or infrequently converting between PR/CW
- recovery not occuring during these conversions
- if the recovery bug does occur, the caller may not notice,
  depending on what further locking calls are made, e.g. if
  the lock is simply unlocked it may go unnoticed

However, a core analysis from a recent gfs2 bug report points
to this broken code.

PR = Protected Read
CW = Concurrent Write
PR and CW are incompatible
PR and PR are compatible
CW and CW are compatible

Example 1

node C, resource R
granted: PR node A
granted: PR node B
granted: NL node C
granted: NL node D

- A sends convert PR->CW to C
- C fails before A gets a reply
- recovery occurs

At this point, A does not know if it still holds
the lock in PR, or if its conversion to CW was granted:
- If A's conversion to CW was granted, then another
  node's CW lock may also have been granted.
- If A's conversion to CW was not granted, it still
  holds a PR lock, and other nodes may also hold PR locks.

So, the new master of R cannot simply recreate the lock
from A using granted mode PR and requested mode CW.
The new master must look at all the recovered locks to
determine the correct granted modes, and ensure that all
the recovered locks are recreated in compatible states.

The correct lock recovery steps in this example are:
- node D becomes the new master of R
- node B sends D its lkb, granted PR
- node A sends D its lkb, convert PR->CW
- D determines the correct lock state is:
  granted: PR node B
  convert: PR->CW node A

The lkb sent by each node was recreated without
any change on the new master node.

Example 2

node C, resource R
granted: PR node A
granted: NL node C
granted: NL node D
waiting: CW node B

- A sends convert PR->CW to C
- C grants the conversion to CW for A
- C grants the waiting request for CW to B
- C sends granted message to B, but fails
  before it can send the granted message to A
- B receives the granted message from C

At this point:
- A believes it is converting PR->CW
- B believes it is holding a CW lock

The correct lock recovery steps in this example are:
- node D becomes the new master of R
- node A sends D its lkb, convert PR->CW
- node B sends D its lkb, granted CW
- D determins the correct lock state is:
  granted: CW node B
  granted: CW node A

The lkb sent by B is recreated without change,
but the lkb sent by A is changed because the
granted mode was not compatible.

Fixes to make this work correctly:

recover_convert_waiter: should not make any changes
to a converting lkb that is still waiting for a reply
message.  It was previously setting grmode to IV, which
is invalid state, so the lkb would not be handled
correctly by other code.

receive_rcom_lock_args: was checking the wrong lkb field
(wait_type instead of status) to determine if the lkb is
being converted, and in need of inspection for this special
recovery.  It was also setting grmode to IV in the lkb,
causing it to be mishandled by other code.
Now, this function just puts the lkb, directly as sent,
onto the convert queue of the resource being recovered,
and corrects it in recover_conversion() later, if needed.

recover_conversion: the job of this function is to detect
and correct lkb states for the special PR/CW conversions.
The new code now checks for recovered lkbs on the granted
queue with grmode PR or CW, and takes the real grmode from
that.  Then it looks for lkbs on the convert queue with an
incompatible grmode (i.e. grmode PR when the real grmode is
CW, or v.v.)  These converting lkbs need to be fixed.
They are fixed by temporarily setting their grmode to NL,
so that grmodes are not incompatible and won't confuse other
locking code.  The converting lkb will then be granted at
the end of recovery, replacing the temporary NL grmode.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
2024-11-15 13:39:36 -06:00
arch x86: kvm: fix build error 2024-09-29 14:47:33 -07:00
block for-6.12/block-20240925 2024-09-25 14:56:40 -07:00
certs sign-file,extract-cert: use pkcs11 provider for OPENSSL MAJOR >= 3 2024-09-20 19:52:48 +03:00
crypto KEYS: prevent NULL pointer dereference in find_asymmetric_key() 2024-09-20 19:49:49 +03:00
Documentation mhu-v3, omap2+ : fix kconfig dependencies 2024-09-29 09:53:04 -07:00
drivers mhu-v3, omap2+ : fix kconfig dependencies 2024-09-29 09:53:04 -07:00
fs dlm: fix recovery of middle conversions 2024-11-15 13:39:36 -06:00
include dma-mapping fixes for Linux 6.12 2024-09-29 09:35:10 -07:00
init Rust changes for v6.12 2024-09-25 10:25:40 -07:00
io_uring for-6.12/io_uring-20240922 2024-09-24 11:11:38 -07:00
ipc struct fd layout change (and conversion to accessor helpers) 2024-09-23 09:35:36 -07:00
kernel Locking changes for v6.12: 2024-09-29 08:51:30 -07:00
lib bitmap-for-6.12 2024-09-27 12:10:45 -07:00
LICENSES LICENSES: add 0BSD license text 2024-09-01 20:43:24 -07:00
mm 19 hotfixes. 13 are cc:stable. 2024-09-27 10:27:22 -07:00
net Three CephFS fixes from Xiubo and Luis and a bunch of assorted 2024-09-28 08:40:36 -07:00
rust Rust changes for v6.12 2024-09-25 10:25:40 -07:00
samples [tree-wide] finally take no_llseek out 2024-09-27 08:18:43 -07:00
scripts Reduce Coccinelle choices in string_choices.cocci 2024-09-28 21:33:11 +02:00
security One bugfix patch, one preparation patch, and one conversion patch. 2024-09-27 12:03:48 -07:00
sound [tree-wide] finally take no_llseek out 2024-09-27 08:18:43 -07:00
tools linux_kselftest-next-6.12-rc1-fixes 2024-09-29 08:37:03 -07:00
usr initramfs: shorten cmd_initfs in usr/Makefile 2024-07-16 01:07:52 +09:00
virt x86: 2024-09-28 09:20:14 -07:00
.clang-format clang-format: Update with v6.11-rc1's for_each macro list 2024-08-02 13:20:31 +02:00
.cocciconfig
.editorconfig .editorconfig: remove trim_trailing_whitespace option 2024-06-13 16:47:52 +02:00
.get_maintainer.ignore Add Jeff Kirsher to .get_maintainer.ignore 2024-03-08 11:36:54 +00:00
.gitattributes .gitattributes: set diff driver for Rust source code files 2023-05-31 17:48:25 +02:00
.gitignore Kbuild updates for v6.12 2024-09-24 13:02:06 -07:00
.mailmap Summary 2024-09-24 11:08:40 -07:00
.rustfmt.toml
COPYING
CREDITS MAINTAINERS: Mark powerpc spufs as orphaned 2024-08-19 21:27:56 +10:00
Kbuild
Kconfig
MAINTAINERS Modules changes for v6.12-rc1 2024-09-28 09:06:15 -07:00
Makefile Linux 6.12-rc1 2024-09-29 15:06:19 -07:00
README README: Fix spelling 2024-03-18 03:36:32 -06:00

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the reStructuredText markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.