mirror of
https://github.com/torvalds/linux.git
synced 2024-11-27 22:51:35 +00:00
doc: update ext4 and journalling docs to include fast commit feature
This patch adds necessary documentation for fast commits. Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> Link: https://lore.kernel.org/r/20201015203802.3597742-2-harshadshirwadkar@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
This commit is contained in:
parent
e0770e9142
commit
f5b8b297b0
@ -28,6 +28,17 @@ metadata are written to disk through the journal. This is slower but
|
||||
safest. If ``data=writeback``, dirty data blocks are not flushed to the
|
||||
disk before the metadata are written to disk through the journal.
|
||||
|
||||
In case of ``data=ordered`` mode, Ext4 also supports fast commits which
|
||||
help reduce commit latency significantly. The default ``data=ordered``
|
||||
mode works by logging metadata blocks to the journal. In fast commit
|
||||
mode, Ext4 only stores the minimal delta needed to recreate the
|
||||
affected metadata in fast commit space that is shared with JBD2.
|
||||
Once the fast commit area fills in or if fast commit is not possible
|
||||
or if JBD2 commit timer goes off, Ext4 performs a traditional full commit.
|
||||
A full commit invalidates all the fast commits that happened before
|
||||
it and thus it makes the fast commit area empty for further fast
|
||||
commits. This feature needs to be enabled at mkfs time.
|
||||
|
||||
The journal inode is typically inode 8. The first 68 bytes of the
|
||||
journal inode are replicated in the ext4 superblock. The journal itself
|
||||
is normal (but hidden) file within the filesystem. The file usually
|
||||
@ -609,3 +620,58 @@ bytes long (but uses a full block):
|
||||
- h\_commit\_nsec
|
||||
- Nanoseconds component of the above timestamp.
|
||||
|
||||
Fast commits
|
||||
~~~~~~~~~~~~
|
||||
|
||||
Fast commit area is organized as a log of tag length values. Each TLV has
|
||||
a ``struct ext4_fc_tl`` in the beginning which stores the tag and the length
|
||||
of the entire field. It is followed by variable length tag specific value.
|
||||
Here is the list of supported tags and their meanings:
|
||||
|
||||
.. list-table::
|
||||
:widths: 8 20 20 32
|
||||
:header-rows: 1
|
||||
|
||||
* - Tag
|
||||
- Meaning
|
||||
- Value struct
|
||||
- Description
|
||||
* - EXT4_FC_TAG_HEAD
|
||||
- Fast commit area header
|
||||
- ``struct ext4_fc_head``
|
||||
- Stores the TID of the transaction after which these fast commits should
|
||||
be applied.
|
||||
* - EXT4_FC_TAG_ADD_RANGE
|
||||
- Add extent to inode
|
||||
- ``struct ext4_fc_add_range``
|
||||
- Stores the inode number and extent to be added in this inode
|
||||
* - EXT4_FC_TAG_DEL_RANGE
|
||||
- Remove logical offsets to inode
|
||||
- ``struct ext4_fc_del_range``
|
||||
- Stores the inode number and the logical offset range that needs to be
|
||||
removed
|
||||
* - EXT4_FC_TAG_CREAT
|
||||
- Create directory entry for a newly created file
|
||||
- ``struct ext4_fc_dentry_info``
|
||||
- Stores the parent inode number, inode number and directory entry of the
|
||||
newly created file
|
||||
* - EXT4_FC_TAG_LINK
|
||||
- Link a directory entry to an inode
|
||||
- ``struct ext4_fc_dentry_info``
|
||||
- Stores the parent inode number, inode number and directory entry
|
||||
* - EXT4_FC_TAG_UNLINK
|
||||
- Unlink a directory entry of an inode
|
||||
- ``struct ext4_fc_dentry_info``
|
||||
- Stores the parent inode number, inode number and directory entry
|
||||
|
||||
* - EXT4_FC_TAG_PAD
|
||||
- Padding (unused area)
|
||||
- None
|
||||
- Unused bytes in the fast commit area.
|
||||
|
||||
* - EXT4_FC_TAG_TAIL
|
||||
- Mark the end of a fast commit
|
||||
- ``struct ext4_fc_tail``
|
||||
- Stores the TID of the commit, CRC of the fast commit of which this tag
|
||||
represents the end of
|
||||
|
||||
|
@ -132,6 +132,39 @@ The opportunities for abuse and DOS attacks with this should be obvious,
|
||||
if you allow unprivileged userspace to trigger codepaths containing
|
||||
these calls.
|
||||
|
||||
Fast commits
|
||||
~~~~~~~~~~~~
|
||||
|
||||
JBD2 to also allows you to perform file-system specific delta commits known as
|
||||
fast commits. In order to use fast commits, you first need to call
|
||||
:c:func:`jbd2_fc_init` and tell how many blocks at the end of journal
|
||||
area should be reserved for fast commits. Along with that, you will also need
|
||||
to set following callbacks that perform correspodning work:
|
||||
|
||||
`journal->j_fc_cleanup_cb`: Cleanup function called after every full commit and
|
||||
fast commit.
|
||||
|
||||
`journal->j_fc_replay_cb`: Replay function called for replay of fast commit
|
||||
blocks.
|
||||
|
||||
File system is free to perform fast commits as and when it wants as long as it
|
||||
gets permission from JBD2 to do so by calling the function
|
||||
:c:func:`jbd2_fc_begin_commit()`. Once a fast commit is done, the client
|
||||
file system should tell JBD2 about it by calling
|
||||
:c:func:`jbd2_fc_end_commit()`. If file system wants JBD2 to perform a full
|
||||
commit immediately after stopping the fast commit it can do so by calling
|
||||
:c:func:`jbd2_fc_end_commit_fallback()`. This is useful if fast commit operation
|
||||
fails for some reason and the only way to guarantee consistency is for JBD2 to
|
||||
perform the full traditional commit.
|
||||
|
||||
JBD2 helper functions to manage fast commit buffers. File system can use
|
||||
:c:func:`jbd2_fc_get_buf()` and :c:func:`jbd2_fc_wait_bufs()` to allocate
|
||||
and wait on IO completion of fast commit buffers.
|
||||
|
||||
Currently, only Ext4 implements fast commits. For details of its implementation
|
||||
of fast commits, please refer to the top level comments in
|
||||
fs/ext4/fast_commit.c.
|
||||
|
||||
Summary
|
||||
~~~~~~~
|
||||
|
||||
|
Loading…
Reference in New Issue
Block a user