linux/Documentation
Christian Brauner b40508ca5d
Merge patch series "timekeeping/fs: multigrain timestamp redux"
Jeff Layton <jlayton@kernel.org> says:

The VFS has always used coarse-grained timestamps when updating the
ctime and mtime after a change. This has the benefit of allowing
filesystems to optimize away a lot metadata updates, down to around 1
per jiffy, even when a file is under heavy writes.

Unfortunately, this has always been an issue when we're exporting via
NFSv3, which relies on timestamps to validate caches. A lot of changes
can happen in a jiffy, so timestamps aren't sufficient to help the
client decide when to invalidate the cache. Even with NFSv4, a lot of
exported filesystems don't properly support a change attribute and are
subject to the same problems with timestamp granularity. Other
applications have similar issues with timestamps (e.g backup
applications).

If we were to always use fine-grained timestamps, that would improve the
situation, but that becomes rather expensive, as the underlying
filesystem would have to log a lot more metadata updates.

What we need is a way to only use fine-grained timestamps when they are
being actively queried. Use the (unused) top bit in inode->i_ctime_nsec
as a flag that indicates whether the current timestamps have been
queried via stat() or the like. When it's set, we allow the kernel to
use a fine-grained timestamp iff it's necessary to make the ctime show
a different value.

This solves the problem of being able to distinguish the timestamp
between updates, but introduces a new problem: it's now possible for a
file being changed to get a fine-grained timestamp. A file that is
altered just a bit later can then get a coarse-grained one that appears
older than the earlier fine-grained time. This violates timestamp
ordering guarantees.

To remedy this, keep a global monotonic atomic64_t value that acts as a
timestamp floor.  When we go to stamp a file, we first get the latter of
the current floor value and the current coarse-grained time. If the
inode ctime hasn't been queried then we just attempt to stamp it with
that value.

If it has been queried, then first see whether the current coarse time
is later than the existing ctime. If it is, then we accept that value.
If it isn't, then we get a fine-grained time and try to swap that into
the global floor. Whether that succeeds or fails, we take the resulting
floor time, convert it to realtime and try to swap that into the ctime.

We take the result of the ctime swap whether it succeeds or fails, since
either is just as valid.

Filesystems can opt into this by setting the FS_MGTIME fstype flag.
Others should be unaffected (other than being subject to the same floor
value as multigrain filesystems).

* patches from https://lore.kernel.org/r/20241002-mgtime-v10-0-d1c4717f5284@kernel.org:
  tmpfs: add support for multigrain timestamps
  btrfs: convert to multigrain timestamps
  ext4: switch to multigrain timestamps
  xfs: switch to multigrain timestamps
  Documentation: add a new file documenting multigrain timestamps
  fs: add percpu counters for significant multigrain timestamp events
  fs: tracepoints around multigrain timestamp events
  fs: handle delegated timestamps in setattr_copy_mgtime
  fs: have setattr_copy handle multigrain timestamps appropriately
  fs: add infrastructure for multigrain timestamps

Link: https://lore.kernel.org/r/20241002-mgtime-v10-0-d1c4717f5284@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-10-10 10:20:57 +02:00
..
ABI Char/Misc and other driver changes for 6.12-rc1 2024-09-26 10:13:08 -07:00
accel drm next for 6.12-rc1 2024-09-19 10:18:15 +02:00
accounting
admin-guide x86: 2024-09-28 09:20:14 -07:00
arch arm64 fixes for 6.12-rc2: 2024-10-04 12:20:09 -07:00
block docs: block: Fix grammar and spelling mistakes in bfq-iosched.rst 2024-09-05 14:38:10 -06:00
bpf docs/bpf: Add missing BPF program types to docs 2024-09-12 10:56:41 -07:00
cdrom
core-api vfs-6.12-rc2.fixes.2 2024-10-03 09:22:50 -07:00
cpu-freq
crypto
dev-tools The core clk framework is left largely untouched this time around except for 2024-09-23 15:01:48 -07:00
devicetree sound fixes for 6.12-rc2 2024-10-04 11:29:46 -07:00
doc-guide
driver-api platform/x86: wmi: Update WMI driver API documentation 2024-10-06 12:48:52 +02:00
fault-injection Fix typo "allocateed" to allocated 2024-08-26 15:37:25 -06:00
fb
features x86: remove PG_uncached 2024-09-03 21:15:46 -07:00
filesystems Merge patch series "timekeeping/fs: multigrain timestamp redux" 2024-10-10 10:20:57 +02:00
firmware_class
firmware-guide
fpga
gpu Short summary of fixes pull: 2024-10-01 08:15:55 +10:00
hid Documentation: hid: intel-ish-hid: Add vendor custom firmware loading 2024-08-19 21:12:27 +02:00
hwmon hwmon: Remove devm_hwmon_device_unregister() API function 2024-09-13 07:27:36 -07:00
i2c i2c: testunit: add SMBusAlert trigger 2024-08-26 15:15:48 +02:00
iio doc: iio: ad4695: update for calibration support 2024-09-03 18:49:43 +01:00
images
infiniband
input
isdn
kbuild kbuild: doc: replace "gcc" in external module description 2024-09-24 03:07:21 +09:00
kernel-hacking
leds - Limited LED current based on thermal conditions in the QCOM flash LED driver. 2024-09-23 14:20:11 -07:00
litmus-tests
livepatch Documentation: livepatch: Correct release locks antonym 2024-09-04 13:42:27 +02:00
locking
maintainer
mhi
misc-devices
mm ALong with the usual shower of singleton patches, notable patch series in 2024-09-21 07:29:05 -07:00
netlabel
netlink Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2024-09-12 17:11:24 -07:00
networking doc: net: napi: Update documentation for napi_schedule_irqoff 2024-10-03 12:07:29 +02:00
nvdimm
nvme Remove duplicate "and" in 'Linux NVMe docs. 2024-09-10 15:44:20 -06:00
PCI Documentation: PCI: fix typo in pci.rst 2024-09-10 15:30:42 -06:00
pcmcia
peci
power Documentation: PM: Discourage use of deprecated macros 2024-09-04 14:37:57 +02:00
process Kbuild updates for v6.12 2024-09-24 13:02:06 -07:00
RCU Merge branches 'context_tracking.15.08.24a', 'csd.lock.15.08.24a', 'nocb.09.09.24a', 'rcutorture.14.08.24a', 'rcustall.09.09.24a', 'srcu.12.08.24a', 'rcu.tasks.14.08.24a', 'rcu_scaling_tests.15.08.24a', 'fixes.12.08.24a' and 'misc.11.08.24a' into next.09.09.24a 2024-09-09 00:09:47 +05:30
rust Rust changes for v6.12 2024-09-25 10:25:40 -07:00
scheduler sched_ext: Provide a sysfs enable_seq counter 2024-09-23 06:53:02 -10:00
scsi
security documentation: add IPE documentation 2024-08-20 14:03:47 -04:00
sound Docs/sound: Add documentation for userspace-driven ALSA timers 2024-08-18 09:55:54 +02:00
sphinx docs: kerneldoc-preamble.sty: Suppress extra spaces in CJK literal blocks 2024-09-05 14:16:41 -06:00
sphinx-static
spi
staging xz: remove XZ_EXTERN and extern from functions 2024-09-01 20:43:27 -07:00
target
tee
timers treewide: Fix wrong singular form of jiffies in comments 2024-09-08 20:47:40 +02:00
tools
trace tracing/Documentation: Start a document on how to debug with tracing 2024-08-26 13:54:08 -04:00
translations move asm/unaligned.h to linux/unaligned.h 2024-10-02 17:23:23 -04:00
usb usb: gadget: f_uac1: Change volume name and remove alt names 2024-08-13 18:11:35 +02:00
userspace-api Landlock updates for v6.12-rc1 2024-09-24 10:40:11 -07:00
virt x86: 2024-09-28 09:20:14 -07:00
w1
watchdog [tree-wide] finally take no_llseek out 2024-09-27 08:18:43 -07:00
wmi platform/x86: dell-ddv: Fix typo in documentation 2024-10-06 12:47:40 +02:00
.gitignore
atomic_bitops.txt
atomic_t.txt
Changes
CodingStyle
conf.py
docutils.conf
dontdiff Kbuild updates for v6.12 2024-09-24 13:02:06 -07:00
index.rst
Kconfig
Makefile
memory-barriers.txt docs/memory-barriers.txt: Remove left-over references to "CACHE COHERENCY" 2024-09-13 23:56:44 -07:00
SubmittingPatches
subsystem-apis.rst