linux

mirror of https://github.com/torvalds/linux.git synced 2024-11-22 04:02:20 +00:00

Author	SHA1	Message	Date
Kent Overstreet	8863d1e092	bcachefs: BCH_IOCTL_QUERY_ACCOUNTING Add a new ioctl that can return the new accounting counter types; it takes as input a bitmask of accounting types to return. This will be used for returning e.g. compression accounting and rebalance_work accounting. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-07-14 19:00:15 -04:00
Reed Riley	7f3dc6c98b	bcachefs: support REMAP_FILE_DEDUP in bch2_remap_file_range By removing the early-exit when REMAP_FILE_DEDUP is set, we should be able to support the fideduperange ioctl, albeit less efficiently than if we handled some of the extent locking and comparison logic inside bcachefs. Extent comparison logic already exists inside of `__generic_remap_file_range_prep`. Signed-off-by: Reed Riley <reed@riley.engineer> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-07-14 19:00:15 -04:00
Hongbo Li	7a254053a5	bcachefs: support FS_IOC_SETFSLABEL Implement support for FS_IOC_SETFSLABEL ioctl to set filesystem label. Signed-off-by: Hongbo Li <lihongbo22@huawei.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-07-14 19:00:15 -04:00
Hongbo Li	81bce3cf2b	bcachefs: support get fs label Implement support for FS_IOC_GETFSLABEL ioctl to read filesystem label. Signed-off-by: Hongbo Li <lihongbo22@huawei.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-07-14 19:00:15 -04:00
Hongbo Li	8a4ef7e28a	bcachefs: implement FS_IOC_GETVERSION to support lsattr In this patch we add the FS_IOC_GETVERSION ioctl for getting i_generation from inode, after that, users can list file's generation number by using "lsattr". Signed-off-by: Hongbo Li <lihongbo22@huawei.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-07-14 19:00:14 -04:00
Kent Overstreet	889fb3dc5d	bcachefs: Unlock trans when waiting for user input in fsck We can't hold locks while waiting for user input, that's a deadlock. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-07-14 19:00:14 -04:00
Youling Tang	747d1d6c7e	bcachefs: Add tracepoints for bch2_sync_fs() and bch2_fsync() Add trace_bch2_sync_fs() and trace_bch2_fsync() implementations. The output in trace is as follows: sync-29779 [000] ..... 193.700935: bch2_sync_fs: dev 254,16 wait 1 <...>-40027 [002] ..... 342.535227: bch2_fsync: dev 254,32 ino 4099 parent 4096 datasync 1 Signed-off-by: Youling Tang <tangyouling@kylinos.cn> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-07-14 19:00:14 -04:00
Youling Tang	8b0882505d	bcachefs: track writeback errors using the generic tracking infrastructure We already using mapping_set_error() in bch2_writepage_io_done(), so all we need to do is to use file_check_and_advance_wb_err() when handling fsync() requests in bch2_fsync(). Signed-off-by: Youling Tang <tangyouling@kylinos.cn> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-07-14 19:00:14 -04:00
Ariel Miculas	f8b0147364	bcachefs: bch2_dir_emit() - fix directory reads in the fuse driver Commit `0c0cbfdb84` dropped the ctx->pos update before the call to dir_emit. This breaks the userspace implementation, causing the directory reads to be stuck in an infinite loop. This doesn't happen in the kernel because the vfs handles the updates to ctx->pos, but in the fuse implementation nobody updates it. Signed-off-by: Ariel Miculas <ariel.miculas@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-07-14 19:00:14 -04:00
Kent Overstreet	7ed122aea2	bcachefs: twf: delete dead struct fields Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-07-14 19:00:14 -04:00
Kent Overstreet	d37dd9b604	bcachefs: bch2_stdio_redirect_readline_timeout() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-07-14 19:00:14 -04:00
Kent Overstreet	0c97c437e3	bcachefs: twf: convert bch2_stdio_redirect_readline() to darray We now read the line from the buffer atomically, which means we have to allow the buffer to grow past STDIO_REDIRECT_BUFSIZE if we're waiting for a full line - this behaviour is necessary for stdio_redirect_readline_timeout() in the next patch. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-07-14 19:00:14 -04:00
Kent Overstreet	36008d5d01	bcachefs: Plumb more logging through stdio redirect Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-07-14 19:00:14 -04:00
Kent Overstreet	a850bde649	bcachefs: fsck_err() may now take a btree_trans fsck_err() now optionally takes a btree_trans; if the current thread has one, it is required that it be passed. The next patch will use this to unlock when waiting for user input. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-07-14 19:00:14 -04:00
Kent Overstreet	38e3ca275c	bcachefs: btree_types bitmask cleanups Make things more consistent and ensure that we're using u64 bitfields - key types and btree ids are already around 32 bits. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-07-14 19:00:14 -04:00
Kent Overstreet	174722de55	bcachefs: Delete old assertion for online fsck the order in which btree_gc walks keys have changed, so we no longer have the sort of issues with online fsck this assertion was warning about. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-07-14 19:00:14 -04:00
Kent Overstreet	38ad9dc8c6	bcachefs: Initialize gc buckets in alloc trigger Needed for online fsck; we need the trigger to initialize newly allocated buckets and generation number changes while gc is running. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-07-14 19:00:14 -04:00
Kent Overstreet	9ab55df599	bcachefs: Walk leaf to root in btree_gc Next change will move gc_alloc_start initialization into the alloc trigger, so we have to mark those first. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-07-14 19:00:14 -04:00
Kent Overstreet	86d46471d5	bcachefs: Don't block journal when finishing check_allocations() Blocking the journal was needed to finish checking old style accounting, but that code is gone and it's not needed in the alloc rewrite, mark_lock is sufficient for synchronization. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-07-14 19:00:14 -04:00
Kent Overstreet	5645c32ccf	bcachefs: bch2_fs_get_tree() cleanup - improve error paths - call bch2_fs_start() separately, after applying late-parsed options Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-07-14 19:00:14 -04:00
Kent Overstreet	25ee25e637	bcachefs: Kill bch2_mount() Fold into bch2_fs_get_tree() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-07-14 19:00:14 -04:00
Kent Overstreet	b9efa9673e	bcachefs: Eytzinger accumulation for accounting keys The btree write buffer takes as input keys from the journal, sorts them, deduplicates them, and flushes them back to the btree in sorted order. The disk space accounting rewrite is moving accounting to normal btree keys, with update (in this case deltas) accumulated in the write buffer and then flushed to the btree; but this is going to increase the number of keys handled by the write buffer by perhaps as much as a factor of 3x-5x. The overhead from copying around and sorting this many keys would cause a significant performance regression, but: there is huge locality in updates to accounting keys that we can take advantage of. Instead of appending accounting keys to the list of keys to be sorted, this patch adds an eytzinger search tree of recently seen accounting keys. We look up the accounting key in the eytzinger search tree and apply the delta directly, adding it if it doesn't exist, and periodically prune the eytzinger tree of unused entries. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-07-14 19:00:14 -04:00
Kent Overstreet	20ac515a9c	bcachefs: bch_acct_rebalance_work Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-07-14 19:00:14 -04:00
Kent Overstreet	6af91147b6	bcachefs: bch_acct_btree Add counters for how much disk space we're using per btree. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-07-14 19:00:14 -04:00
Kent Overstreet	6675c37662	bcachefs: bch_acct_snapshot Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-07-14 19:00:14 -04:00
Kent Overstreet	72c2778780	bcachefs: bch2_fs_usage_base_to_text() Helper to show raw accounting in sysfs, mainly for debugging. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-07-14 19:00:13 -04:00
Kent Overstreet	f93bb76ba2	bcachefs: bch2_fs_accounting_to_text() Helper to show raw accounting in sysfs, mainly for debugging. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-07-14 19:00:13 -04:00
Kent Overstreet	91f44781d5	bcachefs: Convert bch2_compression_stats_to_text() to new accounting We no longer have to walk the whole btree to calculate compression stats. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-07-14 19:00:13 -04:00
Kent Overstreet	bfcaa9079d	bcachefs: bch_acct_compression This adds per-compression-type accounting of compressed and uncompressed size as well as number of extents - meaning we can now see compression ratio (without walking the whole filesystem). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-07-14 19:00:13 -04:00
Kent Overstreet	5668e5deec	bcachefs: bch2_verify_accounting_clean() Verify that the in-memory accounting verifies the on-disk accounting after a clean shutdown. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-07-14 19:00:13 -04:00
Kent Overstreet	00839addfc	bcachefs: Convert bch2_replicas_gc2() to new accounting bch2_replicas_gc2() is used for garbage collection superblock replicas entries that are empty - this converts it to the new accounting scheme. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-07-14 19:00:13 -04:00
Kent Overstreet	fb23d57a6d	bcachefs: Convert gc to new accounting Rewrite fsck/gc for the new accounting scheme. This adds a second set of in-memory accounting counters for gc to use; like with other parts of gc we run all trigger in TRIGGER_GC mode, then compare what we calculated to existing in-memory accounting at the end. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-07-14 19:00:13 -04:00
Kent Overstreet	4c4a7d48bd	bcachefs: Kill replicas_journal_res More dead code deletion Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-07-14 19:00:13 -04:00
Kent Overstreet	66a57684c6	bcachefs: Kill fs_usage_online More dead code deletion. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-07-14 19:00:13 -04:00
Kent Overstreet	fe5eddc0d0	bcachefs: Kill bch2_fs_usage_to_text() Dead code. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-07-14 19:00:13 -04:00
Kent Overstreet	8bb8d683a4	bcachefs: Delete journal-buf-sharded old style accounting More deletion of dead code. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-07-14 19:00:13 -04:00
Kent Overstreet	5b9bc272e6	bcachefs: Kill writing old accounting to journal More ripping out of the old disk space accounting. Note that the new disk space accounting is incompatible with the old, and writing out old style disk space accounting with the new code is infeasible. This means upgrading and downgrading past this version requires regenerating accounting. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-07-14 19:00:13 -04:00
Kent Overstreet	3afb8dbf03	bcachefs: kill bch2_fs_usage_read() With bch2_ioctl_fs_usage(), this is now dead code. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-07-14 19:00:13 -04:00
Kent Overstreet	6b39638b84	bcachefs: Convert bch2_ioctl_fs_usage() to new accounting This converts bch2_ioctl_fs_usage() to read from the new disk accounting, via bch2_fs_replicas_usage_read(). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-07-14 19:00:13 -04:00
Kent Overstreet	72a6bb098c	bcachefs: Kill bch2_fs_usage_initialize() Deleting code for the old disk accounting scheme. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-07-14 19:00:13 -04:00
Kent Overstreet	f5095b9f85	bcachefs: dev_usage updated by new accounting Reading disk accounting now requires an eytzinger lookup (see: bch2_accounting_mem_read()), but the per-device counters are used frequently enough that we'd like to still be able to read them with just a percpu sum, as in the old code. This patch special cases the device counters; when we update in-memory accounting we also update the old style percpu counters if it's a deice counter update. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-07-14 19:00:13 -04:00
Kent Overstreet	2e8d686a4a	bcachefs: Coalesce accounting keys before journal replay This fixes a performance regression in journal replay; without colaescing accounting keys we have multiple keys at the same position, which means journal_keys_peek_upto() has to skip past many overwritten keys - turning journal replay into an O(n^2) algorithm. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-07-14 19:00:13 -04:00
Kent Overstreet	1d16c605cc	bcachefs: Disk space accounting rewrite Main part of the disk accounting rewrite. This is a wholesale rewrite of the existing disk space accounting, which relies on percepu counters that are sharded by journal buffer, and rolled up and added to each journal write. With the new scheme, every set of counters is a distinct key in the accounting btree; this fixes scaling limitations of the old scheme, where counters took up space in each journal entry and required multiple percpu counters. Now, in memory accounting requires a single set of percpu counters - not multiple for each in flight journal buffer - and in the future we'll probably also have counters that don't use in memory percpu counters, they're not strictly required. An accounting update is now a normal btree update, using the btree write buffer path. At transaction commit time, we apply accounting updates to the in memory counters, which are percpu counters indexed in an eytzinger tree by the accounting key. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-07-14 19:00:13 -04:00
Kent Overstreet	5d9667d1d6	bcachefs: btree write buffer knows how to accumulate bch_accounting keys Teach the btree write buffer how to accumulate accounting keys - instead of having the newer key overwrite the older key as we do with other updates, we need to add them together. Also, add a flag so that write buffer flush knows when journal replay is finished flushing accounting, and teach it to hold accounting keys until that flag is set. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-07-14 19:00:13 -04:00
Kent Overstreet	9dec2a473b	bcachefs: Accumulate accounting keys in journal replay Until accounting keys hit the btree, they are deltas, not new versions of the existing key; this means we have to teach journal replay to accumulate them. Additionally, the journal doesn't track precisely which entries have been flushed to the btree; it only tracks a range of entries that may possibly still need to be flushed. That means we need to compare accounting keys against the version in the btree and only flush updates that are newer. There's another wrinkle with the write buffer: if the write buffer starts flushing accounting keys before journal replay has finished flushing accounting keys, journal replay will see the version number from the new updates and updates from the journal will be lost. To avoid this, journal replay has to flush accounting keys first, and we'll be adding a flag so that write buffer flush knows to hold accounting keys until then. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-07-14 19:00:13 -04:00
Kent Overstreet	2744e5c9eb	bcachefs: KEY_TYPE_accounting New key type for the disk space accounting rewrite. - Holds a variable sized array of u64s (may be more than one for accounting e.g. compressed and uncompressed size, or buckets and sectors for a given data type) - Updates are deltas, not new versions of the key: this means updates to accounting can happen via the btree write buffer, which we'll be teaching to accumulate deltas. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-07-14 19:00:13 -04:00
Thomas Bertschinger	929d954330	bcachefs: use new mount API This updates bcachefs to use the new mount API: - Update the file_system_type to use the new init_fs_context() function. - Define the new fs_context_operations functions. - No longer register bch2_mount() and bch2_remount(); these are now called via the new fs_context functions. - Define a new helper type, bch2_opts_parse that includes a struct bch_opts and additionally a printbuf used to save options that can't be parsed until after the FS is opened. This enables us to parse as many options as possible prior to opening the filesystem while saving those options that need the open FS for later parsing. Signed-off-by: Thomas Bertschinger <tahbertschinger@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-07-14 19:00:12 -04:00
Thomas Bertschinger	1c12d1caf8	bcachefs: Add error code to defer option parsing This introduces a new error code, option_needs_open_fs, which is used to indicate that an attempt was made to parse a mount option prior to opening a filesystem, when that mount option requires an open filesystem in order to be validated. Returning this error results in bch2_parse_one_mount_opt() saving that option for later parsing, after the filesystem is opened. Signed-off-by: Thomas Bertschinger <tahbertschinger@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-07-14 19:00:12 -04:00
Thomas Bertschinger	9b7f0b5d3d	bcachefs: add printbuf arg to bch2_parse_mount_opts() Mount options that take the name of a device that may be part of a filesystem, for example "metadata_target", cannot be validated until after the filesystem has been opened. However, an attempt to parse those options may be made prior to the filesystem being opened. This change adds a printbuf parameter to bch2_parse_mount_opts() which will be used to save those mount options, when they are supplied prior to the FS being opened, so that they can be parsed later. This functionality is not currently needed, but will be used after bcachefs starts using the new mount API to parse mount options. This is because using the new mount API, we will process mount options prior to opening the FS, but the new API doesn't provide a convenient way to "replay" mount option parsing. So we save these options ourselves to accomplish this. This change also splits out the code to parse a single option into bch2_parse_one_mount_opt(), which will be useful when using the new mount API which deals with a single mount option at a time. Signed-off-by: Thomas Bertschinger <tahbertschinger@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-07-14 19:00:12 -04:00
Kent Overstreet	7773df19c3	bcachefs: metadata version bucket_stripe_sectors New on disk format version for bch_alloc->stripe_sectors and BCH_DATA_unstriped - accounting for unstriped data in stripe buckets. Upgrade/downgrade requires regenerating alloc info - but only if erasure coding is in use. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-07-14 19:00:12 -04:00

1 2 3 4 5 ...

1281600 Commits