linux

Author	SHA1	Message	Date
Darrick J. Wong	4b4c1326fd	xfs: treat CoW fork operations as delalloc for quota accounting Since the CoW fork only exists in memory, it is incorrect to update the on-disk quota block counts when we modify the CoW fork. Unlike the data fork, even real extents in the CoW fork are only delalloc-style reservations (on-disk they're owned by the refcountbt) so they must not be tracked in the on disk quota info. Ensure the i_delayed_blks accounting reflects this too. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>	2018-01-29 07:27:23 -08:00
Darrick J. Wong	01c2e13dca	xfs: only grab shared inode locks for source file during reflink Reflink and dedupe operations remap blocks from a source file into a destination file. The destination file needs exclusive locks on all levels because we're updating its block map, but the source file isn't undergoing any block map changes so we can use a shared lock. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>	2018-01-29 07:27:23 -08:00
Darrick J. Wong	7c2d238ac6	xfs: allow xfs_lock_two_inodes to take different EXCL/SHARED modes Refactor xfs_lock_two_inodes to take separate locking modes for each inode. Specifically, this enables us to take a SHARED lock on one inode and an EXCL lock on the other. The lock class (MMAPLOCK/ILOCK) must be the same for each inode. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>	2018-01-29 07:27:23 -08:00
Darrick J. Wong	1364b1d4b5	xfs: reflink should break pnfs leases before sharing blocks Before we share blocks between files, we need to break the pnfs leases on the layout before we start slicing and dicing the block map. The structure of this function sets us up for the lock contention reduction in the next patch. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>	2018-01-29 07:27:23 -08:00
Darrick J. Wong	c47b74fb2d	xfs: don't clobber inobt/finobt cursors when xref with rmap Even if we can't use the inobt/finobt cursors to count the number of inode btree blocks, we are never allowed to clobber the cursor of the btree being checked, so don't do this. Found by fuzzing level = ones in xfs/364. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>	2018-01-29 07:27:23 -08:00
Darrick J. Wong	70c57dcd60	xfs: skip CoW writes past EOF when writeback races with truncate Every so often we blow the ASSERT(type != XFS_IO_COW) in xfs_map_blocks when running fsstress, as we do in generic/269. The cause of this is writeback racing with truncate -- writeback doesn't take the iolock, so truncate can sneak in to decrease i_size and truncate page cache while writeback is gathering buffer heads to schedule writeout. If we hit this race on a block that has a CoW mapping, we'll get a valid imap from the CoW fork but the reduced i_size trims the mapping to zero length (which makes it invalid), so we call xfs_map_blocks to try again. This doesn't do much anyway, since any mapping we get out of that will also be invalid, so we might as well skip the assert and just stop. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>	2018-01-29 07:27:23 -08:00
Amir Goldstein	acd1d71598	xfs: preserve i_rdev when recycling a reclaimable inode Commit `66f364649d` ("xfs: remove if_rdev") moved storing of rdev value for special inodes to VFS inodes, but forgot to preserve the value of i_rdev when recycling a reclaimable xfs_inode. This was detected by xfstest overlay/017 with inodex=on mount option and xfs base fs. The test does a lookup of overlay chardev and blockdev right after drop caches. Overlayfs inodes hold a reference on underlying xfs inodes when mount option index=on is configured. If drop caches reclaim xfs inodes, before it relclaims overlayfs inodes, that can sometimes leave a reclaimable xfs inode and that test hits that case quite often. When that happens, the xfs inode cache remains broken (zere i_rdev) until the next cycle mount or drop caches. Fixes: `66f364649d` ("xfs: remove if_rdev") Signed-off-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2018-01-29 07:27:23 -08:00
Darrick J. Wong	751f3767c2	xfs: refactor accounting updates out of xfs_bmap_btalloc Move all the inode and quota accounting updates out of xfs_bmap_btalloc in preparation for fixing some quota accounting problems with copy on write. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com>	2018-01-29 07:27:23 -08:00
Darrick J. Wong	22431bf3df	xfs: refactor inode verifier corruption error printing Refactor inode verifier error reporting into a non-libxfs function so that we aren't encoding the message format in libxfs. This also changes the kernel dmesg output to resemble buffer verifier errors more closely. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>	2018-01-29 07:27:22 -08:00
Darrick J. Wong	67a3f6d014	xfs: make tracepoint inode number format consistent Fix all the inode number formats to be consistently (0x%llx) in all trace point definitions. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>	2018-01-29 07:27:22 -08:00
Darrick J. Wong	beaae8cd58	xfs: always zero di_flags2 when we free the inode Always zero the di_flags2 field when we free the inode so that we never end up with an on-disk record for an unallocated inode that also has the reflink iflag set. This is in keeping with the general principle that only files can have the reflink iflag set, even though we'll zero out di_flags2 if we ever reallocate the inode. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>	2018-01-29 07:27:22 -08:00
Darrick J. Wong	09ac862397	xfs: call xfs_qm_dqattach before performing reflink operations Ensure that we've attached all the necessary dquots before performing reflink operations so that quota accounting is accurate. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>	2018-01-29 07:27:22 -08:00
Shan Hai	6ca30729c2	xfs: bmap code cleanup Remove the extent size hint and realtime inode relevant code from the xfs_bmapi_reserve_delalloc since it is not called on the inode with extent size hint set or on a realtime inode. Signed-off-by: Shan Hai <shan.hai@oracle.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2018-01-29 07:27:22 -08:00
Carlos Maiolino	643c8c05e7	Use list_head infra-structure for buffer's log items list Now that buffer's b_fspriv has been split, just replace the current singly linked list of xfs_log_items, by the list_head infrastructure. Also, remove the xfs_log_item argument from xfs_buf_resubmit_failed_buffers(), there is no need for this argument, once the log items can be walked through the list_head in the buffer. Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: Bill O'Donnell <billodo@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> [darrick: minor style cleanups] Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2018-01-29 07:27:22 -08:00
Carlos Maiolino	fb1755a645	Split buffer's b_fspriv field By splitting the b_fspriv field into two different fields (b_log_item and b_li_list). It's possible to get rid of an old ABI workaround, by using the new b_log_item field to store xfs_buf_log_item separated from the log items attached to the buffer, which will be linked in the new b_li_list field. This way, there is no more need to reorder the log items list to place the buf_log_item at the beginning of the list, simplifying a bit the logic to handle buffer IO. This also opens the possibility to change buffer's log items list into a proper list_head. b_log_item field is still defined as a void *, because it is still used by the log buffers to store xlog_in_core structures, and there is no need to add an extra field on xfs_buf just for xlog_in_core. Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: Bill O'Donnell <billodo@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> [darrick: minor style changes] Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2018-01-29 07:27:22 -08:00
Carlos Maiolino	70a2065533	Get rid of xfs_buf_log_item_t typedef Take advantage of the rework on xfs_buf log items list, to get rid of ths typedef for xfs_buf_log_item. This patch also fix some indentation alignment issues found along the way. Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: Bill O'Donnell <billodo@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2018-01-29 07:27:22 -08:00
Jeff Layton	d17260fd5f	xfs: avoid setting XFS_ILOG_CORE if i_version doesn't need incrementing If XFS_ILOG_CORE is already set then go ahead and increment it. Signed-off-by: Jeff Layton <jlayton@redhat.com> Acked-by: Darrick J. Wong <darrick.wong@oracle.com> Acked-by: Dave Chinner <dchinner@redhat.com>	2018-01-29 06:42:21 -05:00
Jeff Layton	f0e2828062	xfs: convert to new i_version API Signed-off-by: Jeff Layton <jlayton@redhat.com> Acked-by: Darrick J. Wong <darrick.wong@oracle.com> Acked-by: Dave Chinner <dchinner@redhat.com>	2018-01-29 06:42:21 -05:00
Darrick J. Wong	75d4a13b1f	xfs: fix non-debug build compiler warnings Fix compiler warning on non-debug build Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>	2018-01-17 21:00:47 -08:00
Darrick J. Wong	4bb73d0147	xfs: check sb_agblocks and sb_agblklog when validating superblock Currently, we don't check sb_agblocks or sb_agblklog when we validate the superblock, which means that we can fuzz garbage values into those values and the mount succeeds. This leads to all sorts of UBSAN warnings in xfs/350 since we can then coerce other parts of xfs into shifting by ridiculously large values. Once we've validated agblocks, make sure the agcount makes sense. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>	2018-01-17 21:00:47 -08:00
Darrick J. Wong	be78ff0e72	xfs: recheck reflink / dirty page status before freeing CoW reservations Eryu Guan reported seeing occasional hangs when running generic/269 with a new fsstress that supports clonerange/deduperange. The cause of this hang is an infinite loop when we convert the CoW fork extents from unwritten to real just prior to writing the pages out; the infinite loop happens because there's nothing in the CoW fork to convert, and so it spins forever. The fundamental issue here is that when we go to perform these CoW fork conversions, we're supposed to have an extent waiting for us, but the low space CoW reaper has snuck in and blown them away! There are four conditions that can dissuade the reaper from touching our file -- no reflink iflag; dirty page cache; writeback in progress; or directio in progress. We check the four conditions prior to taking the locks, but we neglect to recheck them once we have the locks, which is how we end up whacking the writeback that's in progress. Therefore, refactor the four checks into a helper function and call it once again once we have the locks to make sure we really want to reap the inode. While we're at it, add an ASSERT for this weird condition so that we'll fail noisily if we ever screw this up again. Reported-by: Eryu Guan <eguan@redhat.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Tested-by: Eryu Guan <eguan@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com>	2018-01-17 21:00:47 -08:00
Darrick J. Wong	a5f460b168	xfs: check that br_blockcount doesn't overflow xfs_bmbt_irec.br_blockcount is declared as xfs_filblks_t, which is an unsigned 64-bit integer. Though the bmbt helpers will never set a value larger than 2^21 (since the underlying on-disk extent record has a length field that is only 21 bits wide), we should be a little defensive about checking that a bmbt record doesn't exceed what we're expecting or overflow into the next AG. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>	2018-01-17 21:00:47 -08:00
Darrick J. Wong	55e45429ce	xfs: btree format ifork loader should check for zero numrecs A btree format inode fork with zero records makes no sense, so reject it if we see it, or else we can miscalculate memory allocations. Found by zeroes fuzzing {a,u3}.bmbt.numrecs in xfs/{374,378,412} with KASAN. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>	2018-01-17 21:00:46 -08:00
Darrick J. Wong	79a69bf8dc	xfs: attr leaf verifier needs to check for obviously bad count In the attribute leaf verifier, we can check for obviously bad values of firstused and count so that later attempts at lasthash don't run off the end of the memory buffer. Found by ones fuzzing hdr.count in xfs/400 with KASAN. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>	2018-01-17 21:00:46 -08:00
Darrick J. Wong	ce92d29ddf	xfs: directory scrubber must walk through data block to offset In xfs_scrub_dir_rec, we must walk through the directory block entries to arrive at the offset given by the hash structure. If we blindly trust the hash address, we can end up midway into a directory entry and stray outside the block. Found by lastbit fuzzing lents[3].address in xfs/390 with KASAN enabled. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>	2018-01-17 21:00:46 -08:00
Darrick J. Wong	638a717489	xfs: don't iunlock unlocked inodes Don't iunlock an unlocked inode, which can happen if the parent pointer scrubber bails out with sc->ip unlocked while trying to grab the parent directory inode. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>	2018-01-17 21:00:46 -08:00
Darrick J. Wong	cf1b0b8b1a	xfs: scrub in-core metadata Whenever we load a buffer, explicitly re-call the structure verifier to ensure that memory isn't corrupting things. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>	2018-01-17 21:00:46 -08:00
Darrick J. Wong	561f648ab2	xfs: cross-reference the block mappings when possible Use an inode's block mappings to cross-reference inode block counters. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>	2018-01-17 21:00:46 -08:00
Darrick J. Wong	46d9bfb5e7	xfs: cross-reference the realtime bitmap While we're scrubbing various btrees, cross-reference the records with the other metadata. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>	2018-01-17 21:00:46 -08:00
Darrick J. Wong	f6d5fc21fd	xfs: cross-reference refcount btree during scrub During metadata btree scrub, we should cross-reference with the reference counts. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>	2018-01-17 21:00:46 -08:00
Darrick J. Wong	dbde19da96	xfs: cross-reference the rmapbt data with the refcountbt Cross reference the refcount data with the rmap data to check that the number of rmaps for a given block match the refcount of that block, and that CoW blocks (which are owned entirely by the refcountbt) are tracked as well. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>	2018-01-17 21:00:45 -08:00
Darrick J. Wong	d852657ccf	xfs: cross-reference reverse-mapping btree When scrubbing various btrees, we should cross-reference the records with the reverse mapping btree and ensure that traversing the btree finds the same number of blocks that the rmapbt thinks are owned by that btree. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>	2018-01-17 21:00:45 -08:00
Darrick J. Wong	2e6f27561b	xfs: cross-reference inode btrees during scrub Cross-reference the inode btrees with the other metadata when we scrub the filesystem. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>	2018-01-17 21:00:45 -08:00
Darrick J. Wong	e1134b12fd	xfs: cross-reference bnobt records with cntbt Scrub should make sure that each bnobt record has a corresponding cntbt record. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>	2018-01-17 21:00:45 -08:00
Darrick J. Wong	52dc4b44af	xfs: cross-reference with the bnobt When we're scrubbing various btrees, cross-reference the records with the bnobt to ensure that we don't also think the space is free. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>	2018-01-17 21:00:45 -08:00
Darrick J. Wong	166d76410d	xfs: introduce scrubber cross-referencing stubs Create some stubs that will be used to cross-reference metadata records. The actual cross-referencing will be filled in by subsequent patches. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>	2018-01-17 21:00:45 -08:00
Darrick J. Wong	858333dcf0	xfs: check btree block ownership with bnobt/rmapbt when scrubbing btree When scanning a metadata btree block, cross-reference the block location with the free space btree and the reverse mapping btree to ensure that the rmapbt knows about the block and the bnobt does not. Add a mechanism to defer checks when we happen to be scanning the bnobt/rmapbt itself because it's less efficient to repeatedly clone and destroy the cursor. This patch provides the framework to make btree block owner checks happen; the actual meat will be added in subsequent patches. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>	2018-01-17 21:00:45 -08:00
Darrick J. Wong	9a7e269566	xfs: fix a few erroneous process_error calls in the scrubbers There are a few places where we make a libxfs api call on behalf of some object other than the one we're scrubbing but inadvertently call the regular process_error function. When this happens we mark the object corrupt even though it was corruption in /some other/ object that actually produced the -EFSCORRUPTED code. The correct output flag for these situations is SCRUB_OFLAG_XFAIL, not SCRUB_OFLAG_CORRUPT, so fix this now that we also have a helper to set these. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>	2018-01-17 21:00:45 -08:00
Darrick J. Wong	64b12563b2	xfs: set up scrub cross-referencing helpers Create some helper functions that we'll use later to deal with problems we might encounter while cross referencing metadata with other metadata. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>	2018-01-17 21:00:44 -08:00
Darrick J. Wong	49db55eca5	xfs: add scrub cross-referencing helpers for the refcount btrees Add a couple of functions to the refcount btrees that will be used to cross-reference metadata against the refcountbt. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>	2018-01-17 21:00:44 -08:00
Darrick J. Wong	ed7c52d4bf	xfs: add scrub cross-referencing helpers for the rmap btrees Add a couple of functions to the rmap btrees that will be used to cross-reference metadata against the rmapbt. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>	2018-01-17 21:00:44 -08:00
Darrick J. Wong	2e001266b6	xfs: add scrub cross-referencing helpers for the inode btrees Add a couple of functions to the inode btrees that will be used to cross-reference metadata against the inobt. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>	2018-01-17 21:00:44 -08:00
Darrick J. Wong	ce1d802e6a	xfs: add scrub cross-referencing helpers for the free space btrees Add a couple of functions to the free space btrees that will be used to cross-reference metadata against the bnobt/cntbt, and a generic btree function that provides the real implementation. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>	2018-01-17 21:00:44 -08:00
Brian Foster	c468562879	xfs: cancel tx on xfs_defer_finish() error during xattr set/remove Chris Dunlop reports a problem where an xattr operation fails, reports the following error to syslog and hangs during unmount: ================================================ [ BUG: lock held when returning to user space! ] ... ------------------------------------------------ <PID> is leaving the kernel with locks still held! 1 lock held by <PID>: #0: (sb_internal){......}, at: [<ffffffffa07692a3>] xfs_trans_alloc+0xe3/0x130 [xfs] The failure/shutdown occurs during deferred ops processing which leads to an error return from xfs_defer_finish() via xfs_attr_leaf_addname(). While the root cause of the failure is unknown corruption, the cause of the subsequent BUG above and unmount hang is failure to cancel the transaction before returning to userspace. The transaction is not cancelled because the out_defer_cancel error handling paths in the xfs_attr_[leaf\|node]_[add\|remove]name() functions clear args.trans without releasing the transaction. The callers therefore lose the reference to the transaction and fail to cancel it. Since xfs_attr_[set\|remove]() always cancel args.trans when != NULL and xfs_defer_finish()->...->xfs_trans_roll() should always return with a valid transaction, update the leaf/node xattr functions to not reset args.trans in the error path responsible for cancelling deferred ops. Reported-by: Chris Dunlop <chris@onthe.net.au> Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2018-01-16 14:53:28 -08:00
Brian Foster	ad90bb585c	xfs: account finobt blocks properly in perag reservation XFS started using the perag metadata reservation pool for free inode btree blocks in commit `76d771b4cb` ("xfs: use per-AG reservations for the finobt"). To handle backwards compatibility, finobt blocks are accounted against the pool so long as the full reservation is available at mount time. Otherwise the ->m_inotbt_nores flag is set and the filesystem falls back to the traditional per-transaction finobt reservation. This commit has two problems: - finobt blocks are always accounted against the metadata reservation on allocation, regardless of ->m_inotbt_nores state - finobt blocks are never returned to the reservation pool on free The first problem affects reflink+finobt filesystems where the full finobt reservation is not available at mount time. finobt blocks are essentially stolen from the reflink reservation, putting refcountbt management at risk of allocation failure. The second problem is an unconditional leak of metadata reservation whenever finobt is enabled. Update the finobt block allocation callouts to consider ->m_inotbt_nores and account blocks appropriately. Blocks should be consistently accounted against the metadata pool when ->m_inotbt_nores is false and otherwise tagged as RESV_NONE. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2018-01-12 14:09:08 -08:00
Colin Ian King	a8789a5ae2	xfs: fix check on struct_version for versions 4 or greater It appears that the check for versions 4 or more is incorrect and is off-by-one. Fix this. Detected by CoverityScan, CID#1463775 ("Logically dead code") Fixes: `ac503a4cc9` ("xfs: refactor the geometry structure filling function") Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2018-01-12 14:09:08 -08:00
Xiongwei Song	1da0618993	xfs: destroy mutex pag_ici_reclaim_lock before free The mutex pag_ici_reclaim_lock of xfs_perag_t structure is initialized in xfs_initialize_perag. If happen errors in xfs_initialize_perag, or free resources in xfs_free_perag, wo need to destroy the mutex before free perag. Signed-off-by: Xiongwei Song <sxwjean@me.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2018-01-12 14:09:08 -08:00
Darrick J. Wong	c96900435f	xfs: use %px for data pointers when debugging Starting with commit `57e734423a` ("vsprintf: refactor %pK code out of pointer"), the behavior of the raw '%p' printk format specifier was changed to print a 32-bit hash of the pointer value to avoid leaking kernel pointers into dmesg. For most situations that's good. This is /undesirable/ behavior when we're trying to debug XFS, however, so define a PTR_FMT that prints the actual pointer when we're in debug mode. Note that %p for tracepoints still prints the raw pointer, so in the long run we could consider rewriting some of these messages as tracepoints. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>	2018-01-12 14:09:08 -08:00
Darrick J. Wong	aff68a5502	xfs: use %pS printk format for direct instruction addresses Use the %pS instead of the %pF printk format specifier for printing symbols from direct addresses. This is needed for the ia64, ppc64 and parisc64 architectures. While we're at it, be consistent with the capitalization of the 'S'. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>	2018-01-12 14:09:08 -08:00
Darrick J. Wong	3d170aa242	xfs: change 0x%p -> %p in print messages Since %p prepends "0x" to the outputted string, we can drop the prefix. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>	2018-01-12 14:09:08 -08:00
Darrick J. Wong	c219b01579	xfs: clarify units in the failed metadata io message If a metadata IO error happens, we report the location of the failed IO request in units of daddrs. However, the printk message misleads people into thinking that the units are fs blocks, so fix the reported units. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>	2018-01-09 15:18:07 -08:00
Darrick J. Wong	46c59736d8	xfs: harden directory integrity checks some more If a malicious filesystem image contains a block+ format directory wherein the directory inode's core.mode is set such that S_ISDIR(core.mode) == 0, and if there are subdirectories of the corrupted directory, an attempt to traverse up the directory tree will crash the kernel in __xfs_dir3_data_check. Running the online scrub's parent checks will tend to do this. The crash occurs because the directory inode's d_ops get set to xfs_dir[23]_nondir_ops (it's not a directory) but the parent pointer scrubber's indiscriminate call to xfs_readdir proceeds past the ASSERT if we have non fatal asserts configured. Fix the null pointer dereference crash in __xfs_dir3_data_check by looking for S_ISDIR or wrong d_ops; and teach the parent scrubber to bail out if it is fed a non-directory "parent". Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>	2018-01-09 11:11:42 -08:00
Darrick J. Wong	ac503a4cc9	xfs: refactor the geometry structure filling function Refactor the geometry structure filling function to use the superblock to fill the fields. While we're at it, make the function less indenty and use some whitespace to make the function easier to read. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>	2018-01-08 10:54:48 -08:00
Darrick J. Wong	c368ebcd4c	xfs: hoist xfs_fs_geometry to libxfs Move xfs_fs_geometry to libxfs so that we can clean up the fs geometry reporting in xfsprogs. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>	2018-01-08 10:54:48 -08:00
Darrick J. Wong	b872af2c87	xfs: trace log reservations at mount time At each mount, emit the transaction reservation type information via tracepoints. This makes it easier to compare the log reservation info calculated by the kernel and xfsprogs so that we can more easily diagnose minimum log size failures on freshly formatted filesystems. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>	2018-01-08 10:54:47 -08:00
Darrick J. Wong	9c712a1346	xfs: dump the first 128 bytes of any corrupt buffer Increase the corrupt buffer dump to the first 128 bytes since v5 filesystems have larger block headers than before. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>	2018-01-08 10:54:47 -08:00
Darrick J. Wong	d9418ed08a	xfs: teach error reporting functions to take xfs_failaddr_t Convert the two other error reporting functions to take xfs_failaddr_t when the caller wishes to capture a code pointer instead of the classic void * pointer. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>	2018-01-08 10:54:47 -08:00
Darrick J. Wong	eebf3cab9c	xfs: standardize quota verification function outputs Rename xfs_dqcheck to xfs_dquot_verify and make it return an xfs_failaddr_t like every other structure verifier function. This enables us to check on-disk quotas in the same way that we check everything else. Callers are now responsible for logging errors, as XFS_QMOPT_DOWARN goes away. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>	2018-01-08 10:54:47 -08:00
Darrick J. Wong	eeea798028	xfs: separate dquot repair into a separate function Move the dquot repair code into a separate function and remove XFS_QMOPT_DQREPAIR in favor of calling the helper directly. Remove other dead code because quotacheck is the only caller of DQREPAIR. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>	2018-01-08 10:54:47 -08:00
Darrick J. Wong	b55725974c	xfs: create a new buf_ops pointer to verify structure metadata Expose all metadata structure buffer verifier functions via buf_ops. These will be used by the online scrub mechanism to look for problems with buffers that are already sitting around in memory. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>	2018-01-08 10:54:47 -08:00
Darrick J. Wong	8ba92d43d4	xfs: fail out of xfs_attr3_leaf_lookup_int if it looks corrupt If the xattr leaf block looks corrupt, return -EFSCORRUPTED to userspace instead of ASSERTing on debug kernels or running off the end of the buffer on regular kernels. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>	2018-01-08 10:54:47 -08:00
Darrick J. Wong	9cfb9b4747	xfs: provide a centralized method for verifying inline fork data Replace the current haphazard dir2 shortform verifier callsites with a centralized verifier function that can be called either with the default verifier functions or with a custom set. This helps us strengthen integrity checking while providing us with flexibility for repair tools. xfs_repair wants this to be able to supply its own verifier functions when trying to fix possibly corrupt metadata. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>	2018-01-08 10:54:47 -08:00
Darrick J. Wong	dc042c2d8f	xfs: refactor short form directory structure verifier function Change the short form directory structure verifier function to return the instruction pointer of a failing check or NULL if everything's ok. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>	2018-01-08 10:54:46 -08:00
Darrick J. Wong	0795e004fd	xfs: create structure verifier function for short form symlinks Create a function to check the structure of short form symlink targets. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>	2018-01-08 10:54:46 -08:00
Darrick J. Wong	1e1bbd8e7e	xfs: create structure verifier function for shortform xattrs Create a function to perform structure verification for short form extended attributes. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>	2018-01-08 10:54:46 -08:00
Darrick J. Wong	71493b839e	xfs: move inode fork verifiers to xfs_dinode_verify Consolidate the fork size and format verifiers to xfs_dinode_verify so that we can reject bad inodes earlier and in a single place. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>	2018-01-08 10:54:46 -08:00
Darrick J. Wong	50aa90ef03	xfs: verify dinode header first Move the v3 inode integrity information (crc, owner, metauuid) before we look at anything else in the inode so that we don't waste time on a torn write or a totally garbled block. This makes xfs_dinode_verify more consistent with the other verifiers. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>	2018-01-08 10:54:46 -08:00
Darrick J. Wong	bc1a09b8e3	xfs: refactor verifier callers to print address of failing check Refactor the callers of verifiers to print the instruction address of a failing check. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>	2018-01-08 10:54:46 -08:00
Darrick J. Wong	a6a781a58b	xfs: have buffer verifier functions report failing address Modify each function that checks the contents of a metadata buffer to return the instruction address of the failing test so that we can report more precise failure errors to the log. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>	2018-01-08 10:54:46 -08:00
Darrick J. Wong	31ca03c92c	xfs: refactor xfs_verifier_error and xfs_buf_ioerror Since all verification errors also mark the buffer as having an error, we can combine these two calls. Later we'll add a xfs_failaddr_t parameter to promote the idea of reporting corruption errors and the address of the failing check to enable better debugging reports. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>	2018-01-08 10:54:45 -08:00
Darrick J. Wong	9101d3707b	xfs: remove XFS_WANT_CORRUPTED_RETURN from dir3 data verifiers Since __xfs_dir3_data_check verifies on-disk metadata, we can't have it noisily blowing asserts and hanging the system on corrupt data coming in off the disk. Instead, have it return a boolean like all the other checker functions, and only have it noisily fail if we fail in debug mode. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>	2018-01-08 10:54:45 -08:00
Darrick J. Wong	e1e55aaf1c	xfs: refactor short form btree pointer verification Now that we have xfs_verify_agbno, use it to verify short form btree pointers instead of open-coding them. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>	2018-01-08 10:54:45 -08:00
Darrick J. Wong	8368a6019d	xfs: refactor long-format btree header verification routines Create two helper functions to verify the headers of a long format btree block. We'll use this later for the realtime rmapbt. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>	2018-01-08 10:54:45 -08:00
Darrick J. Wong	59f6fec3bd	xfs: remove XFS_FSB_SANITY_CHECK We already have a function to verify fsb pointers, so get rid of the last users of the (less robust) macro. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>	2018-01-08 10:54:45 -08:00
Darrick J. Wong	d658e72b4a	xfs: distinguish between corrupt inode and invalid inum in xfs_scrub_get_inode In xfs_scrub_get_inode, we don't do a good enough job distinguishing EINVAL returns from xfs_iget w/ IGET_UNTRUSTED -- this can happen if the passed in inode number is invalid (past eofs, inobt says it isn't an inode) or if the inum is actually valid but the inode buffer fails verifier. In the first case we still want to return ENOENT, but in the second case we want to capture the corruption error. Therefore, if xfs_iget returns EINVAL, try the raw imap lookup. If that succeeds, we conclude it's a corruption error, otherwise we just bounce out to userspace. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>	2018-01-08 10:49:04 -08:00
Darrick J. Wong	1ad1205e71	xfs: always grab transaction when scrubbing inode Always allocate a transaction for inode scrubbing, even if the _iget fails. This is something that is nice to have now for consistency with the other scrubbers but will become critical when we get to online repair where we'll actually use the transaction + raw buffer read to fix the verifier errors. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>	2018-01-08 10:49:03 -08:00
Darrick J. Wong	2b9e9b5771	xfs: xfs_scrub_bmap should use for_each_xfs_iext Refactor xfs_scrub_bmap to use for_each_xfs_iext now that it exists. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>	2018-01-08 10:49:03 -08:00
Darrick J. Wong	e5b37faa93	xfs: catch a few more error codes when scrubbing secondary sb The superblock validation routines return a variety of error codes to reject a mount request. For scrub we can assume that the mount succeeded, so if we see these things appear when scrubbing secondary sb X, we can treat them all like corruption. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>	2018-01-08 10:49:02 -08:00
Darrick J. Wong	5a0f433745	xfs: ignore agfl read errors when not scrubbing agfl In xfs_scrub_ag_read_headers, if we're not scrubbing the AGFL but hit a read error reading the AGFL, we should reset the error code so that it doesn't propagate up into the caller. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>	2018-01-08 10:49:02 -08:00
Brian Foster	c017cb5ddf	xfs: eliminate duplicate icreate tx reservation functions The create transaction reservation calculation has two different branches of code depending on whether the filesystem is a v5 format fs or older. Each branch considers the max reservation between the allocation case (new chunk allocation + record insert) and the modify case (chunk exists, record modification) of inode allocation. The modify case is the same for both superblock versions with the exception of the finobt. The finobt helper checks the feature bit, however, and so the modify case already shares the same code. Now that inode chunk allocation has been refactored into a helper that checks the superblock version to calculate the appropriate reservation for the create transaction, the only remaining difference between the create and icreate branches is the call to the finobt helper. As noted above, the finobt helper is a no-op when the feature is not enabled. Therefore, these branches are effectively duplicate and can be condensed. Remove the xfs_calc_create_() branch of functions and update the various callers to use the xfs_calc_icreate_() variant. The latter creates the same reservation size for v4 create transactions as the removed branch. As such, this patch does not result in transaction reservation changes. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2018-01-08 10:41:38 -08:00
Brian Foster	57af33e451	xfs: refactor inode chunk alloc/free tx reservation The reservation for the various forms of inode allocation is scattered across several different functions. This includes two variants of chunk allocation (v5 icreate transactions vs. older create transactions) and the inode free transaction. To clean up some of this code and clarify the purpose of specific allocfree reservations, continue the pattern of defining helper functions for smaller operational units of broader transactions. Refactor the reservation into an inode chunk alloc/free helper that considers the various conditions based on filesystem format. An inode chunk free involves an extent free and buffer invalidations. The latter requires reservation for log headers only. An inode chunk allocation modifies the free space btrees and logs the chunk on v4 supers. v5 supers initialize the inode chunk using ordered buffers and so do not log the chunk. As a side effect of this refactoring, add one more allocfree res to the ifree transaction. Technically this does not serve a specific purpose because inode chunks are freed via deferred operations and thus occur after a transaction roll. tr_ifree has a bit of a history of tx overruns caused by too many agfl fixups during sustained file deletion workloads, so add this extra reservation as a form of padding nonetheless. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2018-01-08 10:41:38 -08:00
Brian Foster	f03c78f397	xfs: include an allocfree res for inobt modifications Analysis of recent reports of log reservation overruns and code inspection has uncovered that the reservations associated with inode operations may not cover the worst case scenarios. In particular, many cases only include one allocfree res. for a particular operation even though said operations may also entail AGFL fixups and inode btree block allocations in addition to the actual inode chunk allocation. This can easily turn into two or three block allocations (or frees) per operation. In theory, the only way to define the worst case reservation is to include an allocfree res for each individual allocation in a transaction. Since that is impractical (we can perform multiple agfl fixups per tx and not every allocation results in a full tree operation), we need to find a reasonable compromise that addresses the deficiency in practice without blowing out the size of the transactions. Since the inode btrees are not filled by the AGFL, record insertion and removal can directly result in block allocations and frees depending on the shape of the tree. These allocations and frees occur in the same transaction context as the inobt update itself, but are separate from the allocation/free that might be required for an inode chunk. Therefore, it makes sense to assume that an [f]inobt insert/remove can directly result in one or more block allocations on behalf of the tree. Refactor the inode transaction reservations to include one allocfree res. per inode btree modification to cover allocations required by the tree itself. This separates the reservation required to allocate the inode chunk from the reservation required for inobt record insertion/removal. Apply the same logic to the finobt. This results in killing off the finobt modify condition because we no longer assume that the broader transaction reservation will cover finobt block allocations and finobt shape changes can occur in either of the inobt allocation or modify situations. Suggested-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2018-01-08 10:41:37 -08:00
Brian Foster	a606ebdb85	xfs: truncate transaction does not modify the inobt The truncate transaction does not ever modify the inode btree, but includes an associated log reservation. Update xfs_calc_itruncate_reservation() to remove the reservation associated with inobt updates. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2018-01-08 10:41:37 -08:00
Brian Foster	e8341d9f63	xfs: fix up agi unlinked list reservations The current AGI unlinked list addition and removal reservations do not reflect the worst case log usage. An unlinked list removal can log up to two on-disk inode clusters but only includes reservation for one. An unlinked list addition logs the on-disk cluster but includes reservation for an in-core inode. Update the AGI unlinked list reservation helpers to calculate the correct worst case reservation for the associated operations. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2018-01-08 10:41:36 -08:00
Brian Foster	a6f485908d	xfs: include inobt buffers in ifree tx log reservation The tr_ifree transaction handles inode unlinks and inode chunk frees. The current transaction calculation does not accurately reflect worst case changes to the inode btree, however. The inobt portion of the current transaction reservation only covers modification of a single inobt buffer (for the particular inode record). This is a historical artifact from the days before XFS supported full inode chunk removal. When support for inode chunk removal was added in commit 254f6311ed1b ("Implement deletion of inode clusters in XFS."), the additional log reservation required for chunk removal was not added correctly. The new reservation only considered the header overhead of associated buffers rather than the full contents of the btrees and AGF and AGFL buffers affected by the transaction. The reservation for the free space btrees was subsequently fixed up in commit 5fe6abb82f76 ("Add space for inode and allocation btrees to ITRUNCATE log reservation"), but the res. for full inobt joins has never been added. Further review of the ifree reservation uncovered a couple more problems: - The undocumented +2 blocks are intended for the AGF and AGFL, but are also not sized correctly and should be logged as full sectors (not FSBs). - The additional single block header is undocumented and serves no apparent purpose. Update xfs_calc_ifree_reservation() to include a full inobt join in the reservation calculation. Refactor the undocumented blocks appropriately and fix up the comments to reflect the current calculation. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2018-01-08 10:41:36 -08:00
Brian Foster	2c8f626539	xfs: print transaction log reservation on overrun The transaction dump code displays the content and reservation consumption of a particular transaction in the event of an overrun. It currently displays the reservation associated with the transaction ticket, but not the original reservation attached to the transaction. The latter value reflects the original transaction reservation calculation before additional reservation overhead is assigned, such as for the CIL context header and potential split region headers. Update xlog_print_trans() to also print the original transaction reservation in the event of overrun. This provides a reference point to identify how much reservation overhead was added to a particular ticket by xfs_log_calc_unit_res(). Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2018-01-08 10:41:35 -08:00
Darrick J. Wong	29c1c123a3	xfs: scrub inode nsec fields Check that the nanosecond fields in each timestamp aren't larger than a billion. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>	2018-01-08 10:41:35 -08:00
Eric Sandeen	8e63083762	xfs: move all scrub input checking to xfs_scrub_validate There were ad-hoc checks for some scrub types but not others; mark each scrub type with ... it's type, and use that to validate the allowed and/or required input fields. Moving these checks out of xfs_scrub_setup_ag_header makes it a thin wrapper, so unwrap it in the process. Signed-off-by: Eric Sandeen <sandeen@redhat.com> [darrick: add xfs_ prefix to enum, check scrub args after checking type] Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2018-01-08 10:41:34 -08:00
Eric Sandeen	0a085ddf0e	xfs: factor out scrub input checking Do this before adding more core checks. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2018-01-08 10:41:34 -08:00
Eric Sandeen	bfb3e9b926	xfs: explicitly initialize meta_scrub_ops array by type An implicit mapping to type by order of initialization seems error-prone, and doesn't lend itself to cscope-ing. Also add sanity checks about size of array vs. max types, and a defensive check that ->scrub exists before using it. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2018-01-08 10:41:33 -08:00
Richard Wareing	a015831596	xfs: Show realtime device stats on statfs calls if realtime flags set - Reports realtime device free blocks in statfs calls if (realtime) inheritance bit is set on the inode of directory, or realtime flag in the case of files. This is a bit more intuitive, especially for use-cases which are using a much larger device for the realtime device. - Add XFS_IS_REALTIME_MOUNT option to gate based on the existence of a realtime device on the mount, similar to the XFS_IS_REALTIME_INODE option. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Richard Wareing <rwareing@fb.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2018-01-08 10:41:33 -08:00
Jan Kara	c0b2462597	dax: pass detailed error code from dax_iomap_fault() Ext4 needs to pass through error from its iomap handler to the page fault handler so that it can properly detect ENOSPC and force transaction commit and retry the fault (and block allocation). Add argument to dax_iomap_fault() for passing such error. Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2018-01-07 16:38:43 -05:00
Linus Torvalds	12e971b652	Changes since last update: - Fix resource cleanup of failed quota initialization - Fix integer overflow problems wrt s_maxbytes -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABCgAGBQJaS8yUAAoJEPh/dxk0SrTrrNMP+gLCitWenObhf6uA0Aysb3Vr EnhNFaqZA7RRLbQRwLESblvhExp9WTrtFmWOAFh1Q0ETBEIazIGkXfKDeOChxaCY LMPb83vQarZoV++HoiBeFbShf39dFw2ufGHyveZwvxk4kgYgQRFzIVZbRTg7CA/C nMLPZ9IBDBhEwnCVpH+gKJMcU6j5I9IIePwaEIKnB0o99fsEgZfnM0B4Wl0DRrzn nE6DOvkGZiNF4on1J2KgL2rB0r+VEyyMtBTCRs519rEaa8ACFUQDqEqoUIC92SnS pD/n9S2JwVH1dLX7cRoiMQcX/r4do83LlK0IvMswApMuNqYRQU6332lwosdgo7KQ 8+antAlVKuqMAGNvhVWMy1DuaRO5gCqRwL1wpzebNHsw4eRsDD2MNkeLXbM2P2oL 5OflIrPLMlLORlPtwbJclm8CcnQzQGMAa5yEDJcU1PIWH/urdRd+KqWQ+N0Zfj6m J3L4tXDY61hqwZ8BISe+/9iFDooGV/6Ri4mbez4UWiN6UfaKKokaFZzbo2n3VTb9 Htx5KsrzslfGWAnoeIT9GnyFhT4te9IHT69jl2AorvxpmdXdfOI8TgrzS8TzuKGD N6TadC4IZGLLpww+rND6Bywdc8/garmFbck+/nVdMRwNAsZUE+m08OrNFMCqmYms p9jIA2tRh94Hu4Awi8hG =2rs/ -----END PGP SIGNATURE----- Merge tag 'xfs-4.15-fixes-10' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux Pull XFS fixes from Darrick Wong: "I have just a few fixes for bugs and resource cleanup problems this week: - Fix resource cleanup of failed quota initialization - Fix integer overflow problems wrt s_maxbytes" * tag 'xfs-4.15-fixes-10' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: fix s_maxbytes overflow problems xfs: quota: check result of register_shrinker() xfs: quota: fix missed destroy of qi_tree_lock	2018-01-05 12:59:32 -08:00
Darrick J. Wong	b4d8ad7fd3	xfs: fix s_maxbytes overflow problems Fix some integer overflow problems if offset + count happen to be large enough to cause an integer overflow. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>	2018-01-02 10:16:32 -08:00
Aliaksei Karaliou	3a3882ff26	xfs: quota: check result of register_shrinker() xfs_qm_init_quotainfo() does not check result of register_shrinker() which was tagged as __must_check recently, reported by sparse. Signed-off-by: Aliaksei Karaliou <akaraliou.dev@gmail.com> [darrick: move xfs_qm_destroy_quotainos nearer xfs_qm_init_quotainos] Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2018-01-02 10:16:32 -08:00
Aliaksei Karaliou	2196881566	xfs: quota: fix missed destroy of qi_tree_lock xfs_qm_destroy_quotainfo() does not destroy quotainfo->qi_tree_lock while destroys quotainfo->qi_quotaofflock. Signed-off-by: Aliaksei Karaliou <akaraliou.dev@gmail.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2018-01-02 10:16:32 -08:00
Adam Borowski	91581e4c60	fs/*/Kconfig: drop links to 404-compliant http://acl.bestbits.at This link is replicated in most filesystems' config stanzas. Referring to an archived version of that site is pointless as it mostly deals with patches; user documentation is available elsewhere. Signed-off-by: Adam Borowski <kilobyte@angband.pl> CC: Alexander Viro <viro@zeniv.linux.org.uk> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Acked-by: Jan Kara <jack@suse.cz> Acked-by: Dave Kleikamp <dave.kleikamp@oracle.com> Acked-by: David Sterba <dsterba@suse.com> Acked-by: "Yan, Zheng" <zyan@redhat.com> Acked-by: Chao Yu <yuchao0@huawei.com> Acked-by: Jaegeuk Kim <jaegeuk@kernel.org> Acked-by: Steve French <smfrench@gmail.com> Signed-off-by: Jonathan Corbet <corbet@lwn.net>	2018-01-01 12:45:37 -07:00
Linus Torvalds	fca0e39b2b	Changes since last update: - Fix a locking problem during xattr block conversion that could lead to the log checkpointing thread to try to write an incomplete buffer to disk, which leads to a corruption shutdown - Fix a null pointer dereference when removing delayed allocation extents - Remove post-eof speculative allocations when reflinking a block past current inode size so that we don't just leave them there and assert on inode reclaim - Relax an assert which didn't accurately reflect the way locking works and would trigger under heavy io load - Avoid infinite loop when cancelling copy on write extents after a writeback failure - Try to avoid copy on write transaction reservation overflows when remapping after a successful write - Fix various problems with the copy-on-write reservation automatic garbage collection not being cleaned up properly during a ro remount - Fix problems with rmap log items being processed in the wrong order, leading to corruption shutdowns - Fix problems with EFI recovery wherein the "remove any rmapping if present" mechanism wasn't actually doing anything, which would lead to corruption problems later when the extent is reallocated, leading to multiple rmaps for the same extent -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABCgAGBQJaO+dwAAoJEPh/dxk0SrTrY8YP/R9AXH3Wt6S2QGGjZfXURa22 /cioJKFl8hWay00ZT8Zcj4Pdx6R+stvausj5ECDvpdWZG+d28e61c1bxg+bqRYO5 JWXikWnAa80RQ5uEjOXHoUjAgk6u6YYuQHEuHH/xA0nL4Cw98WLSzLjqk7ZU53rx P17dgUWWHta/w8OpxG9UG5pxvNW3VRitiyCMWxa2gzBPncHnCk3fu9lInpDzH9S+ xakwCRtfiAykoOG/O5pnMg6vw5r6ENwK7DymxXgqF+Vv/HzgMbeJs+9UON2eACtp ECHGffN4pXpqWVcGDMs5cWCOfLUEjxCrotMLYpIrdZs5DptmOcOWpQpHWl4JiaXB rqAxx3D0Yo+00ENponM01un8UgCXF5gqsDGyTzn99aPpDVqxCJw1XmSdOXRhcnnF At2raUkXF+nbqaVwL3Y7ZJuOKs1hi3HpsYwwfvClR8cTFk/BaY6sQ4QnVR0Ggkg6 8lZxeDb8VdoUjWO11sX1edwGtR8g+p3PSHiUFSnh1JsbP2I0R+TV+j5Y9rMotxFT Eq6+Ehp889GeSpEBCrDpMgNIABMjBxoi5JvOwXSUNhF5Rh/1Vf//7v31nXcyVlah a95IhCYfQLFMtaYaGr2ElvdO+Qs1+ppsD207I4H86XotjRkvD7U+mJoYm9EaujQX jgUDdZEsP5h5DX524VHU =i51V -----END PGP SIGNATURE----- Merge tag 'xfs-4.15-fixes-8' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux Pull xfs fixes from Darrick Wong: "Here are some XFS fixes for 4.15-rc5. Apologies for the unusually large number of patches this late, but I wanted to make sure the corruption fixes were really ready to go. Changes since last update: - Fix a locking problem during xattr block conversion that could lead to the log checkpointing thread to try to write an incomplete buffer to disk, which leads to a corruption shutdown - Fix a null pointer dereference when removing delayed allocation extents - Remove post-eof speculative allocations when reflinking a block past current inode size so that we don't just leave them there and assert on inode reclaim - Relax an assert which didn't accurately reflect the way locking works and would trigger under heavy io load - Avoid infinite loop when cancelling copy on write extents after a writeback failure - Try to avoid copy on write transaction reservation overflows when remapping after a successful write - Fix various problems with the copy-on-write reservation automatic garbage collection not being cleaned up properly during a ro remount - Fix problems with rmap log items being processed in the wrong order, leading to corruption shutdowns - Fix problems with EFI recovery wherein the "remove any rmapping if present" mechanism wasn't actually doing anything, which would lead to corruption problems later when the extent is reallocated, leading to multiple rmaps for the same extent" * tag 'xfs-4.15-fixes-8' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: only skip rmap owner checks for unknown-owner rmap removal xfs: always honor OWN_UNKNOWN rmap removal requests xfs: queue deferred rmap ops for cow staging extent alloc/free in the right order xfs: set cowblocks tag for direct cow writes too xfs: remove leftover CoW reservations when remounting ro xfs: don't be so eager to clear the cowblocks tag on truncate xfs: track cowblocks separately in i_flags xfs: allow CoW remap transactions to use reserve blocks xfs: avoid infinite loop when cancelling CoW blocks after writeback failure xfs: relax is_reflink_inode assert in xfs_reflink_find_cow_mapping xfs: remove dest file's post-eof preallocations before reflinking xfs: move xfs_iext_insert tracepoint to report useful information xfs: account for null transactions in bunmapi xfs: hold xfs_buf locked between shortform->leaf conversion and the addition of an attribute xfs: add the ability to join a held buffer to a defer_ops	2017-12-22 12:27:27 -08:00
Darrick J. Wong	68c58e9b9a	xfs: only skip rmap owner checks for unknown-owner rmap removal For rmap removal, refactor the rmap owner checks into a separate function, then skip the checks if we are performing an unknown-owner removal. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>	2017-12-21 08:48:38 -08:00
Darrick J. Wong	33df3a9cf9	xfs: always honor OWN_UNKNOWN rmap removal requests Calling xfs_rmap_free with an unknown owner is supposed to remove any rmaps covering that range regardless of owner. This is used by the EFI recovery code to say "we're freeing this, it mustn't be owned by anything anymore", but for whatever reason xfs_free_ag_extent filters them out. Therefore, remove the filter and make xfs_rmap_unmap actually treat it as a wildcard owner -- free anything that's already there, and if there's no owner at all then that's fine too. There are two existing callers of bmap_add_free that take care the rmap deferred ops themselves and use OWN_UNKNOWN to skip the EFI-based rmap cleanup; convert these to use OWN_NULL (via helpers), and now we really require that an RUI (if any) gets added to the defer ops before any EFI. Lastly, now that xfs_free_extent filters out OWN_NULL rmap free requests, growfs will have to consult directly with the rmap to ensure that there aren't any rmaps in the grown region. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>	2017-12-21 08:48:38 -08:00

1 2 3 4 5 ...

4982 Commits