linux

Author	SHA1	Message	Date
Bob Peterson	2eb5909dee	GFS2: Don't try to end a non-existent transaction in unlink Before this patch, if function gfs2_unlink failed to get a valid transaction (for example, not enough journal blocks) it would go to label out_end_trans which did gfs2_trans_end. But if the trans_begin failed, there's no transaction to end, and trying to do so results in: kernel BUG at fs/gfs2/trans.c:117! This patch changes the goto so that it does not try to end a non-existent transaction. Signed-off-by: Bob Peterson <rpeterso@redhat.com>	2018-01-29 10:00:23 -07:00
Christoph Hellwig	1e369b0e19	xfs: remove experimental tag for reflinks But reject reflink + DAX file systems for now until the code to support reflinks on DAX is actually implemented. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> [darrick: port to 4.16] Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2018-01-29 07:27:24 -08:00
Darrick J. Wong	6d8a45ce29	xfs: don't screw up direct writes when freesp is fragmented xfs_bmap_btalloc is given a range of file offset blocks that must be allocated to some data/attr/cow fork. If the fork has an extent size hint associated with it, the request will be enlarged on both ends to try to satisfy the alignment hint. If free space is fragmentated, sometimes we can allocate some blocks but not enough to fulfill any of the requested range. Since bmapi_allocate always trims the new extent mapping to match the originally requested range, this results in bmapi_write returning zero and no mapping. The consequences of this vary -- buffered writes will simply re-call bmapi_write until it can satisfy at least one block from the original request. Direct IO overwrites notice nmaps == 0 and return -ENOSPC through the dio mechanism out to userspace with the weird result that writes fail even when we have enough space because the ENOSPC return overrides any partial write status. For direct CoW writes the situation was disastrous because nobody notices us returning an invalid zero-length wrong-offset mapping to iomap and the write goes off into space. Therefore, if free space is so fragmented that we managed to allocate some space but not enough to map into even a single block of the original allocation request range, we should break the alignment hint in order to guarantee at least some forward progress for the direct write. If we return a short allocation to iomap_apply it'll call back about the remaining blocks. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>	2018-01-29 07:27:24 -08:00
Darrick J. Wong	9f37bd11b4	xfs: check reflink allocation mappings There's a really bad bug in xfs_reflink_allocate_cow -- if bmapi_write can return a zero error code but no mappings. This happens if there's an extent size hint (which causes allocation requests to be rounded to extsz granularity internally), but there wasn't a big enough chunk of free space to start filling at the extsz granularity and fill even one block of the range that we actually requested. In any case, if we got no mappings we can't possibly do anything useful with the contents of imap, so we must bail out with ENOSPC here. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>	2018-01-29 07:27:24 -08:00
Darrick J. Wong	0c6dda7a1c	iomap: warn on zero-length mappings Don't let the iomap callback get away with feeding us a garbage zero length mapping -- there was a bug in xfs that resulted in those leaking out to hilarious effect. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>	2018-01-29 07:27:24 -08:00
Darrick J. Wong	4b4c1326fd	xfs: treat CoW fork operations as delalloc for quota accounting Since the CoW fork only exists in memory, it is incorrect to update the on-disk quota block counts when we modify the CoW fork. Unlike the data fork, even real extents in the CoW fork are only delalloc-style reservations (on-disk they're owned by the refcountbt) so they must not be tracked in the on disk quota info. Ensure the i_delayed_blks accounting reflects this too. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>	2018-01-29 07:27:23 -08:00
Darrick J. Wong	01c2e13dca	xfs: only grab shared inode locks for source file during reflink Reflink and dedupe operations remap blocks from a source file into a destination file. The destination file needs exclusive locks on all levels because we're updating its block map, but the source file isn't undergoing any block map changes so we can use a shared lock. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>	2018-01-29 07:27:23 -08:00
Darrick J. Wong	7c2d238ac6	xfs: allow xfs_lock_two_inodes to take different EXCL/SHARED modes Refactor xfs_lock_two_inodes to take separate locking modes for each inode. Specifically, this enables us to take a SHARED lock on one inode and an EXCL lock on the other. The lock class (MMAPLOCK/ILOCK) must be the same for each inode. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>	2018-01-29 07:27:23 -08:00
Darrick J. Wong	1364b1d4b5	xfs: reflink should break pnfs leases before sharing blocks Before we share blocks between files, we need to break the pnfs leases on the layout before we start slicing and dicing the block map. The structure of this function sets us up for the lock contention reduction in the next patch. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>	2018-01-29 07:27:23 -08:00
Darrick J. Wong	c47b74fb2d	xfs: don't clobber inobt/finobt cursors when xref with rmap Even if we can't use the inobt/finobt cursors to count the number of inode btree blocks, we are never allowed to clobber the cursor of the btree being checked, so don't do this. Found by fuzzing level = ones in xfs/364. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>	2018-01-29 07:27:23 -08:00
Darrick J. Wong	70c57dcd60	xfs: skip CoW writes past EOF when writeback races with truncate Every so often we blow the ASSERT(type != XFS_IO_COW) in xfs_map_blocks when running fsstress, as we do in generic/269. The cause of this is writeback racing with truncate -- writeback doesn't take the iolock, so truncate can sneak in to decrease i_size and truncate page cache while writeback is gathering buffer heads to schedule writeout. If we hit this race on a block that has a CoW mapping, we'll get a valid imap from the CoW fork but the reduced i_size trims the mapping to zero length (which makes it invalid), so we call xfs_map_blocks to try again. This doesn't do much anyway, since any mapping we get out of that will also be invalid, so we might as well skip the assert and just stop. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>	2018-01-29 07:27:23 -08:00
Amir Goldstein	acd1d71598	xfs: preserve i_rdev when recycling a reclaimable inode Commit `66f364649d` ("xfs: remove if_rdev") moved storing of rdev value for special inodes to VFS inodes, but forgot to preserve the value of i_rdev when recycling a reclaimable xfs_inode. This was detected by xfstest overlay/017 with inodex=on mount option and xfs base fs. The test does a lookup of overlay chardev and blockdev right after drop caches. Overlayfs inodes hold a reference on underlying xfs inodes when mount option index=on is configured. If drop caches reclaim xfs inodes, before it relclaims overlayfs inodes, that can sometimes leave a reclaimable xfs inode and that test hits that case quite often. When that happens, the xfs inode cache remains broken (zere i_rdev) until the next cycle mount or drop caches. Fixes: `66f364649d` ("xfs: remove if_rdev") Signed-off-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2018-01-29 07:27:23 -08:00
Darrick J. Wong	751f3767c2	xfs: refactor accounting updates out of xfs_bmap_btalloc Move all the inode and quota accounting updates out of xfs_bmap_btalloc in preparation for fixing some quota accounting problems with copy on write. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com>	2018-01-29 07:27:23 -08:00
Darrick J. Wong	22431bf3df	xfs: refactor inode verifier corruption error printing Refactor inode verifier error reporting into a non-libxfs function so that we aren't encoding the message format in libxfs. This also changes the kernel dmesg output to resemble buffer verifier errors more closely. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>	2018-01-29 07:27:22 -08:00
Darrick J. Wong	67a3f6d014	xfs: make tracepoint inode number format consistent Fix all the inode number formats to be consistently (0x%llx) in all trace point definitions. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>	2018-01-29 07:27:22 -08:00
Darrick J. Wong	beaae8cd58	xfs: always zero di_flags2 when we free the inode Always zero the di_flags2 field when we free the inode so that we never end up with an on-disk record for an unallocated inode that also has the reflink iflag set. This is in keeping with the general principle that only files can have the reflink iflag set, even though we'll zero out di_flags2 if we ever reallocate the inode. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>	2018-01-29 07:27:22 -08:00
Darrick J. Wong	09ac862397	xfs: call xfs_qm_dqattach before performing reflink operations Ensure that we've attached all the necessary dquots before performing reflink operations so that quota accounting is accurate. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>	2018-01-29 07:27:22 -08:00
Shan Hai	6ca30729c2	xfs: bmap code cleanup Remove the extent size hint and realtime inode relevant code from the xfs_bmapi_reserve_delalloc since it is not called on the inode with extent size hint set or on a realtime inode. Signed-off-by: Shan Hai <shan.hai@oracle.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2018-01-29 07:27:22 -08:00
Carlos Maiolino	643c8c05e7	Use list_head infra-structure for buffer's log items list Now that buffer's b_fspriv has been split, just replace the current singly linked list of xfs_log_items, by the list_head infrastructure. Also, remove the xfs_log_item argument from xfs_buf_resubmit_failed_buffers(), there is no need for this argument, once the log items can be walked through the list_head in the buffer. Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: Bill O'Donnell <billodo@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> [darrick: minor style cleanups] Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2018-01-29 07:27:22 -08:00
Carlos Maiolino	fb1755a645	Split buffer's b_fspriv field By splitting the b_fspriv field into two different fields (b_log_item and b_li_list). It's possible to get rid of an old ABI workaround, by using the new b_log_item field to store xfs_buf_log_item separated from the log items attached to the buffer, which will be linked in the new b_li_list field. This way, there is no more need to reorder the log items list to place the buf_log_item at the beginning of the list, simplifying a bit the logic to handle buffer IO. This also opens the possibility to change buffer's log items list into a proper list_head. b_log_item field is still defined as a void *, because it is still used by the log buffers to store xlog_in_core structures, and there is no need to add an extra field on xfs_buf just for xlog_in_core. Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: Bill O'Donnell <billodo@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> [darrick: minor style changes] Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2018-01-29 07:27:22 -08:00
Carlos Maiolino	70a2065533	Get rid of xfs_buf_log_item_t typedef Take advantage of the rework on xfs_buf log items list, to get rid of ths typedef for xfs_buf_log_item. This patch also fix some indentation alignment issues found along the way. Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: Bill O'Donnell <billodo@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2018-01-29 07:27:22 -08:00
Jeff Layton	3a8c7231d5	btrfs: only dirty the inode in btrfs_update_time if something was changed At this point, we know that "now" and the file times may differ, and we suspect that the i_version has been flagged to be bumped. Attempt to bump the i_version, and only mark the inode dirty if that actually occurred or if one of the times was updated. Signed-off-by: Jeff Layton <jlayton@redhat.com> Acked-by: David Sterba <dsterba@suse.com> Reviewed-by: Liu Bo <bo.li.liu@oracle.com>	2018-01-29 06:42:21 -05:00
Jeff Layton	d17260fd5f	xfs: avoid setting XFS_ILOG_CORE if i_version doesn't need incrementing If XFS_ILOG_CORE is already set then go ahead and increment it. Signed-off-by: Jeff Layton <jlayton@redhat.com> Acked-by: Darrick J. Wong <darrick.wong@oracle.com> Acked-by: Dave Chinner <dchinner@redhat.com>	2018-01-29 06:42:21 -05:00
Jeff Layton	e38cf302b2	fs: only set S_VERSION when updating times if necessary We only really need to update i_version if someone has queried for it since we last incremented it. By doing that, we can avoid having to update the inode if the times haven't changed. If the times have changed, then we go ahead and forcibly increment the counter, under the assumption that we'll be going to the storage anyway, and the increment itself is relatively cheap. Signed-off-by: Jeff Layton <jlayton@redhat.com> Reviewed-by: Jan Kara <jack@suse.cz>	2018-01-29 06:42:21 -05:00
Jeff Layton	f0e2828062	xfs: convert to new i_version API Signed-off-by: Jeff Layton <jlayton@redhat.com> Acked-by: Darrick J. Wong <darrick.wong@oracle.com> Acked-by: Dave Chinner <dchinner@redhat.com>	2018-01-29 06:42:21 -05:00
Jeff Layton	bb8c2d66bc	ufs: use new i_version API Signed-off-by: Jeff Layton <jlayton@redhat.com>	2018-01-29 06:42:21 -05:00
Jeff Layton	cc56c33e78	ocfs2: convert to new i_version API Signed-off-by: Jeff Layton <jlayton@redhat.com> Reviewed-by: Jan Kara <jack@suse.cz>	2018-01-29 06:42:21 -05:00
Jeff Layton	1f15a550f5	nfsd: convert to new i_version API Mostly just making sure we use the "get" wrappers so we know when it is being fetched for later use. Signed-off-by: Jeff Layton <jlayton@redhat.com>	2018-01-29 06:42:21 -05:00
Jeff Layton	1eb5d98f16	nfs: convert to new i_version API For NFS, we just use the "raw" API since the i_version is mostly managed by the server. The exception there is when the client holds a write delegation, but we only need to bump it once there anyway to handle CB_GETATTR. Tested-by: Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by: Jeff Layton <jlayton@redhat.com>	2018-01-29 06:42:21 -05:00
Jeff Layton	ee73f9a52a	ext4: convert to new i_version API Signed-off-by: Jeff Layton <jlayton@redhat.com> Acked-by: Theodore Ts'o <tytso@mit.edu>	2018-01-29 06:42:21 -05:00
Jeff Layton	e1d747d9b6	ext2: convert to new i_version API Signed-off-by: Jeff Layton <jlayton@redhat.com> Reviewed-by: Jan Kara <jack@suse.cz>	2018-01-29 06:42:20 -05:00
Jeff Layton	317bc94780	exofs: switch to new i_version API Signed-off-by: Jeff Layton <jlayton@redhat.com>	2018-01-29 06:42:20 -05:00
Jeff Layton	c7f88c4e78	btrfs: convert to new i_version API Signed-off-by: Jeff Layton <jlayton@redhat.com> Acked-by: David Sterba <dsterba@suse.com>	2018-01-29 06:42:20 -05:00
Jeff Layton	a01179e6eb	afs: convert to new i_version API For AFS, it's generally treated as an opaque value, so we use the *_raw variants of the API here. Note that AFS has quite a different definition for this counter. AFS only increments it on changes to the data to the data in regular files and contents of the directories. Inode metadata changes do not result in a version increment. We'll need to reconcile that somehow if we ever want to present this to userspace via statx. Signed-off-by: Jeff Layton <jlayton@redhat.com>	2018-01-29 06:42:20 -05:00
Jeff Layton	9dffe569d9	affs: convert to new i_version API Signed-off-by: Jeff Layton <jlayton@redhat.com>	2018-01-29 06:42:20 -05:00
Jeff Layton	2489dbabea	fat: convert to new i_version API Signed-off-by: Jeff Layton <jlayton@redhat.com>	2018-01-29 06:42:20 -05:00
Jeff Layton	ae5e165d85	fs: new API for handling inode->i_version Add a documentation blob that explains what the i_version field is, how it is expected to work, and how it is currently implemented by various filesystems. We already have inode_inc_iversion. Add several other functions for manipulating and accessing the i_version counter. For now, the implementation is trivial and basically works the way that all of the open-coded i_version accesses work today. Future patches will convert existing users of i_version to use the new API, and then convert the backend implementation to do things more efficiently. Signed-off-by: Jeff Layton <jlayton@redhat.com> Reviewed-by: Jan Kara <jack@suse.cz>	2018-01-29 06:41:30 -05:00
Trond Myklebust	e231c6879c	NFS: Fix a race between mmap() and O_DIRECT When locking the file in order to do O_DIRECT on it, we must unmap any mmapped ranges on the pagecache so that we can flush out the dirty data. Fixes: `a5864c999d` ("NFS: Do not serialise O_DIRECT reads and writes") Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Cc: stable@vger.kernel.org # v4.8+	2018-01-28 22:00:15 -05:00
Achilles Gaikwad	36c7ce4a17	fs/cifs/cifsacl.c Fixes typo in a comment Signed-off-by: Achilles Gaikwad <achillesgaikwad@gmail.com> Signed-off-by: Steve French <smfrench@gmail.com>	2018-01-28 09:19:45 -06:00
Trond Myklebust	128159f292	NFS: Remove a redundant call to unmap_mapping_range() We don't need to call unmap_mapping_range() prior to calling nfs_sync_mapping(). Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2018-01-28 09:35:54 -05:00
Steve French	ab2c643309	update internal version number for cifs.ko To version 2.11 Signed-off-by: Steve French <smfrench@gmail.com>	2018-01-26 17:03:01 -06:00
Andrés Souto	cd1aca29fa	cifs: add .splice_write add splice_write support in cifs vfs using iter_file_splice_write Signed-off-by: Andrés Souto <kai670@gmail.com> Signed-off-by: Steve French <smfrench@gmail.com>	2018-01-26 17:03:01 -06:00
Aurelien Aptel	4a1360d01d	CIFS: document tcon/ses/server refcount dance Signed-off-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Steve French <smfrench@gmail.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>	2018-01-26 17:03:00 -06:00
Steve French	6b314714ff	move a few externs to smbdirect.h to eliminate warning Quiet minor sparse warnings in new SMB3 rdma patch series ("symbol was not declared ...") by moving these externs to smbdirect.h Signed-off-by: Steve French <smfrench@gmail.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>	2018-01-26 17:03:00 -06:00
Aurelien Aptel	97f4b7276b	CIFS: zero sensitive data when freeing also replaces memset()+kfree() by kzfree(). Signed-off-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Steve French <smfrench@gmail.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com> Cc: <stable@vger.kernel.org>	2018-01-26 17:03:00 -06:00
Steve French	2026b06e9c	Cleanup some minor endian issues in smb3 rdma Minor cleanup of some sparse warnings (including a few misc endian fixes for the new smb3 rdma code) Signed-off-by: Steve French <smfrench@gmail.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>	2018-01-26 17:03:00 -06:00
Aurelien Aptel	02cf5905e3	CIFS: dump IPC tcon in debug proc file dump it as first share with an "IPC: " prefix. Signed-off-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Steve French <smfrench@gmail.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>	2018-01-26 17:03:00 -06:00
Aurelien Aptel	63a83b861c	CIFS: use tcon_ipc instead of use_ipc parameter of SMB2_ioctl Since IPC now has a tcon object, the caller can just pass it. This allows domain-based DFS requests to work with smb2+. Link: https://bugzilla.samba.org/show_bug.cgi?id=12917 Fixes: `9d49640a21` ("CIFS: implement get_dfs_refer for SMB2+") Signed-off-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Steve French <smfrench@gmail.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>	2018-01-26 17:03:00 -06:00
Aurelien Aptel	b327a717e5	CIFS: make IPC a regular tcon * Remove ses->ipc_tid. * Make IPC$ regular tcon. * Add a direct pointer to it in ses->tcon_ipc. * Distinguish PIPE tcon from IPC tcon by adding a tcon->pipe flag. All IPC tcons are pipes but not all pipes are IPC. * All TreeConnect functions now cannot take a NULL tcon object. The IPC tcon has the same lifetime as the session it belongs to. It is created when the session is created and destroyed when the session is destroyed. Since no mounts directly refer to the IPC tcon, its refcount should always be set to initialisation value (1). Thus we make sure cifs_put_tcon() skips it. If the mount request resulting in a new session being created requires encryption, try to require it too for IPC. * set SERVER_NAME_LENGTH to serverName actual size The maximum length of an ipv6 string representation is defined in INET6_ADDRSTRLEN as 45+1 for null but lets keep what we know works. Signed-off-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Steve French <smfrench@gmail.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>	2018-01-26 17:03:00 -06:00
Martin Brandenburg	6793f1c450	orangefs: fix deadlock; do not write i_size in read_iter After do_readv_writev, the inode cache is invalidated anyway, so i_size will never be read. It will be fetched from the server which will also know about updates from other machines. Fixes deadlock on 32-bit SMP. See https://marc.info/?l=linux-fsdevel&m=151268557427760&w=2 Signed-off-by: Martin Brandenburg <martin@omnibond.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Mike Marshall <hubcap@omnibond.com> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2018-01-25 17:26:24 -08:00
Jake Daryll Obina	5bdd0c6f89	jffs2: Fix use-after-free bug in jffs2_iget()'s error handling path If jffs2_iget() fails for a newly-allocated inode, jffs2_do_clear_inode() can get called twice in the error handling path, the first call in jffs2_iget() itself and the second through iget_failed(). This can result to a use-after-free error in the second jffs2_do_clear_inode() call, such as shown by the oops below wherein the second jffs2_do_clear_inode() call was trying to free node fragments that were already freed in the first jffs2_do_clear_inode() call. [ 78.178860] jffs2: error: (1904) jffs2_do_read_inode_internal: CRC failed for read_inode of inode 24 at physical location 0x1fc00c [ 78.178914] Unable to handle kernel paging request at virtual address 6b6b6b6b6b6b6b7b [ 78.185871] pgd = ffffffc03a567000 [ 78.188794] [6b6b6b6b6b6b6b7b] pgd=0000000000000000, pud=0000000000000000 [ 78.194968] Internal error: Oops: 96000004 [#1] PREEMPT SMP ... [ 78.513147] PC is at rb_first_postorder+0xc/0x28 [ 78.516503] LR is at jffs2_kill_fragtree+0x28/0x90 [jffs2] [ 78.520672] pc : [<ffffff8008323d28>] lr : [<ffffff8000eb1cc8>] pstate: 60000105 [ 78.526757] sp : ffffff800cea38f0 [ 78.528753] x29: ffffff800cea38f0 x28: ffffffc01f3f8e80 [ 78.532754] x27: 0000000000000000 x26: ffffff800cea3c70 [ 78.536756] x25: 00000000dc67c8ae x24: ffffffc033d6945d [ 78.540759] x23: ffffffc036811740 x22: ffffff800891a5b8 [ 78.544760] x21: 0000000000000000 x20: 0000000000000000 [ 78.548762] x19: ffffffc037d48910 x18: ffffff800891a588 [ 78.552764] x17: 0000000000000800 x16: 0000000000000c00 [ 78.556766] x15: 0000000000000010 x14: 6f2065646f6e695f [ 78.560767] x13: 6461657220726f66 x12: 2064656c69616620 [ 78.564769] x11: 435243203a6c616e x10: 7265746e695f6564 [ 78.568771] x9 : 6f6e695f64616572 x8 : ffffffc037974038 [ 78.572774] x7 : bbbbbbbbbbbbbbbb x6 : 0000000000000008 [ 78.576775] x5 : 002f91d85bd44a2f x4 : 0000000000000000 [ 78.580777] x3 : 0000000000000000 x2 : 000000403755e000 [ 78.584779] x1 : 6b6b6b6b6b6b6b6b x0 : 6b6b6b6b6b6b6b6b ... [ 79.038551] [<ffffff8008323d28>] rb_first_postorder+0xc/0x28 [ 79.042962] [<ffffff8000eb5578>] jffs2_do_clear_inode+0x88/0x100 [jffs2] [ 79.048395] [<ffffff8000eb9ddc>] jffs2_evict_inode+0x3c/0x48 [jffs2] [ 79.053443] [<ffffff8008201ca8>] evict+0xb0/0x168 [ 79.056835] [<ffffff8008202650>] iput+0x1c0/0x200 [ 79.060228] [<ffffff800820408c>] iget_failed+0x30/0x3c [ 79.064097] [<ffffff8000eba0c0>] jffs2_iget+0x2d8/0x360 [jffs2] [ 79.068740] [<ffffff8000eb0a60>] jffs2_lookup+0xe8/0x130 [jffs2] [ 79.073434] [<ffffff80081f1a28>] lookup_slow+0x118/0x190 [ 79.077435] [<ffffff80081f4708>] walk_component+0xfc/0x28c [ 79.081610] [<ffffff80081f4dd0>] path_lookupat+0x84/0x108 [ 79.085699] [<ffffff80081f5578>] filename_lookup+0x88/0x100 [ 79.089960] [<ffffff80081f572c>] user_path_at_empty+0x58/0x6c [ 79.094396] [<ffffff80081ebe14>] vfs_statx+0xa4/0x114 [ 79.098138] [<ffffff80081ec44c>] SyS_newfstatat+0x58/0x98 [ 79.102227] [<ffffff800808354c>] __sys_trace_return+0x0/0x4 [ 79.106489] Code: d65f03c0 f9400001 b40000e1 aa0103e0 (f9400821) The jffs2_do_clear_inode() call in jffs2_iget() is unnecessary since iget_failed() will eventually call jffs2_do_clear_inode() if needed, so just remove it. Fixes: `5451f79f5f` ("iget: stop JFFS2 from using iget() and read_inode()") Reviewed-by: Richard Weinberger <richard@nod.at> Signed-off-by: Jake Daryll Obina <jake.obina@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2018-01-25 19:34:30 -05:00
Alexey Dobriyan	b35d786b67	dcache: delete unused d_hash_mask Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2018-01-25 19:34:30 -05:00
Alexey Dobriyan	854d3e6343	dcache: subtract d_hash_shift from 32 in advance Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2018-01-25 19:34:29 -05:00
Eric Biggers	01950a349e	fs/buffer.c: fold init_buffer() into init_page_buffers() Since commit `e76004093d` ("fs/buffer.c: remove unnecessary init operation after allocating buffer_head"), there are no callers of init_buffer() outside of init_page_buffers(). So just fold it into init_page_buffers(). Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2018-01-25 19:34:28 -05:00
Eric Biggers	4bfd054ae1	fs: fold __inode_permission() into inode_permission() Since commit `9c630ebefe` ("ovl: simplify permission checking"), overlayfs doesn't call __inode_permission() anymore, which leaves no users other than inode_permission(). So just fold it back into inode_permission(). Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2018-01-25 19:34:28 -05:00
Chao Yu	1c1d35df71	f2fs: support inode creation time This patch adds creation time field in inode layout to support showing kstat.btime in ->statx. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2018-01-25 14:10:39 -08:00
Benjamin Coddington	f34462c3c8	pnfs/blocklayout: Ensure disk address in block device map It's possible that the device map is smaller than the offset into the device for the I/O we're adding. Add a check for it and bail out, otherwise we risk botching the bio calculations that follow. Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Trond Myklebust <trondmy@gmail.com>	2018-01-25 16:42:35 -05:00
Benjamin Coddington	b39604755c	pnfs/blocklayout: pnfs_block_dev_map uses bytes, not sectors Fixup the field types to match their use. Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Trond Myklebust <trondmy@gmail.com>	2018-01-25 16:42:35 -05:00
Yunlei He	068c3cd858	f2fs: rebuild sit page from sit info in mem This patch rebuild sit page from sit info in mem instead of issue a read io. I test this method and the result is as below: Pre: mmc_perf_test-12061 [001] ...1 976.819992: f2fs_write_checkpoint: dev = (259,44), checkpoint for Sync, state = start flush sit mmc_perf_test-12061 [001] ...1 976.856446: f2fs_write_checkpoint: dev = (259,44), checkpoint for Sync, state = end flush sit mmc_perf_test-12061 [003] ...1 998.976946: f2fs_write_checkpoint: dev = (259,44), checkpoint for Sync, state = start flush sit mmc_perf_test-12061 [003] ...1 999.023269: f2fs_write_checkpoint: dev = (259,44), checkpoint for Sync, state = end flush sit mmc_perf_test-12061 [003] ...1 1022.060772: f2fs_write_checkpoint: dev = (259,44), checkpoint for Sync, state = start flush sit mmc_perf_test-12061 [003] ...1 1022.111034: f2fs_write_checkpoint: dev = (259,44), checkpoint for Sync, state = end flush sit mmc_perf_test-12061 [002] ...1 1070.127643: f2fs_write_checkpoint: dev = (259,44), checkpoint for Sync, state = start flush sit mmc_perf_test-12061 [003] ...1 1070.187352: f2fs_write_checkpoint: dev = (259,44), checkpoint for Sync, state = end flush sit mmc_perf_test-12061 [003] ...1 1095.942124: f2fs_write_checkpoint: dev = (259,44), checkpoint for Sync, state = start flush sit mmc_perf_test-12061 [003] ...1 1095.995975: f2fs_write_checkpoint: dev = (259,44), checkpoint for Sync, state = end flush sit mmc_perf_test-12061 [003] ...1 1122.535091: f2fs_write_checkpoint: dev = (259,44), checkpoint for Sync, state = start flush sit mmc_perf_test-12061 [003] ...1 1122.586521: f2fs_write_checkpoint: dev = (259,44), checkpoint for Sync, state = end flush sit mmc_perf_test-12061 [001] ...1 1147.897487: f2fs_write_checkpoint: dev = (259,44), checkpoint for Sync, state = start flush sit mmc_perf_test-12061 [001] ...1 1147.959438: f2fs_write_checkpoint: dev = (259,44), checkpoint for Sync, state = end flush sit mmc_perf_test-12061 [003] ...1 1177.926951: f2fs_write_checkpoint: dev = (259,44), checkpoint for Sync, state = start flush sit mmc_perf_test-12061 [002] ...1 1177.976823: f2fs_write_checkpoint: dev = (259,44), checkpoint for Sync, state = end flush sit mmc_perf_test-12061 [002] ...1 1204.176087: f2fs_write_checkpoint: dev = (259,44), checkpoint for Sync, state = start flush sit mmc_perf_test-12061 [002] ...1 1204.239046: f2fs_write_checkpoint: dev = (259,44), checkpoint for Sync, state = end flush sit Some sit flush consume more than 50ms. Now: mmc_perf_test-2187 [007] ...1 196.840684: f2fs_write_checkpoint: dev = (259,44), checkpoint for Sync, state = start flush sit mmc_perf_test-2187 [007] ...1 196.841258: f2fs_write_checkpoint: dev = (259,44), checkpoint for Sync, state = end flush sit mmc_perf_test-2187 [007] ...1 219.430582: f2fs_write_checkpoint: dev = (259,44), checkpoint for Sync, state = start flush sit mmc_perf_test-2187 [007] ...1 219.431144: f2fs_write_checkpoint: dev = (259,44), checkpoint for Sync, state = end flush sit mmc_perf_test-2187 [002] ...1 243.638678: f2fs_write_checkpoint: dev = (259,44), checkpoint for Sync, state = start flush sit mmc_perf_test-2187 [000] ...1 243.638980: f2fs_write_checkpoint: dev = (259,44), checkpoint for Sync, state = end flush sit mmc_perf_test-2187 [002] ...1 265.392180: f2fs_write_checkpoint: dev = (259,44), checkpoint for Sync, state = start flush sit mmc_perf_test-2187 [002] ...1 265.392245: f2fs_write_checkpoint: dev = (259,44), checkpoint for Sync, state = end flush sit mmc_perf_test-2187 [000] ...1 290.309051: f2fs_write_checkpoint: dev = (259,44), checkpoint for Sync, state = start flush sit mmc_perf_test-2187 [000] ...1 290.309116: f2fs_write_checkpoint: dev = (259,44), checkpoint for Sync, state = end flush sit mmc_perf_test-2187 [003] ...1 317.144209: f2fs_write_checkpoint: dev = (259,44), checkpoint for Sync, state = start flush sit mmc_perf_test-2187 [003] ...1 317.145913: f2fs_write_checkpoint: dev = (259,44), checkpoint for Sync, state = end flush sit mmc_perf_test-2187 [005] ...1 343.224954: f2fs_write_checkpoint: dev = (259,44), checkpoint for Sync, state = start flush sit mmc_perf_test-2187 [005] ...1 343.225574: f2fs_write_checkpoint: dev = (259,44), checkpoint for Sync, state = end flush sit mmc_perf_test-2187 [000] ...1 370.239846: f2fs_write_checkpoint: dev = (259,44), checkpoint for Sync, state = start flush sit mmc_perf_test-2187 [000] ...1 370.241138: f2fs_write_checkpoint: dev = (259,44), checkpoint for Sync, state = end flush sit mmc_perf_test-2187 [001] ...1 397.029043: f2fs_write_checkpoint: dev = (259,44), checkpoint for Sync, state = start flush sit mmc_perf_test-2187 [001] ...1 397.030750: f2fs_write_checkpoint: dev = (259,44), checkpoint for Sync, state = end flush sit mmc_perf_test-2187 [003] ...1 425.386377: f2fs_write_checkpoint: dev = (259,44), checkpoint for Sync, state = start flush sit mmc_perf_test-2187 [003] ...1 425.387735: f2fs_write_checkpoint: dev = (259,44), checkpoint for Sync, state = end flush sit Most sit flush consume no more than 1ms. Signed-off-by: Yunlei He <heyunlei@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2018-01-25 10:44:34 -08:00
Chao Yu	3b60d802d9	f2fs: stop issuing discard if fs is readonly If filesystem is readonly, stop to issue discard in daemon. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2018-01-25 10:40:10 -08:00
Chao Yu	6819b884e0	f2fs: clean up duplicated assignment in init_discard_policy Remove duplicated codes of assignment for .max_requests and .io_aware_gran. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2018-01-25 10:40:01 -08:00
Chao Yu	2882d34310	f2fs: use GFP_F2FS_ZERO for cleanup Clean up codes with GFP_F2FS_ZERO, no logic changes. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2018-01-25 10:39:49 -08:00
Bob Peterson	4519eaad72	GFS2: Fix minor comment typo Signed-off-by: Bob Peterson <rpeterso@redhat.com>	2018-01-25 10:18:06 -07:00
Linus Torvalds	525273fb2e	for-4.15 -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAlppIo8ACgkQxWXV+ddt WDuu9A//e29aPiicPrtIovNfVCzev774UddFBhIiEqyfbO0QvHk7Ld0d9+htR38W eAwxJG3gH4VS9yOFAyk/LrsjpQZYzP32wjGoIXQlz8J1SzxALlILLmjpYkZWHwZX UqAomYDnRjFu4jsHZ/AAyUNngYER8aeH+aO1PudppBiSCIVpk1TAVRnkt04gCJ7e CigFgMdRc8pT5P0T6Rsv2+W/yJC7sU0BgDIBjmUcnduqEAxRYl0zsJzpP0IYPPRa cAnpyu/ApK/m9mlLWi0SfUyNvePFWNxfA2QDBO5G6FwM6C2f5x/zBGfu29wAupVJ czdOSR5uqCd6WGZHfvaQ4cgQ69AE4lk68zijHRNESPa7tVZ9mQt713h2BqAX6Fus z+y45ti9CrFoOhaoXCSjUbpJZQ7YKWrat44Qi8pJaKGxqiqXpT8oVcBfjofu+Drn vnDrU+7WvST5TxcWw/JBWaFfft1dkbdKX+EYZW8sVJAmA/e0+aIfElsTuvdUQxuA aLnsyvsYQHVcA/mArD12sScQ13DDejR4ija8bP5tpmodxquUouwEAtKPoFnXVT6k 8hbb6qPLsiBerRMSSuaNaFF/ESi36bsizH3O99bYyyxlR8bXi7irHZY9VPWc1OK3 4twoDO0RhCxDHPY5mrIcKdlp9HiiUBPGos3BCaOAg8rKHd/QP7Q= =EOAu -----END PGP SIGNATURE----- Merge tag 'for-4.15-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fix from David Sterba: "It's been reported recently that readdir can list stale entries under some conditions. Fix it." * tag 'for-4.15-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: Btrfs: fix stale entries in readdir	2018-01-25 09:03:10 -08:00
Colin Ian King	37e12f5551	cifs: remove redundant duplicated assignment of pointer 'node' Node is assigned twice to rb_first(root), first during declaration time and second after a taking a spin lock, so we have a duplicated assignment. Remove the first assignment because it is redundant and also not protected by the spin lock. Cleans up clang warning: fs/cifs/connect.c:4435:18: warning: Value stored to 'node' during its initialization is never read Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Steve French <smfrench@gmail.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>	2018-01-24 19:49:07 -06:00
Arnd Bergmann	e36c048a9b	CIFS: SMBD: work around gcc -Wmaybe-uninitialized warning GCC versions from 4.9 to 6.3 produce a false-positive warning when dealing with a conditional spin_lock_irqsave(): fs/cifs/smbdirect.c: In function 'smbd_recv_buf': include/linux/spinlock.h:260:3: warning: 'flags' may be used uninitialized in this function [-Wmaybe-uninitialized] This function calls some sleeping interfaces, so it is clear that it does not get called with interrupts disabled and there is no need to save the irq state before taking the spinlock. This lets us remove the variable, which makes the function slightly more efficient and avoids the warning. A further cleanup could do the same change for other functions in this file, but I did not want to take this too far for now. Fixes: ac69f66e54ca ("CIFS: SMBD: Implement function to receive data via RDMA receive") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Steve French <smfrench@gmail.com>	2018-01-24 19:49:07 -06:00
Daniel N Pettersson	9aca7e4544	cifs: Fix autonegotiate security settings mismatch Autonegotiation gives a security settings mismatch error if the SMB server selects an SMBv3 dialect that isn't SMB3.02. The exact error is "protocol revalidation - security settings mismatch". This can be tested using Samba v4.2 or by setting the global Samba setting max protocol = SMB3_00. The check that fails in smb3_validate_negotiate is the dialect verification of the negotiate info response. This is because it tries to verify against the protocol_id in the global smbdefault_values. The protocol_id in smbdefault_values is SMB3.02. In SMB2_negotiate the protocol_id in smbdefault_values isn't updated, it is global so it probably shouldn't be, but server->dialect is. This patch changes the check in smb3_validate_negotiate to use server->dialect instead of server->vals->protocol_id. The patch works with autonegotiate and when using a specific version in the vers mount option. Signed-off-by: Daniel N Pettersson <danielnp@axis.com> Signed-off-by: Steve French <smfrench@gmail.com> CC: Stable <stable@vger.kernel.org>	2018-01-24 19:49:07 -06:00
kbuild test robot	9084432c31	CIFS: SMBD: _smbd_get_connection() can be static Fixes: 07495ff5d9bc ("CIFS: SMBD: Establish SMB Direct connection") Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Steve French <smfrench@gmail.com> Acked-by: Long Li <longli@microsoft.com>	2018-01-24 19:49:07 -06:00
Long Li	8801e90233	CIFS: SMBD: Disable signing on SMB direct transport Currently the CIFS SMB Direct implementation (experimental) doesn't properly support signing. Disable it when SMB Direct is in use for transport. Signing will be enabled in future after it is implemented. Signed-off-by: Long Li <longli@microsoft.com> Signed-off-by: Steve French <smfrench@gmail.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>	2018-01-24 19:49:07 -06:00
Long Li	08a3b9690f	CIFS: SMBD: Add SMB Direct debug counters For debugging and troubleshooting, export SMBDirect debug counters to /proc/fs/cifs/DebugData. Signed-off-by: Long Li <longli@microsoft.com> Signed-off-by: Steve French <smfrench@gmail.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>	2018-01-24 19:49:07 -06:00
Long Li	bd3dcc6a22	CIFS: SMBD: Upper layer performs SMB read via RDMA write through memory registration If I/O size is larger than rdma_readwrite_threshold, use RDMA write for SMB read by specifying channel SMB2_CHANNEL_RDMA_V1 or SMB2_CHANNEL_RDMA_V1_INVALIDATE in the SMB packet, depending on SMB dialect used. Append a smbd_buffer_descriptor_v1 to the end of the SMB packet and fill in other values to indicate this SMB read uses RDMA write. There is no need to read from the transport for incoming payload. At the time SMB read response comes back, the data is already transferred and placed in the pages by RDMA hardware. When SMB read is finished, deregister the memory regions if RDMA write is used for this SMB read. smbd_deregister_mr may need to do local invalidation and sleep, if server remote invalidation is not used. There are situations where the MID may not be created on I/O failure, under which memory region is deregistered when read data context is released. Signed-off-by: Long Li <longli@microsoft.com> Signed-off-by: Steve French <smfrench@gmail.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>	2018-01-24 19:49:07 -06:00
Long Li	74dcf418fe	CIFS: SMBD: Read correct returned data length for RDMA write (SMB read) I/O This patch is for preparing upper layer doing SMB read via RDMA write. When RDMA write is used for SMB read, the returned data length is in DataRemaining in the response packet. Reading it properly by adding a parameter to specifiy where the returned data length is. Add the defition for memory registration to wdata and return the correct length based on if RDMA write is used. Signed-off-by: Long Li <longli@microsoft.com> Signed-off-by: Steve French <smfrench@gmail.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>	2018-01-24 19:49:07 -06:00
Long Li	db223a590d	CIFS: SMBD: Upper layer performs SMB write via RDMA read through memory registration When sending I/O, if size is larger than rdma_readwrite_threshold we prepare to send SMB write packet for a RDMA read via memory registration. The actual I/O is done by remote peer through local RDMA hardware. Modify the relevant fields in the packet accordingly, and append a smbd_buffer_descriptor_v1 to the end of the SMB write packet. On write I/O finish, deregister the memory region if this was for a RDMA read. If remote invalidation is not used, the call to smbd_deregister_mr will do local invalidation and possibly wait. Memory region is normally deregistered in MID callback as soon as it's used. There are situations where the MID may not be created on I/O failure, under which memory region is deregistered when write data context is released. Signed-off-by: Long Li <longli@microsoft.com> Signed-off-by: Steve French <smfrench@gmail.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>	2018-01-24 19:49:07 -06:00
Long Li	c739858334	CIFS: SMBD: Implement RDMA memory registration Memory registration is used for transferring payload via RDMA read or write. After I/O is done, memory registrations are recovered and reused. This process can be time consuming and is done in a work queue. Signed-off-by: Long Li <longli@microsoft.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <smfrench@gmail.com>	2018-01-24 19:49:06 -06:00
Long Li	9762c2d080	CIFS: SMBD: Upper layer sends data via RDMA send With SMB Direct connected, use it for sending data via RDMA send. Signed-off-by: Long Li <longli@microsoft.com> Signed-off-by: Steve French <smfrench@gmail.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>	2018-01-24 19:49:06 -06:00
Long Li	d649e1bba3	CIFS: SMBD: Implement function to send data via RDMA send The transport doesn't maintain send buffers or send queue for transferring payload via RDMA send. There is no data copy in the transport on send. Signed-off-by: Long Li <longli@microsoft.com> Signed-off-by: Steve French <smfrench@gmail.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>	2018-01-24 19:49:06 -06:00
Long Li	2fef137a2e	CIFS: SMBD: Upper layer receives data via RDMA receive With SMB Direct connected, use it for receiving data via RDMA receive. Signed-off-by: Long Li <longli@microsoft.com> Signed-off-by: Steve French <smfrench@gmail.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>	2018-01-24 19:49:06 -06:00
Long Li	f64b78fd18	CIFS: SMBD: Implement function to receive data via RDMA receive On the receive path, the transport maintains receive buffers and a reassembly queue for transferring payload via RDMA recv. There is data copy in the transport on recv when it copies the payload to upper layer. The transport recognizes the RFC1002 header length use in the SMB upper layer payloads in CIFS. Because this length is mainly used for TCP and not applicable to RDMA, it is handled as a out-of-band information and is never sent over the wire, and the trasnport behaves like TCP to upper layer by processing and exposing the length correctly on data payloads. Signed-off-by: Long Li <longli@microsoft.com> Signed-off-by: Steve French <smfrench@gmail.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>	2018-01-24 19:49:06 -06:00
Long Li	09902f8dc8	CIFS: SMBD: Set SMB Direct maximum read or write size for I/O When connecting over SMB Direct, the transport negotiates its maximum I/O sizes with the server and determines how to choose to do RDMA send/recv vs read/write. Expose these maximum I/O sizes to upper layer so we will get the correct sized payloads. Signed-off-by: Long Li <longli@microsoft.com> Signed-off-by: Steve French <smfrench@gmail.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>	2018-01-24 19:49:06 -06:00
Long Li	bce9ce7cc0	CIFS: SMBD: Upper layer destroys SMB Direct session on shutdown or umount When upper layer wants to umount, make it call shutdown on transport when SMB Direct is used. Signed-off-by: Long Li <longli@microsoft.com> Signed-off-by: Steve French <smfrench@gmail.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>	2018-01-24 19:49:06 -06:00
Long Li	8ef130f9ec	CIFS: SMBD: Implement function to destroy a SMB Direct connection Add function to tear down a SMB Direct connection. This is used by upper layer to free all SMB Direct connection and transport resources. Signed-off-by: Long Li <longli@microsoft.com> Signed-off-by: Steve French <smfrench@gmail.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>	2018-01-24 19:49:06 -06:00
Long Li	781a8050f2	CIFS: SMBD: Upper layer reconnects to SMB Direct session Do a reconnect on SMB Direct when it is used as the connection. Reconnect can happen for many reasons and it's mostly the decision of SMB2 upper layer. Signed-off-by: Long Li <longli@microsoft.com> Signed-off-by: Steve French <smfrench@gmail.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>	2018-01-24 19:49:06 -06:00
Long Li	ad57b8e172	CIFS: SMBD: Implement function to reconnect to a SMB Direct transport Add function to implement a reconnect to SMB Direct. This involves tearing down the current connection and establishing/negotiating a new connection. Signed-off-by: Long Li <longli@microsoft.com> Signed-off-by: Steve French <smfrench@gmail.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>	2018-01-24 19:49:06 -06:00
Long Li	2f8946464b	CIFS: SMBD: Upper layer connects to SMBDirect session When "rdma" is specified in the mount option, make CIFS connect to SMB Direct. Signed-off-by: Long Li <longli@microsoft.com> Signed-off-by: Steve French <smfrench@gmail.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>	2018-01-24 19:49:06 -06:00
Randy Dunlap	0933d6fa74	cifs: fix build errors for SMB_DIRECT Prevent build errors when CIFS=y and INFINIBAND=m. fs/cifs/smbdirect.o: In function `smbd_qp_async_error_upcall': smbdirect.c:(.text+0x28c): undefined reference to `ib_event_msg' fs/cifs/smbdirect.o: In function `smbd_destroy_rdma_work': smbdirect.c:(.text+0xfde): undefined reference to `ib_drain_qp' smbdirect.c:(.text+0xfea): undefined reference to `rdma_destroy_qp' smbdirect.c:(.text+0x12a0): undefined reference to `ib_free_cq' smbdirect.c:(.text+0x12ac): undefined reference to `ib_free_cq' smbdirect.c:(.text+0x12b8): undefined reference to `ib_dealloc_pd' smbdirect.c:(.text+0x12c4): undefined reference to `rdma_destroy_id' fs/cifs/smbdirect.o: In function `_smbd_get_connection': smbdirect.c:(.text+0x168c): undefined reference to `rdma_create_id' smbdirect.c:(.text+0x1713): undefined reference to `rdma_resolve_addr' smbdirect.c:(.text+0x1780): undefined reference to `rdma_resolve_route' smbdirect.c:(.text+0x17e3): undefined reference to `rdma_destroy_id' smbdirect.c:(.text+0x183d): undefined reference to `rdma_destroy_id' smbdirect.c:(.text+0x199d): undefined reference to `ib_alloc_cq' smbdirect.c:(.text+0x19d9): undefined reference to `ib_alloc_cq' smbdirect.c:(.text+0x1a89): undefined reference to `rdma_create_qp' smbdirect.c:(.text+0x1b3c): undefined reference to `rdma_connect' smbdirect.c:(.text+0x2538): undefined reference to `rdma_destroy_qp' smbdirect.c:(.text+0x2549): undefined reference to `ib_free_cq' smbdirect.c:(.text+0x255a): undefined reference to `ib_free_cq' smbdirect.c:(.text+0x2563): undefined reference to `ib_dealloc_pd' smbdirect.c:(.text+0x256c): undefined reference to `rdma_destroy_id' smbdirect.c:(.text+0x25f0): undefined reference to `__ib_alloc_pd' smbdirect.c:(.text+0x26bb): undefined reference to `rdma_disconnect' fs/cifs/smbdirect.o: In function `smbd_disconnect_rdma_work': smbdirect.c:(.text+0x62): undefined reference to `rdma_disconnect' Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Steve French <sfrench@samba.org> Cc: linux-cifs@vger.kernel.org Cc: samba-technical@lists.samba.org (moderated for non-subscribers) Signed-off-by: Steve French <smfrench@gmail.com>	2018-01-24 19:49:06 -06:00
Matthew Wilcox	f04a703c3d	cifs: Fix missing put_xid in cifs_file_strict_mmap If cifs_zap_mapping() returned an error, we would return without putting the xid that we got earlier. Restructure cifs_file_strict_mmap() and cifs_file_mmap() to be more similar to each other and have a single point of return that always puts the xid. Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com> Signed-off-by: Steve French <smfrench@gmail.com> CC: Stable <stable@vger.kernel.org>	2018-01-24 19:49:06 -06:00
Long Li	d8ec913b17	CIFS: SMBD: export protocol initial values For use-configurable SMB Direct protocol values, export them to /proc/fs/cifs. Signed-off-by: Long Li <longli@microsoft.com> Signed-off-by: Steve French <smfrench@gmail.com> Acked-by: Ronnie Sahlberg <lsahlber@redhat.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>	2018-01-24 19:49:06 -06:00
Long Li	399f9539d9	CIFS: SMBD: Implement function to create a SMB Direct connection The upper layer calls this function to connect to peer through SMB Direct. Each SMB Direct connection is based on a RDMA RC Queue Pair. Signed-off-by: Long Li <longli@microsoft.com> Signed-off-by: Steve French <smfrench@gmail.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>	2018-01-24 19:49:05 -06:00
Long Li	f198186aa9	CIFS: SMBD: Establish SMB Direct connection Add code to implement the core functions to establish a SMB Direct connection. 1. Establish an RDMA connection to SMB server. 2. Negotiate and setup SMB Direct protocol. 3. Implement idle connection timer and credit management. SMB Direct is enabled by setting CONFIG_CIFS_SMB_DIRECT. Add to Makefile to enable building SMB Direct. Signed-off-by: Long Li <longli@microsoft.com> Signed-off-by: Steve French <smfrench@gmail.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>	2018-01-24 19:49:05 -06:00
Long Li	03bee01d62	CIFS: SMBD: Add SMB Direct protocol initial values and constants To prepare for protocol implementation, add constants and user-configurable values for the SMB Direct protocol. Signed-off-by: Long Li <longli@microsoft.com> Signed-off-by: Steve French <smfrench@gmail.com> Acked-by: Ronnie Sahlberg <lsahlber.redhat.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>	2018-01-24 19:49:05 -06:00
Long Li	8339dd32fb	CIFS: SMBD: Add rdma mount option Add "rdma" to CIFS mount options to connect to SMB Direct. Add checks to validate this is used on SMB 3.X dialects. To connect to SMBDirect, use "mount.cifs -o rdma,vers=3.x". At the time of this patch, 3.x can be 3.0, 3.02 or 3.1.1. Signed-off-by: Long Li <longli@microsoft.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <smfrench@gmail.com> Acked-by: Ronnie Sahlberg <lsahlber.redhat.com>	2018-01-24 19:49:05 -06:00
Long Li	2b6ed88037	CIFS: SMBD: Introduce kernel config option CONFIG_CIFS_SMB_DIRECT Build SMB Direct code when this option is set. Signed-off-by: Long Li <longli@microsoft.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <smfrench@gmail.com> Acked-by: Ronnie Sahlberg <lsahlber.redhat.com>	2018-01-24 19:49:05 -06:00
Long Li	2dabfd5bab	CIFS: SMBD: Add parameter rdata to smb2_new_read_req This patch is for preparing upper layer for doing SMB read via RDMA write. When we assemble the SMB read packet header, we need to know the I/O layout if this request is to use a RDMA write. rdata has all the information we need for memory registration. Add rdata to smb2_new_read_req. Signed-off-by: Long Li <longli@microsoft.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <smfrench@gmail.com> Acked-by: Ronnie Sahlberg <lsahlber.redhat.com>	2018-01-24 19:49:05 -06:00
Ronnie Sahlberg	3cecf4865c	cifs: avoid a kmalloc in smb2_send_recv/SendReceive2 for the common case In both functions, use an array of 8 (arbitrary but should be big enough for all current uses) iov and avoid having to kmalloc the array for the common case. If 8 is too small, then fall back to the original behaviour and use kmalloc/kfree. This should not change any behaviour but should save us a tiny amount of cpu cycles. Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <smfrench@gmail.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>	2018-01-24 19:49:05 -06:00
Ronnie Sahlberg	305428acf0	cifs: remove small_smb2_init Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <smfrench@gmail.com> Acked-by: Pavel Shilovsky <pshilov@microsoft.com> Reviewed-by: Aurelien Aptel <aaptel@suse.com>	2018-01-24 19:49:05 -06:00
Ronnie Sahlberg	8eb7998e79	cifs: remove rfc1002 header from smb2_lease_ack Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <smfrench@gmail.com> Reviewed-by: Aurelien Aptel <aaptel@suse.com> Acked-by: Pavel Shilovsky <pshilov@microsoft.com>	2018-01-24 19:49:05 -06:00
Ronnie Sahlberg	5dfe69a407	cifs: remove unused variable from SMB2_read Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <smfrench@gmail.com>	2018-01-24 19:49:05 -06:00
Ronnie Sahlberg	21ad9487ca	cifs: remove rfc1002 header from smb2_oplock_break we get from server Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <smfrench@gmail.com> Acked-by: Pavel Shilovsky <pshilov@microsoft.com> Reviewed-by: Aurelien Aptel <aaptel@suse.com>	2018-01-24 19:49:05 -06:00
Ronnie Sahlberg	b2fb7fecc9	cifs: remove rfc1002 header from smb2_query_info_req Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <smfrench@gmail.com> Acked-by: Pavel Shilovsky <pshilov@microsoft.com> Reviewed-by: Aurelien Aptel <aaptel@suse.com>	2018-01-24 19:49:05 -06:00
Ronnie Sahlberg	7c00c3a625	cifs: remove rfc1002 header from smb2_query_directory_req Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <smfrench@gmail.com> Acked-by: Pavel Shilovsky <pshilov@microsoft.com> Reviewed-by: Aurelien Aptel <aaptel@suse.com>	2018-01-24 19:49:05 -06:00

1 2 3 4 5 ...

52321 Commits