linux

mirror of https://github.com/torvalds/linux.git synced 2024-11-21 19:41:42 +00:00

Author	SHA1	Message	Date
Linus Torvalds	850925a813	Revert patches causing inode collision problems -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE/IPbcYBuWt0zoYhOq06b7GqY5nAFAmccC7UACgkQq06b7GqY 5nBYbA/7BP/Me80+ofClszxd0F3nlXCGJPe31vC36EeFsap5p1TrIM7jvj0iR0Zr HXLxyTwocrubFgSE28rUrSS2nhmk016g3EUEVeoaO+hYSbvtdrRhoTSCubEwb4N5 68vUz1ebGqZ35oT9oVZeAd+xdwxpKPktqrN6zCJsyWbSiZevE8YMkV+dgYrGde+Y j7BMypcxtQ2+u8donJ1PnGz04k0RNhqcqGGED0VuNw1Sp6quY8n0EWvHEeYjrb+N hFYB1lSeA245fVm8bmnUFHLJ2w47bxfJZ8hANbO9C0zLI98vlmH6xItXQnGofc4P 1VtCHxoZnIFMMjK/iTPjvi4JsanNr0mS1Aa5w8HAHqyHYeIoBCpn78XFf79IhFAl pO3RV6nxE8fELXBM39Yyq7I3rQfyFmNBeob0UWIbazQ7cC54xBHpu0RDMj2gD6xy TBbZL2euRfGEswKWfrdR32U1jzSCv+jTAWew9sdx+z0RX9inRyaNBHFVsjOnVFZS iLJn20cotgw0WU2OhghWvgsV4o/Ckx4eJAC9oNxZ9pT1s7BqSvW1qg/Zlpn9wYgf UGQGXrCdXGY6XZ3W53c2joS+QWtmU6N8mAC6jCeqQm8pCQQeiTHE1QycyNc0xPGd ELhQYzCA1hEiNDki3e/vqzbPScpAer9dc/6hMKUT55SVfjcCBoc= =O6o1 -----END PGP SIGNATURE----- Merge tag '9p-for-6.12-rc5' of https://github.com/martinetd/linux Pull more 9p reverts from Dominique Martinet: "Revert patches causing inode collision problems. The code simplification introduced significant regressions on servers that do not remap inode numbers when exporting multiple underlying filesystems with colliding inodes. See the top-most revert (commit `be2ca38253`) for details. This problem had been ignored for too long and the reverts will also head to stable (6.9+). I'm confident this set of patches gets us back to previous behaviour (another related patch had already been reverted back in April and we're almost back to square 1, and the rest didn't touch inode lifecycle)" * tag '9p-for-6.12-rc5' of https://github.com/martinetd/linux: Revert "fs/9p: simplify iget to remove unnecessary paths" Revert "fs/9p: fix uaf in in v9fs_stat2inode_dotl" Revert "fs/9p: remove redundant pointer v9ses" Revert " fs/9p: mitigate inode collisions"	2024-10-25 15:25:02 -07:00
Dominique Martinet	be2ca38253	Revert "fs/9p: simplify iget to remove unnecessary paths" This reverts commit `724a08450f`. This code simplification introduced significant regressions on servers that do not remap inode numbers when exporting multiple underlying filesystems with colliding inodes, as can be illustrated with simple tmpfs exports in qemu with remapping disabled: ``` # host side cd /tmp/linux-test mkdir m1 m2 mount -t tmpfs tmpfs m1 mount -t tmpfs tmpfs m2 mkdir m1/dir m2/dir echo foo > m1/dir/foo echo bar > m2/dir/bar # guest side # started with -virtfs local,path=/tmp/linux-test,mount_tag=tmp,security_model=mapped-file mount -t 9p -o trans=virtio,debug=1 tmp /mnt/t ls /mnt/t/m1/dir # foo ls /mnt/t/m2/dir # bar (works ok if directry isn't open) # cd to keep first dir's inode alive cd /mnt/t/m1/dir ls /mnt/t/m2/dir # foo (should be bar) ``` Other examples can be crafted with regular files with fscache enabled, in which case I/Os just happen to the wrong file leading to corruptions, or guest failing to boot with: \| VFS: Lookup of 'com.android.runtime' in 9p 9p would have caused loop In theory, we'd want the servers to be smart enough and ensure they never send us two different files with the same 'qid.path', but while qemu has an option to remap that is recommended (and qemu prints a warning if this case happens), there are many other servers which do not (kvmtool, nfs-ganesha, probably diod...), we should at least ensure we don't cause regressions on this: - assume servers can't be trusted and operations that should get a 'new' inode properly do so. commit `d05dcfdf5e` (" fs/9p: mitigate inode collisions") attempted to do this, but v9fs_fid_iget_dotl() was not called so some higher level of caching got in the way; this needs to be fixed properly before we can re-apply the patches. - if we ever want to really simplify this code, we will need to add some negotiation with the server at mount time where the server could claim they handle this properly, at which point we could optimize this out. (but that might not be needed at all if we properly handle the 'new' check?) Fixes: `724a08450f` ("fs/9p: simplify iget to remove unnecessary paths") Reported-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/all/20240408141436.GA17022@redhat.com/ Link: https://lkml.kernel.org/r/20240923100508.GA32066@willie-the-truck Cc: stable@vger.kernel.org # v6.9+ Message-ID: <20241024-revert_iget-v1-4-4cac63d25f72@codewreck.org> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>	2024-10-25 06:26:09 +09:00
Dominique Martinet	26f8dd2dde	Revert "fs/9p: fix uaf in in v9fs_stat2inode_dotl" This reverts commit `11763a8598`. This is a requirement to revert commit `724a08450f` ("fs/9p: simplify iget to remove unnecessary paths"), see that revert for details. Fixes: `724a08450f` ("fs/9p: simplify iget to remove unnecessary paths") Reported-by: Will Deacon <will@kernel.org> Link: https://lkml.kernel.org/r/20240923100508.GA32066@willie-the-truck Cc: stable@vger.kernel.org # v6.9+ Message-ID: <20241024-revert_iget-v1-3-4cac63d25f72@codewreck.org> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>	2024-10-25 06:26:09 +09:00
Dominique Martinet	fedd06210b	Revert "fs/9p: remove redundant pointer v9ses" This reverts commit `10211b4a23`. This is a requirement to revert commit `724a08450f` ("fs/9p: simplify iget to remove unnecessary paths"), see that revert for details. Fixes: `724a08450f` ("fs/9p: simplify iget to remove unnecessary paths") Reported-by: Will Deacon <will@kernel.org> Link: https://lkml.kernel.org/r/20240923100508.GA32066@willie-the-truck Cc: stable@vger.kernel.org # v6.9+ Message-ID: <20241024-revert_iget-v1-2-4cac63d25f72@codewreck.org> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>	2024-10-25 06:26:09 +09:00
Dominique Martinet	f69999b5f9	Revert " fs/9p: mitigate inode collisions" This reverts commit `d05dcfdf5e`. This is a requirement to revert commit `724a08450f` ("fs/9p: simplify iget to remove unnecessary paths"), see that revert for details. Fixes: `724a08450f` ("fs/9p: simplify iget to remove unnecessary paths") Reported-by: Will Deacon <will@kernel.org> Link: https://lkml.kernel.org/r/20240923100508.GA32066@willie-the-truck Cc: stable@vger.kernel.org # v6.9+ Message-ID: <20241024-revert_iget-v1-1-4cac63d25f72@codewreck.org> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>	2024-10-25 06:26:08 +09:00
Dominique Martinet	f009e946c1	Revert "9p: Enable multipage folios" This reverts commit `1325e4a91a`. using multipage folios apparently break some madvise operations like MADV_PAGEOUT which do not reliably unload the specified page anymore, Revert the patch until that is figured out. Reported-by: Andrii Nakryiko <andrii@kernel.org> Fixes: `1325e4a91a` ("9p: Enable multipage folios") Signed-off-by: Dominique Martinet <asmadeus@codewreck.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2024-10-24 11:24:05 -07:00
Linus Torvalds	9197b73fd7	Mashed-up update that I sat on too long: - fix for multiple slabs created with the same name - enable multipage folios - theorical fix to also look for opened fids by inode if none was found by dentry -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE/IPbcYBuWt0zoYhOq06b7GqY5nAFAmcS81AACgkQq06b7GqY 5nACpBAAtXOGRjg+dushCwUVKBlnI3oTwE2G+ywnphNZg2A0emlMOxos7x1OTiM3 Fu0b10MCUWHIXo4jD6ALVPWITJTfjiXR8s90Q/ozypcIXXhkDDShhV31b2h6Iplr YyKyjEehDFRiS7rqWC2a9mce99sOpwdQRmnssnWbjYvpJ4imFbl+50Z1I5Nc/Omu j2y02eMuikiWF/shKj0Dx1mmpZ4InSv3kvlM+V2D2YdWKNonGZe/xFZhid95LXmr Upt55R8k9qR2pn4VU22eKP6c34DIZGDlrcQdPUCNP5QuaAdGZov3TjNQdjE1bJmF E2QdxvUNfvvHqlvaRrlWa27uMgXMcy7QV3LEKwmo3tmaYVw2PDMRbFXc9zQdxy91 zqXjjGasnwzE8ca36y79vZjFTHAyY5VK/3cHCL3ai+ysu4UL3k2QgmVegREG/xKk G8Nz4UO/R6s8Wc2VqxKJdZS5NMLlADS+Aes0PG+9AxQz7iR9Ktgwrw39KDxMi+Lm PeH3Gz2rP9+EPoa3usoBQtvvvmJKM/Wb9qdPW9vTtRbRJ7bVclJoizFoLMA/TiW1 Jru+HYGBO75s8RynwEDLMiJhkjZWHfVgDjPsY6YsGVH8W2gOcJ7egQ2J2EsuurN3 tzKz4uQilV+VeDuWs8pWKrX/c3Y3KpSYV+oayg7Je7LoTlQBmU8= =VG4t -----END PGP SIGNATURE----- Merge tag '9p-for-6.12-rc4' of https://github.com/martinetd/linux Pull 9p fixes from Dominique Martinet: "Mashed-up update that I sat on too long: - fix for multiple slabs created with the same name - enable multipage folios - theorical fix to also look for opened fids by inode if none was found by dentry" [ Enabling multi-page folios should have been done during the merge window, but it's a one-liner, and the actual meat of the enablement is in netfs and already in use for other filesystems... - Linus ] * tag '9p-for-6.12-rc4' of https://github.com/martinetd/linux: 9p: Avoid creating multiple slab caches with the same name 9p: Enable multipage folios 9p: v9fs_fid_find: also lookup by inode if not found dentry	2024-10-19 08:44:10 -07:00
David Howells	1325e4a91a	9p: Enable multipage folios Enable support for multipage folios on the 9P filesystem. This is all handled through netfslib and is already enabled on AFS and CIFS also. Signed-off-by: David Howells <dhowells@redhat.com> cc: Eric Van Hensbergen <ericvh@kernel.org> cc: Latchesar Ionkov <lucho@ionkov.net> cc: Dominique Martinet <asmadeus@codewreck.org> cc: Christian Schoenebeck <linux_oss@crudebyte.com> cc: Jeff Layton <jlayton@kernel.org> cc: Matthew Wilcox <willy@infradead.org> cc: v9fs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org Message-ID: <20240620173137.610345-7-dhowells@redhat.com> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>	2024-09-23 05:51:27 +09:00
Dominique Martinet	38d222b316	9p: v9fs_fid_find: also lookup by inode if not found dentry It's possible for v9fs_fid_find "find by dentry" branch to not turn up anything despite having an entry set (because e.g. uid doesn't match), in which case the calling code will generally make an extra lookup to the server. In this case we might have had better luck looking by inode, so fall back to look up by inode if we have one and the lookup by dentry failed. Message-Id: <20240523210024.1214386-1-asmadeus@codewreck.org> Reviewed-by: Christian Schoenebeck <linux_oss@crudebyte.com> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>	2024-09-23 05:51:27 +09:00
David Howells	ee4cdf7ba8	netfs: Speed up buffered reading Improve the efficiency of buffered reads in a number of ways: (1) Overhaul the algorithm in general so that it's a lot more compact and split the read submission code between buffered and unbuffered versions. The unbuffered version can be vastly simplified. (2) Read-result collection is handed off to a work queue rather than being done in the I/O thread. Multiple subrequests can be processes simultaneously. (3) When a subrequest is collected, any folios it fully spans are collected and "spare" data on either side is donated to either the previous or the next subrequest in the sequence. Notes: () Readahead expansion is massively slows down fio, presumably because it causes a load of extra allocations, both folio and xarray, up front before RPC requests can be transmitted. () RDMA with cifs does appear to work, both with SIW and RXE. (*) PG_private_2-based reading and copy-to-cache is split out into its own file and altered to use folio_queue. Note that the copy to the cache now creates a new write transaction against the cache and adds the folios to be copied into it. This allows it to use part of the writeback I/O code. Signed-off-by: David Howells <dhowells@redhat.com> cc: Jeff Layton <jlayton@kernel.org> cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Link: https://lore.kernel.org/r/20240814203850.2240469-20-dhowells@redhat.com/ # v2 Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-09-12 12:20:41 +02:00
Dominique Martinet	e3786b29c5	9p: Fix DIO read through netfs If a program is watching a file on a 9p mount, it won't see any change in size if the file being exported by the server is changed directly in the source filesystem, presumably because 9p doesn't have change notifications, and because netfs skips the reads if the file is empty. Fix this by attempting to read the full size specified when a DIO read is requested (such as when 9p is operating in unbuffered mode) and dealing with a short read if the EOF was less than the expected read. To make this work, filesystems using netfslib must not set NETFS_SREQ_CLEAR_TAIL if performing a DIO read where that read hit the EOF. I don't want to mandatorily clear this flag in netfslib for DIO because, say, ceph might make a read from an object that is not completely filled, but does not reside at the end of file - and so we need to clear the excess. This can be tested by watching an empty file over 9p within a VM (such as in the ktest framework): while true; do read content; if [ -n "$content" ]; then echo $content; break; fi; done < /host/tmp/foo then writing something into the empty file. The watcher should immediately display the file content and break out of the loop. Without this fix, it remains in the loop indefinitely. Fixes: `80105ed2fd` ("9p: Use netfslib read/write_iter") Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218916 Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/r/1229195.1723211769@warthog.procyon.org.uk cc: Eric Van Hensbergen <ericvh@kernel.org> cc: Latchesar Ionkov <lucho@ionkov.net> cc: Christian Schoenebeck <linux_oss@crudebyte.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: Ilya Dryomov <idryomov@gmail.com> cc: Steve French <sfrench@samba.org> cc: Paulo Alcantara <pc@manguebit.com> cc: Trond Myklebust <trond.myklebust@hammerspace.com> cc: v9fs@lists.linux.dev cc: linux-afs@lists.infradead.org cc: ceph-devel@vger.kernel.org cc: linux-cifs@vger.kernel.org cc: linux-nfs@vger.kernel.org cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Signed-off-by: Dominique Martinet <asmadeus@codewreck.org> Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-08-13 13:53:09 +02:00
Linus Torvalds	397a83ab97	Two fixes headed to stable trees: - some trace event was dumping uninitialized values - a missing lock somewhere that was thought to have exclusive access, and it turned out not to -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE/IPbcYBuWt0zoYhOq06b7GqY5nAFAmZWgYUACgkQq06b7GqY 5nD+QQ/+JODfSn9l9JyHgXco0mIpQeldFUYoHPv3UwHr14MF8nux5HjzsupviAik vHBV5C2v6nOgAZWHpX4Rz+EaMNgjIwL2f0wLZMYh1Ho+lLr6+G0fN5iN3vHWmE4C w90qstKKhWf493pW+65IzzFp55vG7PPG8S81ZqbdxpdgMoBVpdtXDjedPOf9uzFi hkfGYWlmbrqkJ8pW4cvnlBkcraKgDDQndTRG4AQLtiLctpDk8/n95KeJpYZvgxX8 30Vu09QjgFzTGur/QFdB8UC0ZEaDALtSKfBDjVwTZBvxA1uM6S1v2Ll6wiufvJ2H gTPtSwZ7CP601NDFdNmtDIsrJSp617d9xjBzFPIwJmX8tJplzy6sKYuVB0xe+gic 4u3xK2I60H5D1Fw0dpWhW4MdgHkyKcEOb+EJ2zj3SmosgmOvLb7hDZ81Vc6FH4SX oLmMIj99Ks8U+TTZvY2lt51wxCXYaHF93feIOKnDEa7dF8gYy3/+C/0ztWbE+csF xqy7iIB2HWhN8/jtIOruiQlcx4JBJr2eZd+Vw/mCWhVvLXA5zbdAqKXEEBFNSyNk RXnk5KnlpgSoNec/z4lv1RJRidwic7TBkA6Q3/cgUuP39SoF4AnS0qcuUbivjKhc 8RTInCO/iaruMJEZdlksFUCRo9iJQL/DdM4F9elX+DHL15TpZ+E= =7DGy -----END PGP SIGNATURE----- Merge tag '9p-for-6.10-rc2' of https://github.com/martinetd/linux Pull 9p fixes from Dominique Martinet: "Two fixes headed to stable trees: - a trace event was dumping uninitialized values - a missing lock that was thought to have exclusive access, and it turned out not to" * tag '9p-for-6.10-rc2' of https://github.com/martinetd/linux: 9p: add missing locking around taking dentry fid list net/9p: fix uninit-value in p9_client_rpc()	2024-05-29 09:25:15 -07:00
David Howells	f89ea63f1c	netfs, 9p: Fix race between umount and async request completion There's a problem in 9p's interaction with netfslib whereby a crash occurs because the 9p_fid structs get forcibly destroyed during client teardown (without paying attention to their refcounts) before netfslib has finished with them. However, it's not a simple case of deferring the clunking that p9_fid_put() does as that requires the p9_client record to still be present. The problem is that netfslib has to unlock pages and clear the IN_PROGRESS flag before destroying the objects involved - including the fid - and, in any case, nothing checks to see if writeback completed barring looking at the page flags. Fix this by keeping a count of outstanding I/O requests (of any type) and waiting for it to quiesce during inode eviction. Reported-by: syzbot+df038d463cca332e8414@syzkaller.appspotmail.com Link: https://lore.kernel.org/all/0000000000005be0aa061846f8d6@google.com/ Reported-by: syzbot+d7c7a495a5e466c031b6@syzkaller.appspotmail.com Link: https://lore.kernel.org/all/000000000000b86c5e06130da9c6@google.com/ Reported-by: syzbot+1527696d41a634cc1819@syzkaller.appspotmail.com Link: https://lore.kernel.org/all/000000000000041f960618206d7e@google.com/ Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/r/755891.1716560771@warthog.procyon.org.uk Tested-by: syzbot+d7c7a495a5e466c031b6@syzkaller.appspotmail.com Reviewed-by: Dominique Martinet <asmadeus@codewreck.org> cc: Eric Van Hensbergen <ericvh@kernel.org> cc: Latchesar Ionkov <lucho@ionkov.net> cc: Christian Schoenebeck <linux_oss@crudebyte.com> cc: Jeff Layton <jlayton@kernel.org> cc: Steve French <sfrench@samba.org> cc: Hillf Danton <hdanton@sina.com> cc: v9fs@lists.linux.dev cc: linux-afs@lists.infradead.org cc: linux-cifs@vger.kernel.org cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Reported-and-tested-by: syzbot+d7c7a495a5e466c031b6@syzkaller.appspotmail.com Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-05-27 13:12:13 +02:00
Dominique Martinet	c898afdc15	9p: add missing locking around taking dentry fid list Fix a use-after-free on dentry's d_fsdata fid list when a thread looks up a fid through dentry while another thread unlinks it: UAF thread: refcount_t: addition on 0; use-after-free. p9_fid_get linux/./include/net/9p/client.h:262 v9fs_fid_find+0x236/0x280 linux/fs/9p/fid.c:129 v9fs_fid_lookup_with_uid linux/fs/9p/fid.c:181 v9fs_fid_lookup+0xbf/0xc20 linux/fs/9p/fid.c:314 v9fs_vfs_getattr_dotl+0xf9/0x360 linux/fs/9p/vfs_inode_dotl.c:400 vfs_statx+0xdd/0x4d0 linux/fs/stat.c:248 Freed by: p9_fid_destroy (inlined) p9_client_clunk+0xb0/0xe0 linux/net/9p/client.c:1456 p9_fid_put linux/./include/net/9p/client.h:278 v9fs_dentry_release+0xb5/0x140 linux/fs/9p/vfs_dentry.c:55 v9fs_remove+0x38f/0x620 linux/fs/9p/vfs_inode.c:518 vfs_unlink+0x29a/0x810 linux/fs/namei.c:4335 The problem is that d_fsdata was not accessed under d_lock, because d_release() normally is only called once the dentry is otherwise no longer accessible but since we also call it explicitly in v9fs_remove that lock is required: move the hlist out of the dentry under lock then unref its fids once they are no longer accessible. Fixes: `154372e67d` ("fs/9p: fix create-unlink-getattr idiom") Cc: stable@vger.kernel.org Reported-by: Meysam Firouzi Reported-by: Amirmohammad Eftekhar Reviewed-by: Christian Schoenebeck <linux_oss@crudebyte.com> Message-ID: <20240521122947.1080227-1-asmadeus@codewreck.org> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>	2024-05-23 20:29:09 +09:00
David Howells	c245868524	netfs: Remove the old writeback code Remove the old writeback code. Signed-off-by: David Howells <dhowells@redhat.com> cc: Jeff Layton <jlayton@kernel.org> cc: Eric Van Hensbergen <ericvh@kernel.org> cc: Latchesar Ionkov <lucho@ionkov.net> cc: Dominique Martinet <asmadeus@codewreck.org> cc: Christian Schoenebeck <linux_oss@crudebyte.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: v9fs@lists.linux.dev cc: linux-afs@lists.infradead.org cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org	2024-05-01 18:07:38 +01:00
David Howells	2df86547b2	netfs: Cut over to using new writeback code Cut over to using the new writeback code. The old code is #ifdef'd out or otherwise removed from compilation to avoid conflicts and will be removed in a future patch. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: Eric Van Hensbergen <ericvh@kernel.org> cc: Latchesar Ionkov <lucho@ionkov.net> cc: Dominique Martinet <asmadeus@codewreck.org> cc: Christian Schoenebeck <linux_oss@crudebyte.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: v9fs@lists.linux.dev cc: linux-afs@lists.infradead.org cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org	2024-05-01 18:07:37 +01:00
David Howells	5fb70e7275	netfs, 9p: Implement helpers for new write code Implement the helpers for the new write code in 9p. There's now an optional ->prepare_write() that allows the filesystem to set the parameters for the next write, such as maximum size and maximum segment count, and an ->issue_write() that is called to initiate an (asynchronous) write operation. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: Eric Van Hensbergen <ericvh@kernel.org> cc: Latchesar Ionkov <lucho@ionkov.net> cc: Dominique Martinet <asmadeus@codewreck.org> cc: Christian Schoenebeck <linux_oss@crudebyte.com> cc: v9fs@lists.linux.dev cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org	2024-05-01 18:07:37 +01:00
David Howells	40fb4828d5	9p: Use alternative invalidation to using launder_folio Use writepages-based flushing invalidation instead of invalidate_inode_pages2() and ->launder_folio(). This will allow ->launder_folio() to be removed eventually. Signed-off-by: David Howells <dhowells@redhat.com> cc: Eric Van Hensbergen <ericvh@kernel.org> cc: Latchesar Ionkov <lucho@ionkov.net> cc: Dominique Martinet <asmadeus@codewreck.org> cc: Christian Schoenebeck <linux_oss@crudebyte.com> cc: Jeff Layton <jlayton@kernel.org> cc: v9fs@lists.linux.dev cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org	2024-05-01 18:07:34 +01:00
Eric Van Hensbergen	d05dcfdf5e	fs/9p: mitigate inode collisions Detect and mitigate inode collsions that now occur since we fixed 9p generating duplicate inode structures. Underlying cause of these appears to be a race condition between reuse of inode numbers in underlying file system and cleanup of inode numbers in the client. Enabling caching makes this much more likely to happen as it increases cleanup latency due to writebacks. Reported-by: Kent Overstreet <kent.overstreet@linux.dev> Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org>	2024-04-22 15:34:27 +00:00
Joakim Sindholt	7fd524b9bd	fs/9p: drop inodes immediately on non-.L too Signed-off-by: Joakim Sindholt <opensource@zhasha.com> Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org>	2024-04-11 23:40:55 +00:00
Eric Van Hensbergen	824f06ff81	fs/9p: Revert "fs/9p: fix dups even in uncached mode" This reverts commit `be57855f50`. It caused a regression involving duplicate inode numbers in some tester trees. The bad behavior seems to be dependent on inode reuse policy in underlying file system, so it did not trigger in my test setup. Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org>	2024-04-11 23:36:33 +00:00
Eric Van Hensbergen	6e45a30fe5	fs/9p: remove erroneous nlink init from legacy stat2inode In 9p2000 legacy mode, stat2inode initializes nlink to 1, which is redundant with what alloc_inode should have already set. 9p2000.u overrides this with extensions if present in the stat structure, and 9p2000.L incorporates nlink into its stat structure. At the very least this probably messes with directory nlink accounting in legacy mode. Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org>	2024-04-09 23:53:00 +00:00
Jeff Layton	7a84602297	9p: explicitly deny setlease attempts 9p is a remote network protocol, and it doesn't support asynchronous notifications from the server. Ensure that we don't hand out any leases since we can't guarantee they'll be broken when a file's contents change. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org>	2024-03-28 19:52:55 +00:00
Joakim Sindholt	4e5d208cc9	fs/9p: fix the cache always being enabled on files with qid flags I'm not sure why this check was ever here. After updating to 6.6 I suddenly found caching had been turned on by default and neither cache=none nor the new directio would turn it off. After walking through the new code very manually I realized that it's because the caching has to be, in effect, turned off explicitly by setting P9L_DIRECT and whenever a file has a flag, in my case QTAPPEND, it doesn't get set. Setting aside QTDIR which seems to ignore the new fid->mode entirely, the rest of these either should be subject to the same cache rules as every other QTFILE or perhaps very explicitly not cached in the case of QTAUTH. Signed-off-by: Joakim Sindholt <opensource@zhasha.com> Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org>	2024-03-28 15:10:29 +00:00
Joakim Sindholt	87de39e705	fs/9p: translate O_TRUNC into OTRUNC This one hits both 9P2000 and .u as it appears v9fs has never translated the O_TRUNC flag. Signed-off-by: Joakim Sindholt <opensource@zhasha.com> Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org>	2024-03-28 15:10:28 +00:00
Joakim Sindholt	cd25e15e57	fs/9p: only translate RWX permissions for plain 9P2000 Garbage in plain 9P2000's perm bits is allowed through, which causes it to be able to set (among others) the suid bit. This was presumably not the intent since the unix extended bits are handled explicitly and conditionally on .u. Signed-off-by: Joakim Sindholt <opensource@zhasha.com> Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org>	2024-03-28 13:59:23 +00:00
Eric Van Hensbergen	6630036b7c	fs/9p: fix uninitialized values during inode evict If an iget fails due to not being able to retrieve information from the server then the inode structure is only partially initialized. When the inode gets evicted, references to uninitialized structures (like fscache cookies) were being made. This patch checks for a bad_inode before doing anything other than clearing the inode from the cache. Since the inode is bad, it shouldn't have any state associated with it that needs to be written back (and there really isn't a way to complete those anyways). Reported-by: syzbot+eb83fe1cce5833cd66a0@syzkaller.appspotmail.com Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org>	2024-03-25 14:16:06 +00:00
Colin Ian King	10211b4a23	fs/9p: remove redundant pointer v9ses Pointer v9ses is being assigned the value from the return of inlined function v9fs_inode2v9ses (which just returns inode->i_sb->s_fs_info). The pointer is not used after the assignment, so the variable is redundant and can be removed. Cleans up clang scan warnings such as: fs/9p/vfs_inode_dotl.c:300:28: warning: variable 'v9ses' set but not used [-Wunused-but-set-variable] Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Reviewed-by: Dominique Martinet <asmadeus@codewreck.org> Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org>	2024-03-25 00:34:35 +00:00
Lizhi Xu	11763a8598	fs/9p: fix uaf in in v9fs_stat2inode_dotl The incorrect logical order of accessing the st object code in v9fs_fid_iget_dotl is causing this uaf. Fixes: `724a08450f` ("fs/9p: simplify iget to remove unnecessary paths") Reported-and-tested-by: syzbot+7a3d75905ea1a830dbe5@syzkaller.appspotmail.com Signed-off-by: Lizhi Xu <lizhi.xu@windriver.com> Tested-by: Breno Leitao <leitao@debian.org> Reviewed-by: Dominique Martinet <asmadeus@codewreck.org> Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org>	2024-03-25 00:34:35 +00:00
Linus Torvalds	c442a42363	fs/9p changes for the 6.9 merge window This pull request includes a number of patches addressing improvements in the cache portions of the 9p client. The biggest improvements have to do with fixing handling of inodes and eliminating duplicate structures and unnecessary allocation/release of inode structures and many associated unnecessary protocol traffic. This also dramatically reduced code complexity across the code and sets us up to add proper temporal cache capabilities. Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org> -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEElpbw0ZalkJikytFRiP/V+0pf/5gFAmX0XX0ACgkQiP/V+0pf /5jdgg//UCY+RTB9UNr/B1I5sj1m36l0BfFB/jaAMApH5/bQPF2mgtjD1jQL2qk2 gbAvR25erDdKiggRAbNU6OGVBIKXHiUPnL5d6UNgeyQQu1wSzStCejTfoQ/mwgTl 1N2Uge63zz5L0XgJm31gPcKSlJAD1r70UgsPw0ldkimEPqiBQjQu5VQqUWKgrkBe hcdXbuZbmXBLfZv9i0XI9BmwIeaSRwY2qx/JUr2eCoAXrDRNfwBRVMAfqpJ6L6I0 56C5Q2EDjipUnYp1Z3fqg80IUa+MnoF3rbMyMj54JBWxAPyMspamUBltYlZb5pRv zvjSJk3Uof8Wzc+beVVbyV2NmC6gC/cm2MnSYZBTX1KWTMl46eiloSrFz+Pv1/O1 8rcZFQuw8uAox9/4qc1lPCAC0hZVUFelj8NpCPN8aHS3cF0eN/MCQ3qfBkFgo8um jMYTUiai1tjdxWUWrUSpSKFHzBcL986+1bnE7qwI3qPXrxrnB9kOhzSilDJSuHnV Ei6zinvMEUBmKYLdNaCQ9N1KTKPr0bqXIlEvk4LDaGLGkZbI7eKuN5mCxZB0DXCc 65WSc5G9NYux9QjobR+TUwwLdiQH84w/3zYS41r/as7VqABcj/X/2b2ZfQ6/WBuF I0CmSWz2gfcpQQc642dzZJk7UP4wOVtsW3F2JAQ1GPq/D5UOmWU= =1LU1 -----END PGP SIGNATURE----- Merge tag '9p-for-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs Pull 9p updates from Eric Van Hensbergen: "This includes a number of patches addressing improvements in the cache portions of the 9p client. The biggest improvements have to do with fixing handling of inodes and eliminating duplicate structures and unnecessary allocation/release of inode structures and many associated unnecessary protocol traffic. This also dramatically reduced code complexity across the code and sets us up to add proper temporal cache capabilities" * tag '9p-for-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs: fs/9p: fix dups even in uncached mode fs/9p: simplify iget to remove unnecessary paths fs/9p: rework qid2ino logic fs/9p: Eliminate now unused v9fs_get_inode fs/9p: Eliminate redundant non-cache path in mknod fs/9p: remove walk and inode allocation from symlink fs/9p: convert mkdir to use get_new_inode fs/9p: switch vfsmount to use v9fs_get_new_inode	2024-03-15 10:10:31 -07:00
Linus Torvalds	f88c3fb81c	mm, slab: remove last vestiges of SLAB_MEM_SPREAD Yes, yes, I know the slab people were planning on going slow and letting every subsystem fight this thing on their own. But let's just rip off the band-aid and get it over and done with. I don't want to see a number of unnecessary pull requests just to get rid of a flag that no longer has any meaning. This was mainly done with a couple of 'sed' scripts and then some manual cleanup of the end result. Link: https://lore.kernel.org/all/CAHk-=wji0u+OOtmAOD-5JV3SXcRJF___k_+8XNKmak0yd5vW1Q@mail.gmail.com/ Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2024-03-12 20:32:19 -07:00
Jeff Layton	459c814a3c	9p: adapt to breakup of struct file_lock Most of the existing APIs have remained the same, but subsystems that access file_lock fields directly need to reach into struct file_lock_core now. Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://lore.kernel.org/r/20240131-flsplit-v3-34-c6129007ee8d@kernel.org Reviewed-by: NeilBrown <neilb@suse.de> Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-02-05 13:11:41 +01:00
Jeff Layton	a69ce85ec9	filelock: split common fields into struct file_lock_core In a future patch, we're going to split file leases into their own structure. Since a lot of the underlying machinery uses the same fields move those into a new file_lock_core, and embed that inside struct file_lock. For now, add some macros to ensure that we can continue to build while the conversion is in progress. Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://lore.kernel.org/r/20240131-flsplit-v3-17-c6129007ee8d@kernel.org Reviewed-by: NeilBrown <neilb@suse.de> Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-02-05 13:11:38 +01:00
Jeff Layton	75a1bbe60a	9p: rename fl_type variable in v9fs_file_do_lock In later patches, we're going to introduce some macros that conflict with the variable name here. Rename it. Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://lore.kernel.org/r/20240131-flsplit-v3-5-c6129007ee8d@kernel.org Reviewed-by: NeilBrown <neilb@suse.de> Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-02-05 13:11:35 +01:00
Eric Van Hensbergen	be57855f50	fs/9p: fix dups even in uncached mode In uncached mode we were still seeing duplicate getattr requests because of aggressive dropping of inodes. Inode "freshness" is guarded by other mechanisms when caches are disabled so this is unnecessary and increases overhead of almost every operation. Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org>	2024-01-26 16:46:56 +00:00
Eric Van Hensbergen	724a08450f	fs/9p: simplify iget to remove unnecessary paths Remove the additional comparison operators and switch to simply lookup by inode number (aka qid.path). Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org>	2024-01-26 16:46:56 +00:00
Eric Van Hensbergen	b91a26696e	fs/9p: rework qid2ino logic This changes from a function to a macro because we can figure out if we are 32 or 64 bit at compile time. Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org>	2024-01-26 16:46:56 +00:00
Eric Van Hensbergen	f61c906a7d	fs/9p: Eliminate now unused v9fs_get_inode Now with all inode allocation going through get_from_fid functions we can remove v9fs_get_inode and reduce us down to a single inode allocation path. Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org>	2024-01-26 16:46:56 +00:00
Eric Van Hensbergen	2dc92e5975	fs/9p: Eliminate redundant non-cache path in mknod Like symlink, mknod had a seperate path with different inode allocation -- but this seems unnecessary, so eliminating this path. Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org>	2024-01-26 16:46:56 +00:00
Eric Van Hensbergen	6bb2932722	fs/9p: remove walk and inode allocation from symlink Symlink had a bunch of extra operations which essentially end up discarded. It was walking the fid to the new file and creating an inode for it, but those semantics are part of tsymlink. This did prepopulate the cache, but that also seems potentially unnecessary and frought with peril. Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org>	2024-01-26 16:46:55 +00:00
Eric Van Hensbergen	44c53ac097	fs/9p: convert mkdir to use get_new_inode mkdir had different code paths for inode creation, cache used the get_new_inode_from_fid helper, but non-cached used v9fs_get_inode. Collapsed into a single implementation across both as there should be no difference. Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org>	2024-01-26 16:46:55 +00:00
Eric Van Hensbergen	fe1371d0f8	fs/9p: switch vfsmount to use v9fs_get_new_inode In the process of cleaning up inode number allocation, I noticed several functions which didn't use the standard helper allocators. This patch fixes the allocation in the mount entrypoint. Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org>	2024-01-26 16:46:55 +00:00
David Howells	252cf7b2ea	9p: Use length of data written to the server in preference to error In v9fs_upload_to_server(), we pass the error to netfslib to terminate the subreq rather than the amount of data written - even if we did actually write something. Further, we assume that the write is always entirely done if successful - but it might have been partially complete - as returned by p9_client_write(), but we ignore that. Fix this by indicating the amount written by preference and only returning the error if we didn't write anything. (We might want to return both in future if both are available as this might be useful as to whether we retry or not.) Suggested-by: Dominique Martinet <asmadeus@codewreck.org> Link: https://lore.kernel.org/r/ZZULNQAZ0n0WQv7p@codewreck.org/ Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Dominique Martinet <asmadeus@codewreck.org> cc: Eric Van Hensbergen <ericvh@kernel.org> cc: Latchesar Ionkov <lucho@ionkov.net> cc: Christian Schoenebeck <linux_oss@crudebyte.com> cc: v9fs@lists.linux.dev cc: linux-cachefs@redhat.com cc: linux-fsdevel@vger.kernel.org	2024-01-04 13:15:31 +00:00
David Howells	6c2c1e0009	9p: Do a couple of cleanups Do a couple of cleanups to 9p: (1) Remove a couple of unused variables. (2) Turn a BUG_ON() into a warning, consolidate with another warning and make the warning message include the inode number rather than whatever's in i_private (which will get hashed anyway). Suggested-by: Dominique Martinet <asmadeus@codewreck.org> Link: https://lore.kernel.org/r/ZZULNQAZ0n0WQv7p@codewreck.org/ Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Dominique Martinet <asmadeus@codewreck.org> cc: Eric Van Hensbergen <ericvh@kernel.org> cc: Latchesar Ionkov <lucho@ionkov.net> cc: Christian Schoenebeck <linux_oss@crudebyte.com> cc: v9fs@lists.linux.dev cc: linux-cachefs@redhat.com cc: linux-fsdevel@vger.kernel.org	2024-01-04 13:14:13 +00:00
David Howells	9546ac78b2	9p: Fix initialisation of netfs_inode for 9p The 9p filesystem is calling netfs_inode_init() in v9fs_init_inode() - before the struct inode fields have been initialised from the obtained file stats (ie. after v9fs_stat2inode*() has been called), but netfslib wants to set a couple of its fields from i_size. Reported-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: Marc Dionne <marc.dionne@auristor.com> Tested-by: Dominique Martinet <asmadeus@codewreck.org> Acked-by: Dominique Martinet <asmadeus@codewreck.org> cc: Eric Van Hensbergen <ericvh@kernel.org> cc: Latchesar Ionkov <lucho@ionkov.net> cc: Dominique Martinet <asmadeus@codewreck.org> cc: Christian Schoenebeck <linux_oss@crudebyte.com> cc: v9fs@lists.linux.dev cc: linux-cachefs@redhat.com cc: linux-fsdevel@vger.kernel.org	2024-01-03 14:53:01 +00:00
David Howells	80105ed2fd	9p: Use netfslib read/write_iter Use netfslib's read and write iteration helpers, allowing netfslib to take over the management of the page cache for 9p files and to manage local disk caching. In particular, this eliminates write_begin, write_end, writepage and all mentions of struct page and struct folio from 9p. Note that netfslib now offers the possibility of write-through caching if that is desirable for 9p: just set the NETFS_ICTX_WRITETHROUGH flag in v9inode->netfs.flags in v9fs_set_netfs_context(). Note also this is untested as I can't get ganesha.nfsd to correctly parse the config to turn on 9p support. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: Eric Van Hensbergen <ericvh@kernel.org> cc: Latchesar Ionkov <lucho@ionkov.net> cc: Dominique Martinet <asmadeus@codewreck.org> cc: Christian Schoenebeck <linux_oss@crudebyte.com> cc: v9fs@lists.linux.dev cc: linux-cachefs@redhat.com cc: linux-fsdevel@vger.kernel.org	2023-12-28 09:45:28 +00:00
David Howells	100ccd18bb	netfs: Optimise away reads above the point at which there can be no data Track the file position above which the server is not expected to have any data (the "zero point") and preemptively assume that we can satisfy requests by filling them with zeroes locally rather than attempting to download them if they're over that line - even if we've written data back to the server. Assume that any data that was written back above that position is held in the local cache. Note that we have to split requests that straddle the line. Make use of this to optimise away some reads from the server. We need to set the zero point in the following circumstances: (1) When we see an extant remote inode and have no cache for it, we set the zero_point to i_size. (2) On local inode creation, we set zero_point to 0. (3) On local truncation down, we reduce zero_point to the new i_size if the new i_size is lower. (4) On local truncation up, we don't change zero_point. (5) On local modification, we don't change zero_point. (6) On remote invalidation, we set zero_point to the new i_size. (7) If stored data is discarded from the pagecache or culled from fscache, we must set zero_point above that if the data also got written to the server. (8) If dirty data is written back to the server, but not fscache, we must set zero_point above that. (9) If a direct I/O write is made, set zero_point above that. Assuming the above, any read from the server at or above the zero_point position will return all zeroes. The zero_point value can be stored in the cache, provided the above rules are applied to it by any code that culls part of the local cache. Signed-off-by: David Howells <dhowells@redhat.com> cc: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org	2023-12-28 09:45:27 +00:00
David Howells	c1ec4d7c2e	netfs: Provide invalidate_folio and release_folio calls Provide default invalidate_folio and release_folio calls. These will need to interact with invalidation correctly at some point. They will be needed if netfslib is to make use of folio->private for its own purposes. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org	2023-12-24 15:08:51 +00:00
David Howells	c9c4ff12df	netfs: Move pinning-for-writeback from fscache to netfs Move the resource pinning-for-writeback from fscache code to netfslib code. This is used to keep a cache backing object pinned whilst we have dirty pages on the netfs inode in the pagecache such that VM writeback will be able to reach it. Whilst we're at it, switch the parameters of netfs_unpin_writeback() to match ->write_inode() so that it can be used for that directly. Note that this mechanism could be more generically useful than that for network filesystems. Quite often they have to keep around other resources (e.g. authentication tokens or network connections) until the writeback is complete. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org	2023-12-24 15:08:49 +00:00
David Howells	4498a8eccc	netfs, fscache: Remove ->begin_cache_operation Remove ->begin_cache_operation() in favour of just calling fscache directly. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: Christian Brauner <christian@brauner.io> cc: linux-fsdevel@vger.kernel.org cc: linux-cachefs@redhat.com	2023-12-24 15:08:48 +00:00

1 2 3 4 5 ...

807 Commits