linux

mirror of https://github.com/torvalds/linux.git synced 2024-11-23 12:42:02 +00:00

Author	SHA1	Message	Date
David Howells	1eda8bab70	afs: Add support for the UAE error table Add support for mapping AFS UAE abort codes to Linux errno values. Signed-off-by: David Howells <dhowells@redhat.com>	2019-06-28 18:37:53 +01:00
David S. Miller	d96ff269a0	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net The new route handling in ip_mc_finish_output() from 'net' overlapped with the new support for returning congestion notifications from BPF programs. In order to handle this I had to take the dev_loopback_xmit() calls out of the switch statement. The aquantia driver conflicts were simple overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-27 21:06:39 -07:00
Linus Torvalds	cd0f3aaebc	AFS fixes -----BEGIN PGP SIGNATURE----- iQIVAwUAXRMn5vu3V2unywtrAQICpA/+IIINk6MJVQDzGhOnvWrbGdPnOdJEUyLN B9U4bLZJRg/j+Sqodn+fXIfsEO4FQflkSJD+xoBi4pzBZcr0xkLUVOog/1S7dv4J bPVT9p2f3ITNiatmisOrUe1InuHa6Wb/cUnQaLLRhd7NqbawKGRQG4tv4CGwKn67 dJIOOm/iTCs1ACES4C5QOpU7/DWK38Pn3BbnN21bFzDgfbtbdDTaFFkhFtXy78oB Gcj5g+ULpkKBcuJThFuJUPZ9E4qICNZR4kJXEULSvykDDRzluhJmQ+v8btm6NJsq hMqTrT9M2y114V1OqXj3me7tA6wOEAfTQ0WzpzF2SmyFQKnSly/EkWc4HZXFD/8O BczCcABUbuKNE/pJSELx6k1M0+00QfeLcjHPc6joZFCni3lMdYWOncn/syyHw5P+ rc9JQsy3+dLcFsaVQ5eGmX6NDc70dCrAlS6MllIzSBcwAVCctTKwm0meaSW6B2y6 VymPy+cqi1RxMKyiQ0hAeU7Xe6yqFcl6rtonfCQqRLxkfzrCXkDp6/ELOXBzDft1 ey6+N3WsmWW7YSPuM/SIZKV66rshlflj0w+FRluZEEAF1NYeYqXUDvK/S8KC9kPG AXUDvhI+tBpxg1AVz94JN714VmkbY23xV0g44eQsdqSQm2YvsxiFCSWZZ6L/KEWe kWQc6BGDCB0= =YTdG -----END PGP SIGNATURE----- Merge tag 'afs-fixes-20190620' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs Pull AFS fixes from David Howells: "The in-kernel AFS client has been undergoing testing on opendev.org on one of their mirror machines. They are using AFS to hold data that is then served via apache, and Ian Wienand had reported seeing oopses, spontaneous machine reboots and updates to volumes going missing. This patch series appears to have fixed the problem, very probably due to patch (2), but it's not 100% certain. (1) Fix the printing of the "vnode modified" warning to exclude checks on files for which we don't have a callback promise from the server (and so don't expect the server to tell us when it changes). Without this, for every file or directory for which we still have an in-core inode that gets changed on the server, we may get a message logged when we next look at it. This can happen in bulk if, for instance, someone does "vos release" to update a R/O volume from a R/W volume and a whole set of files are all changed together. We only really want to log a message if the file changed and the server didn't tell us about it or we failed to track the state internally. (2) Fix accidental corruption of either afs_vlserver struct objects or the the following memory locations (which could hold anything). The issue is caused by a union that points to two different structs in struct afs_call (to save space in the struct). The call cleanup code assumes that it can simply call the cleanup for one of those structs if not NULL - when it might be actually pointing to the other struct. This means that every Volume Location RPC op is going to corrupt something. (3) Fix an uninitialised spinlock. This isn't too bad, it just causes a one-off warning if lockdep is enabled when "vos release" is called, but the spinlock still behaves correctly. (4) Fix the setting of i_block in the inode. This causes du, for example, to produce incorrect results, but otherwise should not be dangerous to the kernel" * tag 'afs-fixes-20190620' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: afs: Fix setting of i_blocks afs: Fix uninitialised spinlock afs_volume::cb_break_lock afs: Fix vlserver record corruption afs: Fix over zealous "vnode modified" warnings	2019-06-28 08:34:12 +08:00
David Howells	2e12256b9a	keys: Replace uid/gid/perm permissions checking with an ACL Replace the uid/gid/perm permissions checking on a key with an ACL to allow the SETATTR and SEARCH permissions to be split. This will also allow a greater range of subjects to represented. ============ WHY DO THIS? ============ The problem is that SETATTR and SEARCH cover a slew of actions, not all of which should be grouped together. For SETATTR, this includes actions that are about controlling access to a key: (1) Changing a key's ownership. (2) Changing a key's security information. (3) Setting a keyring's restriction. And actions that are about managing a key's lifetime: (4) Setting an expiry time. (5) Revoking a key. and (proposed) managing a key as part of a cache: (6) Invalidating a key. Managing a key's lifetime doesn't really have anything to do with controlling access to that key. Expiry time is awkward since it's more about the lifetime of the content and so, in some ways goes better with WRITE permission. It can, however, be set unconditionally by a process with an appropriate authorisation token for instantiating a key, and can also be set by the key type driver when a key is instantiated, so lumping it with the access-controlling actions is probably okay. As for SEARCH permission, that currently covers: (1) Finding keys in a keyring tree during a search. (2) Permitting keyrings to be joined. (3) Invalidation. But these don't really belong together either, since these actions really need to be controlled separately. Finally, there are number of special cases to do with granting the administrator special rights to invalidate or clear keys that I would like to handle with the ACL rather than key flags and special checks. =============== WHAT IS CHANGED =============== The SETATTR permission is split to create two new permissions: (1) SET_SECURITY - which allows the key's owner, group and ACL to be changed and a restriction to be placed on a keyring. (2) REVOKE - which allows a key to be revoked. The SEARCH permission is split to create: (1) SEARCH - which allows a keyring to be search and a key to be found. (2) JOIN - which allows a keyring to be joined as a session keyring. (3) INVAL - which allows a key to be invalidated. The WRITE permission is also split to create: (1) WRITE - which allows a key's content to be altered and links to be added, removed and replaced in a keyring. (2) CLEAR - which allows a keyring to be cleared completely. This is split out to make it possible to give just this to an administrator. (3) REVOKE - see above. Keys acquire ACLs which consist of a series of ACEs, and all that apply are unioned together. An ACE specifies a subject, such as: () Possessor - permitted to anyone who 'possesses' a key () Owner - permitted to the key owner () Group - permitted to the key group () Everyone - permitted to everyone Note that 'Other' has been replaced with 'Everyone' on the assumption that you wouldn't grant a permit to 'Other' that you wouldn't also grant to everyone else. Further subjects may be made available by later patches. The ACE also specifies a permissions mask. The set of permissions is now: VIEW Can view the key metadata READ Can read the key content WRITE Can update/modify the key content SEARCH Can find the key by searching/requesting LINK Can make a link to the key SET_SECURITY Can change owner, ACL, expiry INVAL Can invalidate REVOKE Can revoke JOIN Can join this keyring CLEAR Can clear this keyring The KEYCTL_SETPERM function is then deprecated. The KEYCTL_SET_TIMEOUT function then is permitted if SET_SECURITY is set, or if the caller has a valid instantiation auth token. The KEYCTL_INVALIDATE function then requires INVAL. The KEYCTL_REVOKE function then requires REVOKE. The KEYCTL_JOIN_SESSION_KEYRING function then requires JOIN to join an existing keyring. The JOIN permission is enabled by default for session keyrings and manually created keyrings only. ====================== BACKWARD COMPATIBILITY ====================== To maintain backward compatibility, KEYCTL_SETPERM will translate the permissions mask it is given into a new ACL for a key - unless KEYCTL_SET_ACL has been called on that key, in which case an error will be returned. It will convert possessor, owner, group and other permissions into separate ACEs, if each portion of the mask is non-zero. SETATTR permission turns on all of INVAL, REVOKE and SET_SECURITY. WRITE permission turns on WRITE, REVOKE and, if a keyring, CLEAR. JOIN is turned on if a keyring is being altered. The KEYCTL_DESCRIBE function translates the ACL back into a permissions mask to return depending on possessor, owner, group and everyone ACEs. It will make the following mappings: (1) INVAL, JOIN -> SEARCH (2) SET_SECURITY -> SETATTR (3) REVOKE -> WRITE if SETATTR isn't already set (4) CLEAR -> WRITE Note that the value subsequently returned by KEYCTL_DESCRIBE may not match the value set with KEYCTL_SETATTR. ======= TESTING ======= This passes the keyutils testsuite for all but a couple of tests: (1) tests/keyctl/dh_compute/badargs: The first wrong-key-type test now returns EOPNOTSUPP rather than ENOKEY as READ permission isn't removed if the type doesn't have ->read(). You still can't actually read the key. (2) tests/keyctl/permitting/valid: The view-other-permissions test doesn't work as Other has been replaced with Everyone in the ACL. Signed-off-by: David Howells <dhowells@redhat.com>	2019-06-27 23:03:07 +01:00
David Howells	a58946c158	keys: Pass the network namespace into request_key mechanism Create a request_key_net() function and use it to pass the network namespace domain tag into DNS revolver keys and rxrpc/AFS keys so that keys for different domains can coexist in the same keyring. Signed-off-by: David Howells <dhowells@redhat.com> cc: netdev@vger.kernel.org cc: linux-nfs@vger.kernel.org cc: linux-cifs@vger.kernel.org cc: linux-afs@lists.infradead.org	2019-06-27 23:02:12 +01:00
Zhengyuan Liu	ee102584ef	fs/afs: use struct_size() in kzalloc() As Gustavo said in other patches doing the same replace, we can now use the new struct_size() helper to avoid leaving these open-coded and prone to type mistake. Signed-off-by: Zhengyuan Liu <liuzhengyuan@kylinos.cn> Signed-off-by: David Howells <dhowells@redhat.com>	2019-06-20 18:12:17 +01:00
David Howells	4521819369	afs: Trace afs_server usage Add a tracepoint (afs_server) to track the afs_server object usage count. Signed-off-by: David Howells <dhowells@redhat.com>	2019-06-20 18:12:17 +01:00
David Howells	051d25250b	afs: Add some callback management tracepoints Add a couple of tracepoints to track callback management: (1) afs_cb_miss - Logs when we were unable to apply a callback, either due to the inode being discarded or due to a competing thread applying a callback first. (2) afs_cb_break - Logs when we attempted to clear the noted callback promise, either due to the server explicitly breaking the callback, the callback promise lapsing or a local event obsoleting it. Signed-off-by: David Howells <dhowells@redhat.com>	2019-06-20 18:12:16 +01:00
David Howells	fa59f52f5b	afs: afs_unlink() doesn't need to check dentry->d_inode Don't check that dentry->d_inode is valid in afs_unlink(). We should be able to take that as given. This caused Smatch to issue the following warning: fs/afs/dir.c:1392 afs_unlink() error: we previously assumed 'vnode' could be null (see line 1375) Reported-by: kbuild test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David Howells <dhowells@redhat.com>	2019-06-20 18:12:16 +01:00
David Howells	2cd42d19cf	afs: Fix setting of i_blocks The setting of i_blocks, which is calculated from i_size, has got accidentally misordered relative to the setting of i_size when initially setting up an inode. Further, i_blocks isn't updated by afs_apply_status() when the size is updated. To fix this, break the i_size/i_blocks setting out into a helper function and call it from both places. Fixes: `a58823ac45` ("afs: Fix application of status and callback to be under same lock") Signed-off-by: David Howells <dhowells@redhat.com>	2019-06-20 18:12:02 +01:00
David Howells	90fa9b6452	afs: Fix uninitialised spinlock afs_volume::cb_break_lock Fix the cb_break_lock spinlock in afs_volume struct by initialising it when the volume record is allocated. Also rename the lock to cb_v_break_lock to distinguish it from the lock of the same name in the afs_server struct. Without this, the following trace may be observed when a volume-break callback is received: INFO: trying to register non-static key. the code is fine but needs lockdep annotation. turning off the locking correctness validator. CPU: 2 PID: 50 Comm: kworker/2:1 Not tainted 5.2.0-rc1-fscache+ #3045 Hardware name: ASUS All Series/H97-PLUS, BIOS 2306 10/09/2014 Workqueue: afs SRXAFSCB_CallBack Call Trace: dump_stack+0x67/0x8e register_lock_class+0x23b/0x421 ? check_usage_forwards+0x13c/0x13c __lock_acquire+0x89/0xf73 lock_acquire+0x13b/0x166 ? afs_break_callbacks+0x1b2/0x3dd _raw_write_lock+0x2c/0x36 ? afs_break_callbacks+0x1b2/0x3dd afs_break_callbacks+0x1b2/0x3dd ? trace_event_raw_event_afs_server+0x61/0xac SRXAFSCB_CallBack+0x11f/0x16c process_one_work+0x2c5/0x4ee ? worker_thread+0x234/0x2ac worker_thread+0x1d8/0x2ac ? cancel_delayed_work_sync+0xf/0xf kthread+0x11f/0x127 ? kthread_park+0x76/0x76 ret_from_fork+0x24/0x30 Fixes: `68251f0a68` ("afs: Fix whole-volume callback handling") Signed-off-by: David Howells <dhowells@redhat.com>	2019-06-20 16:49:35 +01:00
David Howells	a6853b9ce8	afs: Fix vlserver record corruption Because I made the afs_call struct share pointers to an afs_server object and an afs_vlserver object to save space, afs_put_call() calls afs_put_server() on afs_vlserver object (which is only meant for the afs_server object) because it sees that call->server isn't NULL. This means that the afs_vlserver object gets unpredictably and randomly modified, depending on what config options are set (such as lockdep). Fix this by getting rid of the union and having two non-overlapping pointers in the afs_call struct. Fixes: `ffba718e93` ("afs: Get rid of afs_call::reply[]") Signed-off-by: David Howells <dhowells@redhat.com>	2019-06-20 16:49:35 +01:00
David Howells	3647e42b55	afs: Fix over zealous "vnode modified" warnings Occasionally, warnings like this: vnode modified 2af7 on {10000b:1} [exp 2af2] YFS.FetchStatus(vnode) are emitted into the kernel log. This indicates that when we were applying the updated vnode (file) status retrieved from the server to an inode we saw that the data version number wasn't what we were expecting (in this case it's 0x2af7 rather than 0x2af2). We've usually received a callback from the server prior to this point - or the callback promise has lapsed - so the warning is merely informative and the state is to be expected. Fix this by only emitting the warning if the we still think that we have a valid callback promise and haven't received a callback. Also change the format slightly so so that the new data version doesn't look like part of the text, the like is prefixed with "kAFS: " and the message is ranked as a warning. Fixes: `31143d5d51` ("AFS: implement basic file write support") Reported-by: Ian Wienand <iwienand@redhat.com> Signed-off-by: David Howells <dhowells@redhat.com>	2019-06-20 16:49:34 +01:00
Amir Goldstein	49246466a9	fsnotify: move fsnotify_nameremove() hook out of d_delete() d_delete() was piggy backed for the fsnotify_nameremove() hook when in fact not all callers of d_delete() care about fsnotify events. For all callers of d_delete() that may be interested in fsnotify events, we made sure to call one of fsnotify_{unlink,rmdir}() hooks before calling d_delete(). Now we can move the fsnotify_nameremove() call from d_delete() to the fsnotify_{unlink,rmdir}() hooks. Two explicit calls to fsnotify_nameremove() from nfs/afs sillyrename are also removed. This will cause a change of behavior - nfs/afs will NOT generate an fsnotify delete event when renaming over a positive dentry. This change is desirable, because it is consistent with the behavior of all other filesystems. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>	2019-06-20 14:47:44 +02:00
David S. Miller	a6cdeeb16b	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Some ISDN files that got removed in net-next had some changes done in mainline, take the removals. Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-07 11:00:14 -07:00
Florian Westphal	35ebfc22fe	afs: do not send list of client addresses David Howells says: I'm told that there's not really any point populating the list. Current OpenAFS ignores it, as does AuriStor - and IBM AFS 3.6 will do the right thing. The list is actually useless as it's the client's view of the world, not the servers, so if there's any NAT in the way its contents are invalid. Further, it doesn't support IPv6 addresses. On that basis, feel free to make it an empty list and remove all the interface enumeration. V1 of this patch reworked the function to use a new helper for the ifa_list iteration to avoid sparse warnings once the proper __rcu annotations get added in struct in_device later. But, in light of the above, just remove afs_get_ipv4_interfaces. Compile tested only. Cc: David Howells <dhowells@redhat.com> Cc: linux-afs@lists.infradead.org Signed-off-by: Florian Westphal <fw@strlen.de> Tested-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-02 18:06:26 -07:00
Thomas Gleixner	2874c5fd28	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 of the license or at your option any later version extracted by the scancode license scanner the SPDX license identifier GPL-2.0-or-later has been chosen to replace the boilerplate/reference in 3029 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190527070032.746973796@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-05-30 11:26:32 -07:00
Thomas Gleixner	b4d0d230cc	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 36 Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public licence as published by the free software foundation either version 2 of the licence or at your option any later version extracted by the scancode license scanner the SPDX license identifier GPL-2.0-or-later has been chosen to replace the boilerplate/reference in 114 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190520170857.552531963@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-05-24 17:27:11 +02:00
Thomas Gleixner	ec8f24b7fa	treewide: Add SPDX license identifier - Makefile/Kconfig Add SPDX license identifiers to all Make/Kconfig files which: - Have no license information of any form These files fall under the project license, GPL v2 only. The resulting SPDX license identifier is: GPL-2.0-only Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-05-21 10:50:46 +02:00
David Howells	39db9815da	afs: Fix application of the results of a inline bulk status fetch Fix afs_do_lookup() such that when it does an inline bulk status fetch op, it will update inodes that are already extant (something that afs_iget() doesn't do) and to cache permits for each inode created (thereby avoiding a follow up FS.FetchStatus call to determine this). Extant inodes need looking up in advance so that their cb_break counters before and after the operation can be compared. To this end, the inode pointers are cached so that they don't need looking up again after the op. Fixes: `5cf9dd55a0` ("afs: Prospectively look up extra files when doing a single lookup") Signed-off-by: David Howells <dhowells@redhat.com>	2019-05-16 22:23:21 +01:00
David Howells	b835915325	afs: Pass pre-fetch server and volume break counts into afs_iget5_set() Pass the server and volume break counts from before the status fetch operation that queried the attributes of a file into afs_iget5_set() so that the new vnode's break counters can be initialised appropriately. This allows detection of a volume or server break that happened whilst we were fetching the status or setting up the vnode. Fixes: `c435ee3455` ("afs: Overhaul the callback handling") Signed-off-by: David Howells <dhowells@redhat.com>	2019-05-16 22:23:21 +01:00
David Howells	a38a75581e	afs: Fix unlink to handle YFS.RemoveFile2 better Make use of the status update for the target file that the YFS.RemoveFile2 RPC op returns to correctly update the vnode as to whether the file was actually deleted or just had nlink reduced. Fixes: `30062bd13e` ("afs: Implement YFS support in the fs client") Signed-off-by: David Howells <dhowells@redhat.com>	2019-05-16 22:23:21 +01:00
David Howells	61c347ba55	afs: Clear AFS_VNODE_CB_PROMISED if we detect callback expiry Fix afs_validate() to clear AFS_VNODE_CB_PROMISED on a vnode if we detect any condition that causes the callback promise to be broken implicitly, including server break (cb_s_break), volume break (cb_v_break) or callback expiry. Fixes: `ae3b7361dc` ("afs: Fix validation/callback interaction") Reported-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com>	2019-05-16 22:23:21 +01:00
David Howells	f642404a04	afs: Make vnode->cb_interest RCU safe Use RCU-based freeing for afs_cb_interest struct objects and use RCU on vnode->cb_interest. Use that change to allow afs_check_validity() to use read_seqbegin_or_lock() instead of read_seqlock_excl(). This also requires the caller of afs_check_validity() to hold the RCU read lock across the call. Signed-off-by: David Howells <dhowells@redhat.com>	2019-05-16 22:23:21 +01:00
David Howells	c925bd0ac4	afs: Split afs_validate() so first part can be used under LOOKUP_RCU Split afs_validate() so that the part that decides if the vnode is still valid can be used under LOOKUP_RCU conditions from afs_d_revalidate(). Signed-off-by: David Howells <dhowells@redhat.com>	2019-05-16 22:23:21 +01:00
David Howells	7c71245866	afs: Don't save callback version and type fields Don't save callback version and type fields as the version is about the format of the callback information and the type is relative to the particular RPC call. Signed-off-by: David Howells <dhowells@redhat.com>	2019-05-16 22:23:21 +01:00
David Howells	a58823ac45	afs: Fix application of status and callback to be under same lock When applying the status and callback in the response of an operation, apply them in the same critical section so that there's no race between checking the callback state and checking status-dependent state (such as the data version). Fix this by: (1) Allocating a joint {status,callback} record (afs_status_cb) before calling the RPC function for each vnode for which the RPC reply contains a status or a status plus a callback. A flag is set in the record to indicate if a callback was actually received. (2) These records are passed into the RPC functions to be filled in. The afs_decode_status() and yfs_decode_status() functions are removed and the cb_lock is no longer taken. (3) xdr_decode_AFSFetchStatus() and xdr_decode_YFSFetchStatus() no longer update the vnode. (4) xdr_decode_AFSCallBack() and xdr_decode_YFSCallBack() no longer update the vnode. (5) vnodes, expected data-version numbers and callback break counters (cb_break) no longer need to be passed to the reply delivery functions. Note that, for the moment, the file locking functions still need access to both the call and the vnode at the same time. (6) afs_vnode_commit_status() is now given the cb_break value and the expected data_version and the task of applying the status and the callback to the vnode are now done here. This is done under a single taking of vnode->cb_lock. (7) afs_pages_written_back() is now called by afs_store_data() rather than by the reply delivery function. afs_pages_written_back() has been moved to before the call point and is now given the first and last page numbers rather than a pointer to the call. (8) The indicator from YFS.RemoveFile2 as to whether the target file actually got removed (status.abort_code == VNOVNODE) rather than merely dropping a link is now checked in afs_unlink rather than in xdr_decode_YFSFetchStatus(). Supplementary fixes: () afs_cache_permit() now gets the caller_access mask from the afs_status_cb object rather than picking it out of the vnode's status record. afs_fetch_status() returns caller_access through its argument list for this purpose also. () afs_inode_init_from_status() now uses a write lock on cb_lock rather than a read lock and now sets the callback inside the same critical section. Fixes: `c435ee3455` ("afs: Overhaul the callback handling") Signed-off-by: David Howells <dhowells@redhat.com>	2019-05-16 16:25:21 +01:00
David Howells	4571577f16	afs: Always get the reply time Always ask for the reply time from AF_RXRPC as it's used to calculate the callback expiry time and lock expiry times, so it's needed by most FS operations. Signed-off-by: David Howells <dhowells@redhat.com>	2019-05-16 16:25:21 +01:00
David Howells	87182759cd	afs: Fix order-1 allocation in afs_do_lookup() afs_do_lookup() will do an order-1 allocation to allocate status records if there are more than 39 vnodes to stat. Fix this by allocating an array of {status,callback} records for each vnode we want to examine using vmalloc() if larger than a page. This not only gets rid of the order-1 allocation, but makes it easier to grow beyond 50 records for YFS servers. It also allows us to move to {status,callback} tuples for other calls too and makes it easier to lock across the application of the status and the callback to the vnode. Fixes: `5cf9dd55a0` ("afs: Prospectively look up extra files when doing a single lookup") Signed-off-by: David Howells <dhowells@redhat.com>	2019-05-16 16:25:21 +01:00
David Howells	ffba718e93	afs: Get rid of afs_call::reply[] Replace the afs_call::reply[] array with a bunch of typed members so that the compiler can use type-checking on them. It's also easier for the eye to see what's going on. Signed-off-by: David Howells <dhowells@redhat.com>	2019-05-16 16:25:21 +01:00
David Howells	fefb2483dc	afs: Don't pass the vnode pointer through into the inline bulk status op Don't pass the vnode pointer through into the inline bulk status op. We want to process the status records outside of it anyway. Signed-off-by: David Howells <dhowells@redhat.com>	2019-05-16 16:25:21 +01:00
David Howells	fd711586bb	afs: Fix double inc of vnode->cb_break When __afs_break_callback() clears the CB_PROMISED flag, it increments vnode->cb_break to trigger a future refetch of the status and callback - however it also calls afs_clear_permits(), which also increments vnode->cb_break. Fix this by removing the increment from afs_clear_permits(). Whilst we're at it, fix the conditional call to afs_put_permits() as the function checks to see if the argument is NULL, so the check is redundant. Fixes: `be080a6f43` ("afs: Overhaul permit caching"); Signed-off-by: David Howells <dhowells@redhat.com>	2019-05-16 16:25:21 +01:00
David Howells	c7226e407b	afs: Fix lock-wait/callback-break double locking __afs_break_callback() holds vnode->lock around its call of afs_lock_may_be_available() - which also takes that lock. Fix this by not taking the lock in __afs_break_callback(). Also, there's no point checking the granted_locks and pending_locks queues; it's sufficient to check lock_state, so move that check out of afs_lock_may_be_available() into __afs_break_callback() to replace the queue checks. Fixes: `e8d6c55412` ("AFS: implement file locking") Signed-off-by: David Howells <dhowells@redhat.com>	2019-05-16 16:25:21 +01:00
David Howells	d9052dda8a	afs: Don't invalidate callback if AFS_VNODE_DIR_VALID not set Don't invalidate the callback promise on a directory if the AFS_VNODE_DIR_VALID flag is not set (which indicates that the directory contents are invalid, due to edit failure, callback break, page reclaim). The directory will be reloaded next time the directory is accessed, so clearing the callback flag at this point may race with a reload of the directory and cancel it's recorded callback promise. Fixes: `f3ddee8dc4` ("afs: Fix directory handling") Signed-off-by: David Howells <dhowells@redhat.com>	2019-05-16 16:25:21 +01:00
David Howells	781070551c	afs: Fix calculation of callback expiry time Fix the calculation of the expiry time of a callback promise, as obtained from operations like FS.FetchStatus and FS.FetchData. The time should be based on the timestamp of the first DATA packet in the reply and the calculation needs to turn the ktime_t timestamp into a time64_t. Fixes: `c435ee3455` ("afs: Overhaul the callback handling") Signed-off-by: David Howells <dhowells@redhat.com>	2019-05-16 16:25:21 +01:00
David Howells	3b05e528cb	afs: Make dynamic root population wait uninterruptibly for proc_cells_lock Make dynamic root population wait uninterruptibly for proc_cells_lock. Signed-off-by: David Howells <dhowells@redhat.com>	2019-05-16 16:25:21 +01:00
David Howells	20b8391fff	afs: Make some RPC operations non-interruptible Make certain RPC operations non-interruptible, including: () Set attributes () Store data We don't want to get interrupted during a flush on close, flush on unlock, writeback or an inode update, leaving us in a state where we still need to do the writeback or update. () Extend lock () Release lock We don't want to get lock extension interrupted as the file locks on the server are time-limited. Interruption during lock release is less of an issue since the lock is time-limited, but it's better to complete the release to avoid a several-minute wait to recover it. Setting the lock isn't a problem if it's interrupted since we can just return to the user and tell them they were interrupted - at which point they can elect to retry. (*) Silly unlink We want to remove silly unlink files if we can, rather than leaving them for the salvager to clear up. Note that whilst these calls are no longer interruptible, they do have timeouts on them, so if the server stops responding the call will fail with something like ETIME or ECONNRESET. Without this, the following: kAFS: Unexpected error from FS.StoreData -512 appears in dmesg when a pending store data gets interrupted and some processes may just hang. Additionally, make the code that checks/updates the server record ignore failure due to interruption if the main call is uninterruptible and if the server has an address list. The next op will check it again since the expiration time on the old list has past. Fixes: `d2ddc776a4` ("afs: Overhaul volume and server record caching and fileserver rotation") Reported-by: Jonathan Billings <jsbillings@jsbillings.org> Reported-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com>	2019-05-16 16:25:20 +01:00
David Howells	b960a34b73	rxrpc: Allow the kernel to mark a call as being non-interruptible Allow kernel services using AF_RXRPC to indicate that a call should be non-interruptible. This allows kafs to make things like lock-extension and writeback data storage calls non-interruptible. If this is set, signals will be ignored for operations on that call where possible - such as waiting to get a call channel on an rxrpc connection. It doesn't prevent UDP sendmsg from being interrupted, but that will be handled by packet retransmission. rxrpc_kernel_recv_data() isn't affected by this since that never waits, preferring instead to return -EAGAIN and leave the waiting to the caller. Userspace initiated calls can't be set to be uninterruptible at this time. Signed-off-by: David Howells <dhowells@redhat.com>	2019-05-16 16:25:20 +01:00
David Howells	0ab4c95948	afs: Fix error propagation from server record check/update afs_check/update_server_record() should be setting fc->error rather than fc->ac.error as they're called from within the cursor iteration function. afs_fs_cursor::error is where the error code of the attempt to call the operation on multiple servers is integrated and is the final result, whereas afs_addr_cursor::error is used to hold the error from individual iterations of the call loop. (Note there's also an afs_vl_cursor which also wraps afs_addr_cursor for accessing VL servers rather than file servers). Fix this by setting fc->error in the afs_check/update_server_record() so that any error incurred whilst talking to the VL server correctly propagates to the final result. This results in: kAFS: Unexpected error from FS.StoreData -512 being seen, even though the store-data op is non-interruptible. The error is actually coming from the server record update getting interrupted. Fixes: `d2ddc776a4` ("afs: Overhaul volume and server record caching and fileserver rotation") Signed-off-by: David Howells <dhowells@redhat.com>	2019-05-16 16:25:20 +01:00
David Howells	94f699c9cd	afs: Fix the maximum lifespan of VL and probe calls If an older AFS server doesn't support an operation, it may accept the call and then sit on it forever, happily responding to pings that make kafs think that the call is still alive. Fix this by setting the maximum lifespan of Volume Location service calls in particular and probe calls in general so that they don't run on endlessly if they're not supported. Signed-off-by: David Howells <dhowells@redhat.com>	2019-05-16 16:25:20 +01:00
David Howells	51eba99970	afs: Fix "kAFS: AFS vnode with undefined type 0" Under some circumstances afs_select_fileserver() can return without setting an error in fc->error. The problem is in the no_more_servers segment where the accumulated errors from attempts to contact various servers are integrated into an afs_error-type variable 'e'. The resultant error code is, however, then abandoned. Fix this by getting the error out of e.error and putting it in 'error' so that the next part will store it into fc->error. Not doing this causes a report like the following: kAFS: AFS vnode with undefined type 0 kAFS: A=0 m=0 s=0 v=0 kAFS: vnode 20000025:1:1 because the code following the server selection loop then sees what it thinks is a successful invocation because fc.error is 0. However, it can't apply the status record because it's all zeros. The report is followed on the first instance with a trace looking something like: dump_stack+0x67/0x8e afs_inode_init_from_status.isra.2+0x21b/0x487 afs_fetch_status+0x119/0x1df afs_iget+0x130/0x295 afs_get_tree+0x31d/0x595 vfs_get_tree+0x1f/0xe8 fc_mount+0xe/0x36 afs_d_automount+0x328/0x3c3 follow_managed+0x109/0x20a lookup_fast+0x3bf/0x3f8 do_last+0xc3/0x6a4 path_openat+0x1af/0x236 do_filp_open+0x51/0xae ? _raw_spin_unlock+0x24/0x2d ? __alloc_fd+0x1a5/0x1b7 do_sys_open+0x13b/0x1e8 do_syscall_64+0x7d/0x1b3 entry_SYSCALL_64_after_hwframe+0x49/0xbe Fixes: `4584ae96ae` ("afs: Fix missing net error handling") Signed-off-by: David Howells <dhowells@redhat.com>	2019-05-16 15:48:20 +01:00
David Howells	d5c32c89b2	afs: Fix cell DNS lookup Currently, once configured, AFS cells are looked up in the DNS at regular intervals - which is a waste of resources if those cells aren't being used. It also leads to a problem where cells preloaded, but not configured, before the network is brought up end up effectively statically configured with no VL servers and are unable to get any. Fix this by not doing the DNS lookup until the first time a cell is touched. It is waited for if we don't have any cached records yet, otherwise the DNS lookup to maintain the record is done in the background. This has the downside that the first time you touch a cell, you now have to wait for the upcall to do the required DNS lookups rather than them already being cached. Further, the record is not replaced if the old record has at least one server in it and the new record doesn't have any. Fixes: `0a5143f2f8` ("afs: Implement VL server rotation") Signed-off-by: David Howells <dhowells@redhat.com>	2019-05-16 12:58:23 +01:00
David Howells	d0660f0b3b	dns_resolver: Allow used keys to be invalidated Allow used DNS resolver keys to be invalidated after use if the caller is doing its own caching of the results. This reduces the amount of resources required. Fix AFS to invalidate DNS results to kill off permanent failure records that get lodged in the resolver keyring and prevent future lookups from happening. Fixes: `0a5143f2f8` ("afs: Implement VL server rotation") Signed-off-by: David Howells <dhowells@redhat.com>	2019-05-15 17:35:54 +01:00
David Howells	ca1cbbdce9	afs: Fix afs_cell records to always have a VL server list record Fix it such that afs_cell records always have a VL server list record attached, even if it's a dummy one, so that various checks can be removed. Signed-off-by: David Howells <dhowells@redhat.com>	2019-05-15 17:35:53 +01:00
David Howells	6b8812fc8e	afs: Fix missing lock when replacing VL server list When afs_update_cell() replaces the cell->vl_servers list, it uses RCU protocol so that proc is protected, but doesn't take ->vl_servers_lock to protect afs_start_vl_iteration() (which does actually take a shared lock). Fix this by making afs_update_cell() take an exclusive lock when replacing ->vl_servers. Fixes: `0a5143f2f8` ("afs: Implement VL server rotation") Signed-off-by: David Howells <dhowells@redhat.com>	2019-05-15 17:35:53 +01:00
David Howells	773e0c4025	afs: Fix afs_xattr_get_yfs() to not try freeing an error value afs_xattr_get_yfs() tries to free yacl, which may hold an error value (say if yfs_fs_fetch_opaque_acl() failed and returned an error). Fix this by allocating yacl up front (since it's a fixed-length struct, unlike afs_acl) and passing it in to the RPC function. This also allows the flags to be placed in the object rather than passing them through to the RPC function. Fixes: `ae46578b96` ("afs: Get YFS ACLs and information through xattrs") Signed-off-by: David Howells <dhowells@redhat.com>	2019-05-15 17:35:53 +01:00
David Howells	cc1dd5c85c	afs: Fix incorrect error handling in afs_xattr_get_acl() Fix incorrect error handling in afs_xattr_get_acl() where there appears to be a redundant assignment before return, but in fact the return should be a goto to the error handling at the end of the function. Fixes: `260f082bae` ("afs: Get an AFS3 ACL as an xattr") Addresses-Coverity: ("Unused Value") Reported-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David Howells <dhowells@redhat.com> cc: Joe Perches <joe@perches.com>	2019-05-15 17:35:53 +01:00
David Howells	a1b879eefc	afs: Fix key leak in afs_release() and afs_evict_inode() Fix afs_release() to go through the cleanup part of the function if FMODE_WRITE is set rather than exiting through vfs_fsync() (which skips the cleanup). The cleanup involves discarding the refs on the key used for file ops and the writeback key record. Also fix afs_evict_inode() to clean up any left over wb keys attached to the inode/vnode when it is removed. Fixes: `5a81327616` ("afs: Do better accretion of small writes on newly created content") Signed-off-by: David Howells <dhowells@redhat.com>	2019-05-15 12:32:34 +01:00
Linus Torvalds	e5fef2a973	AFS Development -----BEGIN PGP SIGNATURE----- iQIVAwUAXNGpJPu3V2unywtrAQIVVQ//ZaEhskofcXYCsyO9pXshVKCZmp1pZ9Q3 ecbTbrF18guwHfM25LURidtjBmEAeuG5NOac/XHxcUbn5NUVzBQ1FircTVmLgtGY yjrBmMSqIDYhghslLAv78/HibdHJ+Flqy3RWAMyDMecTvx7VGx4idZQl5QIDbNEb GNvGP3WRiHG8tm6dykfm3afQoAS+n5seBBPDFucqPzAYa/Z/mBLgaZRKbmuMwEAe Q2mAf7vhYgw55JzeTSZZ4sWGP9Z+9Mi/18Hu8QvJwsrJW+jHlzJHtJp0EphSa3Xs YIRx+6AQ7WqAhnBUzzY5nBzMClIfMv1GrCG/6rXTTI/UYX65kVAP5M8EW6BAI8oX Fz2hJqCIvF8ZCSxIYLqizlEkxmEvfmwYxueX9km/+dfTma+MIaajMge+n3fDYmls S4RONn2LuqVeIw3m8DtKUBr7VRP0J9s1z0O4kubCtZt5PKNekvzSQSMIc17sXSST Uuo7aL3W6Lxk4bLMmB8o/Rf2RHBZlhmpPk8rF+I6jd0Q45SDV/TttqygyvKZseDo MZbnmBiDElDWXyKE6gxQqdC13tpb3MlCPv1L+xKDPArXe9yjq2XvHY4NtYBMCa5U iO1v+6W1JrGh8bkE72YuxKcBVVOStQxhHGU4D8WKZjOI7oeU87U7AD/8kSRhKQni VRXY1z87sZk= =yiyv -----END PGP SIGNATURE----- Merge tag 'afs-next-20190507' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs Pull AFS updates from David Howells: "A set of fix and development patches for AFS for 5.2. Summary: - Fix the AFS file locking so that sqlite can run on an AFS mount and also so that firefox and gnome can use a homedir that's mounted through AFS. This required emulation of fine-grained locking when the server will only support whole-file locks and no upgrade/downgrade. Four modes are provided, settable by mount parameter: "flock=local" - No reference to the server "flock=openafs" - Fine-grained locks are local-only, whole-file locks require sufficient server locks "flock=strict" - All locks require sufficient server locks "flock=write" - Always get an exclusive server lock If the volume is a read-only or backup volume, then flock=local for that volume. - Log extra information for a couple of cases where the client mucks up somehow: AFS vnode with undefined type and dir check failure - in both cases we seem to end up with unfilled data, but the issues happen infrequently and are difficult to reproduce at will. - Implement silly rename for unlink() and rename(). - Set i_blocks so that du can get some information about usage. - Fix xattr handlers to return the right amount of data and to not overflow buffers. - Implement getting/setting raw AFS and YFS ACLs as xattrs" * tag 'afs-next-20190507' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: afs: Implement YFS ACL setting afs: Get YFS ACLs and information through xattrs afs: implement acl setting afs: Get an AFS3 ACL as an xattr afs: Fix getting the afs.fid xattr afs: Fix the afs.cell and afs.volume xattr handlers afs: Calculate i_blocks based on file size afs: Log more information for "kAFS: AFS vnode with undefined type\n" afs: Provide mount-time configurable byte-range file locking emulation afs: Add more tracepoints afs: Implement sillyrename for unlink and rename afs: Add directory reload tracepoint afs: Handle lock rpc ops failing on a file that got deleted afs: Improve dir check failure reports afs: Add file locking tracepoints afs: Further fix file locking afs: Fix AFS file locking to allow fine grained locks afs: Calculate lock extend timer from set/extend reply reception afs: Split wait from afs_make_call()	2019-05-07 20:51:58 -07:00
Linus Torvalds	b4b52b881c	Wimplicit-fallthrough patches for 5.2-rc1 Hi Linus, This is my very first pull-request. I've been working full-time as a kernel developer for more than two years now. During this time I've been fixing bugs reported by Coverity all over the tree and, as part of my work, I'm also contributing to the KSPP. My work in the kernel community has been supervised by Greg KH and Kees Cook. OK. So, after the quick introduction above, please, pull the following patches that mark switch cases where we are expecting to fall through. These patches are part of the ongoing efforts to enable -Wimplicit-fallthrough. They have been ignored for a long time (most of them more than 3 months, even after pinging multiple times), which is the reason why I've created this tree. Most of them have been baking in linux-next for a whole development cycle. And with Stephen Rothwell's help, we've had linux-next nag-emails going out for newly introduced code that triggers -Wimplicit-fallthrough to avoid gaining more of these cases while we work to remove the ones that are already present. I'm happy to let you know that we are getting close to completing this work. Currently, there are only 32 of 2311 of these cases left to be addressed in linux-next. I'm auditing every case; I take a look into the code and analyze it in order to determine if I'm dealing with an actual bug or a false positive, as explained here: https://lore.kernel.org/lkml/c2fad584-1705-a5f2-d63c-824e9b96cf50@embeddedor.com/ While working on this, I've found and fixed the following missing break/return bugs, some of them introduced more than 5 years ago: `84242b82d8` `7850b51b6c` `5e420fe635` `09186e5034` `b5be853181` `7264235ee7` `cc5034a5d2` `479826cc86` `5340f23df8` `df997abeeb` `2f10d82373` `307b00c5e6` `5d25ff7a54` `a7ed5b3e7d` `c24bfa8f21` `ad0eaee619` `9ba8376ce1` `dc586a60a1` `a8e9b186f1` `4e57562b48` `60747828ea` `c5b974bee9` `cc44ba9116` `2c930e3d0a` Once this work is finish, we'll be able to universally enable "-Wimplicit-fallthrough" to avoid any of these kinds of bugs from entering the kernel again. Thanks Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEkmRahXBSurMIg1YvRwW0y0cG2zEFAlzQR2IACgkQRwW0y0cG 2zEJbQ//X930OcBtT/9DRW4XL1Jeq0Mjssz/GLX2Vpup5CwwcTROG65no80Zezf/ yQRWnUjGX0OBv/fmUK32/nTxI/7k7NkmIXJHe0HiEF069GEENB7FT6tfDzIPjU8M qQkB8NsSUWJs3IH6BVynb/9MGE1VpGBDbYk7CBZRtRJT1RMM+3kQPucgiZMgUBPo Yd9zKwn4i/8tcOCli++EUdQ29ukMoY2R3qpK4LftdX9sXLKZBWNwQbiCwSkjnvJK I6FDiA7RaWH2wWGlL7BpN5RrvAXp3z8QN/JZnivIGt4ijtAyxFUL/9KOEgQpBQN2 6TBRhfTQFM73NCyzLgGLNzvd8awem1rKGSBBUvevaPbgesgM+Of65wmmTQRhFNCt A7+e286X1GiK3aNcjUKrByKWm7x590EWmDzmpmICxNPdt5DHQ6EkmvBdNjnxCMrO aGA24l78tBN09qN45LR7wtHYuuyR0Jt9bCmeQZmz7+x3ICDHi/+Gw7XPN/eM9+T6 lZbbINiYUyZVxOqwzkYDCsdv9+kUvu3e4rPs20NERWRpV8FEvBIyMjXAg6NAMTue K+ikkyMBxCvyw+NMimHJwtD7ho4FkLPcoeXb2ZGJTRHixiZAEtF1RaQ7dA05Q/SL gbSc0DgLZeHlLBT+BSWC2Z8SDnoIhQFXW49OmuACwCUC68NHKps= =k30z -----END PGP SIGNATURE----- Merge tag 'Wimplicit-fallthrough-5.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux Pull Wimplicit-fallthrough updates from Gustavo A. R. Silva: "Mark switch cases where we are expecting to fall through. This is part of the ongoing efforts to enable -Wimplicit-fallthrough. Most of them have been baking in linux-next for a whole development cycle. And with Stephen Rothwell's help, we've had linux-next nag-emails going out for newly introduced code that triggers -Wimplicit-fallthrough to avoid gaining more of these cases while we work to remove the ones that are already present. We are getting close to completing this work. Currently, there are only 32 of 2311 of these cases left to be addressed in linux-next. I'm auditing every case; I take a look into the code and analyze it in order to determine if I'm dealing with an actual bug or a false positive, as explained here: https://lore.kernel.org/lkml/c2fad584-1705-a5f2-d63c-824e9b96cf50@embeddedor.com/ While working on this, I've found and fixed the several missing break/return bugs, some of them introduced more than 5 years ago. Once this work is finished, we'll be able to universally enable "-Wimplicit-fallthrough" to avoid any of these kinds of bugs from entering the kernel again" * tag 'Wimplicit-fallthrough-5.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux: (27 commits) memstick: mark expected switch fall-throughs drm/nouveau/nvkm: mark expected switch fall-throughs NFC: st21nfca: Fix fall-through warnings NFC: pn533: mark expected switch fall-throughs block: Mark expected switch fall-throughs ASN.1: mark expected switch fall-through lib/cmdline.c: mark expected switch fall-throughs lib: zstd: Mark expected switch fall-throughs scsi: sym53c8xx_2: sym_nvram: Mark expected switch fall-through scsi: sym53c8xx_2: sym_hipd: mark expected switch fall-throughs scsi: ppa: mark expected switch fall-through scsi: osst: mark expected switch fall-throughs scsi: lpfc: lpfc_scsi: Mark expected switch fall-throughs scsi: lpfc: lpfc_nvme: Mark expected switch fall-through scsi: lpfc: lpfc_nportdisc: Mark expected switch fall-through scsi: lpfc: lpfc_hbadisc: Mark expected switch fall-throughs scsi: lpfc: lpfc_els: Mark expected switch fall-throughs scsi: lpfc: lpfc_ct: Mark expected switch fall-throughs scsi: imm: mark expected switch fall-throughs scsi: csiostor: csio_wr: mark expected switch fall-through ...	2019-05-07 12:48:10 -07:00
Linus Torvalds	168e153d5e	Merge branch 'work.icache' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs inode freeing updates from Al Viro: "Introduction of separate method for RCU-delayed part of ->destroy_inode() (if any). Pretty much as posted, except that destroy_inode() stashes ->free_inode into the victim (anon-unioned with ->i_fops) before scheduling i_callback() and the last two patches (sockfs conversion and folding struct socket_wq into struct socket) are excluded - that pair should go through netdev once davem reopens his tree" * 'work.icache' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (58 commits) orangefs: make use of ->free_inode() shmem: make use of ->free_inode() hugetlb: make use of ->free_inode() overlayfs: make use of ->free_inode() jfs: switch to ->free_inode() fuse: switch to ->free_inode() ext4: make use of ->free_inode() ecryptfs: make use of ->free_inode() ceph: use ->free_inode() btrfs: use ->free_inode() afs: switch to use of ->free_inode() dax: make use of ->free_inode() ntfs: switch to ->free_inode() securityfs: switch to ->free_inode() apparmor: switch to ->free_inode() rpcpipe: switch to ->free_inode() bpf: switch to ->free_inode() mqueue: switch to ->free_inode() ufs: switch to ->free_inode() coda: switch to ->free_inode() ...	2019-05-07 10:57:05 -07:00
David Howells	f5e4546347	afs: Implement YFS ACL setting Implement the setting of YFS ACLs in AFS through the interface of setting the afs.yfs.acl extended attribute on the file. Signed-off-by: David Howells <dhowells@redhat.com>	2019-05-07 16:48:44 +01:00
David Howells	ae46578b96	afs: Get YFS ACLs and information through xattrs The YFS/AuriStor variant of AFS provides more capable ACLs and provides per-volume ACLs and per-file ACLs as well as per-directory ACLs. It also provides some extra information that can be retrieved through four ACLs: (1) afs.yfs.acl The YFS file ACL (not the same format as afs.acl). (2) afs.yfs.vol_acl The YFS volume ACL. (3) afs.yfs.acl_inherited "1" if a file's ACL is inherited from its parent directory, "0" otherwise. (4) afs.yfs.acl_num_cleaned The number of of ACEs removed from the ACL by the server because the PT entries were removed from the PTS database (ie. the subject is no longer known). Signed-off-by: David Howells <dhowells@redhat.com>	2019-05-07 16:48:44 +01:00
Joe Gorse	b10494af49	afs: implement acl setting Implements the setting of ACLs in AFS by means of setting the afs.acl extended attribute on the file. Signed-off-by: Joe Gorse <jhgorse@gmail.com> Signed-off-by: David Howells <dhowells@redhat.com>	2019-05-07 16:48:44 +01:00
David Howells	260f082bae	afs: Get an AFS3 ACL as an xattr Implement an xattr on AFS files called "afs.acl" that retrieves a file's ACL. It returns the raw AFS3 ACL from the result of calling FS.FetchACL, leaving any interpretation to userspace. Note that whilst YFS servers will respond to FS.FetchACL, this will render a more-advanced YFS ACL down. Use "afs.yfs.acl" instead for that. Signed-off-by: David Howells <dhowells@redhat.com>	2019-05-07 16:48:44 +01:00
David Howells	a2f611a3dc	afs: Fix getting the afs.fid xattr The AFS3 FID is three 32-bit unsigned numbers and is represented as three up-to-8-hex-digit numbers separated by colons to the afs.fid xattr. However, with the advent of support for YFS, the FID is now a 64-bit volume number, a 96-bit vnode/inode number and a 32-bit uniquifier (as before). Whilst the sprintf in afs_xattr_get_fid() has been partially updated (it currently ignores the upper 32 bits of the 96-bit vnode number), the size of the stack-based buffer has not been increased to match, thereby allowing stack corruption to occur. Fix this by increasing the buffer size appropriately and conditionally including the upper part of the vnode number if it is non-zero. The latter requires the lower part to be zero-padded if the upper part is non-zero. Fixes: `3b6492df41` ("afs: Increase to 64-bit volume ID and 96-bit vnode ID for YFS") Signed-off-by: David Howells <dhowells@redhat.com>	2019-05-07 16:48:44 +01:00
David Howells	c73aa4102f	afs: Fix the afs.cell and afs.volume xattr handlers Fix the ->get handlers for the afs.cell and afs.volume xattrs to pass the source data size to memcpy() rather than target buffer size. Overcopying the source data occasionally causes the kernel to oops. Fixes: `d3e3b7eac8` ("afs: Add metadata xattrs") Signed-off-by: David Howells <dhowells@redhat.com>	2019-05-07 16:48:44 +01:00
Marc Dionne	c0abbb5791	afs: Calculate i_blocks based on file size While it's not possible to give an accurate number for the blocks used on the server, populate i_blocks based on the file size so that 'du' can give a reasonable estimate. The value is rounded up to 1K granularity, for consistency with what other AFS clients report, and the servers' 1K usage quota unit. Note that the value calculated by 'du' at the root of a volume can still be slightly lower than the quota usage on the server, as 0-length files are charged 1 quota block, but are reported as occupying 0 blocks. Again, this is consistent with other AFS clients. Signed-off-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com>	2019-05-07 16:48:44 +01:00
David Howells	b134d687dd	afs: Log more information for "kAFS: AFS vnode with undefined type\n" Log more information when "kAFS: AFS vnode with undefined type\n" is displayed due to a vnode record being retrieved from the server that appears to have a duff file type (usually 0). This prints more information to try and help pin down the problem. Signed-off-by: David Howells <dhowells@redhat.com>	2019-05-07 16:48:44 +01:00
Al Viro	51b9fe48c4	afs: switch to use of ->free_inode() debugging printks left in ->destroy_inode() and so's the update of inode count; we could take the latter to RCU-delayed part (would take only moving the check on module exit past rcu_barrier() there), but debugging output ought to either stay where it is or go into ->evict_inode() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2019-05-01 22:43:26 -04:00
David Howells	6c6c1d63c2	afs: Provide mount-time configurable byte-range file locking emulation Provide byte-range file locking emulation that can be configured at mount time to one of four modes: (1) flock=local. Locking is done locally only and no reference is made to the server. (2) flock=openafs. Byte-range locking is done locally only; whole-file locking is done with reference to the server. Whole-file locks cannot be upgraded unless the client holds an exclusive lock. (3) flock=strict. Byte-range and whole-file locking both require a sufficient whole-file lock on the server. (4) flock=write. As strict, but the client always gets an exclusive whole-file lock on the server. Signed-off-by: David Howells <dhowells@redhat.com>	2019-04-25 14:26:52 +01:00
David Howells	80548b0399	afs: Add more tracepoints Add four more tracepoints: (1) afs_make_fs_call1 - Split from afs_make_fs_call but takes a filename to log also. (2) afs_make_fs_call2 - Like the above but takes two filenames to log. (3) afs_lookup - Log the result of doing a successful lookup, including a negative result (fid 0:0). (4) afs_get_tree - Log the set up of a volume for mounting. It also extends the name buffer on the afs_edit_dir tracepoint to 24 chars and puts quotes around the filename in the text representation. Signed-off-by: David Howells <dhowells@redhat.com>	2019-04-25 14:26:51 +01:00
David Howells	79ddbfa500	afs: Implement sillyrename for unlink and rename Implement sillyrename for AFS unlink and rename, using the NFS variant implementation as a basis. Note that the asynchronous file locking extender/releaser has to be notified with a state change to stop it complaining if there's a race between that and the actual file deletion. A tracepoint, afs_silly_rename, is also added to note the silly rename and the cleanup. The afs_edit_dir tracepoint is given some extra reason indicators and the afs_flock_ev tracepoint is given a silly-delete file lock cancellation indicator. Signed-off-by: David Howells <dhowells@redhat.com>	2019-04-25 14:26:51 +01:00
David Howells	99987c5600	afs: Add directory reload tracepoint Add a tracepoint (afs_reload_dir) to indicate when a directory is being reloaded. Signed-off-by: David Howells <dhowells@redhat.com>	2019-04-25 14:26:51 +01:00
David Howells	cdfb26b40d	afs: Handle lock rpc ops failing on a file that got deleted Holding a file lock on an AFS file does not prevent it from being deleted on the server, so we need to handle an error resulting from that when we try setting, extending or releasing a lock. Fix this by adding a "deleted" lock state and cancelling the lock extension process for that file and aborting all waiters for the lock. Fixes: `0fafdc9f88` ("afs: Fix file locking") Reported-by: Jonathan Billings <jsbillin@umich.edu> Signed-off-by: David Howells <dhowells@redhat.com>	2019-04-25 14:26:51 +01:00
David Howells	445b10289f	afs: Improve dir check failure reports Improve the content of directory check failure reports from: kAFS: afs_dir_check_page(6d57): bad magic 1/2 is 0000 to dump more information about the individual blocks in a directory page. Signed-off-by: David Howells <dhowells@redhat.com>	2019-04-25 14:26:51 +01:00
David Howells	d46966013b	afs: Add file locking tracepoints Add two tracepoints for monitoring AFS file locking. Firstly, add one that follows the operational part: echo 1 >/sys/kernel/debug/tracing/events/afs/afs_flock_op/enable And add a second that more follows the event-driven part: echo 1 >/sys/kernel/debug/tracing/events/afs/afs_flock_ev/enable Individual file_lock structs seen by afs are tagged with debugging IDs that are displayed in the trace log to make it easier to see what's going on, especially as setting the first lock always seems to involve copying the file_lock twice. Signed-off-by: David Howells <dhowells@redhat.com>	2019-04-25 14:26:50 +01:00
David Howells	4be5975aea	afs: Further fix file locking Further fix the file locking in the afs filesystem client in a number of ways, including: (1) Don't submit the operation to obtain a lock from the server in a work queue context, but rather do it in the process context of whoever issued the requesting system call. (2) The owner of the file_lock struct at the front of the pending_locks queue now owns right to talk to the server. (3) Write locks can be instantly granted if they don't overlap with any other locks and we have a write lock on the server. (4) In the event of an authentication/permission error, all other matching pending locks requests are also immediately aborted. (5) Properly use VFS core locks_lock_file_wait() to distribute the server lock amongst local client locks, including waiting for the lock to become available. Test with: sqlite3 /afs/.../scratch/billings.sqlite <<EOF CREATE TABLE hosts ( hostname varchar(80), shorthost varchar(80), room varchar(30), building varchar(30), PRIMARY KEY(shorthost) ); EOF With the version of sqlite3 that I have, this should fail consistently with EAGAIN, whether or not the program is straced (which introduces some delays between lock syscalls). Fixes: `0fafdc9f88` ("afs: Fix file locking") Reported-by: Jonathan Billings <jsbillin@umich.edu> Signed-off-by: David Howells <dhowells@redhat.com>	2019-04-25 14:26:50 +01:00
David Howells	68ce801ffd	afs: Fix AFS file locking to allow fine grained locks Fix AFS file locking to allow fine grained locks as some applications, such as firefox, won't work if they can't take such locks on certain state files - thereby preventing the use of kAFS to distribute a home directory. Note that this cannot be made completely functional as the protocol only has provision for whole-file locks, so there exists the possibility of a process deadlocking itself by getting a partial read-lock on a file first and then trying to get a non-overlapping write-lock - but we got the server's read lock with the first lock, so we're now stuck. OpenAFS solves this by just granting any partial-range lock directly without consulting the server - and hoping there's no remote collision. I want to implement that in a separate patch and it requires a bit more thought. Fixes: 8d6c554126b8 ("AFS: implement file locking") Reported-by: Jonathan Billings <jsbillings@jsbillings.org> Signed-off-by: David Howells <dhowells@redhat.com>	2019-04-25 14:26:50 +01:00
David Howells	a690f60a2b	afs: Calculate lock extend timer from set/extend reply reception Record the timestamp on the first reply DATA packet received in response to a set- or extend-lock operation, then use this to calculate the time remaining till the lock expires rather than using whatever time the requesting process wakes up and finishes processing the operation as a base. Signed-off-by: David Howells <dhowells@redhat.com>	2019-04-25 14:26:50 +01:00
David Howells	0b9bf3812a	afs: Split wait from afs_make_call() Split the call to afs_wait_for_call_to_complete() from afs_make_call() to make it easier to handle asynchronous calls and to make it easier to convert a synchronous call to an asynchronous one in future, for instance when someone tries to interrupt an operation by pressing Ctrl-C. Signed-off-by: David Howells <dhowells@redhat.com>	2019-04-25 14:26:50 +01:00
David Howells	eeba1e9cf3	afs: Fix in-progess ops to ignore server-level callback invalidation The in-kernel afs filesystem client counts the number of server-level callback invalidation events (CB.InitCallBackState* RPC operations) that it receives from the server. This is stored in cb_s_break in various structures, including afs_server and afs_vnode. If an inode is examined by afs_validate(), say, the afs_server copy is compared, along with other break counters, to those in afs_vnode, and if one or more of the counters do not match, it is considered that the server's callback promise is broken. At points where this happens, AFS_VNODE_CB_PROMISED is cleared to indicate that the status must be refetched from the server. afs_validate() issues an FS.FetchStatus operation to get updated metadata - and based on the updated data_version may invalidate the pagecache too. However, the break counters are also used to determine whether to note a new callback in the vnode (which would set the AFS_VNODE_CB_PROMISED flag) and whether to cache the permit data included in the YFSFetchStatus record by the server. The problem comes when the server sends us a CB.InitCallBackState op. The first such instance doesn't cause cb_s_break to be incremented, but rather causes AFS_SERVER_FL_NEW to be cleared - but thereafter, say some hours after last use and all the volumes have been automatically unmounted and the server has forgotten about the client[], this will* likely cause an increment. [] There are other circumstances too, such as the server restarting or needing to make space in its callback table. Note that the server won't send us a CB.InitCallBackState op until we talk to it again. So what happens is: (1) A mount for a new volume is attempted, a inode is created for the root vnode and vnode->cb_s_break and AFS_VNODE_CB_PROMISED aren't set immediately, as we don't have a nominated server to talk to yet - and we may iterate through a few to find one. (2) Before the operation happens, afs_fetch_status(), say, notes in the cursor (fc.cb_break) the break counter sum from the vnode, volume and server counters, but the server->cb_s_break is currently 0. (3) We send FS.FetchStatus to the server. The server sends us back CB.InitCallBackState. We increment server->cb_s_break. (4) Our FS.FetchStatus completes. The reply includes a callback record. (5) xdr_decode_AFSCallBack()/xdr_decode_YFSCallBack() check to see whether the callback promise was broken by checking the break counter sum from step (2) against the current sum. This fails because of step (3), so we don't set the callback record and, importantly, don't set AFS_VNODE_CB_PROMISED on the vnode. This does not preclude the syscall from progressing, and we don't loop here rechecking the status, but rather assume it's good enough for one round only and will need to be rechecked next time. (6) afs_validate() it triggered on the vnode, probably called from d_revalidate() checking the parent directory. (7) afs_validate() notes that AFS_VNODE_CB_PROMISED isn't set, so doesn't update vnode->cb_s_break and assumes the vnode to be invalid. (8) afs_validate() needs to calls afs_fetch_status(). Go back to step (2) and repeat, every time the vnode is validated. This primarily affects volume root dir vnodes. Everything subsequent to those inherit an already incremented cb_s_break upon mounting. The issue is that we assume that the callback record and the cached permit information in a reply from the server can't be trusted after getting a server break - but this is wrong since the server makes sure things are done in the right order, holding up our ops if necessary[]. [*] There is an extremely unlikely scenario where a reply from before the CB.InitCallBackState could get its delivery deferred till after - at which point we think we have a promise when we don't. This, however, requires unlucky mass packet loss to one call. AFS_SERVER_FL_NEW tries to paper over the cracks for the initial mount from a server we've never contacted before, but this should be unnecessary. It's also further insulated from the problem on an initial mount by querying the server first with FS.GetCapabilities, which triggers the CB.InitCallBackState. Fix this by (1) Remove AFS_SERVER_FL_NEW. (2) In afs_calc_vnode_cb_break(), don't include cb_s_break in the calculation. (3) In afs_cb_is_broken(), don't include cb_s_break in the check. Signed-off-by: David Howells <dhowells@redhat.com>	2019-04-13 08:37:37 +01:00
Marc Dionne	21bd68f196	afs: Unlock pages for __pagevec_release() __pagevec_release() complains loudly if any page in the vector is still locked. The pages need to be locked for generic_error_remove_page(), but that function doesn't actually unlock them. Unlock the pages afterwards. Signed-off-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: Jonathan Billings <jsbillin@umich.edu>	2019-04-13 08:37:37 +01:00
David Howells	8022c4b95c	afs: Differentiate abort due to unmarshalling from other errors Differentiate an abort due to an unmarshalling error from an abort due to other errors, such as ENETUNREACH. It doesn't make sense to set abort code RXGEN_*_UNMARSHAL in such a case, so use RX_USER_ABORT instead. Signed-off-by: David Howells <dhowells@redhat.com>	2019-04-13 08:37:37 +01:00
Andi Kleen	d2abfa86ff	afs: Avoid section confusion in CM_NAME __tracepoint_str cannot be const because the tracepoint_str section is not read-only. Remove the stray const. Cc: dhowells@redhat.com Cc: viro@zeniv.linux.org.uk Signed-off-by: Andi Kleen <ak@linux.intel.com>	2019-04-13 08:37:36 +01:00
Arnd Bergmann	ba25b81e3a	afs: avoid deprecated get_seconds() get_seconds() has a limited range on 32-bit architectures and is deprecated because of that. While AFS uses the same limits for its inode timestamps on the wire protocol, let's just use the simpler current_time() as we do for other file systems. This will still zero out the 'tv_nsec' field of the timestamps internally. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David Howells <dhowells@redhat.com>	2019-04-13 08:37:36 +01:00
Marc Dionne	f7f1dd3162	afs: Check for rxrpc call completion in wait loop Check the state of the rxrpc call backing an afs call in each iteration of the call wait loop in case the rxrpc call has already been terminated at the rxrpc layer. Interrupt the wait loop and mark the afs call as complete if the rxrpc layer call is complete. There were cases where rxrpc errors were not passed up to afs, which could result in this loop waiting forever for an afs call to transition to AFS_CALL_COMPLETE while the rx call was already complete. Signed-off-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-04-12 16:57:23 -07:00
Marc Dionne	4611da30d6	rxrpc: Make rxrpc_kernel_check_life() indicate if call completed Make rxrpc_kernel_check_life() pass back the life counter through the argument list and return true if the call has not yet completed. Suggested-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-04-12 16:57:23 -07:00
Gustavo A. R. Silva	e690c9e3f4	afs: Mark expected switch fall-throughs In preparation to enabling -Wimplicit-fallthrough, mark switch cases where we are expecting to fall through. Notice that in many cases I placed a /* Fall through */ comment at the bottom of the case, which what GCC is expecting to find. In other cases I had to tweak a bit the format of the comments. This patch suppresses ALL missing-break-in-switch false positives in fs/afs Addresses-Coverity-ID: 115042 ("Missing break in switch") Addresses-Coverity-ID: 115043 ("Missing break in switch") Addresses-Coverity-ID: 115045 ("Missing break in switch") Addresses-Coverity-ID: 1357430 ("Missing break in switch") Addresses-Coverity-ID: 115047 ("Missing break in switch") Addresses-Coverity-ID: 115050 ("Missing break in switch") Addresses-Coverity-ID: 115051 ("Missing break in switch") Addresses-Coverity-ID: 1467806 ("Missing break in switch") Addresses-Coverity-ID: 1467807 ("Missing break in switch") Addresses-Coverity-ID: 1467811 ("Missing break in switch") Addresses-Coverity-ID: 115041 ("Missing break in switch") Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>	2019-04-08 18:35:56 -05:00
David Howells	8c7ae38d1c	afs: Fix StoreData op marshalling The marshalling of AFS.StoreData, AFS.StoreData64 and YFS.StoreData64 calls generated by ->setattr() ops for the purpose of expanding a file is incorrect due to older documentation incorrectly describing the way the RPC 'FileLength' parameter is meant to work. The older documentation says that this is the length the file is meant to end up at the end of the operation; however, it was never implemented this way in any of the servers, but rather the file is truncated down to this before the write operation is effected, and never expanded to it (and, indeed, it was renamed to 'TruncPos' in 2014). Fix this by setting the position parameter to the new file length and doing a zero-lengh write there. The bug causes Xwayland to SIGBUS due to unexpected non-expansion of a file it then mmaps. This can be tested by giving the following test program a filename in an AFS directory: #include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <fcntl.h> #include <sys/mman.h> int main(int argc, char argv[]) { char p; int fd; if (argc != 2) { fprintf(stderr, "Format: test-trunc-mmap <file>\n"); exit(2); } fd = open(argv[1], O_RDWR \| O_CREAT \| O_TRUNC); if (fd < 0) { perror(argv[1]); exit(1); } if (ftruncate(fd, 0x140008) == -1) { perror("ftruncate"); exit(1); } p = mmap(NULL, 4096, PROT_READ \| PROT_WRITE, MAP_SHARED, fd, 0); if (p == MAP_FAILED) { perror("mmap"); exit(1); } p[0] = 'a'; if (munmap(p, 4096) < 0) { perror("munmap"); exit(1); } if (close(fd) < 0) { perror("close"); exit(1); } exit(0); } Fixes: `31143d5d51` ("AFS: implement basic file write support") Reported-by: Jonathan Billings <jsbillin@umich.edu> Tested-by: Jonathan Billings <jsbillin@umich.edu> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2019-03-28 08:54:20 -07:00
Linus Torvalds	7b47a9e7c8	Merge branch 'work.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs mount infrastructure updates from Al Viro: "The rest of core infrastructure; no new syscalls in that pile, but the old parts are switched to new infrastructure. At that point conversions of individual filesystems can happen independently; some are done here (afs, cgroup, procfs, etc.), there's also a large series outside of that pile dealing with NFS (quite a bit of option-parsing stuff is getting used there - it's one of the most convoluted filesystems in terms of mount-related logics), but NFS bits are the next cycle fodder. It got seriously simplified since the last cycle; documentation is probably the weakest bit at the moment - I considered dropping the commit introducing Documentation/filesystems/mount_api.txt (cutting the size increase by quarter ;-), but decided that it would be better to fix it up after -rc1 instead. That pile allows to do followup work in independent branches, which should make life much easier for the next cycle. fs/super.c size increase is unpleasant; there's a followup series that allows to shrink it considerably, but I decided to leave that until the next cycle" * 'work.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (41 commits) afs: Use fs_context to pass parameters over automount afs: Add fs_context support vfs: Add some logging to the core users of the fs_context log vfs: Implement logging through fs_context vfs: Provide documentation for new mount API vfs: Remove kern_mount_data() hugetlbfs: Convert to fs_context cpuset: Use fs_context kernfs, sysfs, cgroup, intel_rdt: Support fs_context cgroup: store a reference to cgroup_ns into cgroup_fs_context cgroup1_get_tree(): separate "get cgroup_root to use" into a separate helper cgroup_do_mount(): massage calling conventions cgroup: stash cgroup_root reference into cgroup_fs_context cgroup2: switch to option-by-option parsing cgroup1: switch to option-by-option parsing cgroup: take options parsing into ->parse_monolithic() cgroup: fold cgroup1_mount() into cgroup1_get_tree() cgroup: start switching to fs_context ipc: Convert mqueue fs to fs_context proc: Add fs_context support to procfs ...	2019-03-12 14:08:19 -07:00
Nikolay Borisov	b5420237ec	mm: refactor readahead defines in mm.h All users of VM_MAX_READAHEAD actually convert it to kbytes and then to pages. Define the macro explicitly as (SZ_128K / PAGE_SIZE). This simplifies the expression in every filesystem. Also rename the macro to VM_READAHEAD_PAGES to properly convey its meaning. Finally remove unused VM_MIN_READAHEAD [akpm@linux-foundation.org: fix fs/io_uring.c, per Stephen] Link: http://lkml.kernel.org/r/20181221144053.24318-1-nborisov@suse.com Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: Matthew Wilcox <willy@infradead.org> Reviewed-by: David Hildenbrand <david@redhat.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Eric Van Hensbergen <ericvh@gmail.com> Cc: Latchesar Ionkov <lucho@ionkov.net> Cc: Dominique Martinet <asmadeus@codewreck.org> Cc: David Howells <dhowells@redhat.com> Cc: Chris Mason <clm@fb.com> Cc: Josef Bacik <josef@toxicpanda.com> Cc: David Sterba <dsterba@suse.com> Cc: Miklos Szeredi <miklos@szeredi.hu> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2019-03-12 10:04:01 -07:00
David Howells	c99c2171fc	afs: Use fs_context to pass parameters over automount Alter the AFS automounting code to create and modify an fs_context struct when parameterising a new mount triggered by an AFS mountpoint rather than constructing device name and option strings. Also remove the cell=, vol= and rwpath options as they are then redundant. The reason they existed is because the 'device name' may be derived literally from a mountpoint object in the filesystem, so default cell and parent-type information needed to be passed in by some other method from the automount routines. The vol= option didn't end up being used. Signed-off-by: David Howells <dhowells@redhat.com> cc: Eric W. Biederman <ebiederm@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2019-02-28 03:29:39 -05:00
David Howells	13fcc68370	afs: Add fs_context support Add fs_context support to the AFS filesystem, converting the parameter parsing to store options there. This will form the basis for namespace propagation over mountpoints within the AFS model, thereby allowing AFS to be used in containers more easily. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2019-02-28 03:29:38 -05:00
David Howells	7d762d6914	afs: Fix manually set volume location server list When a cell with a volume location server list is added manually by echoing the details into /proc/net/afs/cells, a record is added but the flag saying it has been looked up isn't set. This causes the VL server rotation code to wait forever, with the top of /proc/pid/stack looking like: afs_select_vlserver+0x3a6/0x6f3 afs_vl_lookup_vldb+0x4b/0x92 afs_create_volume+0x25/0x1b9 ... with the thread stuck in afs_start_vl_iteration() waiting for AFS_CELL_FL_NO_LOOKUP_YET to be cleared. Fix this by clearing AFS_CELL_FL_NO_LOOKUP_YET when setting up a record if that record's details were supplied manually. Fixes: `0a5143f2f8` ("afs: Implement VL server rotation") Reported-by: Dave Botsch <dwb7@cornell.edu> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2019-02-25 11:59:07 -08:00
David Howells	34fa47612b	afs: Fix race in async call refcounting There's a race between afs_make_call() and afs_wake_up_async_call() in the case that an error is returned from rxrpc_kernel_send_data() after it has queued the final packet. afs_make_call() will try and clean up the mess, but the call state may have been moved on thereby causing afs_process_async_call() to also try and to delete the call. Fix this by: (1) Getting an extra ref for an asynchronous call for the call itself to hold. This makes sure the call doesn't evaporate on us accidentally and will allow the call to be retained by the caller in a future patch. The ref is released on leaving afs_make_call() or afs_wait_for_call_to_complete(). (2) In the event of an error from rxrpc_kernel_send_data(): (a) Don't set the call state to AFS_CALL_COMPLETE until after the call has been aborted and ended. This prevents afs_deliver_to_call() from doing anything with any notifications it gets. (b) Explicitly end the call immediately to prevent further callbacks. (c) Cancel any queued async_work and wait for the work if it's executing. This allows us to be sure the race won't recur when we change the state. We put the work queue's ref on the call if we managed to cancel it. (d) Put the call's ref that we got in (1). This belongs to us as long as the call is in state AFS_CALL_CL_REQUESTING. Fixes: `341f741f04` ("afs: Refcount the afs_call struct") Signed-off-by: David Howells <dhowells@redhat.com>	2019-01-17 15:17:28 +00:00
David Howells	7a75b0079a	afs: Provide a function to get a ref on a call Provide a function to get a reference on an afs_call struct. Signed-off-by: David Howells <dhowells@redhat.com>	2019-01-17 15:17:28 +00:00
David Howells	59d49076ae	afs: Fix key refcounting in file locking code Fix the refcounting of the authentication keys in the file locking code. The vnode->lock_key member points to a key on which it expects to be holding a ref, but it isn't always given an extra ref, however. Fixes: `0fafdc9f88` ("afs: Fix file locking") Signed-off-by: David Howells <dhowells@redhat.com>	2019-01-17 15:17:28 +00:00
Marc Dionne	4882a27cec	afs: Don't set vnode->cb_s_break in afs_validate() A cb_interest record is not necessarily attached to the vnode on entry to afs_validate(), which can cause an oops when we try to bring the vnode's cb_s_break up to date in the default case (ie. no current callback promise and the vnode has not been deleted). Fix this by simply removing the line, as vnode->cb_s_break will be set when needed by afs_register_server_cb_interest() when we next get a callback promise from RPC call. The oops looks something like: BUG: unable to handle kernel NULL pointer dereference at 0000000000000018 ... RIP: 0010:afs_validate+0x66/0x250 [kafs] ... Call Trace: afs_d_revalidate+0x8d/0x340 [kafs] ? __d_lookup+0x61/0x150 lookup_dcache+0x44/0x70 ? lookup_dcache+0x44/0x70 __lookup_hash+0x24/0xa0 do_unlinkat+0x11d/0x2c0 __x64_sys_unlink+0x23/0x30 do_syscall_64+0x4d/0xf0 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: `ae3b7361dc` ("afs: Fix validation/callback interaction") Signed-off-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com>	2019-01-17 15:15:52 +00:00
Marc Dionne	5edc22cc1d	afs: Set correct lock type for the yfs CreateFile A lock type of 0 is "LockRead", which makes the fileserver record an unintentional read lock on the new file. This will cause problems later on if the file is the subject of locking operations. The correct default value should be -1 ("LockNone"). Fix the operation marshalling code to set the value and provide an enum to symbolise the values whilst we're at it. Fixes: `30062bd13e` ("afs: Implement YFS support in the fs client") Signed-off-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com>	2019-01-10 17:12:05 +00:00
Gustavo A. R. Silva	c2b8bd49d3	afs: Use struct_size() in kzalloc() One of the more common cases of allocation size calculations is finding the size of a structure that has a zero-sized array at the end, along with memory for some number of elements for that array. For example: struct foo { int stuff; void entry[]; }; instance = kzalloc(sizeof(struct foo) + sizeof(void ) * count, GFP_KERNEL); Instead of leaving these open-coded and prone to type mistakes, we can now use the new struct_size() helper: instance = kzalloc(struct_size(instance, entry, count), GFP_KERNEL); This code was detected with the help of Coccinelle. Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: David Howells <dhowells@redhat.com>	2019-01-10 17:12:05 +00:00
Nikolay Borisov	f86196ea87	fs: don't open code lru_to_page() Multiple filesystems open code lru_to_page(). Rectify this by moving the macro from mm_inline (which is specific to lru stuff) to the more generic mm.h header and start using the macro where appropriate. No functional changes. Link: http://lkml.kernel.org/r/20181129104810.23361-1-nborisov@suse.com Link: https://lkml.kernel.org/r/20181129075301.29087-1-nborisov@suse.com Signed-off-by: Nikolay Borisov <nborisov@suse.com> Acked-by: Michal Hocko <mhocko@suse.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Mike Rapoport <rppt@linux.ibm.com> Acked-by: Pankaj gupta <pagupta@redhat.com> Acked-by: "Yan, Zheng" <zyan@redhat.com> [ceph] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2019-01-04 13:13:48 -08:00
Davidlohr Bueso	08d405c8b8	fs/: remove caller signal_pending branch predictions This is already done for us internally by the signal machinery. [akpm@linux-foundation.org: fix fs/buffer.c] Link: http://lkml.kernel.org/r/20181116002713.8474-7-dave@stgolabs.net Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2019-01-04 13:13:48 -08:00
Linus Torvalds	5f1ca5c619	Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs fixes from Al Viro: "Assorted fixes all over the place. The iov_iter one is this cycle regression (splice from UDP triggering WARN_ON()), the rest is older" * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: afs: Use d_instantiate() rather than d_add() and don't d_drop() afs: Fix missing net error handling afs: Fix validation/callback interaction iov_iter: teach csum_and_copy_to_iter() to handle pipe-backed ones exportfs: do not read dentry after free exportfs: fix 'passing zero to ERR_PTR()' warning aio: fix failure to put the file pointer sysv: return 'err' instead of 0 in __sysv_write_inode	2018-11-30 10:47:50 -08:00
David Howells	73116df7bb	afs: Use d_instantiate() rather than d_add() and don't d_drop() Use d_instantiate() rather than d_add() and don't d_drop() in afs_vnode_new_inode(). The dentry shouldn't be removed as it's not changing its name. Reported-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2018-11-29 21:08:14 -05:00
David Howells	4584ae96ae	afs: Fix missing net error handling kAFS can be given certain network errors (EADDRNOTAVAIL, EHOSTDOWN and ERFKILL) that it doesn't handle in its server/address rotation algorithms. They cause the probing and rotation to abort immediately rather than rotating. Fix this by: (1) Abstracting out the error prioritisation from the VL and FS rotation algorithms into a common function and expand usage into the server probing code. When multiple errors are available, this code selects the one we'd prefer to return. (2) Add handling for EADDRNOTAVAIL, EHOSTDOWN and ERFKILL. Fixes: `0fafdc9f88` ("afs: Fix file locking") Fixes: 0338747d8454 ("afs: Probe multiple fileservers simultaneously") Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2018-11-29 21:08:14 -05:00
David Howells	ae3b7361dc	afs: Fix validation/callback interaction When afs_validate() is called to validate a vnode (inode), there are two unhandled cases in the fastpath at the top of the function: (1) If the vnode is promised (AFS_VNODE_CB_PROMISED is set), the break counters match and the data has expired, then there's an implicit case in which the vnode needs revalidating. This has no consequences since the default "valid = false" set at the top of the function happens to do the right thing. (2) If the vnode is not promised and it hasn't been deleted (AFS_VNODE_DELETED is not set) then there's a default case we're not handling in which the vnode is invalid. If the vnode is invalid, we need to bring cb_s_break and cb_v_break up to date before we refetch the status. As a consequence, once the server loses track of the client (ie. sufficient time has passed since we last sent it an operation), it will send us a CB.InitCallBackState* operation when we next try to talk to it. This calls afs_init_callback_state() which increments afs_server::cb_s_break, but this then doesn't propagate to the afs_vnode record. The result being that every afs_validate() call thereafter sends a status fetch operation to the server. Clarify and fix this by: (A) Setting valid in all the branches rather than initialising it at the top so that the compiler catches where we've missed. (B) Restructuring the logic in the 'promised' branch so that we set valid to false if the callback is due to expire (or has expired) and so that the final case is that the vnode is still valid. (C) Adding an else-statement that ups cb_s_break and cb_v_break if the promised and deleted cases don't match. Fixes: `c435ee3455` ("afs: Overhaul the callback handling") Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2018-11-29 21:08:14 -05:00
David Howells	7150ceaacb	rxrpc: Fix life check The life-checking function, which is used by kAFS to make sure that a call is still live in the event of a pending signal, only samples the received packet serial number counter; it doesn't actually provoke a change in the counter, rather relying on the server to happen to give us a packet in the time window. Fix this by adding a function to force a ping to be transmitted. kAFS then keeps track of whether there's been a stall, and if so, uses the new function to ping the server, resetting the timeout to allow the reply to come back. If there's a stall, a ping and the call is still stalled in the same place after another period, then the call will be aborted. Fixes: `bc5e3a546d` ("rxrpc: Use MSG_WAITALL to tell sendmsg() to temporarily ignore signals") Fixes: `f4d15fb6f9` ("rxrpc: Provide functions for allowing cleaner handling of signals") Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-15 11:35:40 -08:00
David Howells	3bf0fb6f33	afs: Probe multiple fileservers simultaneously Send probes to all the unprobed fileservers in a fileserver list on all addresses simultaneously in an attempt to find out the fastest route whilst not getting stuck for 20s on any server or address that we don't get a reply from. This alleviates the problem whereby attempting to access a new server can take a long time because the rotation algorithm ends up rotating through all servers and addresses until it finds one that responds. Signed-off-by: David Howells <dhowells@redhat.com>	2018-10-24 00:41:09 +01:00
David Howells	18ac61853c	afs: Fix callback handling In some circumstances, the callback interest pointer is NULL, so in such a case we can't dereference it when checking to see if the callback is broken. This causes an oops in some circumstances. Fix this by replacing the function that worked out the aggregate break counter with one that actually does the comparison, and then make that return true (ie. broken) if there is no callback interest as yet (ie. the pointer is NULL). Fixes: `68251f0a68` ("afs: Fix whole-volume callback handling") Signed-off-by: David Howells <dhowells@redhat.com>	2018-10-24 00:41:09 +01:00

1 2 3 4 5 ...

679 Commits