linux

mirror of https://github.com/torvalds/linux.git synced 2024-12-02 17:11:33 +00:00

Author	SHA1	Message	Date
Al Viro	5e261246ce	logfs: no need to lock directory in lseek Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-05-09 11:42:19 -04:00
Al Viro	51a16a9cd5	switch ecryptfs to ->iterate_shared Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-05-09 11:42:18 -04:00
Al Viro	a063ff1e43	Merge branch 'for-linus' into work.lookups	2016-05-09 11:41:30 -04:00
Al Viro	5963ded8fe	9p: switch to ->iterate_shared() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-05-09 11:41:16 -04:00
Al Viro	98d4b8d8f0	fat: switch to ->iterate_shared() ... and make that weird ioctl lock directory only shared. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-05-09 11:41:15 -04:00
Al Viro	d375570fa8	romfs, squashfs: switch to ->iterate_shared() don't need to lock directory in ->llseek(), either Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-05-09 11:41:15 -04:00
Al Viro	c51da20c48	more trivial ->iterate_shared conversions Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-05-09 11:41:14 -04:00
Al Viro	060ff688ca	lustre: don't need to lock inode in directory lseek Note that lustre has its private mutex protecting directory pagecache; if they ever remove it, they'll need to be careful with PageChecked() use. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-05-09 11:41:14 -04:00
Al Viro	8cb0d2c1c7	kernfs: no point locking directory around that generic_file_llseek() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-05-09 11:41:13 -04:00
Al Viro	a01b3007ff	configfs_readdir(): make safe under shared lock Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-05-09 11:41:13 -04:00
Al Viro	884be17535	nfs: per-name sillyunlink exclusion use d_alloc_parallel() for sillyunlink/lookup exclusion and explicit rwsem (nfs_rmdir() being a writer and nfs_call_unlink() - a reader) for rmdir/sillyunlink one. That ought to make lookup/readdir/!O_CREAT atomic_open really parallel on NFS. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-05-09 11:39:45 -04:00
Al Viro	99d825822e	get_rock_ridge_filename(): handle malformed NM entries Payloads of NM entries are not supposed to contain NUL. When we run into such, only the part prior to the first NUL goes into the concatenation (i.e. the directory entry name being encoded by a bunch of NM entries). We do stop when the amount collected so far + the claimed amount in the current NM entry exceed 254. So far, so good, but what we return as the total length is the sum of claimed sizes, not the actual amount collected. And that can grow pretty large - not unlimited, since you'd need to put CE entries in between to be able to get more than the maximum that could be contained in one isofs directory entry / continuation chunk and we are stop once we'd encountered 32 CEs, but you can get about 8Kb easily. And that's what will be passed to readdir callback as the name length. 8Kb __copy_to_user() from a buffer allocated by __get_free_page() Cc: stable@vger.kernel.org # 0.98pl6+ (yes, really) Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-05-07 22:52:39 -04:00
Al Viro	6a480a7842	ecryptfs: fix handling of directory opening First of all, trying to open them r/w is idiocy; it's guaranteed to fail. Moreover, assigning ->f_pos and assuming that everything will work is blatantly broken - try that with e.g. tmpfs as underlying layer and watch the fireworks. There may be a non-trivial amount of state associated with current IO position, well beyond the numeric offset. Using the single struct file associated with underlying inode is really not a good idea; we ought to open one for each ecryptfs directory struct file. Additionally, file_operations both for directories and non-directories are full of pointless methods; non-directories should not have ->iterate(), directories should not have ->flush(), ->fasync() and ->splice_read(). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-05-04 14:04:13 -04:00
Al Viro	9ac3d3e846	nfs: switch to ->iterate_shared() aside of the usual care about seeding dcache from readdir, we need to be careful about the pagecache evictions here. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-05-02 19:51:53 -04:00
Al Viro	9cf843e3f4	lookup_open(): lock the parent shared unless O_CREAT is given Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-05-02 19:51:17 -04:00
Al Viro	6fbd07146d	lookup_open(): put the dentry fed to ->lookup() or ->atomic_open() into in-lookup hash Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-05-02 19:51:16 -04:00
Al Viro	12fa5e2404	lookup_open(): expand the call of real_lookup() ... and lose the duplicate IS_DEADDIR() - we'd already checked that. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-05-02 19:51:16 -04:00
Al Viro	384f26e28f	atomic_open(): reorder and clean up a bit Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-05-02 19:51:15 -04:00
Al Viro	1643b43fbd	lookup_open(): lift the "fallback to !O_CREAT" logics from atomic_open() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-05-02 19:51:15 -04:00
Al Viro	b3d58eaffb	atomic_open(): be paranoid about may_open() return value It should never return positives; however, with Linux S&M crowd involved, no bogosity is impossible. Results would be unpleasant... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-05-02 19:51:14 -04:00
Al Viro	0fb1ea0933	atomic_open(): delay open_to_namei_flags() until the method call nobody else needs that transformation. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-05-02 19:51:14 -04:00
Al Viro	fe9ec8291f	do_last(): take fput() on error after opening to out: make it conditional on *opened & FILE_OPENED; in addition to getting rid of exit_fput: thing, it simplifies atomic_open() cleanup on may_open() failure. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-05-02 19:51:13 -04:00
Al Viro	47f9dbd387	do_last(): get rid of duplicate ELOOP check may_open() will catch it Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-05-02 19:51:13 -04:00
Al Viro	55db2fd936	atomic_open(): massage the create_error logics a bit Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-05-02 19:51:12 -04:00
Al Viro	9d0728e16e	atomic_open(): consolidate "overridden ENOENT" in open-yourself cases Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-05-02 19:51:12 -04:00
Al Viro	5249e411b4	atomic_open(): don't bother with EEXIST check - it's done in do_last() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-05-02 19:51:11 -04:00
Al Viro	df889b3631	Merge branch 'for-linus' into work.lookups	2016-05-02 19:49:46 -04:00
Al Viro	ce8644fcad	lookup_open(): expand the call of vfs_create() Lift IS_DEADDIR handling up into the part common with atomic_open(), remove it from the latter. Collapse permission checks into the call of may_o_create(), getting it closer to atomic_open() case. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-05-02 19:49:33 -04:00
Al Viro	6ac087099e	path_openat(): take O_PATH handling out of do_last() do_last() and lookup_open() simpler that way and so does O_PATH itself. As it bloody well should: we find what the pathname resolves to, same way as in stat() et.al. and associate it with FMODE_PATH struct file. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-05-02 19:49:33 -04:00
Al Viro	3b0a3c1ac1	simple local filesystems: switch to ->iterate_shared() no changes needed (XFS isn't simple, but it has the same parallelism in the interesting parts exercised from CXFS). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-05-02 19:49:32 -04:00
Al Viro	4e82901cd6	dcache_{readdir,dir_lseek}() users: switch to ->iterate_shared no need to lock directory in dcache_dir_lseek(), while we are at it - per-struct file exclusion is enough. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-05-02 19:49:32 -04:00
Al Viro	3125d2650c	cifs: switch to ->iterate_shared() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-05-02 19:49:31 -04:00
Al Viro	d9b3dbdcfd	fuse: switch to ->iterate_shared() Switch dcache pre-seeding on readdir to d_alloc_parallel(); nothing else is needed. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-05-02 19:49:31 -04:00
Al Viro	f50752eaa0	switch all procfs directories ->iterate_shared() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-05-02 19:49:30 -04:00
Al Viro	76aab3ab61	proc_sys_fill_cache(): switch to d_alloc_parallel() make it usable with directory locked shared Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-05-02 19:49:30 -04:00
Al Viro	3781764b5c	proc_fill_cache(): switch to d_alloc_parallel() ... making it usable with directory locked shared Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-05-02 19:49:29 -04:00
Al Viro	6192269444	introduce a parallel variant of ->iterate() New method: ->iterate_shared(). Same arguments as in ->iterate(), called with the directory locked only shared. Once all filesystems switch, the old one will be gone. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-05-02 19:49:29 -04:00
Al Viro	63b6df1413	give readdir(2)/getdents(2)/etc. uniform exclusion with lseek() same as read() on regular files has, and for the same reason. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-05-02 19:49:28 -04:00
Al Viro	9902af79c0	parallel lookups: actual switch to rwsem ta-da! The main issue is the lack of down_write_killable(), so the places like readdir.c switched to plain inode_lock(); once killable variants of rwsem primitives appear, that'll be dealt with. lockdep side also might need more work Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-05-02 19:49:28 -04:00
Al Viro	d9171b9345	parallel lookups machinery, part 4 (and last) If we do run into an in-lookup match, we need to wait for it to cease being in-lookup. Fortunately, we do have unused space in in-lookup dentries - d_lru is never looked at until it stops being in-lookup. So we can stash a pointer to wait_queue_head from stack frame of the caller of ->lookup(). Some precautions are needed while waiting, but it's not that hard - we do hold a reference to dentry we are waiting for, so it can't go away. If it's found to be in-lookup the wait_queue_head is still alive and will remain so at least while ->d_lock is held. Moreover, the condition we are waiting for becomes true at the same point where everything on that wq gets woken up, so we can just add ourselves to the queue once. d_alloc_parallel() gets a pointer to wait_queue_head_t from its caller; lookup_slow() adjusted, d_add_ci() taught to use d_alloc_parallel() if the dentry passed to it happens to be in-lookup one (i.e. if it's been called from the parallel lookup). That's pretty much it - all that remains is to switch ->i_mutex to rwsem and have lookup_slow() take it shared. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-05-02 19:49:27 -04:00
Al Viro	94bdd655ca	parallel lookups machinery, part 3 We will need to be able to check if there is an in-lookup dentry with matching parent/name. Right now it's impossible, but as soon as start locking directories shared such beasts will appear. Add a secondary hash for locating those. Hash chains go through the same space where d_alias will be once it's not in-lookup anymore. Search is done under the same bitlock we use for modifications - with the primary hash we can rely on d_rehash() into the wrong chain being the worst that could happen, but here the pointers are buggered once it's removed from the chain. On the other hand, the chains are not going to be long and normally we'll end up adding to the chain anyway. That allows us to avoid bothering with ->d_lock when doing the comparisons - everything is stable until removed from chain. New helper: d_alloc_parallel(). Right now it allocates, verifies that no hashed and in-lookup matches exist and adds to in-lookup hash. Returns ERR_PTR() for error, hashed match (in the unlikely case it's been found) or new dentry. In-lookup matches trigger BUG() for now; that will change in the next commit when we introduce waiting for ongoing lookup to finish. Note that in-lookup matches won't be possible until we actually go for shared locking. lookup_slow() switched to use of d_alloc_parallel(). Again, these commits are separated only for making it easier to review. All this machinery will start doing something useful only when we go for shared locking; it's just that the combination is too large for my taste. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-05-02 19:49:27 -04:00
Al Viro	84e710da2a	parallel lookups machinery, part 2 We'll need to verify that there's neither a hashed nor in-lookup dentry with desired parent/name before adding to in-lookup set. One possible solution would be to hold the parent's ->d_lock through both checks, but while the in-lookup set is relatively small at any time, dcache is not. And holding the parent's ->d_lock through something like __d_lookup_rcu() would suck too badly. So we leave the parent's ->d_lock alone, which means that we watch out for the following scenario: * we verify that there's no hashed match * existing in-lookup match gets hashed by another process * we verify that there's no in-lookup matches and decide that everything's fine. Solution: per-directory kinda-sorta seqlock, bumped around the times we hash something that used to be in-lookup or move (and hash) something in place of in-lookup. Then the above would turn into * read the counter * do dcache lookup * if no matches found, check for in-lookup matches * if there had been none of those either, check if the counter has changed; repeat if it has. The "kinda-sorta" part is due to the fact that we don't have much spare space in inode. There is a spare word (shared with i_bdev/i_cdev/i_pipe), so the counter part is not a problem, but spinlock is a different story. We could use the parent's ->d_lock, and it would be less painful in terms of contention, for __d_add() it would be rather inconvenient to grab; we could do that (using lock_parent()), but... Fortunately, we can get serialization on the counter itself, and it might be a good idea in general; we can use cmpxchg() in a loop to get from even to odd and smp_store_release() from odd to even. This commit adds the counter and updating logics; the readers will be added in the next commit. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-05-02 19:49:26 -04:00
Al Viro	85c7f81041	beginning of transition to parallel lookups - marking in-lookup dentries marked as such when (would be) parallel lookup is about to pass them to actual ->lookup(); unmarked when * __d_add() is about to make it hashed, positive or not. * __d_move() (from d_splice_alias(), directly or via __d_unalias()) puts a preexisting dentry in its place * in caller of ->lookup() if it has escaped all of the above. Bug (WARN_ON, actually) if it reaches the final dput() or d_instantiate() while still marked such. As the result, we are guaranteed that for as long as the flag is set, dentry will * remain negative unhashed with positive refcount * never have its ->d_alias looked at * never have its ->d_lru looked at * never have its ->d_parent and ->d_name changed Right now we have at most one such for any given parent directory. With parallel lookups that restriction will weaken to * only exist when parent is locked shared * at most one with given (parent,name) pair (comparison of names is according to ->d_compare()) * only exist when there's no hashed dentry with the same (parent,name) Transition will take the next several commits; unfortunately, we'll only be able to switch to rwsem at the end of this series. The reason for not making it a single patch is to simplify review. New primitives: d_in_lookup() (a predicate checking if dentry is in the in-lookup state) and d_lookup_done() (tells the system that we are done with lookup and if it's still marked as in-lookup, it should cease to be such). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-05-02 19:47:51 -04:00
Al Viro	0568d705b0	__d_add(): don't drop/regain ->d_lock Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-05-02 19:47:26 -04:00
Al Viro	1936386ea9	lookup_slow(): bugger off on IS_DEADDIR() from the very beginning Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-05-02 19:47:26 -04:00
Al Viro	d2caaa0a77	nfs: missing wakeup in nfs_unblock_sillyrename() will be needed as soon as lookups are not serialized by ->i_mutex Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-05-02 19:47:25 -04:00
Al Viro	be5b82dbfe	make ext2_get_page() and friends work without external serialization Right now ext2_get_page() (and its analogues in a bunch of other filesystems) relies upon the directory being locked - the way it sets and tests Checked and Error bits would be racy without that. Switch to a slightly different scheme, _not_ setting Checked in case of failure. That way the logics becomes if Checked => OK else if Error => fail else if !validate => fail else => OK with validation setting Checked or Error on success and failure resp. and returning which one had happened. Equivalent to the current logics, but unlike the current logics not sensitive to the order of set_bit, test_bit getting reordered by CPU, etc. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-05-02 19:47:25 -04:00
Al Viro	b9e1d435fd	ovl_lookup_real(): use lookup_one_len_unlocked() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-05-02 19:47:24 -04:00
Al Viro	383d4e8ab0	reconnect_one(): use lookup_one_len_unlocked() ... and explain the non-obvious logics in case when lookup yields a different dentry. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-05-02 19:47:24 -04:00
Al Viro	1ae1f3f647	reiserfs: open-code reiserfs_mutex_lock_safe() in reiserfs_unpack() ... and have it use inode_lock() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-05-02 19:47:23 -04:00

1 2 3 4 5 ...

589128 Commits