Commit Graph

508935 Commits

Author SHA1 Message Date
Al Viro
ce85dd58ad 9p: we are leaking glock.client_id in v9fs_file_getlock()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11 22:28:28 -04:00
Al Viro
e494b6b5e1 9p: switch to ->read_iter/->write_iter
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11 22:28:28 -04:00
Al Viro
42b1ab979d 9p: get rid of v9fs_direct_file_read()
do it in ->direct_IO()...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11 22:28:27 -04:00
Al Viro
e1200fe68f 9p: switch p9_client_read() to passing struct iov_iter *
... and make it loop

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11 22:28:27 -04:00
Al Viro
9565a54452 9p: get rid of v9fs_direct_file_write()
just handle it in ->direct_IO()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11 22:28:26 -04:00
Al Viro
c711a6b111 9p: fold v9fs_file_write_internal() into the caller
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11 22:28:26 -04:00
Al Viro
371098c6a6 9p: switch ->writepage() to direct use of p9_client_write()
Don't mess with kmap() - just use ITER_BVEC.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11 22:28:26 -04:00
Al Viro
070b3656cf 9p: switch p9_client_write() to passing it struct iov_iter *
... and make it loop until it's done

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11 22:28:25 -04:00
Al Viro
4f3b35c157 net/9p: switch the guts of p9_client_{read,write}() to iov_iter
... and have get_user_pages_fast() mapping fewer pages than requested
to generate a short read/write.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11 22:28:25 -04:00
Al Viro
6e242a1cee nommu: use __vfs_read()
... instead of open-coding the call of ->read()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11 22:27:56 -04:00
Al Viro
d0f88f8d5d acct: check FMODE_CAN_WRITE
it's not calling ->write() directly anymore.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11 22:27:55 -04:00
Al Viro
47e393622b aio_run_iocb(): kill dead check
We check if ->ki_pos is positive.  However, by that point we have
already done rw_verify_area(), which would have rejected such
unless the file had been one of /dev/mem, /dev/kmem and /proc/kcore.
All of which do not have vectored rw methods, so we would've bailed
out even earlier.

This check had been introduced before rw_verify_area() had been added there
- in fact, it was a subset of checks done on sync paths by rw_verify_area()
(back then the /dev/mem exception didn't exist at all).  The rest of checks
(mandatory locking, etc.) hadn't been added until later.  Unfortunately,
by the time the call of rw_verify_area() got added, the /dev/mem exception
had already appeared, so it wasn't obvious that the older explicit check
downstream had become dead code.  It *is* a dead code, though, since the few
files for which the exception applies do not have ->aio_{read,write}() or
->{read,write}_iter() and for them we won't reach that check anyway.

What's more, even if we ever introduce vectored methods for /dev/mem
and friends, they'll have to cope with negative positions anyway, since
readv(2) and writev(2) are using the same checks as read(2) and write(2) -
i.e. rw_verify_area().

Let's bury it.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11 22:27:55 -04:00
Al Viro
08397acdd0 ioctx_alloc(): remove pointless check
Way, way back kiocb used to be picked from arrays, so ioctx_alloc()
checked for multiplication overflow when calculating the size of
such array.  By the time fs/aio.c went into the tree (in 2002) they
were already allocated one-by-one by kmem_cache_alloc(), so that
check had already become pointless.  Let's bury it...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11 22:27:54 -04:00
Al Viro
23602adfee lustre: kill unused members of struct vvp_thread_info
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11 22:27:54 -04:00
Al Viro
812408fb51 expand __fuse_direct_write() in both callers
it's actually shorter that way *and* later we'll want iocb in scope
of generic_write_check() caller.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11 22:27:53 -04:00
Al Viro
1531626364 fuse: switch fuse_direct_io_file_operations to ->{read,write}_iter()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11 22:27:53 -04:00
Al Viro
cfa86a7412 cuse: switch to iov_iter
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11 22:27:52 -04:00
Al Viro
39c853ebfe Merge branch 'for-davem' into for-next 2015-04-11 22:27:19 -04:00
Al Viro
fdc81f45e9 sg_start_req(): use import_iovec()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11 22:27:14 -04:00
Al Viro
451a2886b6 sg_start_req(): make sure that there's not too many elements in iovec
unfortunately, allowing an arbitrary 16bit value means a possibility of
overflow in the calculation of total number of pages in bio_map_user_iov() -
we rely on there being no more than PAGE_SIZE members of sum in the
first loop there.  If that sum wraps around, we end up allocating
too small array of pointers to pages and it's easy to overflow it in
the second loop.

X-Coverup: TINC (and there's no lumber cartel either)
Cc: stable@vger.kernel.org # way, way back
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11 22:27:13 -04:00
Al Viro
8f7e885a4c blk_rq_map_user(): use import_single_range()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11 22:27:13 -04:00
Al Viro
e272b89ff8 sg_io(): use import_iovec()
... and don't skip access_ok() validation.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11 22:27:13 -04:00
Al Viro
17d17e7282 process_vm_access: switch to {compat_,}import_iovec()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11 22:27:12 -04:00
Al Viro
b353a1f7bb switch keyctl_instantiate_key_common() to iov_iter
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11 22:27:12 -04:00
Al Viro
0504c074b5 switch {compat_,}do_readv_writev() to {compat_,}import_iovec()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11 22:27:12 -04:00
Al Viro
32a56afa23 aio_setup_vectored_rw(): switch to {compat_,}import_iovec()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11 22:27:11 -04:00
Al Viro
345995fa48 vmsplice_to_user(): switch to import_iovec()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11 22:27:11 -04:00
Al Viro
d4fb392f4c kill aio_setup_single_vector()
identical to import_single_range()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11 22:27:10 -04:00
Al Viro
36e9f6535f Merge branch 'iov_iter' into for-next 2015-04-11 22:26:51 -04:00
Al Viro
a96114fa1a aio: simplify arguments of aio_setup_..._rw()
We don't need req in either of those.  We don't need nr_segs in caller.
We don't really need len in caller either - iov_iter_count(&iter) will do.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11 22:26:45 -04:00
Al Viro
4c185ce06d aio: lift iov_iter_init() into aio_setup_..._rw()
the only non-trivial detail is that we do it before rw_verify_area(),
so we'd better cap the length ourselves in aio_setup_single_rw()
case (for vectored case rw_copy_check_uvector() will do that for us).

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11 22:26:45 -04:00
Al Viro
ac15ac0669 lift iov_iter into {compat_,}do_readv_writev()
get it closer to matching {compat_,}rw_copy_check_uvector().

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11 22:26:45 -04:00
Al Viro
c0fec3a98b Merge branch 'iocb' into for-next 2015-04-11 22:24:41 -04:00
Andrew Elble
c1b8940b42 NFS: fix BUG() crash in notify_change() with patch to chown_common()
We have observed a BUG() crash in fs/attr.c:notify_change(). The crash
occurs during an rsync into a filesystem that is exported via NFS.

1.) fs/attr.c:notify_change() modifies the caller's version of attr.
2.) 6de0ec00ba ("VFS: make notify_change pass ATTR_KILL_S*ID to
    setattr operations") introduced a BUG() restriction such that "no
    function will ever call notify_change() with both ATTR_MODE and
    ATTR_KILL_S*ID set". Under some circumstances though, it will have
    assisted in setting the caller's version of attr to this very
    combination.
3.) 27ac0ffeac ("locks: break delegations on any attribute
    modification") introduced code to handle breaking
    delegations. This can result in notify_change() being re-called. attr
    _must_ be explicitly reset to avoid triggering the BUG() established
    in #2.
4.) The path that that triggers this is via fs/open.c:chmod_common().
    The combination of attr flags set here and in the first call to
    notify_change() along with a later failed break_deleg_wait()
    results in notify_change() being called again via retry_deleg
    without resetting attr.

Solution is to move retry_deleg in chmod_common() a bit further up to
ensure attr is completely reset.

There are other places where this seemingly could occur, such as
fs/utimes.c:utimes_common(), but the attr flags are not initially
set in such a way to trigger this.

Fixes: 27ac0ffeac ("locks: break delegations on any attribute modification")
Reported-by: Eric Meddaugh <etmsys@rit.edu>
Tested-by: Eric Meddaugh <etmsys@rit.edu>
Signed-off-by: Andrew Elble <aweits@rit.edu>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11 22:24:34 -04:00
J. Bruce Fields
3d330dc175 dcache: return -ESTALE not -EBUSY on distributed fs race
On a distributed filesystem it's possible for lookup to discover that a
directory it just found is already cached elsewhere in the directory
heirarchy.  The dcache won't let us keep the directory in both places,
so we have to move the dentry to the new location from the place we
previously had it cached.

If the parent has changed, then this requires all the same locks as we'd
need to do a cross-directory rename.  But we're already in lookup
holding one parent's i_mutex, so it's too late to acquire those locks in
the right order.

The (unreliable) solution in __d_unalias is to trylock() the required
locks and return -EBUSY if it fails.

I see no particular reason for returning -EBUSY, and -ESTALE is already
the result of some other lookup races on NFS.  I think -ESTALE is the
more helpful error return.  It also allows us to take advantage of the
logic Jeff Layton added in c6a9428401 "vfs: fix renameat to retry on
ESTALE errors" and ancestors, which hopefully resolves some of these
errors before they're returned to userspace.

I can reproduce these cases using NFS with:

	ssh root@$client '
		mount -olookupcache=pos '$server':'$export' /mnt/
		mkdir /mnt/TO
		mkdir /mnt/DIR
		touch /mnt/DIR/test.txt
		while true; do
			strace -e open cat /mnt/DIR/test.txt 2>&1 | grep EBUSY
		done
	'
	ssh root@$server '
		while true; do
			mv $export/DIR $export/TO/DIR
			mv $export/TO/DIR $export/DIR
		done
	'

It also helps to add some other concurrent use of the directory on the
client (e.g., "ls /mnt/TO").  And you can replace the server-side mv's
by client-side mv's that are repeatedly killed.  (If the client is
interrupted while waiting for the RENAME response then it's left with a
dentry that has to go under one parent or the other, but it doesn't yet
know which.)

Acked-by: Jeff Layton <jlayton@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11 22:24:33 -04:00
Anton Altaparmakov
a632f55930 NTFS: Version 2.1.32 - Update file write from aio_write to write_iter.
Signed-off-by: Anton Altaparmakov <anton@tuxera.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11 22:24:33 -04:00
Anton Altaparmakov
171a02032b VFS: Add iov_iter_fault_in_multipages_readable()
simillar to iov_iter_fault_in_readable() but differs in that it is
not limited to faulting in the first iovec and instead faults in
"bytes" bytes iterating over the iovecs as necessary.

Also, instead of only faulting in the first and last page of the
range, all pages are faulted in.

This function is needed by NTFS when it does multi page file
writes.

Signed-off-by: Anton Altaparmakov <anton@tuxera.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11 22:24:32 -04:00
Al Viro
e5b811e38a drop bogus check in file_open_root()
For one thing, LOOKUP_DIRECTORY will be dealt with in do_last().
For another, name can be an empty string, but not NULL - no callers
pass that and it would oops immediately if they would.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11 22:24:32 -04:00
Al Viro
3f7036a071 switch security_inode_getattr() to struct path *
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11 22:24:32 -04:00
Al Viro
2247386243 constify tomoyo_realpath_from_path()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11 22:24:31 -04:00
Al Viro
74008b365d whack-a-mole: there's no point doing set_fs(USER_DS) in sigframe setup
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11 22:24:31 -04:00
Al Viro
a555ad450f whack-a-mole: no need to set_fs(USER_DS) in {start,flush}_thread()
flush_old_exec() has already done that.  Back on 2011 a bunch of
instances like that had been kicked out, but that hadn't taken
care of then-out-of-tree architectures, obviously, and they served
as reinfection vector...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11 22:24:31 -04:00
Al Viro
9e7543e939 remove incorrect comment in lookup_one_len()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11 22:24:30 -04:00
Al Viro
74eb8cc5a5 namei.c: fold do_path_lookup() into both callers
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11 22:24:30 -04:00
Al Viro
fd2f7cb5bc kill struct filename.separate
just make const char iname[] the last member and compare name->name with
name->iname instead of checking name->separate

We need to make sure that out-of-line name doesn't end up allocated adjacent
to struct filename refering to it; fortunately, it's easy to achieve - just
allocate that struct filename with one byte in ->iname[], so that ->iname[0]
will be inside the same object and thus have an address different from that
of out-of-line name [spotted by Boqun Feng <boqun.feng@gmail.com>]

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11 22:21:24 -04:00
Al Viro
01e97e6517 new helper: msg_data_left()
convert open-coded instances

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11 15:53:35 -04:00
Al Viro
a2dd3793a1 Merge remote-tracking branch 'dh/afs' into for-davem 2015-04-11 15:51:09 -04:00
Al Viro
d8725c86ae get rid of the size argument of sock_sendmsg()
it's equal to iov_iter_count(&msg->msg_iter) in all cases

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11 15:27:37 -04:00
Al Viro
6aa248145a switch kernel_sendmsg() and kernel_recvmsg() to iov_iter_kvec()
For kernel_sendmsg() that eliminates the need to play with setfs();
for kernel_recvmsg() it does *not* - a couple of callers are using
it with non-NULL ->msg_control, which would be treated as userland
address on recvmsg side of things.

In all cases we are really setting a kvec-backed iov_iter, though.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-09 00:02:34 -04:00
Al Viro
da18428498 net: switch importing msghdr from userland to {compat_,}import_iovec()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-09 00:02:26 -04:00