-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJWkuLiAAoJEHm+PkMAQRiG598H/1MH94FeO3jfO07jJ49myjEA
SkxPo39a3OqBfCmy4PmAoabzpK1nH8kVzw0v/fHNizNwI/wz9Vb2ButzFouZo7y7
6Yf9s0l3/BSeO0qFvN1UqhIZPvSWSijG7cSuavFQzVv7jN4w7tDSMPV6ftIQ12uX
+O5omGj7KqmVkL2cS/JH8MpNAeFAljxarXJkmGz3oskMvzlBRQmkACzUF2Kvu3Nq
WMPlamqzdeyhva7DyEP3eNeavyo1aEdqWufgfbf2jMzrNnnaH/ylsyx6/uRdecgt
eQMFgKDBVCrC6sIqCMqAN21XoIeoBOIEp4RpbUf+VzMm2rkGl/R+Lr8XPhk2bc4=
=2/pa
-----END PGP SIGNATURE-----
Orangefs: merge with V4.4
Merge tag 'v4.4' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux into current
Linux 4.4
Until now, orangefs_devreq_write_iter has just been a wrapper for
the old-fashioned orangefs_devreq_writev... linux would call
.write_iter with "struct kiocb *iocb" and "struct iov_iter *iter"
and .write_iter would just:
return pvfs2_devreq_writev(iocb->ki_filp,
iter->iov,
iter->nr_segs,
&iocb->ki_pos);
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
This only changes the names of things, so there is no functional change.
Signed-off-by: Martin Brandenburg <martin@omnibond.com>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
Prefix public functions with "orangefs_" do don't
pollute the global namespace.
This fixes a build issue on UML which also has block_signals().
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Martin Brandenburg <martin@omnibond.com>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
This export_operations structure is never modified, so declare it as const.
Most other structures of this type are already const.
Done with the help of Coccinelle.
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
We have found a BUG on res->migration_pending when migrating lock
resources. The situation is as follows.
dlm_mark_lockres_migration
res->migration_pending = 1;
__dlm_lockres_reserve_ast
dlm_lockres_release_ast returns with res->migration_pending remains
because other threads reserve asts
wait dlm_migration_can_proceed returns 1
>>>>>>> o2hb found that target goes down and remove target
from domain_map
dlm_migration_can_proceed returns 1
dlm_mark_lockres_migrating returns -ESHOTDOWN with
res->migration_pending still remains.
When reentering dlm_mark_lockres_migrating(), it will trigger the BUG_ON
with res->migration_pending. So clear migration_pending when target is
down.
Signed-off-by: Jiufei Xue <xuejiufei@huawei.com>
Reviewed-by: Joseph Qi <joseph.qi@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When resizing, it firstly extends the last gd. Once it should backup
super in the gd, it calculates new backup super and update the
corresponding value.
But it currently doesn't consider the situation that the backup super is
already done. And in this case, it still sets the bit in gd bitmap and
then decrease from bg_free_bits_count, which leads to a corrupted gd and
trigger the BUG in ocfs2_block_group_set_bits:
BUG_ON(le16_to_cpu(bg->bg_free_bits_count) < num_bits);
So check whether the backup super is done and then do the updates.
Signed-off-by: Joseph Qi <joseph.qi@huawei.com>
Reviewed-by: Jiufei Xue <xuejiufei@huawei.com>
Reviewed-by: Yiwen Jiang <jiangyiwen@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Orangefs fails to build on 32-bit SMP configurations due to a simple
misspelling, this does the obvious fix.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: 575e946125 ("Orangefs: change pvfs2 filenames to orangefs")
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
This is an API consolidation only. The use of kmalloc + memset to 0
should be equivalent to kzalloc in this case.
Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJWeF4NAAoJECebzXlCjuG+PvcQAL3AvxDzDnaNFhZJgWZMnRyC
OlXlPE4clfiFXSB7C39xNBcn7eCJYLkINCQLu4ywAS+y7/22sX7unCTt7UXL99K3
GffV/QvxOatSssik+CtS9gIMkRLW9Fs6fuQZ4k5w+UtveISpyFoRfw8hbISABL1w
NtgGIESXL8WXO+OSbVF/wRV8g1+FVi/gXWAOAoUtHBzyUho2JfECXO1XYz6mQ44M
HN4Bvx75dU3SieECHRKsq8yRbkPYHP9ron/+MskBZm7VkV/6mboFlFfivNncid0Y
ivpjeYP5xTj4KoXPlQ3feA9AbADNVshAKeDQYpDxRJimMjr6VVRFVDpzbKJc+5ou
if9AjZUiX02mHZShKMDsJR3kHBu+OzWLtQIDJUtLTIAaeb+V/2NEScnCyCIibXv7
l52zqJ7upEYFuUGFYIZgsEKZgOAm7e3appIAtGG5Nt9ejUVR1LVPfsa8u2xXhUgp
FN1TLmeQw6ZLRXcXa7vHcyQh/gJbPsm3PH514QYS+G3nMyXG8XnYKlMe98uhReno
A3MH5MxfgyiuUITJopVpZfKoEFpYcid21osmVqiZfawoxr4iTocogDArETW7prCL
QjN9sF+drlG70m/unDBKpQMPI0fhlmjY/VrK9YNlgvNaYKsJFVJnVFE1rCOuzj01
ekT3egZmGUR7kX94DuTt
=UJhV
-----END PGP SIGNATURE-----
Merge tag 'nfsd-4.4-1' of git://linux-nfs.org/~bfields/linux
Pull nfsd fix from Bruce Fields:
"Just one fix for a NFSv4 callback bug introduced in 4.4"
* tag 'nfsd-4.4-1' of git://linux-nfs.org/~bfields/linux:
nfsd: don't hold ls_mutex across a layout recall
Pull btrfs fixes from Chris Mason:
"A couple of small fixes"
* 'for-linus-4.4' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
Btrfs: check prepare_uptodate_page() error code earlier
Btrfs: check for empty bitmap list in setup_cluster_bitmaps
btrfs: fix misleading warning when space cache failed to load
Btrfs: fix transaction handle leak in balance
Btrfs: fix unprotected list move from unused_bgs to deleted_bgs list
Writing to /proc/$pid/coredump_filter always returns -ESRCH because commit
774636e19e ("proc: convert to kstrto*()/kstrto*_from_user()") removed
the setting of ret after the get_proc_task call and incorrectly left it as
-ESRCH. Instead, return 0 when successful.
Example breakage:
echo 0 > /proc/self/coredump_filter
bash: echo: write error: No such process
Fixes: 774636e19e ("proc: convert to kstrto*()/kstrto*_from_user()")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: <stable@vger.kernel.org> [4.3+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
All callers were outside of the file these functions were declared in,
so nothing was ever inlined anyway.
Further this happens before I/O and any speedup by not having to do a
call will be dwarfed by the time it takes to talk to the server.
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
Signed-off-by: Martin Brandenburg <martin@omnibond.com>
There was previously MAX_ALIGNED_DEV_REQ_(UP|DOWN)SIZE macros which
evaluated to MAX_DEV_REQ_(UP|DOWN)SIZE+8. As it is unclear what this is
for, other than creating a situation where we accept more data than we
can parse, it is removed.
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
Signed-off-by: Martin Brandenburg <martin@omnibond.com>
We do need to serialize layout stateid morphing operations, but we
currently hold the ls_mutex across a layout recall which is pretty
ugly. It's also unnecessary -- once we've bumped the seqid and
copied it, we don't need to serialize the rest of the CB_LAYOUTRECALL
vs. anything else. Just drop the mutex once the copy is done.
This was causing a "workqueue leaked lock or atomic" warning and an
occasional deadlock.
There's more work to be done here but this fixes the immediate
regression.
Fixes: cc8a55320b "nfsd: serialize layout stateid morphing operations"
Cc: stable@vger.kernel.org
Reported-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
prepare_pages() may end up calling prepare_uptodate_page() twice if our
write only spans a single page. But if the first call returns an error,
our page will be unlocked and its not safe to call it again.
This bug goes all the way back to 2011, and it's not something commonly
hit.
While we're here, add a more explicit check for the page being truncated
away. The bare lock_page() alone is protected only by good thoughts and
i_mutex, which we're sure to regret eventually.
Reported-by: Dave Jones <dsj@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
Dave Jones found a warning from kasan in setup_cluster_bitmaps()
==================================================================
BUG: KASAN: stack-out-of-bounds in setup_cluster_bitmap+0xc4/0x5a0 at
addr ffff88039bef6828
Read of size 8 by task nfsd/1009
page:ffffea000e6fbd80 count:0 mapcount:0 mapping: (null)
index:0x0
flags: 0x8000000000000000()
page dumped because: kasan: bad access detected
CPU: 1 PID: 1009 Comm: nfsd Tainted: G W
4.4.0-rc3-backup-debug+ #1
ffff880065647b50 000000006bb712c2 ffff88039bef6640 ffffffffa680a43e
0000004559c00000 ffff88039bef66c8 ffffffffa62638d1 ffffffffa61121c0
ffff8803a5769de8 0000000000000296 ffff8803a5769df0 0000000000046280
Call Trace:
[<ffffffffa680a43e>] dump_stack+0x4b/0x6d
[<ffffffffa62638d1>] kasan_report_error+0x501/0x520
[<ffffffffa61121c0>] ? debug_show_all_locks+0x1e0/0x1e0
[<ffffffffa6263948>] kasan_report+0x58/0x60
[<ffffffffa6814b00>] ? rb_last+0x10/0x40
[<ffffffffa66f8af4>] ? setup_cluster_bitmap+0xc4/0x5a0
[<ffffffffa6262ead>] __asan_load8+0x5d/0x70
[<ffffffffa66f8af4>] setup_cluster_bitmap+0xc4/0x5a0
[<ffffffffa66f675a>] ? setup_cluster_no_bitmap+0x6a/0x400
[<ffffffffa66fcd16>] btrfs_find_space_cluster+0x4b6/0x640
[<ffffffffa66fc860>] ? btrfs_alloc_from_cluster+0x4e0/0x4e0
[<ffffffffa66fc36e>] ? btrfs_return_cluster_to_free_space+0x9e/0xb0
[<ffffffffa702dc37>] ? _raw_spin_unlock+0x27/0x40
[<ffffffffa666a1a1>] find_free_extent+0xba1/0x1520
Andrey noticed this was because we were doing list_first_entry on a list
that might be empty. Rework the tests a bit so we don't do that.
Signed-off-by: Chris Mason <clm@fb.com>
Reprorted-by: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Reported-by: Dave Jones <dsj@fb.com>
AV dislikes many parts of orangefs_devreq_writev. Besides making
orangefs_devreq_writev more easily readable and better commented,
this patch makes an effort to address some of the problems:
> The 5th is quietly ignored unless trailer_size is positive and
> status is zero. If trailer_size > 0 && status == 0, you verify that
> the length of the 5th segment is no more than trailer_size and copy
> it to vmalloc'ed buffer. Without bothering to zero the rest of that
> buffer out.
It was just wrong to allow a 5th segment that is not exactly equal to
trailer_size. Now that that's fixed, there's nothing to zero out in
the vmalloced buffer - it is exactly the right size to hold the
5th segment.
> Another API bogosity: when the 5th segment is present, successful writev()
> returns the sum of sizes of the first 4.
Added size of 5th segment to writev return...
> if concatenation of the first 4 segments is longer than
> 16 + sizeof(struct pvfs2_downcall_s) by no more than sizeof(long) => whine
> and proceed with garbage.
If 4th segment isn't exactly sizeof(struct pvfs2_downcall_s), whine and fail.
> if the 32bit value 4 bytes into op->downcall is zero and 64bit
> value following it is non-zero, the latter is interpreted as the size of
> trailer data.
The latter is what userspace claimed was the length of the trailer data.
The kernel module now compares it to the trailer iovec's iov_len as a
sanity check.
> if there's no trailer, the 5th segment (if present) is completely ignored.
Whine and fail if there should be no trailer, yet a 5th segment is present.
> if vmalloc fails, act as if status (32bit at offset 5 into
> op->downcall) had been -ENOMEM and don't look at the 5th segment at all.
whine and fail with -ENOMEM.
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
Jan Stancek reported that I wrecked things for him by fixing things for
Vladimir :/
His report was due to an UNINTERRUPTIBLE wait getting -EINTR, which
should not be possible, however my previous patch made this possible by
unconditionally checking signal_pending().
We cannot use current->state as was done previously, because the
instruction after the store to that variable it can be changed. We must
instead pass the initial state along and use that.
Fixes: 68985633bc ("sched/wait: Fix signal handling in bit wait helpers")
Reported-by: Jan Stancek <jstancek@redhat.com>
Reported-by: Chris Mason <clm@fb.com>
Tested-by: Jan Stancek <jstancek@redhat.com>
Tested-by: Vladimir Murzin <vladimir.murzin@arm.com>
Tested-by: Chris Mason <clm@fb.com>
Reviewed-by: Paul Turner <pjt@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: tglx@linutronix.de
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: hpa@zytor.com
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Merge misc fixes from Andrew Morton:
"17 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
MIPS: fix DMA contiguous allocation
sh64: fix __NR_fgetxattr
ocfs2: fix SGID not inherited issue
mm/oom_kill.c: avoid attempting to kill init sharing same memory
drivers/base/memory.c: prohibit offlining of memory blocks with missing sections
tmpfs: fix shmem_evict_inode() warnings on i_blocks
mm/hugetlb.c: fix resv map memory leak for placeholder entries
mm: hugetlb: call huge_pte_alloc() only if ptep is null
kernel: remove stop_machine() Kconfig dependency
mm: kmemleak: mark kmemleak_init prototype as __init
mm: fix kerneldoc on mem_cgroup_replace_page
osd fs: __r4w_get_page rely on PageUptodate for uptodate
MAINTAINERS: make Vladimir co-maintainer of the memory controller
mm, vmstat: allow WQ concurrency to discover memory reclaim doesn't make any progress
mm: fix swapped Movable and Reclaimable in /proc/pagetypeinfo
memcg: fix memory.high target
mm: hugetlb: fix hugepage memory leak caused by wrong reserve count
Pull block layer fixes from Jens Axboe:
"A set of fixes for the current series. This contains:
- A bunch of fixes for lightnvm, should be the last round for this
series. From Matias and Wenwei.
- A writeback detach inode fix from Ilya, also marked for stable.
- A block (though it says SCSI) fix for an OOPS in SCSI runtime power
management.
- Module init error path fixes for null_blk from Minfei"
* 'for-linus' of git://git.kernel.dk/linux-block:
null_blk: Fix error path in module initialization
lightnvm: do not compile in debugging by default
lightnvm: prevent gennvm module unload on use
lightnvm: fix media mgr registration
lightnvm: replace req queue with nvmdev for lld
lightnvm: comments on constants
lightnvm: check mm before use
lightnvm: refactor spin_unlock in gennvm_get_blk
lightnvm: put blks when luns configure failed
lightnvm: use flags in rrpc_get_blk
block: detach bdev inode from its wb in __blkdev_put()
SCSI: Fix NULL pointer dereference in runtime PM
Commit 8f1eb48758 ("ocfs2: fix umask ignored issue") introduced an
issue, SGID of sub dir was not inherited from its parents dir. It is
because SGID is set into "inode->i_mode" in ocfs2_get_init_inode(), but
is overwritten by "mode" which don't have SGID set later.
Fixes: 8f1eb48758 ("ocfs2: fix umask ignored issue")
Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Acked-by: Srinivas Eeda <srinivas.eeda@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit 42cb14b110 ("mm: migrate dirty page without
clear_page_dirty_for_io etc") simplified the migration of a PageDirty
pagecache page: one stat needs moving from zone to zone and that's about
all.
It's convenient and safest for it to shift the PageDirty bit from old
page to new, just before updating the zone stats: before copying data
and marking the new PageUptodate. This is all done while both pages are
isolated and locked, just as before; and just as before, there's a
moment when the new page is visible in the radix_tree, but not yet
PageUptodate. What's new is that it may now be briefly visible as
PageDirty before it is PageUptodate.
When I scoured the tree to see if this could cause a problem anywhere,
the only places I found were in two similar functions __r4w_get_page():
which look up a page with find_get_page() (not using page lock), then
claim it's uptodate if it's PageDirty or PageWriteback or PageUptodate.
I'm not sure whether that was right before, but now it might be wrong
(on rare occasions): only claim the page is uptodate if PageUptodate.
Or perhaps the page in question could never be migratable anyway?
Signed-off-by: Hugh Dickins <hughd@google.com>
Tested-by: Boaz Harrosh <ooo@electrozaur.com>
Cc: Benny Halevy <bhalevy@panasas.com>
Cc: Trond Myklebust <trond.myklebust@primarydata.com>
Cc: Christoph Lameter <cl@linux.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>