Commit Graph

60814 Commits

Author SHA1 Message Date
Linus Torvalds
29a8ea4fbe Merge branch 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
Pull libnvdimm fixes from Dan Williams:
 "1/ Fixes to the libnvdimm 'pfn' device that establishes a reserved
     area for storing a struct page array.

  2/ Fixes for dax operations on a raw block device to prevent pagecache
     collisions with dax mappings.

  3/ A fix for pfn_t usage in vm_insert_mixed that lead to a null
     pointer de-reference.

  These have received build success notification from the kbuild robot
  across 153 configs and pass the latest ndctl tests"

* 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
  phys_to_pfn_t: use phys_addr_t
  mm: fix pfn_t to page conversion in vm_insert_mixed
  block: use DAX for partition table reads
  block: revert runtime dax control of the raw block device
  fs, block: force direct-I/O for dax-enabled block devices
  devm_memremap_pages: fix vmem_altmap lifetime + alignment handling
  libnvdimm, pfn: fix restoring memmap location
  libnvdimm: fix mode determination for e820 devices
2016-02-01 15:21:20 -08:00
Chandan Rajendra
65bfa65807 Btrfs: btrfs_ioctl_clone: Truncate complete page after performing clone operation
In subpagesize-blocksize scenario, the "destination offset" argument passed to
the btrfs_ioctl_clone() can be aligned to sectorsize but may not be
necessarily aligned to the machine's page size. In such cases,
truncate_inode_pages_range() ends up zeroing out the partial page and future
read operations will return incorrect data. Hence this commit explicitly
rounds down the "destination offset" to the machine's page size.

Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2016-02-01 19:24:29 +01:00
Chandan Rajendra
27772b68f6 Btrfs: Clean pte corresponding to page straddling i_size
When extending a file by either "truncate up" or by writing beyond i_size, the
page which had i_size needs to be marked "read only" so that future writes to
the page via mmap interface causes btrfs_page_mkwrite() to be invoked. If not,
a write performed after extending the file via the mmap interface will find
the page to be writaeable and continue writing to the page without invoking
btrfs_page_mkwrite() i.e. we end up writing to a file without reserving disk
space.

Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2016-02-01 19:24:29 +01:00
Chandan Rajendra
5a2834f808 Btrfs: Fix block size returned to user space
btrfs_getattr() returns PAGE_CACHE_SIZE as the block size. Since
generic_fillattr() already does the right thing (by obtaining block size
from inode->i_blkbits), just remove the statement from btrfs_getattr.

Reviewed-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2016-02-01 19:24:29 +01:00
Chandan Rajendra
0c29ba993e Btrfs: Limit inline extents to root->sectorsize
cow_file_range_inline() limits the size of an inline extent to
PAGE_CACHE_SIZE. This breaks in subpagesize-blocksize scenarios. Fix this by
comparing against root->sectorsize.

Reviewed-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2016-02-01 19:24:29 +01:00
Chandan Rajendra
5f4dc8fc83 Btrfs: btrfs_submit_direct_hook: Handle map_length < bio vector length
In subpagesize-blocksize scenario, map_length can be less than the length of a
bio vector. Such a condition may cause btrfs_submit_direct_hook() to submit a
zero length bio. Fix this by comparing map_length against block size rather
than with bv_len.

Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2016-02-01 19:24:29 +01:00
Chandan Rajendra
298cfd3683 Btrfs: Use (eb->start, seq) as search key for tree modification log
In subpagesize-blocksize a page can map multiple extent buffers and hence
using (page index, seq) as the search key is incorrect. For example, searching
through tree modification log tree can return an entry associated with the
first extent buffer mapped by the page (if such an entry exists), when we are
actually searching for entries associated with extent buffers that are mapped
at position 2 or more in the page.

Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2016-02-01 19:24:29 +01:00
Chandan Rajendra
dbfdb6d1b3 Btrfs: Search for all ordered extents that could span across a page
In subpagesize-blocksize scenario it is not sufficient to search using the
first byte of the page to make sure that there are no ordered extents
present across the page. Fix this.

Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2016-02-01 19:24:29 +01:00
Chandan Rajendra
d0b7da88f6 Btrfs: btrfs_page_mkwrite: Reserve space in sectorsized units
In subpagesize-blocksize scenario, if i_size occurs in a block which is not
the last block in the page, then the space to be reserved should be calculated
appropriately.

Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2016-02-01 19:24:29 +01:00
Chandan Rajendra
9703fefe0b Btrfs: fallocate: Work with sectorsized blocks
While at it, this commit changes btrfs_truncate_page() to truncate sectorsized
blocks instead of pages. Hence the function has been renamed to
btrfs_truncate_block().

Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2016-02-01 19:24:29 +01:00
Chandan Rajendra
2dabb32484 Btrfs: Direct I/O read: Work on sectorsized blocks
The direct I/O read's endio and corresponding repair functions work on
page sized blocks. This commit adds the ability for direct I/O read to work on
subpagesized blocks.

Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2016-02-01 19:23:47 +01:00
Chandan Rajendra
c40a3d38af Btrfs: Compute and look up csums based on sectorsized blocks
Checksums are applicable to sectorsize units. The current code uses
bio->bv_len units to compute and look up checksums. This works on machines
where sectorsize == PAGE_SIZE. This patch makes the checksum computation and
look up code to work with sectorsize units.

Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2016-02-01 19:23:47 +01:00
Chandan Rajendra
2e78c927d7 Btrfs: __btrfs_buffered_write: Reserve/release extents aligned to block size
Currently, the code reserves/releases extents in multiples of PAGE_CACHE_SIZE
units. Fix this by doing reservation/releases in block size units.

Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2016-02-01 19:23:47 +01:00
David Howells
a5b3a80b89 CacheFiles: Provide read-and-reset release counters for cachefilesd
Provide read-and-reset objects- and blocks-released counters for cachefilesd
to use to work out whether there's anything new that can be culled.

One of the problems cachefilesd has is that if all the objects in the cache
are pinned by inodes lying dormant in the kernel inode cache, there isn't
anything for it to cull.  In such a case, it just spins around walking the
filesystem tree and scanning for something to cull.  This eats up a lot of
CPU time.

By telling cachefilesd if there have been any releases, the daemon can
sleep until there is the possibility of something to do.

cachefilesd finds this information by the following means:

 (1) When the control fd is read, the kernel presents a list of values of
     interest.  "freleased=N" and "breleased=N" are added to this list to
     indicate the number of files released and number of blocks released
     since the last read call.  At this point the counters are reset.

 (2) POLLIN is signalled if the number of files released becomes greater
     than 0.

Note that by 'released' it just means that the kernel has released its
interest in those files for the moment, not necessarily that the files
should be deleted from the cache.

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Steve Dickson <steved@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-02-01 12:30:10 -05:00
Trond Myklebust
e5003b2f6a NFSv4.x: Fix NFS4ERR_RETRY_UNCACHED_REP in nfs4_callback_sequence
We need to initialize cb_sequenceres information when reporting a
NFS4ERR_RETRY_UNCACHED_REP error, since that will apply to the
next operation, not to the CB_SEQUENCE itself.

Reported-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-02-01 12:06:24 -05:00
Linus Torvalds
dc799d0179 Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fixes from Thomas Gleixner:
 "The timer departement delivers:

   - a regression fix for the NTP code along with a proper selftest
   - prevent a spurious timer interrupt in the NOHZ lowres code
   - a fix for user space interfaces returning the remaining time on
     architectures with CONFIG_TIME_LOW_RES=y
   - a few patches to fix COMPILE_TEST fallout"

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  tick/nohz: Set the correct expiry when switching to nohz/lowres mode
  clocksource: Fix dependencies for archs w/o HAS_IOMEM
  clocksource: Select CLKSRC_MMIO where needed
  tick/sched: Hide unused oneshot timer code
  kselftests: timers: Add adjtimex SETOFFSET validity tests
  ntp: Fix ADJ_SETOFFSET being used w/ ADJ_NANO
  itimers: Handle relative timers with CONFIG_TIME_LOW_RES proper
  posix-timers: Handle relative timers with CONFIG_TIME_LOW_RES proper
  timerfd: Handle relative timers with CONFIG_TIME_LOW_RES proper
  hrtimer: Handle remaining time proper for TIME_LOW_RES
  clockevents/tcb_clksrc: Prevent disabling an already disabled clock
2016-01-31 15:49:06 -08:00
Mike Krinkin
7ddc971f86 block: fix use-after-free in dio_bio_complete
kasan reported the following error when i ran xfstest:

[  701.826854] ==================================================================
[  701.826864] BUG: KASAN: use-after-free in dio_bio_complete+0x41a/0x600 at addr ffff880080b95f94
[  701.826870] Read of size 4 by task loop2/3874
[  701.826879] page:ffffea000202e540 count:0 mapcount:0 mapping:          (null) index:0x0
[  701.826890] flags: 0x100000000000000()
[  701.826895] page dumped because: kasan: bad access detected
[  701.826904] CPU: 3 PID: 3874 Comm: loop2 Tainted: G    B   W    L  4.5.0-rc1-next-20160129 #83
[  701.826910] Hardware name: LENOVO 23205NG/23205NG, BIOS G2ET95WW (2.55 ) 07/09/2013
[  701.826917]  ffff88008fadf800 ffff88008fadf758 ffffffff81ca67bb 0000000041b58ab3
[  701.826941]  ffffffff830d1e74 ffffffff81ca6724 ffff88008fadf748 ffffffff8161c05c
[  701.826963]  0000000000000282 ffff88008fadf800 ffffed0010172bf2 ffffea000202e540
[  701.826987] Call Trace:
[  701.826997]  [<ffffffff81ca67bb>] dump_stack+0x97/0xdc
[  701.827005]  [<ffffffff81ca6724>] ? _atomic_dec_and_lock+0xc4/0xc4
[  701.827014]  [<ffffffff8161c05c>] ? __dump_page+0x32c/0x490
[  701.827023]  [<ffffffff816b0d03>] kasan_report_error+0x5f3/0x8b0
[  701.827033]  [<ffffffff817c302a>] ? dio_bio_complete+0x41a/0x600
[  701.827040]  [<ffffffff816b1119>] __asan_report_load4_noabort+0x59/0x80
[  701.827048]  [<ffffffff817c302a>] ? dio_bio_complete+0x41a/0x600
[  701.827053]  [<ffffffff817c302a>] dio_bio_complete+0x41a/0x600
[  701.827057]  [<ffffffff81bd19c8>] ? blk_queue_exit+0x108/0x270
[  701.827060]  [<ffffffff817c32b0>] dio_bio_end_aio+0xa0/0x4d0
[  701.827063]  [<ffffffff817c3210>] ? dio_bio_complete+0x600/0x600
[  701.827067]  [<ffffffff81bd2806>] ? blk_account_io_completion+0x316/0x5d0
[  701.827070]  [<ffffffff81bafe89>] bio_endio+0x79/0x200
[  701.827074]  [<ffffffff81bd2c9f>] blk_update_request+0x1df/0xc50
[  701.827078]  [<ffffffff81c02c27>] blk_mq_end_request+0x57/0x120
[  701.827081]  [<ffffffff81c03670>] __blk_mq_complete_request+0x310/0x590
[  701.827084]  [<ffffffff812348d8>] ? set_next_entity+0x2f8/0x2ed0
[  701.827088]  [<ffffffff8124b34d>] ? put_prev_entity+0x22d/0x2a70
[  701.827091]  [<ffffffff81c0394b>] blk_mq_complete_request+0x5b/0x80
[  701.827094]  [<ffffffff821e2a33>] loop_queue_work+0x273/0x19d0
[  701.827098]  [<ffffffff811f6578>] ? finish_task_switch+0x1c8/0x8e0
[  701.827101]  [<ffffffff8129d058>] ? trace_hardirqs_on_caller+0x18/0x6c0
[  701.827104]  [<ffffffff821e27c0>] ? lo_read_simple+0x890/0x890
[  701.827108]  [<ffffffff8129dd60>] ? debug_check_no_locks_freed+0x350/0x350
[  701.827111]  [<ffffffff811f63b0>] ? __hrtick_start+0x130/0x130
[  701.827115]  [<ffffffff82a0c8f6>] ? __schedule+0x936/0x20b0
[  701.827118]  [<ffffffff811dd6bd>] ? kthread_worker_fn+0x3ed/0x8d0
[  701.827121]  [<ffffffff811dd4ed>] ? kthread_worker_fn+0x21d/0x8d0
[  701.827125]  [<ffffffff8129d058>] ? trace_hardirqs_on_caller+0x18/0x6c0
[  701.827128]  [<ffffffff811dd57f>] kthread_worker_fn+0x2af/0x8d0
[  701.827132]  [<ffffffff811dd2d0>] ? __init_kthread_worker+0x170/0x170
[  701.827135]  [<ffffffff82a1ea46>] ? _raw_spin_unlock_irqrestore+0x36/0x60
[  701.827138]  [<ffffffff811dd2d0>] ? __init_kthread_worker+0x170/0x170
[  701.827141]  [<ffffffff811dd2d0>] ? __init_kthread_worker+0x170/0x170
[  701.827144]  [<ffffffff811dd00b>] kthread+0x24b/0x3a0
[  701.827148]  [<ffffffff811dcdc0>] ? kthread_create_on_node+0x4c0/0x4c0
[  701.827151]  [<ffffffff8129d70d>] ? trace_hardirqs_on+0xd/0x10
[  701.827155]  [<ffffffff8116d41d>] ? do_group_exit+0xdd/0x350
[  701.827158]  [<ffffffff811dcdc0>] ? kthread_create_on_node+0x4c0/0x4c0
[  701.827161]  [<ffffffff82a1f52f>] ret_from_fork+0x3f/0x70
[  701.827165]  [<ffffffff811dcdc0>] ? kthread_create_on_node+0x4c0/0x4c0
[  701.827167] Memory state around the buggy address:
[  701.827170]  ffff880080b95e80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
[  701.827172]  ffff880080b95f00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
[  701.827175] >ffff880080b95f80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
[  701.827177]                          ^
[  701.827179]  ffff880080b96000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
[  701.827182]  ffff880080b96080: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
[  701.827183] ==================================================================

The problem is that bio_check_pages_dirty calls bio_put, so we must
not access bio fields after bio_check_pages_dirty.

Fixes: 9b81c84235 ("block: don't access bio->bi_error after bio_put()").
Signed-off-by: Mike Krinkin <krinkin.m.u@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-01-30 22:02:10 -07:00
David S. Miller
53729eb174 Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth
Johan Hedberg says:

====================
pull request: bluetooth 2016-01-30

Here's a set of important Bluetooth fixes for the 4.5 kernel:

 - Two fixes to 6LoWPAN code (one fixing a potential crash)
 - Fix LE pairing with devices using both public and random addresses
 - Fix allocation of dynamic LE PSM values
 - Fix missing COMPATIBLE_IOCTL for UART line discipline

Please let me know if there are any issues pulling. Thanks.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-30 15:32:42 -08:00
Dan Williams
d1a5f2b4d8 block: use DAX for partition table reads
Avoid populating pagecache when the block device is in DAX mode.
Otherwise these page cache entries collide with the fsync/msync
implementation and break data durability guarantees.

Cc: Jan Kara <jack@suse.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Reported-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Tested-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Reviewed-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-01-30 13:35:32 -08:00
Dan Williams
9f4736fe7c block: revert runtime dax control of the raw block device
Dynamically enabling DAX requires that the page cache first be flushed
and invalidated.  This must occur atomically with the change of DAX mode
otherwise we confuse the fsync/msync tracking and violate data
durability guarantees.  Eliminate the possibilty of DAX-disabled to
DAX-enabled transitions for now and revisit this for the next cycle.

Cc: Jan Kara <jack@suse.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Matthew Wilcox <willy@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-01-30 13:35:31 -08:00
Borislav Petkov
bc696ca05f x86/cpufeature: Replace the old static_cpu_has() with safe variant
So the old one didn't work properly before alternatives had run.
And it was supposed to provide an optimized JMP because the
assumption was that the offset it is jumping to is within a
signed byte and thus a two-byte JMP.

So I did an x86_64 allyesconfig build and dumped all possible
sites where static_cpu_has() was used. The optimization amounted
to all in all 12(!) places where static_cpu_has() had generated
a 2-byte JMP. Which has saved us a whopping 36 bytes!

This clearly is not worth the trouble so we can remove it. The
only place where the optimization might count - in __switch_to()
- we will handle differently. But that's not subject of this
patch.

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1453842730-28463-6-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-01-30 11:22:18 +01:00
Linus Torvalds
d3f71ae711 Merge branch 'for-linus-4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
Pull btrfs fixes from Chris Mason:
 "Dave had a small collection of fixes to the new free space tree code,
  one of which was keeping our sysfs files more up to date with feature
  bits as different things get enabled (lzo, raid5/6, etc).

  I should have kept the sysfs stuff for rc3, since we always manage to
  trip over something.  This time it was GFP_KERNEL from somewhere that
  is NOFS only.  Instead of rebasing it out I've put a revert in, and
  we'll fix it properly for rc3.

  Otherwise, Filipe fixed a btrfs DIO race and Qu Wenruo fixed up a
  use-after-free in our tracepoints that Dave Jones reported"

* 'for-linus-4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
  Revert "btrfs: synchronize incompat feature bits with sysfs files"
  btrfs: don't use GFP_HIGHMEM for free-space-tree bitmap kzalloc
  btrfs: sysfs: check initialization state before updating features
  Revert "btrfs: clear PF_NOFREEZE in cleaner_kthread()"
  btrfs: async-thread: Fix a use-after-free error for trace
  Btrfs: fix race between fsync and lockless direct IO writes
  btrfs: add free space tree to the cow-only list
  btrfs: add free space tree to lockdep classes
  btrfs: tweak free space tree bitmap allocation
  btrfs: tests: switch to GFP_KERNEL
  btrfs: synchronize incompat feature bits with sysfs files
  btrfs: sysfs: introduce helper for syncing bits with sysfs files
  btrfs: sysfs: add free-space-tree bit attribute
  btrfs: sysfs: fix typo in compat_ro attribute definition
2016-01-29 15:46:49 -08:00
Chris Mason
e410e34fad Revert "btrfs: synchronize incompat feature bits with sysfs files"
This reverts commit 14e46e0495.

This ends up doing sysfs operations from deep in balance (where we
should be GFP_NOFS) and under heavy balance load, we're making races
against sysfs internals.

Revert it for now while we figure things out.

Signed-off-by: Chris Mason <clm@fb.com>
2016-01-29 08:19:37 -08:00
Martin Brandenburg
99109822f5 orangefs: Fix revalidate.
Previously, it would update a live inode. This was fixed, but it did not
ever check that the inode attributes in the dcache are correct. This
checks all inode attributes and rejects any that are not correct, which
causes a lookup and thus a new getattr.

Perhaps inode_operations->permission should replace or augment some of
this.

There is no actual caching, and this does a rather excessive amount of
network operations back to the filesystem server.

Signed-off-by: Martin Brandenburg <martin@omnibond.com>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
2016-01-28 15:08:40 -05:00
Martin Brandenburg
394f647e3a orangefs: Util functions shouldn't operate on inode where it can be avoided.
Signed-off-by: Martin Brandenburg <martin@omnibond.com>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
2016-01-28 15:08:39 -05:00
Trond Myklebust
2370abdab5 NFS: Cleanup - rename NFS_LAYOUT_RETURN_BEFORE_CLOSE
NFS_LAYOUT_RETURN_BEFORE_CLOSE is being used to signal that a
layoutreturn is needed, either due to a layout recall or to a
layout error. Rename it to NFS_LAYOUT_RETURN_REQUESTED in order
to clarify its purpose.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-01-27 20:40:05 -05:00
Marcel Holtmann
d10d34aa7c Bluetooth: Add missing COMPATIBLE_IOCTL for UART line discipline
The HCIUARTGETDEVICE, HCIUARTSETFLAGS and HCIUARTGETFLAGS ioctl are
missing the COMPATIBLE_IOCTL declaration.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2016-01-27 10:48:26 -05:00
Chris Mason
e1c0ebad3f btrfs: don't use GFP_HIGHMEM for free-space-tree bitmap kzalloc
This was copied incorrectly from the __vmalloc call.

Signed-off-by: Chris Mason <clm@fb.com>
2016-01-27 07:05:49 -08:00
Chris Mason
d32a4e3434 Merge branch 'dev/fst-followup' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux into for-linus-4.5 2016-01-27 05:48:23 -08:00
David Sterba
bf6092066f btrfs: sysfs: check initialization state before updating features
If the mount phase is not finished, we can't update the sysfs files.

Reported-by: Chris Mason <clm@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
2016-01-27 05:40:10 -08:00
Herbert Xu
3095e8e366 eCryptfs: Use skcipher and shash
This patch replaces uses of ablkcipher and blkcipher with skcipher,
and the long obsolete hash interface with shash.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-01-27 20:36:18 +08:00
Herbert Xu
1edb82d202 nfsd: Use shash
This patch replaces uses of the long obsolete hash interface with
shash.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-01-27 20:36:13 +08:00
Herbert Xu
2731a944f6 f2fs: Use skcipher
This patch replaces uses of ablkcipher with skcipher.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-01-27 20:35:55 +08:00
Herbert Xu
3f32a5bee0 ext4: Use skcipher
This patch replaces uses of ablkcipher with skcipher.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-01-27 20:35:55 +08:00
Herbert Xu
9651ddbac8 cifs: Use skcipher
This patch replaces uses of blkcipher with skcipher.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-01-27 20:35:53 +08:00
Trond Myklebust
13c13a6ad7 pNFS: Fix missing layoutreturn calls
The layoutreturn code currently relies on pnfs_put_lseg() to initiate the
RPC call when conditions are right. A problem arises when we want to
free the layout segment from inside an inode->i_lock section (e.g. in
pnfs_clear_request_commit()), since we cannot sleep.

The workaround is to move the actual call to pnfs_send_layoutreturn()
to pnfs_put_layout_hdr(), which doesn't have this restriction.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-01-26 23:12:11 -05:00
David Sterba
80ad623edd Revert "btrfs: clear PF_NOFREEZE in cleaner_kthread()"
This reverts commit 6962491321. The
cleaner thread can block freezing when there's a snapshot cleaning in
progress and the other threads get suspended first. From the logs
provided by Martin we're waiting for reading extent pages:

kernel: PM: Syncing filesystems ... done.
kernel: Freezing user space processes ... (elapsed 0.015 seconds) done.
kernel: Freezing remaining freezable tasks ...
kernel: Freezing of tasks failed after 20.003 seconds (1 tasks refusing to freeze, wq_busy=0):
kernel: btrfs-cleaner   D ffff88033dd13bc0     0   152      2 0x00000000
kernel: ffff88032ebc2e00 ffff88032e750000 ffff88032e74fa50 7fffffffffffffff
kernel: ffffffff814a58df 0000000000000002 ffffea000934d580 ffffffff814a5451
kernel: 7fffffffffffffff ffffffff814a6e8f 0000000000000000 0000000000000020
kernel: Call Trace:
kernel: [<ffffffff814a58df>] ? bit_wait+0x2c/0x2c
kernel: [<ffffffff814a5451>] ? schedule+0x6f/0x7c
kernel: [<ffffffff814a6e8f>] ? schedule_timeout+0x2f/0xd8
kernel: [<ffffffff81076f94>] ? timekeeping_get_ns+0xa/0x2e
kernel: [<ffffffff81077603>] ? ktime_get+0x36/0x44
kernel: [<ffffffff814a4f6c>] ? io_schedule_timeout+0x94/0xf2
kernel: [<ffffffff814a4f6c>] ? io_schedule_timeout+0x94/0xf2
kernel: [<ffffffff814a590b>] ? bit_wait_io+0x2c/0x30
kernel: [<ffffffff814a5694>] ? __wait_on_bit+0x41/0x73
kernel: [<ffffffff8109eba8>] ? wait_on_page_bit+0x6d/0x72
kernel: [<ffffffff8105d718>] ? autoremove_wake_function+0x2a/0x2a
kernel: [<ffffffff811a02d7>] ? read_extent_buffer_pages+0x1bd/0x203
kernel: [<ffffffff8117d9e9>] ? free_root_pointers+0x4c/0x4c
kernel: [<ffffffff8117e831>] ? btree_read_extent_buffer_pages.constprop.57+0x5a/0xe9
kernel: [<ffffffff8117f4f3>] ? read_tree_block+0x2d/0x45
kernel: [<ffffffff8116782a>] ? read_block_for_search.isra.34+0x22a/0x26b
kernel: [<ffffffff811656c3>] ? btrfs_set_path_blocking+0x1e/0x4a
kernel: [<ffffffff8116919b>] ? btrfs_search_slot+0x648/0x736
kernel: [<ffffffff81170559>] ? btrfs_lookup_extent_info+0xb7/0x2c7
kernel: [<ffffffff81170ee5>] ? walk_down_proc+0x9c/0x1ae
kernel: [<ffffffff81171c9d>] ? walk_down_tree+0x40/0xa4
kernel: [<ffffffff8117375f>] ? btrfs_drop_snapshot+0x2da/0x664
kernel: [<ffffffff8104ff21>] ? finish_task_switch+0x126/0x167
kernel: [<ffffffff811850f8>] ? btrfs_clean_one_deleted_snapshot+0xa6/0xb0
kernel: [<ffffffff8117eaba>] ? cleaner_kthread+0x13e/0x17b
kernel: [<ffffffff8117e97c>] ? btrfs_item_end+0x33/0x33
kernel: [<ffffffff8104d256>] ? kthread+0x95/0x9d
kernel: [<ffffffff8104d1c1>] ? kthread_parkme+0x16/0x16
kernel: [<ffffffff814a7b5f>] ? ret_from_fork+0x3f/0x70
kernel: [<ffffffff8104d1c1>] ? kthread_parkme+0x16/0x16

As this affects a released kernel (4.4) we need a minimal fix for
stable kernels.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=108361
Reported-by: Martin Ziegler <ziegler@uni-freiburg.de>
CC: stable@vger.kernel.org # 4.4
CC: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
2016-01-25 16:50:27 -08:00
Qu Wenruo
0a95b85137 btrfs: async-thread: Fix a use-after-free error for trace
Parameter of trace_btrfs_work_queued() can be freed in its workqueue.
So no one use use that pointer after queue_work().

Fix the user-after-free bug by move the trace line before queue_work().

Reported-by: Dave Jones <davej@codemonkey.org.uk>
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
2016-01-25 16:50:26 -08:00
Filipe Manana
de0ee0edb2 Btrfs: fix race between fsync and lockless direct IO writes
An fsync, using the fast path, can race with a concurrent lockless direct
IO write and end up logging a file extent item that points to an extent
that wasn't written to yet. This is because the fast fsync path collects
ordered extents into a local list and then collects all the new extent
maps to log file extent items based on them, while the direct IO write
path creates the new extent map before it creates the corresponding
ordered extent (and submitting the respective bio(s)).

So fix this by making the direct IO write path create ordered extents
before the extent maps and make the fast fsync path collect any new
ordered extents after it collects the extent maps.
Note that making the fsync handler call inode_dio_wait() (after acquiring
the inode's i_mutex) would not work and lead to a deadlock when doing
AIO, as through AIO we end up in a path where the fsync handler is called
(through dio_aio_complete_work() -> dio_complete() -> vfs_fsync_range())
before the inode's dio counter is decremented (inode_dio_wait() waits
for this counter to have a value of zero).

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
2016-01-25 16:50:26 -08:00
Chris Mason
6b5aa88c86 Merge branch 'fix/fst-sysfs' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux into for-linus-4.5
Signed-off-by: Chris Mason <clm@fb.com>
2016-01-25 16:43:13 -08:00
David Sterba
3e4c5efbb3 btrfs: add free space tree to the cow-only list
Signed-off-by: David Sterba <dsterba@suse.com>
2016-01-25 16:48:07 +01:00
David Sterba
6b20e0ad2e btrfs: add free space tree to lockdep classes
Signed-off-by: David Sterba <dsterba@suse.com>
2016-01-25 16:48:06 +01:00
Trond Myklebust
810d82e683 NFSv4.x: Allow multiple callbacks in flight
Hook the callback channel into the same session management machinery
as we use for the forward channel.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-01-25 09:36:21 -05:00
Trond Myklebust
5f83d86cf5 NFSv4.x: Fix wraparound issues when validing the callback sequence id
We need to make sure that we don't allow args->csa_sequenceid == 0.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-01-24 17:12:49 -05:00
Trond Myklebust
80f9642724 NFSv4.x: Enforce the ca_maxresponsesize_cached on the back channel
We have no duplicate reply cache, so we always set the back channel
ca_maxresponsesize_cached to zero when negotiating the session.
That means we should always error out as soon as we see the server
set args->csa_cachethis.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-01-24 17:12:48 -05:00
Trond Myklebust
f74a834a0e NFSv4.x: CB_SEQUENCE should return NFS4ERR_DELAY if still executing
See RFC5661 Section 2.10.6.2: if retrying a request, and the old one is
still in progress, we must return NFS4ERR_DELAY as the reply to sequence.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-01-24 17:12:48 -05:00
Trond Myklebust
f4f58ed19b NFSv4.x: Remove hard coded slotids in callback channel
Instead, use the values encoded in the slot table itself.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-01-24 17:12:47 -05:00
Linus Torvalds
e2464688b5 Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus
Pull MIPS updates from Ralf Baechle:
 "This is the main pull request for MIPS for 4.5 plus some 4.4 fixes.

  The executive summary:

   - ATH79 platform improvments, use DT bindings for the ATH79 USB PHY.
   - Avoid useless rebuilds for zboot.
   - jz4780: Add NEMC, BCH and NAND device tree nodes
   - Initial support for the MicroChip's DT platform.  As all the device
     drivers are missing this is still of limited use.
   - Some Loongson3 cleanups.
   - The unavoidable whitespace polishing.
   - Reduce clock skew when synchronizing the CPU cycle counters on CPU
     startup.
   - Add MIPS R6 fixes.
   - Lots of cleanups across arch/mips as fallout from KVM.
   - Lots of minor fixes and changes for IEEE 754-2008 support to the
     FPU emulator / fp-assist software.
   - Minor Ralink, BCM47xx and bcm963xx platform support improvments.
   - Support SMP on BCM63168"

* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: (84 commits)
  MIPS: zboot: Add support for serial debug using the PROM
  MIPS: zboot: Avoid useless rebuilds
  MIPS: BMIPS: Enable ARCH_WANT_OPTIONAL_GPIOLIB
  MIPS: bcm63xx: nvram: Remove unused bcm63xx_nvram_get_psi_size() function
  MIPS: bcm963xx: Update bcm_tag field image_sequence
  MIPS: bcm963xx: Move extended flash address to bcm_tag header file
  MIPS: bcm963xx: Move Broadcom BCM963xx image tag data structure
  MIPS: bcm63xx: nvram: Use nvram structure definition from header file
  MIPS: bcm963xx: Add Broadcom BCM963xx board nvram data structure
  MAINTAINERS: Add KVM for MIPS entry
  MIPS: KVM: Add missing newline to kvm_err()
  MIPS: Move KVM specific opcodes into asm/inst.h
  MIPS: KVM: Use cacheops.h definitions
  MIPS: Break down cacheops.h definitions
  MIPS: Use EXCCODE_ constants with set_except_vector()
  MIPS: Update trap codes
  MIPS: Move Cause.ExcCode trap codes to mipsregs.h
  MIPS: KVM: Make kvm_mips_{init,exit}() static
  MIPS: KVM: Refactor added offsetof()s
  MIPS: KVM: Convert EXPORT_SYMBOL to _GPL
  ...
2016-01-24 12:50:56 -08:00
Linus Torvalds
00e3f5cc30 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client
Pull Ceph updates from Sage Weil:
 "The two main changes are aio support in CephFS, and a series that
  fixes several issues in the authentication key timeout/renewal code.

  On top of that are a variety of cleanups and minor bug fixes"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
  libceph: remove outdated comment
  libceph: kill off ceph_x_ticket_handler::validity
  libceph: invalidate AUTH in addition to a service ticket
  libceph: fix authorizer invalidation, take 2
  libceph: clear messenger auth_retry flag if we fault
  libceph: fix ceph_msg_revoke()
  libceph: use list_for_each_entry_safe
  ceph: use i_size_{read,write} to get/set i_size
  ceph: re-send AIO write request when getting -EOLDSNAP error
  ceph: Asynchronous IO support
  ceph: Avoid to propagate the invalid page point
  ceph: fix double page_unlock() in page_mkwrite()
  rbd: delete an unnecessary check before rbd_dev_destroy()
  libceph: use list_next_entry instead of list_entry_next
  ceph: ceph_frag_contains_value can be boolean
  ceph: remove unused functions in ceph_frag.h
2016-01-24 12:34:13 -08:00
Linus Torvalds
772950ed21 Merge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6
Pull SMB3 fixes from Steve French:
 "A collection of CIFS/SMB3 fixes.

  It includes a couple bug fixes, a few for improved debugging of
  cifs.ko and some improvements to the way cifs does key generation.

  I do have some additional bug fixes I expect in the next week or two
  (to address a problem found by xfstest, and some fixes for SMB3.11
  dialect, and a couple patches that just came in yesterday that I am
  reviewing)"

* 'for-next' of git://git.samba.org/sfrench/cifs-2.6:
  cifs_dbg() outputs an uninitialized buffer in cifs_readdir()
  cifs: fix race between call_async() and reconnect()
  Prepare for encryption support (first part). Add decryption and encryption key generation. Thanks to Metze for helping with this.
  cifs: Allow using O_DIRECT with cache=loose
  cifs: Make echo interval tunable
  cifs: Check uniqueid for SMB2+ and return -ESTALE if necessary
  Print IP address of unresponsive server
  cifs: Ratelimit kernel log messages
2016-01-24 12:31:12 -08:00