Commit Graph

55722 Commits

Author SHA1 Message Date
Linus Torvalds
a36cf68651 SPI NOR changes:
Core changes:
   * Support non-uniform erase size
   * Support controllers with limited TX fifo size
 
  Driver changes:
   * m25p80: Re-issue a WREN command after each write access
   * cadence: Pass a proper dir value to dma_[un]map_single()
   * fsl-qspi: Check fsl_qspi_get_seqid() return val make sure 4B
     addressing opcodes are properly handled
   * intel-spi: Add a new PCI entry for Ice Lake
 
 NAND changes:
  Raw NAND core changes:
  - Two batchs of cleanups of the NAND API, including:
    * Deprecating a lot of interfaces (now replaced by ->exec_op()).
    * Moving code in separate drivers (JEDEC, ONFI), in private files
      (internals), in platform drivers, etc.
    * Functions/structures reordering.
    * Exclusive use of the nand_chip structure instead of the MTD one
      all across the subsystem.
  - Addition of the nand_wait_readrdy/rdy_op() helpers.
 
  Raw NAND controllers drivers changes:
  - Various coccinelle patches.
  - Marvell:
    * Use regmap_update_bits() for syscon access.
    * More documentation.
    * BCH failure path rework.
    * More layouts to be supported.
    * IRQ handler complete() condition fixed.
  - Fsl_ifc:
    * SRAM initialization fixed for newer controller versions.
  - Denali:
    * Fix licenses mismatch and use a SPDX tag.
    * Set SPARE_AREA_SKIP_BYTES register to 8 if unset.
  - Qualcomm:
    * Do not include dma-direct.h.
  - Docg4:
    * Removed.
  - Ams-delta:
    * Use of a GPIO lookup table
    * Internal machinery changes.
 
  Raw NAND chip drivers changes:
  - Toshiba:
    * Add support for Toshiba memory BENAND
    * Pass a single nand_chip object to the status helper.
  - ESMT:
    * New driver to retrieve the ECC requirements from the 5th ID byte.
 
 MTD changes:
  * physmap cleanups/fixe
  * gpio-addr-flash cleanups/fixes
 -----BEGIN PGP SIGNATURE-----
 
 iQJQBAABCgA6FiEEKmCqpbOU668PNA69Ze02AX4ItwAFAlvNmJocHGJvcmlzLmJy
 ZXppbGxvbkBib290bGluLmNvbQAKCRBl7TYBfgi3AKr+D/4pH0gYSGDUstZsqUHW
 gZsPRU4jmOA+OCRbSxE03bOZOYR0UPgdoUYLfAKhZ5qxc7wHd3b47wykP0kUEviu
 lC8QhSs4DUA+EuMVPDVS4FlXRT0e3dMV7jh/IK6nInshD2YkaovyCqa6GvgwFEcM
 7BCizxRhtHV8+fo7GVQWuMLb9ZfbEvz42D0sYOu6hIsF1SnRDvHOvfdDUFEXpJoF
 a2mC9ove65ChEzc2iZ/r260iZ4aoJYb9XJRJKWxmYeITZSmLNmcrUyGqAaNQ7NRc
 AIuPXASbeHGjrIuEfXpKYW07AE5MV1nJFSk3v4u5FjgoohOoobPp7npk+Lz/qwe3
 y8/uW9qOQ/iEOsRnkvMGNu4Yjhw41L1a+J5wVvUvzmwHy1xMCrRHYB/8gXoZzemR
 A7XmCPwjAFVWv1WeKV2cs5MLW9yZq9QNMtGLlNs5OgFR6CccLk67/tHIkYLsJ/l9
 IDXFhd/YKhTeF361u6Iimmgb427TwM71P9N+OMpv4nk46DxurpxkgGle3nslzKNU
 DOFNFlMGiPIx3h4X96AKER6u7cxiOXnLGq+XeHa/y8tIrziy3jg03YoTh0RSwKxV
 J3EXwh1sFLaeebwWEgpE3Ag1LOxpRCqJ2ED71SPzS/DR/938HBVsEoPqUEHNf1in
 jwsUB3cRzNwf+XX68DRCVT5GFA==
 =7w4w
 -----END PGP SIGNATURE-----

Merge tag 'mtd/for-4.20' of git://git.infradead.org/linux-mtd

Pull mtd updates from Boris Brezillon:
 "SPI NOR core changes:
   - Support non-uniform erase size
   - Support controllers with limited TX fifo size

 Driver changes:
   - m25p80: Re-issue a WREN command after each write access
   - cadence: Pass a proper dir value to dma_[un]map_single()
   - fsl-qspi: Check fsl_qspi_get_seqid() return val make sure 4B
     addressing opcodes are properly handled
   - intel-spi: Add a new PCI entry for Ice Lake

 Raw NAND core changes:
   - Two batchs of cleanups of the NAND API, including:
      * Deprecating a lot of interfaces (now replaced by ->exec_op()).
      * Moving code in separate drivers (JEDEC, ONFI), in private files
        (internals), in platform drivers, etc.
      * Functions/structures reordering.
      * Exclusive use of the nand_chip structure instead of the MTD one
        all across the subsystem.
   - Addition of the nand_wait_readrdy/rdy_op() helpers.

 Raw NAND controllers drivers changes:
   - Various coccinelle patches.
   - Marvell:
      * Use regmap_update_bits() for syscon access.
      * More documentation.
      * BCH failure path rework.
      * More layouts to be supported.
      * IRQ handler complete() condition fixed.
   - Fsl_ifc:
      * SRAM initialization fixed for newer controller versions.
   - Denali:
      * Fix licenses mismatch and use a SPDX tag.
      * Set SPARE_AREA_SKIP_BYTES register to 8 if unset.
   - Qualcomm:
      * Do not include dma-direct.h.
   - Docg4:
      * Removed.
   - Ams-delta:
      * Use of a GPIO lookup table
      * Internal machinery changes.

 Raw NAND chip drivers changes:
   - Toshiba:
      * Add support for Toshiba memory BENAND
      * Pass a single nand_chip object to the status helper.
   - ESMT:
      * New driver to retrieve the ECC requirements from the 5th ID
        byte.

  MTD changes:
   - physmap cleanups/fixe
   - gpio-addr-flash cleanups/fixes"

* tag 'mtd/for-4.20' of git://git.infradead.org/linux-mtd: (93 commits)
  jffs2: free jffs2_sb_info through jffs2_kill_sb()
  mtd: spi-nor: fsl-quadspi: fix read error for flash size larger than 16MB
  mtd: spi-nor: intel-spi: Add support for Intel Ice Lake SPI serial flash
  mtd: maps: gpio-addr-flash: Convert to gpiod
  mtd: maps: gpio-addr-flash: Replace array with an integer
  mtd: maps: gpio-addr-flash: Use order instead of size
  mtd: spi-nor: fsl-quadspi: Don't let -EINVAL on the bus
  mtd: devices: m25p80: Make sure WRITE_EN is issued before each write
  mtd: spi-nor: Support controllers with limited TX FIFO size
  mtd: spi-nor: cadence-quadspi: Use proper enum for dma_[un]map_single
  mtd: spi-nor: parse SFDP Sector Map Parameter Table
  mtd: spi-nor: add support to non-uniform SFDP SPI NOR flash memories
  mtd: rawnand: marvell: fix the IRQ handler complete() condition
  mtd: rawnand: denali: set SPARE_AREA_SKIP_BYTES register to 8 if unset
  mtd: rawnand: r852: fix spelling mistake "card_registred" -> "card_registered"
  mtd: rawnand: toshiba: Pass a single nand_chip object to the status helper
  mtd: maps: gpio-addr-flash: Use devm_* functions
  mtd: maps: gpio-addr-flash: Fix ioremapped size
  mtd: maps: gpio-addr-flash: Replace custom printk
  mtd: physmap_of: Release resources on error
  ...
2018-10-23 01:09:22 +01:00
Linus Torvalds
6ab9e09238 for-4.20/block-20181021
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAlvNQKgQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgps+8D/9Iy6YIeoPwN10gYsqIh0P2fS3wKzL3kiww
 3vFsWO78PzgLxUlNmB7teLtNFc/R5mi8becZmAdvs9za5YFZk56o3Ifv1x+e+z00
 VY1/gxhiJD6suLeJ6lECnERGDaiWOZVRMo2TE17vxYGW6GGaa0Ts6PUUXmpla1u5
 WKctgt0Qv9WVNyiIdLdeHqzKJwsSSwNTt8fK7eFhy3x8e0CwJr+GtXckbbW3LFkY
 lug0npsTli3EmEPMovZhd25SjZmTk5GTM+ADZQ7Tnv5KXoDWB9jn6TcCSAi3G+5d
 5WUVwfnDyYJiH8qvlg5tRJ690muIy3xMOmpr7QBQ0YnR/LQ3EW+1CVfqD+qimgLH
 TXzlREXQpBP3YlxSDS5nddz4o5z84GZmC9B/43ujPaZKIQ6eBXYdkmQH7tPtSugm
 C6VGomR5tHotjxIiAsexh/5hAus+wW8bObKGTPTyINT0ub3XNclwCKLh26CgI9ie
 WvbS9g3j/KPvu/7s6weZpgD+cks0YdWe/XdXXxiHwsGI9h3J2aJna5RQt1rKWDm5
 wGCgbc/B8eSwiWx+GXlqdB9/Dy/bGXOnSTDnKpEVl1f5zNjeLwUKXbjvkMefWs4m
 jEIcquuDETORY+ZYEfa5YbmS4Lhskr0kzMVTVkZ++81tAWpSCU9Xh3IHrR8TNpt+
 J0oh0FHBDg==
 =LRTT
 -----END PGP SIGNATURE-----

Merge tag 'for-4.20/block-20181021' of git://git.kernel.dk/linux-block

Pull block layer updates from Jens Axboe:
 "This is the main pull request for block changes for 4.20. This
  contains:

   - Series enabling runtime PM for blk-mq (Bart).

   - Two pull requests from Christoph for NVMe, with items such as;
      - Better AEN tracking
      - Multipath improvements
      - RDMA fixes
      - Rework of FC for target removal
      - Fixes for issues identified by static checkers
      - Fabric cleanups, as prep for TCP transport
      - Various cleanups and bug fixes

   - Block merging cleanups (Christoph)

   - Conversion of drivers to generic DMA mapping API (Christoph)

   - Series fixing ref count issues with blkcg (Dennis)

   - Series improving BFQ heuristics (Paolo, et al)

   - Series improving heuristics for the Kyber IO scheduler (Omar)

   - Removal of dangerous bio_rewind_iter() API (Ming)

   - Apply single queue IPI redirection logic to blk-mq (Ming)

   - Set of fixes and improvements for bcache (Coly et al)

   - Series closing a hotplug race with sysfs group attributes (Hannes)

   - Set of patches for lightnvm:
      - pblk trace support (Hans)
      - SPDX license header update (Javier)
      - Tons of refactoring patches to cleanly abstract the 1.2 and 2.0
        specs behind a common core interface. (Javier, Matias)
      - Enable pblk to use a common interface to retrieve chunk metadata
        (Matias)
      - Bug fixes (Various)

   - Set of fixes and updates to the blk IO latency target (Josef)

   - blk-mq queue number updates fixes (Jianchao)

   - Convert a bunch of drivers from the old legacy IO interface to
     blk-mq. This will conclude with the removal of the legacy IO
     interface itself in 4.21, with the rest of the drivers (me, Omar)

   - Removal of the DAC960 driver. The SCSI tree will introduce two
     replacement drivers for this (Hannes)"

* tag 'for-4.20/block-20181021' of git://git.kernel.dk/linux-block: (204 commits)
  block: setup bounce bio_sets properly
  blkcg: reassociate bios when make_request() is called recursively
  blkcg: fix edge case for blk_get_rl() under memory pressure
  nvme-fabrics: move controller options matching to fabrics
  nvme-rdma: always have a valid trsvcid
  mtip32xx: fully switch to the generic DMA API
  rsxx: switch to the generic DMA API
  umem: switch to the generic DMA API
  sx8: switch to the generic DMA API
  sx8: remove dead IF_64BIT_DMA_IS_POSSIBLE code
  skd: switch to the generic DMA API
  ubd: remove use of blk_rq_map_sg
  nvme-pci: remove duplicate check
  drivers/block: Remove DAC960 driver
  nvme-pci: fix hot removal during error handling
  nvmet-fcloop: suppress a compiler warning
  nvme-core: make implicit seed truncation explicit
  nvmet-fc: fix kernel-doc headers
  nvme-fc: rework the request initialization code
  nvme-fc: introduce struct nvme_fcp_op_w_sgl
  ...
2018-10-22 17:46:08 +01:00
Boris Brezillon
042c1a5a60 NAND core changes:
- Two batchs of cleanups of the NAND API, including:
   * Deprecating a lot of interfaces (now replaced by ->exec_op()).
   * Moving code in separate drivers (JEDEC, ONFI), in private files
     (internals), in platform drivers, etc.
   * Functions/structures reordering.
   * Exclusive use of the nand_chip structure instead of the MTD one
     all across the subsystem.
 - Addition of the nand_wait_readrdy/rdy_op() helpers.
 
 Raw NAND controllers drivers changes:
 - Various coccinelle patches.
 - Marvell:
   * Use regmap_update_bits() for syscon access.
   * More documentation.
   * BCH failure path rework.
   * More layouts to be supported.
   * IRQ handler complete() condition fixed.
 - Fsl_ifc:
   * SRAM initialization fixed for newer controller versions.
 - Denali:
   * Fix licenses mismatch and use a SPDX tag.
   * Set SPARE_AREA_SKIP_BYTES register to 8 if unset.
 - Qualcomm:
   * Do not include dma-direct.h.
 - Docg4:
   * Removed.
 - Ams-delta:
   * Use of a GPIO lookup table
   * Internal machinery changes.
 
 Raw NAND chip drivers changes:
 - Toshiba:
   * Add support for Toshiba memory BENAND
   * Pass a single nand_chip object to the status helper.
 - ESMT:
   * New driver to retrieve the ECC requirements from the 5th ID byte.
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEE9HuaYnbmDhq/XIDIJWrqGEe9VoQFAlvDdvgACgkQJWrqGEe9
 VoQ0zQf/QfXYjLDbC/1FXckEkEeVZPRbXQMf+ptdTo0inJefy9LMOvhRKmuABCxn
 wwPa99aqQ+aHESnU/s2d8FUrhNRUf35lmRIAP512lJk79tBfgo73l5WSc6UTg6hO
 w3KbigOvTi7wCKQYGDFOBZuoFFCvpNsIEtWFuodCbYG1GmeAaL7L2fhbri9JbcE2
 zPqmCv0w5BUrPR7i/taI5gD+KDtRdmCEyfb7wQoJy2XZVO0SKqGsBhDhYWqwCpck
 eP2UzQ4cFT8X1S0/AfUpbowc6cftdqIwW9VRY4inx3ALQ+sKxrkJ+0E3gL9lnZ+F
 p/is0gTdvDmnOjXfwGcL7GXGy0nN8Q==
 =EKv1
 -----END PGP SIGNATURE-----

Merge tag 'nand/for-4.20' of git://git.infradead.org/linux-mtd into mtd/next

NAND core changes:
- Two batchs of cleanups of the NAND API, including:
  * Deprecating a lot of interfaces (now replaced by ->exec_op()).
  * Moving code in separate drivers (JEDEC, ONFI), in private files
    (internals), in platform drivers, etc.
  * Functions/structures reordering.
  * Exclusive use of the nand_chip structure instead of the MTD one
    all across the subsystem.
- Addition of the nand_wait_readrdy/rdy_op() helpers.

Raw NAND controllers drivers changes:
- Various coccinelle patches.
- Marvell:
  * Use regmap_update_bits() for syscon access.
  * More documentation.
  * BCH failure path rework.
  * More layouts to be supported.
  * IRQ handler complete() condition fixed.
- Fsl_ifc:
  * SRAM initialization fixed for newer controller versions.
- Denali:
  * Fix licenses mismatch and use a SPDX tag.
  * Set SPARE_AREA_SKIP_BYTES register to 8 if unset.
- Qualcomm:
  * Do not include dma-direct.h.
- Docg4:
  * Removed.
- Ams-delta:
  * Use of a GPIO lookup table
  * Internal machinery changes.

Raw NAND chip drivers changes:
- Toshiba:
  * Add support for Toshiba memory BENAND
  * Pass a single nand_chip object to the status helper.
- ESMT:
  * New driver to retrieve the ECC requirements from the 5th ID byte.
2018-10-19 09:20:09 +02:00
Eric Sandeen
fa520c47ea fscache: Fix out of bound read in long cookie keys
fscache_set_key() can incur an out-of-bounds read, reported by KASAN:

 BUG: KASAN: slab-out-of-bounds in fscache_alloc_cookie+0x5b3/0x680 [fscache]
 Read of size 4 at addr ffff88084ff056d4 by task mount.nfs/32615

and also reported by syzbot at https://lkml.org/lkml/2018/7/8/236

  BUG: KASAN: slab-out-of-bounds in fscache_set_key fs/fscache/cookie.c:120 [inline]
  BUG: KASAN: slab-out-of-bounds in fscache_alloc_cookie+0x7a9/0x880 fs/fscache/cookie.c:171
  Read of size 4 at addr ffff8801d3cc8bb4 by task syz-executor907/4466

This happens for any index_key_len which is not divisible by 4 and is
larger than the size of the inline key, because the code allocates exactly
index_key_len for the key buffer, but the hashing loop is stepping through
it 4 bytes (u32) at a time in the buf[] array.

Fix this by calculating how many u32 buffers we'll need by using
DIV_ROUND_UP, and then using kcalloc() to allocate a precleared allocation
buffer to hold the index_key, then using that same count as the hashing
index limit.

Fixes: ec0328e46d ("fscache: Maintain a catalogue of allocated cookies")
Reported-by: syzbot+a95b989b2dde8e806af8@syzkaller.appspotmail.com
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-10-18 11:32:21 +02:00
David Howells
1ff22883b0 fscache: Fix incomplete initialisation of inline key space
The inline key in struct rxrpc_cookie is insufficiently initialized,
zeroing only 3 of the 4 slots, therefore an index_key_len between 13 and 15
bytes will end up hashing uninitialized memory because the memcpy only
partially fills the last buf[] element.

Fix this by clearing fscache_cookie objects on allocation rather than using
the slab constructor to initialise them.  We're going to pretty much fill
in the entire struct anyway, so bringing it into our dcache writably
shouldn't incur much overhead.

This removes the need to do clearance in fscache_set_key() (where we aren't
doing it correctly anyway).

Also, we don't need to set cookie->key_len in fscache_set_key() as we
already did it in the only caller, so remove that.

Fixes: ec0328e46d ("fscache: Maintain a catalogue of allocated cookies")
Reported-by: syzbot+a95b989b2dde8e806af8@syzkaller.appspotmail.com
Reported-by: Eric Sandeen <sandeen@redhat.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-10-18 11:32:21 +02:00
Al Viro
169b803397 cachefiles: fix the race between cachefiles_bury_object() and rmdir(2)
the victim might've been rmdir'ed just before the lock_rename();
unlike the normal callers, we do not look the source up after the
parents are locked - we know it beforehand and just recheck that it's
still the child of what used to be its parent.  Unfortunately,
the check is too weak - we don't spot a dead directory since its
->d_parent is unchanged, dentry is positive, etc.  So we sail all
the way to ->rename(), with hosting filesystems _not_ expecting
to be asked renaming an rmdir'ed subdirectory.

The fix is easy, fortunately - the lock on parent is sufficient for
making IS_DEADDIR() on child safe.

Cc: stable@vger.kernel.org
Fixes: 9ae326a690 (CacheFiles: A cache that backs onto a mounted filesystem)
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-10-18 11:32:21 +02:00
Hou Tao
92e2921f7e jffs2: free jffs2_sb_info through jffs2_kill_sb()
When an invalid mount option is passed to jffs2, jffs2_parse_options()
will fail and jffs2_sb_info will be freed, but then jffs2_sb_info will
be used (use-after-free) and freeed (double-free) in jffs2_kill_sb().

Fix it by removing the buggy invocation of kfree() when getting invalid
mount options.

Fixes: 92abc475d8 ("jffs2: implement mount option parsing and compression overriding")
Cc: stable@kernel.org
Signed-off-by: Hou Tao <houtao1@huawei.com>
Reviewed-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
2018-10-16 10:34:28 +02:00
David Howells
f0a7d1883d afs: Fix clearance of reply
The recent patch to fix the afs_server struct leak didn't actually fix the
bug, but rather fixed some of the symptoms.  The problem is that an
asynchronous call that holds a resource pointed to by call->reply[0] will
find the pointer cleared in the call destructor, thereby preventing the
resource from being cleaned up.

In the case of the server record leak, the afs_fs_get_capabilities()
function in devel code sets up a call with reply[0] pointing at the server
record that should be altered when the result is obtained, but this was
being cleared before the destructor was called, so the put in the
destructor does nothing and the record is leaked.

Commit f014ffb025 removed the additional ref obtained by
afs_install_server(), but the removal of this ref is actually used by the
garbage collector to mark a server record as being defunct after the record
has expired through lack of use.

The offending clearance of call->reply[0] upon completion in
afs_process_async_call() has been there from the origin of the code, but
none of the asynchronous calls actually use that pointer currently, so it
should be safe to remove (note that synchronous calls don't involve this
function).

Fix this by the following means:

 (1) Revert commit f014ffb025.

 (2) Remove the clearance of reply[0] from afs_process_async_call().

Without this, afs_manage_servers() will suffer an assertion failure if it
sees a server record that didn't get used because the usage count is not 1.

Fixes: f014ffb025 ("afs: Fix afs_server struct leak")
Fixes: 08e0e7c82e ("[AF_RXRPC]: Make the in-kernel AFS filesystem use AF_RXRPC.")
Signed-off-by: David Howells <dhowells@redhat.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-10-15 15:31:47 +02:00
Greg Kroah-Hartman
3a27203102 libnvdimm/dax 4.19-rc8
* Fix a livelock in dax_layout_busy_page() present since v4.18. The
   lockup triggers when truncating an actively mapped huge page out of a
   mapping pinned for direct-I/O.
 
 * Fix mprotect() clobbers of _PAGE_DEVMAP. Broken since v4.5 mprotect()
   clears this flag that is needed to communicate the liveness of device
   pages to the get_user_pages() path.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJbwiZhAAoJEB7SkWpmfYgCYFoQAL8ED6c1bfGUPRsWSrTRChU0
 ungVZ/Vf1+2ERd3ivUXPQzahNtqH5EWvEVp0aboVpyJUoVllrztInVS2hxaGJE+e
 w7WnzaXh36MY0kvLpK+Ny1Cxk7qg2rXnmzOAPRVdSUoSvh0TXOn5HFX1i/OdI7WK
 wgJwXraCoyKP9aTItw7oHQy9S36bi1RJVUakOAoEpEx4Vn+fwFxLNIt34G5CRJ+k
 iflicM7CPngxlFzwfoiX9v3DhV7toexk1A4LAzzwypG0Aiqd5tW2FG1lwLMPncNk
 8FezBm9VjkMwzv6hj7nD9UfU2lbh3GqqGDW0cPX1DPSgDxr/4pOLtKcbYWHh6yas
 NtCXk37q90ey3GtD2wYBRkBNly6UWvHJ0d3srtO6ZSl1VN6JQu8rhVhQ6KnON24B
 NcWlEVf2brqf0uaW4byCVbdVfIDp96/qgEvCo1pq3olXwCdDyOBJjYxaBcnu5JDV
 YsItMCJ49AxS/qoCt3vam7vC5TGhfYHL5xJPaF06cdjYvgfqOIV3VQT1ujBx4cvh
 MBFRBKDc6oDiJFgkrdYqHwJfn5fCQVS180Oy5S0AFGsVAzsJalKBZBLx2f2RQn8c
 r+kczvvPjpczEeEqzaqsxTgjowo/75Q8PRXc2PbwQzNxfkHuKf+xxQpnUg0mN6Hf
 w8zPSaCcCs2Wo21Kd/ua
 =VXnU
 -----END PGP SIGNATURE-----

Merge tag 'libnvdimm-fixes-4.19-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm

Dan writes:
  "libnvdimm/dax 4.19-rc8

   * Fix a livelock in dax_layout_busy_page() present since v4.18. The
     lockup triggers when truncating an actively mapped huge page out of
     a mapping pinned for direct-I/O.

   * Fix mprotect() clobbers of _PAGE_DEVMAP. Broken since v4.5
     mprotect() clears this flag that is needed to communicate the
     liveness of device pages to the get_user_pages() path."

* tag 'libnvdimm-fixes-4.19-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
  mm: Preserve _PAGE_DEVMAP across mprotect() calls
  filesystem-dax: Fix dax_layout_busy_page() livelock
2018-10-14 08:34:31 +02:00
Richard Weinberger
f8ccb14fd6 ubifs: Fix WARN_ON logic in exit path
ubifs_assert() is not WARN_ON(), so we have to invert
the checks.
Randy faced this warning with UBIFS being a module, since
most users use UBIFS as builtin because UBIFS is the rootfs
nobody noticed so far. :-(
Including me.

Reported-by: Randy Dunlap <rdunlap@infradead.org>
Fixes: 54169ddd38 ("ubifs: Turn two ubifs_assert() into a WARN_ON()")
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-10-13 11:05:02 +02:00
Greg Kroah-Hartman
79fc170b1f Merge branch 'akpm'
Fixes from Andrew:

* akpm:
  fs/fat/fatent.c: add cond_resched() to fat_count_free_clusters()
  mm/thp: fix call to mmu_notifier in set_pmd_migration_entry() v2
  mm/mmap.c: don't clobber partially overlapping VMA with MAP_FIXED_NOREPLACE
  ocfs2: fix a GCC warning
2018-10-13 09:31:19 +02:00
Khazhismel Kumykov
ac081c3be3 fs/fat/fatent.c: add cond_resched() to fat_count_free_clusters()
On non-preempt kernels this loop can take a long time (more than 50 ticks)
processing through entries.

Link: http://lkml.kernel.org/r/20181010172623.57033-1-khazhy@google.com
Signed-off-by: Khazhismel Kumykov <khazhy@google.com>
Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-10-13 09:31:03 +02:00
zhong jiang
1cff514a51 ocfs2: fix a GCC warning
Fix the following compile warning:

fs/ocfs2/dlmglue.c:99:30: warning: ‘lockdep_keys’ defined but not used [-Wunused-variable]
 static struct lock_class_key lockdep_keys[OCFS2_NUM_LOCK_TYPES];

Link: http://lkml.kernel.org/r/1536938148-32110-1-git-send-email-zhongjiang@huawei.com
Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-10-13 09:31:02 +02:00
Greg Kroah-Hartman
ed66c252d9 gfs2 4.19 fixes
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJbwLzVAAoJENW/n+sDE2U6ZQUP/3mcvCqDjU+YBC/RCuA0Hvxd
 4rtzRqY5jTt4HZCTJ9Qj8aARZ4vN/LjKQuQU8IYhzJQcXzeyDi1V2zH4xGmEDomZ
 o9LqcVUUJO1bS/aykLrbnSEEfjU+zeA3z60Jbj3OHkpL6H9u6q+AwqCZnsIcKBnI
 Z8lgUda6oC3L60xzxpdr6BoYLIMqucuJ0YMyVo6YsKT1PEoP30EQhAoqnnKVfxx7
 +l66OUK4Qw3z45auen9MIOXjn+y7BC0RjFANetsBlAS29PE5X9rXWQ5xnv+9uYCh
 DDXPiMjdkLm2Xv8d5U80QhVuO8vxkgn3lnLvE99cmMmQrU/6pS91q+azgc/98RDN
 ZYCReIihyPbug3+uRziYcBHnI+4c8LrM/Za0TsHDG6gA6ddiiNHvdnJE9Nbk85Le
 dS+jQwA5eiYtGc6styoyk52v6huGTI3oPbffyYTTbOLnT6bSaEv+k0zWWCgTqMv3
 sb0imETfsdd1O/7MxR2kAVtQPdSzhD5GMIbkmlRb8nwP+QXb+FBVtQFPsZaPZklR
 qF//eFXGcoWfM1XerOYgl6VVolvntSc+ivycSEsDGHNSU5zUy2JYNTxmkypBB6pb
 wslVyIvjW2bcnUhjGr9PDSg+yQjaAk+5EyiJmxtJaqCiSfjgTpI4YwNrkfQo5csI
 H/Sq1WeTLjo2KeRb7LpU
 =gGo4
 -----END PGP SIGNATURE-----

Merge tag 'gfs2-4.19.fixes3' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2

Andreas writes:
  "gfs2 4.19 fixes

   Fix iomap buffered write support for journaled files"

* tag 'gfs2-4.19.fixes3' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2:
  gfs2: Fix iomap buffered write support for journaled files (2)
2018-10-13 09:06:27 +02:00
David Howells
f014ffb025 afs: Fix afs_server struct leak
Fix a leak of afs_server structs.  The routine that installs them in the
various lookup lists and trees gets a ref on leaving the function, whether
it added the server or a server already exists.  It shouldn't increment
the refcount if it added the server.

The effect of this that "rmmod kafs" will hang waiting for the leaked
server to become unused.

Fixes: d2ddc776a4 ("afs: Overhaul volume and server record caching and fileserver rotation")
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-10-12 17:36:40 +02:00
Andreas Gruenbacher
fee5150c48 gfs2: Fix iomap buffered write support for journaled files (2)
It turns out that the fix in commit 6636c3cc56 is bad; the assertion
that the iomap code no longer creates buffer heads is incorrect for
filesystems that set the IOMAP_F_BUFFER_HEAD flag.

Instead, what's happening is that gfs2_iomap_begin_write treats all
files that have the jdata flag set as journaled files, which is
incorrect as long as those files are inline ("stuffed").  We're handling
stuffed files directly via the page cache, which is why we ended up with
pages without buffer heads in gfs2_page_add_databufs.

Fix this by handling stuffed journaled files correctly in
gfs2_iomap_begin_write.

This reverts commit 6636c3cc5690c11631e6366cf9a28fb99c8b25bb.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2018-10-12 17:14:42 +02:00
David Howells
6b3944e42e afs: Fix cell proc list
Access to the list of cells by /proc/net/afs/cells has a couple of
problems:

 (1) It should be checking against SEQ_START_TOKEN for the keying the
     header line.

 (2) It's only holding the RCU read lock, so it can't just walk over the
     list without following the proper RCU methods.

Fix these by using an hlist instead of an ordinary list and using the
appropriate accessor functions to follow it with RCU.

Since the code that adds a cell to the list must also necessarily change,
sort the list on insertion whilst we're at it.

Fixes: 989782dcdc ("afs: Overhaul cell database management")
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-10-12 13:18:57 +02:00
Greg Kroah-Hartman
4718dcad7d xfs: fixes for 4.19-rc7
Update for 4.19-rc7 to fix numerous file clone and deduplication issues.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJbvo1jAAoJEK3oKUf0dfodurwP/3OGvJo11XjE5Wuh5tbcSOUm
 duJdJvN+b0gx8Lb1IL2lHaBEt/Pw3PKi+KPx6d+pd47aflCYNMiU7I1kzejxvq9I
 TBVdhxffAmNBU02VU/qmhucMnKfpVr/UX19f0sHZcfXbZkpIuHrSpI25MNSYqbEs
 bRoMDkgbuu377hCJu6tnBAUm38z4iZdfJaGPVmJmuVMn8JJE8KA3Une/daxg0/HY
 zbCMWP3SGJwqx7SyyeurQSkRS/8GG3LgbnYOz0FlaHfBd4JVJscHOZU08nrYmMAN
 9SOowvahaPkDcyO8c6gRkyE6NIbOsb79718g+HTm4eqzyChnh+jnFTzitcTMuK2c
 vjblXZSDKnN/P4UaEEzIAnQ1Eew4WhLrKuBr+3fCr2bKTGfj0iiX36CjEgk1k+Df
 t1DV/Hj6me6crZpWO7GQZVLnDsiJBzaOsIgpYnB3vcJyof/cgm+5bfGFedDZFdpV
 XTh0oBN8B7oQztd4MvcfQOGXSktrMtq+6RomO8kv77mVtHFnjnroKHq/wIft2nyI
 rs7u7DFmCLyB9Lbm8aoeMPo4+0zxTp9385HSJ+XznHcjGXxkxIIGhhdU3omKxANa
 j3UQB1+VRC4lHSluUXOW7vH0j1tX3gaS3z/yJ+gfcAdizUmdp0h0rMx8c9VR8SEJ
 xv/fpbrcKHyBwuNvKtBm
 =yj1P
 -----END PGP SIGNATURE-----

Merge tag 'xfs-fixes-for-4.19-rc7' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux

Dave writes:
  "xfs: fixes for 4.19-rc7

   Update for 4.19-rc7 to fix numerous file clone and deduplication issues."

* tag 'xfs-fixes-for-4.19-rc7' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  xfs: fix data corruption w/ unaligned reflink ranges
  xfs: fix data corruption w/ unaligned dedupe ranges
  xfs: update ctime and remove suid before cloning files
  xfs: zero posteof blocks when cloning above eof
  xfs: refactor clonerange preparation into a separate helper
2018-10-11 07:17:42 +02:00
Andreas Gruenbacher
dc480feb45 gfs2: Fix iomap buffered write support for journaled files
Commit 64bc06bb32 broke buffered writes to journaled files (chattr
+j): we'll try to journal the buffer heads of the page being written to
in gfs2_iomap_journaled_page_done.  However, the iomap code no longer
creates buffer heads, so we'll BUG() in gfs2_page_add_databufs.  Fix
that by creating buffer heads ourself when needed.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2018-10-09 18:20:13 +02:00
Dan Williams
d7782145e1 filesystem-dax: Fix dax_layout_busy_page() livelock
In the presence of multi-order entries the typical
pagevec_lookup_entries() pattern may loop forever:

	while (index < end && pagevec_lookup_entries(&pvec, mapping, index,
				min(end - index, (pgoff_t)PAGEVEC_SIZE),
				indices)) {
		...
		for (i = 0; i < pagevec_count(&pvec); i++) {
			index = indices[i];
			...
		}
		index++; /* BUG */
	}

The loop updates 'index' for each index found and then increments to the
next possible page to continue the lookup. However, if the last entry in
the pagevec is multi-order then the next possible page index is more
than 1 page away. Fix this locally for the filesystem-dax case by
checking for dax-multi-order entries. Going forward new users of
multi-order entries need to be similarly careful, or we need a generic
way to report the page increment in the radix iterator.

Fixes: 5fac7408d8 ("mm, fs, dax: handle layout changes to pinned dax...")
Cc: <stable@vger.kernel.org>
Cc: Ross Zwisler <zwisler@kernel.org>
Cc: Matthew Wilcox <willy@infradead.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2018-10-08 11:38:44 -07:00
Dave Chinner
b39989009b xfs: fix data corruption w/ unaligned reflink ranges
When reflinking sub-file ranges, a data corruption can occur when
the source file range includes a partial EOF block. This shares the
unknown data beyond EOF into the second file at a position inside
EOF, exposing stale data in the second file.

XFS only supports whole block sharing, but we still need to
support whole file reflink correctly.  Hence if the reflink
request includes the last block of the souce file, only proceed with
the reflink operation if it lands at or past the destination file's
current EOF. If it lands within the destination file EOF, reject the
entire request with -EINVAL and make the caller go the hard way.

This avoids the data corruption vector, but also avoids disruption
of returning EINVAL to userspace for the common case of whole file
cloning.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2018-10-06 11:44:39 +10:00
Dave Chinner
dceeb47b0e xfs: fix data corruption w/ unaligned dedupe ranges
A deduplication data corruption is Exposed by fstests generic/505 on
XFS. It is caused by extending the block match range to include the
partial EOF block, but then allowing unknown data beyond EOF to be
considered a "match" to data in the destination file because the
comparison is only made to the end of the source file. This corrupts
the destination file when the source extent is shared with it.

XFS only supports whole block dedupe, but we still need to appear to
support whole file dedupe correctly.  Hence if the dedupe request
includes the last block of the souce file, don't include it in the
actual XFS dedupe operation. If the rest of the range dedupes
successfully, then report the partial last block as deduped, too, so
that userspace sees it as a successful dedupe rather than return
EINVAL because we can't dedupe unaligned blocks.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2018-10-06 11:44:19 +10:00
Ashish Samant
cbe355f57c ocfs2: fix locking for res->tracking and dlm->tracking_list
In dlm_init_lockres() we access and modify res->tracking and
dlm->tracking_list without holding dlm->track_lock.  This can cause list
corruptions and can end up in kernel panic.

Fix this by locking res->tracking and dlm->tracking_list with
dlm->track_lock instead of dlm->spinlock.

Link: http://lkml.kernel.org/r/1529951192-4686-1-git-send-email-ashish.samant@oracle.com
Signed-off-by: Ashish Samant <ashish.samant@oracle.com>
Reviewed-by: Changwei Ge <ge.changwei@h3c.com>
Acked-by: Joseph Qi <jiangqi903@gmail.com>
Acked-by: Jun Piao <piaojun@huawei.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <ge.changwei@h3c.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-10-05 16:32:05 -07:00
Jann Horn
f8a00cef17 proc: restrict kernel stack dumps to root
Currently, you can use /proc/self/task/*/stack to cause a stack walk on
a task you control while it is running on another CPU.  That means that
the stack can change under the stack walker.  The stack walker does
have guards against going completely off the rails and into random
kernel memory, but it can interpret random data from your kernel stack
as instruction pointers and stack pointers.  This can cause exposure of
kernel stack contents to userspace.

Restrict the ability to inspect kernel stacks of arbitrary tasks to root
in order to prevent a local attacker from exploiting racy stack unwinding
to leak kernel task stack contents.  See the added comment for a longer
rationale.

There don't seem to be any users of this userspace API that can't
gracefully bail out if reading from the file fails.  Therefore, I believe
that this change is unlikely to break things.  In the case that this patch
does end up needing a revert, the next-best solution might be to fake a
single-entry stack based on wchan.

Link: http://lkml.kernel.org/r/20180927153316.200286-1-jannh@google.com
Fixes: 2ec220e27f ("proc: add /proc/*/stack")
Signed-off-by: Jann Horn <jannh@google.com>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Ken Chen <kenchen@google.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Laura Abbott <labbott@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H . Peter Anvin" <hpa@zytor.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-10-05 16:32:05 -07:00
Larry Chen
69eb7765b9 ocfs2: fix crash in ocfs2_duplicate_clusters_by_page()
ocfs2_duplicate_clusters_by_page() may crash if one of the extent's pages
is dirty.  When a page has not been written back, it is still in dirty
state.  If ocfs2_duplicate_clusters_by_page() is called against the dirty
page, the crash happens.

To fix this bug, we can just unlock the page and wait until the page until
its not dirty.

The following is the backtrace:

kernel BUG at /root/code/ocfs2/refcounttree.c:2961!
[exception RIP: ocfs2_duplicate_clusters_by_page+822]
__ocfs2_move_extent+0x80/0x450 [ocfs2]
? __ocfs2_claim_clusters+0x130/0x250 [ocfs2]
ocfs2_defrag_extent+0x5b8/0x5e0 [ocfs2]
__ocfs2_move_extents_range+0x2a4/0x470 [ocfs2]
ocfs2_move_extents+0x180/0x3b0 [ocfs2]
? ocfs2_wait_for_recovery+0x13/0x70 [ocfs2]
ocfs2_ioctl_move_extents+0x133/0x2d0 [ocfs2]
ocfs2_ioctl+0x253/0x640 [ocfs2]
do_vfs_ioctl+0x90/0x5f0
SyS_ioctl+0x74/0x80
do_syscall_64+0x74/0x140
entry_SYSCALL_64_after_hwframe+0x3d/0xa2

Once we find the page is dirty, we do not wait until it's clean, rather we
use write_one_page() to write it back

Link: http://lkml.kernel.org/r/20180829074740.9438-1-lchen@suse.com
[lchen@suse.com: update comments]
  Link: http://lkml.kernel.org/r/20180830075041.14879-1-lchen@suse.com
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Larry Chen <lchen@suse.com>
Acked-by: Changwei Ge <ge.changwei@h3c.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <jiangqi903@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-10-05 16:32:04 -07:00
Greg Kroah-Hartman
087f759a41 four small SMB3 fixes: one for stable, the others to address a more recent regression
-----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAlu2jLMACgkQiiy9cAdy
 T1GLMQwAiUaUTtR0MN2ws+IHNnwFX5YnWFa1Cr/6Tgz9WBo5TrrvQcBGhTIUQok2
 7w2+I0rXPO7P7Tq2odWn66BT1/l0o2U10aB4OHYlS7zqfCAp6GuwKGtATivRqjsg
 ENn2mmBRuTmz+XtGmlVklmQpHjn4+MTR2GSLA57oo1fqX5YBeSxcq2UEciCGttRQ
 nAvn923NrPSbGRvL/eDDWGXas8gMATXnrNlN8st2AsxnoFC6CrzUuRM0s/IFgp6r
 Orh0jIlovlT2Q6LdND7n7xXnEVoyUTETG89IjRVwcBJ9LtXrJXpuWo7/sOr11cMy
 sGeUkILNlmkIdLtIlUaZcGaclq/yfeKdMxziBT8tWP6wXIH4LrGCzvZ9jDtrzygU
 EWL2nLsqMJNh0+wsYSZM/m9VFC1ReeB1AgIC2EHu/xzqWoTeL40i2OhKi/732FEX
 H+FlTZUoorhFoI96AMJMrqXjXyqZUZSQ1b5iJ7/jq888Vdey+m7HiMUtxv0+5yzC
 +lQksh8X
 =4PHD
 -----END PGP SIGNATURE-----

Merge tag '4.19-rc6-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6

Steve writes:
  "SMB3 fixes

   four small SMB3 fixes: one for stable, the others to address a more
   recent regression"

* tag '4.19-rc6-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  smb3: fix lease break problem introduced by compounding
  cifs: only wake the thread for the very last PDU in a compound
  cifs: add a warning if we try to to dequeue a deleted mid
  smb2: fix missing files in root share directory listing
2018-10-05 08:27:47 -07:00
Darrick J. Wong
7debbf015f xfs: update ctime and remove suid before cloning files
Before cloning into a file, update the ctime and remove sensitive
attributes like suid, just like we'd do for a regular file write.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2018-10-05 19:05:41 +10:00
Darrick J. Wong
410fdc72b0 xfs: zero posteof blocks when cloning above eof
When we're reflinking between two files and the destination file range
is well beyond the destination file's EOF marker, zero any posteof
speculative preallocations in the destination file so that we don't
expose stale disk contents.  The previous strategy of trying to clear
the preallocations does not work if the destination file has the
PREALLOC flag set.

Uncovered by shared/010.

Reported-by: Zorro Lang <zlang@redhat.com>
Bugzilla-id: https://bugzilla.kernel.org/show_bug.cgi?id=201259
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2018-10-05 19:04:27 +10:00
Darrick J. Wong
0d41e1d28c xfs: refactor clonerange preparation into a separate helper
Refactor all the reflink preparation steps into a separate helper
that we'll use to land all the upcoming fixes for insufficient input
checks.

This rework also moves the invalidation of the destination range to
the prep function so that it is done before the range is remapped.
This ensures that nobody can access the data in range being remapped
until the remap is complete.

[dgc: fix xfs_reflink_remap_prep() return value and caller check to
handle vfs_clone_file_prep_inodes() returning 0 to mean "nothing to
do". ]

[dgc: make sure length changed by vfs_clone_file_prep_inodes() gets
propagated back to XFS code that does the remapping. ]

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2018-10-05 19:04:22 +10:00
Greg Kroah-Hartman
010bd965f9 overlayfs fixes for 4.19-rc7
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQSQHSd0lITzzeNWNm3h3BK/laaZPAUCW7YOCQAKCRDh3BK/laaZ
 PAmGAQC1/3+mkfHXqyDx+zv4f7A348ZULW26EWFWwMwlEljWowD+PlRFBgUu5v5c
 194mCApZU7hl6o0u07aijjz6m81e6QE=
 =dM89
 -----END PGP SIGNATURE-----

Merge tag 'ovl-fixes-4.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs

Miklos writes:
  "overlayfs fixes for 4.19-rc7

   This update fixes a couple of regressions in the stacked file update
   added in this cycle, as well as some older bugs uncovered by
   syzkaller.

   There's also one trivial naming change that touches other parts of
   the fs subsystem."

* tag 'ovl-fixes-4.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
  ovl: fix format of setxattr debug
  ovl: fix access beyond unterminated strings
  ovl: make symbol 'ovl_aops' static
  vfs: swap names of {do,vfs}_clone_file_range()
  ovl: fix freeze protection bypass in ovl_clone_file_range()
  ovl: fix freeze protection bypass in ovl_write_iter()
  ovl: fix memory leak on unlink of indexed file
2018-10-04 13:24:38 -07:00
Greg Kroah-Hartman
1b0350c355 XFS fixes for 4.19-rc6
Accumlated regression and bug fixes for 4.19-rc6, including:
 
 o make iomap correctly mark dirty pages for sub-page block sizes
 o fix regression in handling extent-to-btree format conversion errors
 o fix torn log wrap detection for new logs
 o various corrupt inode detection fixes
 o various delalloc state fixes
 o cleanup all the missed transaction cancel cases missed from changes merged
   in 4.19-rc1
 o fix lockdep false positive on transaction allocation
 o fix locking and reference counting on buffer log items
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJbtVyHAAoJEK3oKUf0dfodc8YP/1hT+TdXZDBVPo9kXQwmrnra
 8P1J8IEuGj851PwYQobUhifdDh4eQ+PpcYEIfqTSGhtjTg7cqyQOvTo4uUKHLn50
 8mFXQb6JrAFYnjGjorOCde+lpoQrtj3oKASscs7RaL/W1RNffY74UGdiE0yOJnJs
 9IC7TpJT4ZzAgYBgC61whfYWmszLmRuKlVZWgXrM8hkMqliHdfrXeZk7MXTeiflz
 Q5+9eVnCCgTtC0TWgVAkFMvlZs7UtNMWIIn6zt+HQLB7Vms6q4ArpIT+fF45njwG
 94tyTcFT0AcIJ8OIi5i85fOXcDGjTFFf554Yf80PlWGlk3SbXPHFQao/ItDA4S1Z
 eCl2eurUOXlNXKuyzWoV5KjOBiiBicbOwl6eX0K996cGehhEdFOvBihvAGjJy5rm
 c4plTTy6bhYKZWr3JLuaPDlNWDMr/P/aUkiq9tFXyaMmVKNeyxd6UKzIokl5uUfi
 Ycik4Ik8zRgpFEUx5Clvb2W5qH9pqpR0WcFhrcj0mDnJa10TpBWesA0g2F+hM0Jc
 0mSuGxfQeJk7tuVcvsBpPNzgVEPKabvnYbOlGK7HvCSOPRkg0Fhmj+iJK721ccVl
 Ltiz0Y485eAGLKn9kOWKr47A5GAAFpuK29OrD2a8XizHPZ3Sr2CG5X1jI0gtMb8n
 BiBAxO6dojRQATDuTI6w
 =5ylA
 -----END PGP SIGNATURE-----

Merge tag 'xfs-fixes-for-4.19-rc6' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux

Dave writes:
  "XFS fixes for 4.19-rc6

   Accumlated regression and bug fixes for 4.19-rc6, including:

   o make iomap correctly mark dirty pages for sub-page block sizes
   o fix regression in handling extent-to-btree format conversion errors
   o fix torn log wrap detection for new logs
   o various corrupt inode detection fixes
   o various delalloc state fixes
   o cleanup all the missed transaction cancel cases missed from changes merged
     in 4.19-rc1
   o fix lockdep false positive on transaction allocation
   o fix locking and reference counting on buffer log items"

* tag 'xfs-fixes-for-4.19-rc6' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  xfs: fix error handling in xfs_bmap_extents_to_btree
  iomap: set page dirty after partial delalloc on mkwrite
  xfs: remove invalid log recovery first/last cycle check
  xfs: validate inode di_forkoff
  xfs: skip delalloc COW blocks in xfs_reflink_end_cow
  xfs: don't treat unknown di_flags2 as corruption in scrub
  xfs: remove duplicated include from alloc.c
  xfs: don't bring in extents in xfs_bmap_punch_delalloc_range
  xfs: fix transaction leak in xfs_reflink_allocate_cow()
  xfs: avoid lockdep false positives in xfs_trans_alloc
  xfs: refactor xfs_buf_log_item reference count handling
  xfs: clean up xfs_trans_brelse()
  xfs: don't unlock invalidated buf on aborted tx commit
  xfs: remove last of unnecessary xfs_defer_cancel() callers
  xfs: don't crash the vfs on a garbage inline symlink
2018-10-04 09:17:38 -07:00
Miklos Szeredi
1a8f8d2a44 ovl: fix format of setxattr debug
Format has a typo: it was meant to be "%.*s", not "%*s".  But at some point
callers grew nonprintable values as well, so use "%*pE" instead with a
maximized length.

Reported-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Fixes: 3a1e819b4e ("ovl: store file handle of lower inode on copy up")
Cc: <stable@vger.kernel.org> # v4.12
2018-10-04 14:49:10 +02:00
Amir Goldstein
601350ff58 ovl: fix access beyond unterminated strings
KASAN detected slab-out-of-bounds access in printk from overlayfs,
because string format used %*s instead of %.*s.

> BUG: KASAN: slab-out-of-bounds in string+0x298/0x2d0 lib/vsprintf.c:604
> Read of size 1 at addr ffff8801c36c66ba by task syz-executor2/27811
>
> CPU: 0 PID: 27811 Comm: syz-executor2 Not tainted 4.19.0-rc5+ #36
...
>  printk+0xa7/0xcf kernel/printk/printk.c:1996
>  ovl_lookup_index.cold.15+0xe8/0x1f8 fs/overlayfs/namei.c:689

Reported-by: syzbot+376cea2b0ef340db3dd4@syzkaller.appspotmail.com
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Fixes: 359f392ca5 ("ovl: lookup index entry for copy up origin")
Cc: <stable@vger.kernel.org> # v4.13
2018-10-04 14:49:10 +02:00
Greg Kroah-Hartman
73dec82d8d Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Al writes:
  "xattrs regression fix from Andreas; sat in -next for quite a while."

* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  sysfs: Do not return POSIX ACL xattrs via listxattr
2018-10-03 04:21:23 -07:00
Steve French
7af929d6d0 smb3: fix lease break problem introduced by compounding
Fixes problem (discovered by Aurelien) introduced by recent commit:
commit b24df3e30c
("cifs: update receive_encrypted_standard to handle compounded responses")

which broke the ability to respond to some lease breaks
(lease breaks being ignored is a problem since can block
server response for duration of the lease break timeout).

Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
2018-10-02 18:54:09 -05:00
Ronnie Sahlberg
4e34feb5e9 cifs: only wake the thread for the very last PDU in a compound
For compounded PDUs we whould only wake the waiting thread for the
very last PDU of the compound.
We do this so that we are guaranteed that the demultiplex_thread will
not process or access any of those MIDs any more once the send/recv
thread starts processing.

Else there is a race where at the end of the send/recv processing we
will try to delete all the mids of the compound. If the multiplex
thread still has other mids to process at this point for this compound
this can lead to an oops.

Needed to fix recent commit:
commit 730928c8f4
("cifs: update smb2_queryfs() to use compounding")

Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
2018-10-02 18:53:57 -05:00
Ronnie Sahlberg
ddf83afb9f cifs: add a warning if we try to to dequeue a deleted mid
cifs_delete_mid() is called once we are finished handling a mid and we
expect no more work done on this mid.

Needed to fix recent commit:
commit 730928c8f4
("cifs: update smb2_queryfs() to use compounding")

Add a warning if someone tries to dequeue a mid that has already been
flagged to be deleted.
Also change list_del() to list_del_init() so that if we have similar bugs
resurface in the future we will not oops.

Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
2018-10-02 18:12:31 -05:00
Aurelien Aptel
0595751f26 smb2: fix missing files in root share directory listing
When mounting a Windows share that is the root of a drive (eg. C$)
the server does not return . and .. directory entries. This results in
the smb2 code path erroneously skipping the 2 first entries.

Pseudo-code of the readdir() code path:

cifs_readdir(struct file, struct dir_context)
    initiate_cifs_search            <-- if no reponse cached yet
        server->ops->query_dir_first

    dir_emit_dots
        dir_emit                    <-- adds "." and ".." if we're at pos=0

    find_cifs_entry
        initiate_cifs_search        <-- if pos < start of current response
                                         (restart search)
        server->ops->query_dir_next <-- if pos > end of current response
                                         (fetch next search res)

    for(...)                        <-- loops over cur response entries
                                          starting at pos
        cifs_filldir                <-- skip . and .., emit entry
            cifs_fill_dirent
            dir_emit
	pos++

A) dir_emit_dots() always adds . & ..
   and sets the current dir pos to 2 (0 and 1 are done).

Therefore we always want the index_to_find to be 2 regardless of if
the response has . and ..

B) smb1 code initializes index_of_last_entry with a +2 offset

  in cifssmb.c CIFSFindFirst():
		psrch_inf->index_of_last_entry = 2 /* skip . and .. */ +
			psrch_inf->entries_in_buffer;

Later in find_cifs_entry() we want to find the next dir entry at pos=2
as a result of (A)

	first_entry_in_buffer = cfile->srch_inf.index_of_last_entry -
					cfile->srch_inf.entries_in_buffer;

This var is the dir pos that the first entry in the buffer will
have therefore it must be 2 in the first call.

If we don't offset index_of_last_entry by 2 (like in (B)),
first_entry_in_buffer=0 but we were instructed to get pos=2 so this
code in find_cifs_entry() skips the 2 first which is ok for non-root
shares, as it skips . and .. from the response but is not ok for root
shares where the 2 first are actual files

		pos_in_buf = index_to_find - first_entry_in_buffer;
                // pos_in_buf=2
		// we skip 2 first response entries :(
		for (i = 0; (i < (pos_in_buf)) && (cur_ent != NULL); i++) {
			/* go entry by entry figuring out which is first */
			cur_ent = nxt_dir_entry(cur_ent, end_of_smb,
						cfile->srch_inf.info_level);
		}

C) cifs_filldir() skips . and .. so we can safely ignore them for now.

Sample program:

int main(int argc, char **argv)
{
	const char *path = argc >= 2 ? argv[1] : ".";
	DIR *dh;
	struct dirent *de;

	printf("listing path <%s>\n", path);
	dh = opendir(path);
	if (!dh) {
		printf("opendir error %d\n", errno);
		return 1;
	}

	while (1) {
		de = readdir(dh);
		if (!de) {
			if (errno) {
				printf("readdir error %d\n", errno);
				return 1;
			}
			printf("end of listing\n");
			break;
		}
		printf("off=%lu <%s>\n", de->d_off, de->d_name);
	}

	return 0;
}

Before the fix with SMB1 on root shares:

<.>            off=1
<..>           off=2
<$Recycle.Bin> off=3
<bootmgr>      off=4

and on non-root shares:

<.>    off=1
<..>   off=4  <-- after adding .., the offsets jumps to +2 because
<2536> off=5       we skipped . and .. from response buffer (C)
<411>  off=6       but still incremented pos
<file> off=7
<fsx>  off=8

Therefore the fix for smb2 is to mimic smb1 behaviour and offset the
index_of_last_entry by 2.

Test results comparing smb1 and smb2 before/after the fix on root
share, non-root shares and on large directories (ie. multi-response
dir listing):

PRE FIX
=======
pre-1-root VS pre-2-root:
        ERR pre-2-root is missing [bootmgr, $Recycle.Bin]
pre-1-nonroot VS pre-2-nonroot:
        OK~ same files, same order, different offsets
pre-1-nonroot-large VS pre-2-nonroot-large:
        OK~ same files, same order, different offsets

POST FIX
========
post-1-root VS post-2-root:
        OK same files, same order, same offsets
post-1-nonroot VS post-2-nonroot:
        OK same files, same order, same offsets
post-1-nonroot-large VS post-2-nonroot-large:
        OK same files, same order, same offsets

REGRESSION?
===========
pre-1-root VS post-1-root:
        OK same files, same order, same offsets
pre-1-nonroot VS post-1-nonroot:
        OK same files, same order, same offsets

BugLink: https://bugzilla.samba.org/show_bug.cgi?id=13107
Signed-off-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Paulo Alcantara <palcantara@suse.deR>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
CC: Stable <stable@vger.kernel.org>
2018-10-02 18:06:21 -05:00
Greg Kroah-Hartman
ef0f2584c2 Fixes for v4.19-rc7
- Fix failure-path memory leak in ramoops_init (nixiaoming)
 -----BEGIN PGP SIGNATURE-----
 Comment: Kees Cook <kees@outflux.net>
 
 iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAluxB3cWHGtlZXNjb29r
 QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJidvEACZlqemGsSVQ4elXWCW9EPqyVSn
 lbPg5ONurG/51J6313Ankgn7PrOI7WRd3iAXUWYByoc0DXn/jgz0i/B1+MtSnH4g
 fMeZ3DMnwmuC4h8/50/xmxbjKj2+vW7tX+978wWbYnvYNC+UXf8CN4J4MwBFmNR3
 DGH+oVCx2MITbIYQ3u5FIUgRJl0sD15GtxHg/l5Ff78dtUJVvlnD6ZAa9/rDBCPu
 DD0IqjJ5BqTmGw98L2tG0I2SDSrC8TGAYdQlZK/k7vHUqCWP6QspCpQQy3x6bh+W
 QRL6gCNEPZIGB+uAVbueCo80zRrx1NltbkbO4n9zn9ItYxpvOXJwGT4peYYXwabO
 nn+cWwRAJGITp9BFKIk/V8p05rpRhw8oeOBHIzwylb3C6bqK3ijnkk9mNSmUnLPG
 FtzRs/cYEMHi7n7+aygm1lNHn98PAWGLmomXLyxIUtsSH1jvFWG+jxLiRih/Oah5
 qjSGw2r681vLKsj2tQV4hpFvWaV83t2cO1QOldWSSkVHKJO9+pglEUGz9gfCvBRz
 bCvA0aPNGmMGTO0faQMB/TG2pMtOK94UU6kBIALxlEeHrtdpoOaQN6g++R7sY3/y
 SWo8BeGlNMSAtkWtd6Y7xxIF5N4PDDc6iMyjHGM/KCptANVu03BWlfOn32tel9jR
 17wiJi6SjqQjHVp7jw==
 =bzfz
 -----END PGP SIGNATURE-----

Merge tag 'pstore-v4.19-rc7' of https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Kees writes:
  "Pstore fixes for v4.19-rc7

   - Fix failure-path memory leak in ramoops_init (nixiaoming)"

* tag 'pstore-v4.19-rc7' of https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  pstore/ram: Fix failure-path memory leak in ramoops_init
2018-10-01 17:22:36 -07:00
Jens Axboe
c0aac682fa This is the 4.19-rc6 release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAluw4MIACgkQONu9yGCS
 aT7+8xAAiYnc4khUsxeInm3z44WPfRX1+UF51frTNSY5C8Nn5nvRSnTUNLuKkkrz
 8RbwCL6UYyJxF9I/oZdHPsPOD4IxXkQY55tBjz7ZbSBIFEwYM6RJMm8mAGlXY7wq
 VyWA5MhlpGHM9DjrguB4DMRipnrSc06CVAnC+ZyKLjzblzU1Wdf2dYu+AW9pUVXP
 j4r74lFED5djPY1xfqfzEwmYRCeEGYGx7zMqT3GrrF5uFPqj1H6O5klEsAhIZvdl
 IWnJTU2coC8R/Sd17g4lHWPIeQNnMUGIUbu+PhIrZ/lDwFxlocg4BvarPXEdzgYi
 gdZzKBfovpEsSu5RCQsKWG4IGQxY7I1p70IOP9eqEFHZy77qT1YcHVAWrK1Y/bJd
 UA08gUOSzRnhKkNR3+PsaMflUOl9WkpyHECZu394cyRGMutSS50aWkavJPJ/o1Qi
 D/oGqZLLcKFyuNcchG+Met1TzY3LvYEDgSburqwqeUZWtAsGs8kmiiq7qvmXx4zV
 IcgM8ERqJ8mbfhfsXQU7hwydIrPJ3JdIq19RnM5ajbv2Q4C/qJCyAKkQoacrlKR4
 aiow/qvyNrP80rpXfPJB8/8PiWeDtAnnGhM+xySZNlw3t8GR6NYpUkIzf5TdkSb3
 C8KuKg6FY9QAS62fv+5KK3LB/wbQanxaPNruQFGe5K1iDQ5Fvzw=
 =dMl4
 -----END PGP SIGNATURE-----

Merge tag 'v4.19-rc6' into for-4.20/block

Merge -rc6 in, for two reasons:

1) Resolve a trivial conflict in the blk-mq-tag.c documentation
2) A few important regression fixes went into upstream directly, so
   they aren't in the 4.20 branch.

Signed-off-by: Jens Axboe <axboe@kernel.dk>

* tag 'v4.19-rc6': (780 commits)
  Linux 4.19-rc6
  MAINTAINERS: fix reference to moved drivers/{misc => auxdisplay}/panel.c
  cpufreq: qcom-kryo: Fix section annotations
  perf/core: Add sanity check to deal with pinned event failure
  xen/blkfront: correct purging of persistent grants
  Revert "xen/blkfront: When purging persistent grants, keep them in the buffer"
  selftests/powerpc: Fix Makefiles for headers_install change
  blk-mq: I/O and timer unplugs are inverted in blktrace
  dax: Fix deadlock in dax_lock_mapping_entry()
  x86/boot: Fix kexec booting failure in the SEV bit detection code
  bcache: add separate workqueue for journal_write to avoid deadlock
  drm/amd/display: Fix Edid emulation for linux
  drm/amd/display: Fix Vega10 lightup on S3 resume
  drm/amdgpu: Fix vce work queue was not cancelled when suspend
  Revert "drm/panel: Add device_link from panel device to DRM device"
  xen/blkfront: When purging persistent grants, keep them in the buffer
  clocksource/drivers/timer-atmel-pit: Properly handle error cases
  block: fix deadline elevator drain for zoned block devices
  ACPI / hotplug / PCI: Don't scan for non-hotplug bridges if slot is not bridge
  drm/syncobj: Don't leak fences when WAIT_FOR_SUBMIT is set
  ...

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-01 08:58:57 -06:00
Dave Chinner
e55ec4ddbe xfs: fix error handling in xfs_bmap_extents_to_btree
Commit 01239d77b9 ("xfs: fix a null pointer dereference in
xfs_bmap_extents_to_btree") attempted to fix a null pointer
dreference when a fuzzing corruption of some kind was found.
This fix was flawed, resulting in assert failures like:

XFS: Assertion failed: ifp->if_broot == NULL, file: fs/xfs/libxfs/xfs_bmap.c, line: 715
.....
Call Trace:
  xfs_bmap_extents_to_btree+0x6b9/0x7b0
  __xfs_bunmapi+0xae7/0xf00
  ? xfs_log_reserve+0x1c8/0x290
  xfs_reflink_remap_extent+0x20b/0x620
  xfs_reflink_remap_blocks+0x7e/0x290
  xfs_reflink_remap_range+0x311/0x530
  vfs_dedupe_file_range_one+0xd7/0xe0
  vfs_dedupe_file_range+0x15b/0x1a0
  do_vfs_ioctl+0x267/0x6c0

The problem is that the error handling code now asserts that the
inode fork is not in btree format before the error handling code
undoes the modifications that put the fork back in extent format.
Fix this by moving the assert back to after the xfs_iroot_realloc()
call that returns the fork to extent format, and clean up the jump
labels to be meaningful.

Also, returning ENOSPC when xfs_btree_get_bufl() fails to
instantiate the buffer that was allocated (the actual fix in the
commit mentioned above) is incorrect. This is a fatal error - only
an invalid block address or a filesystem shutdown can result in
failing to get a buffer here.

Hence change this to EFSCORRUPTED so that the higher layer knows
this was a corruption related failure and should not treat it as an
ENOSPC error.  This should result in a shutdown (via cancelling a
dirty transaction) which is necessary as we do not attempt to clean
up the (invalid) block that we have already allocated.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2018-10-01 08:11:07 +10:00
Kees Cook
bac6f6cda2 pstore/ram: Fix failure-path memory leak in ramoops_init
As reported by nixiaoming, with some minor clarifications:

1) memory leak in ramoops_register_dummy():
   dummy_data = kzalloc(sizeof(*dummy_data), GFP_KERNEL);
   but no kfree() if platform_device_register_data() fails.

2) memory leak in ramoops_init():
   Missing platform_device_unregister(dummy) and kfree(dummy_data)
   if platform_driver_register(&ramoops_driver) fails.

I've clarified the purpose of ramoops_register_dummy(), and added a
common cleanup routine for all three failure paths to call.

Reported-by: nixiaoming <nixiaoming@huawei.com>
Cc: stable@vger.kernel.org
Cc: Anton Vorontsov <anton@enomsg.org>
Cc: Colin Cross <ccross@android.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Joel Fernandes <joelaf@google.com>
Cc: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
2018-09-30 10:15:41 -07:00
Greg Kroah-Hartman
9ba6873e16 filesystem-dax for 4.19-rc6
Fix a deadlock in the new for 4.19 dax_lock_mapping_entry() routine.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJbsDBrAAoJEB7SkWpmfYgCi54QAIz8yFZI5+5+amG/L/F9mGe4
 sagcSPsk67EzzTDhnKASTlRmpm0+LWzQckY7o/fDRoM0VQVKjXVDke4VTDnHFg7W
 JfZMN24dg6Pcbq3CxuSXMOiWd8vXSnLL2Myin+fQ/kY1rxnIz2ZYNWxCQsLvdPiC
 VKJAbpYlcG41HZPPnRkMaRBxf2INUraSgyHFoehbgvlwLD7YUOzPh9strauutK5M
 xljv2d/yjfaW4U6DhQhUSo+sDYRLGDkbqQw6ZoVqbODA0IXdY6ytiCujLLD9xODg
 lDKF68jCX/+lFIURm8BRpX9iqHvfILC5el61a4bTxjJ6XUf+Ok5vgkeZFDfQKziC
 rLqm09NTQ5Xu0MJ8Ql+5cqAFqBMA7Uy1zF6l8DnGFCtMV/S0H/TgdXWLzHjRXQvE
 18ekLqTcRk5UmPXJYJ829ln0TKTd3zyuVgwuLuGAeO97m431y3K2Q74ncPahgE9+
 W0nduPFTmMikohcKah2P3mQWGtUAYWodQsEs+Y9gJPyoDic+fmjo+mI0xg7CeFL4
 kpfug45i8hdbnlHrHOJ6bz7fRq7CvaaRaI3gOvFfuN2TJVY8Qfs/8JD4HN8F7u+r
 zDPVnvkutaYV1uOOBU4nDzPJ+naVGlpOj1/tsMU4ikj3LbfkfW+gxsr6XGZYPU81
 qYjEfXm60ritFoAA5dVV
 =3l8a
 -----END PGP SIGNATURE-----

Merge tag 'libnvdimm-fixes2-4.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm

Dan writes:
  "filesystem-dax for 4.19-rc6

   Fix a deadlock in the new for 4.19 dax_lock_mapping_entry() routine."

* tag 'libnvdimm-fixes2-4.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
  dax: Fix deadlock in dax_lock_mapping_entry()
2018-09-30 06:19:38 -07:00
Brian Foster
561295a325 iomap: set page dirty after partial delalloc on mkwrite
The iomap page fault mechanism currently dirties the associated page
after the full block range of the page has been allocated. This
leaves the page susceptible to delayed allocations without ever
being set dirty on sub-page block sized filesystems.

For example, consider a page fault on a page with one preexisting
real (non-delalloc) block allocated in the middle of the page. The
first iomap_apply() iteration performs delayed allocation on the
range up to the preexisting block, the next iteration finds the
preexisting block, and the last iteration attempts to perform
delayed allocation on the range after the prexisting block to the
end of the page. If the first allocation succeeds and the final
allocation fails with -ENOSPC, iomap_apply() returns the error and
iomap_page_mkwrite() fails to dirty the page having already
performed partial delayed allocation. This eventually results in the
page being invalidated without ever converting the delayed
allocation to real blocks.

This problem is reliably reproduced by generic/083 on XFS on ppc64
systems (64k page size, 4k block size). It results in leaked
delalloc blocks on inode reclaim, which triggers an assert failure
in xfs_fs_destroy_inode() and filesystem accounting inconsistency.

Move the set_page_dirty() call from iomap_page_mkwrite() to the
actor callback, similar to how the buffer head implementation works.
The actor callback is called iff ->iomap_begin() returns success, so
ensures the page is dirtied as soon as possible after an allocation.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2018-09-29 13:51:01 +10:00
Brian Foster
ec2ed0b5e9 xfs: remove invalid log recovery first/last cycle check
One of the first steps of log recovery is to check for the special
case of a zeroed log. If the first cycle in the log is zero or the
tail portion of the log is zeroed, the head is set to the first
instance of cycle 0. xlog_find_zeroed() includes a sanity check that
enforces that the first cycle in the log must be 1 if the last cycle
is 0. While this is true in most cases, the check is not totally
valid because it doesn't consider the case where the filesystem
crashed after a partial/out of order log buffer completion that
wraps around the end of the physical log.

For example, consider a filesystem that has completed most of the
first cycle of the log, reaches the end of the physical log and
splits the next single log buffer write into two in order to wrap
around the end of the log. If these I/Os are reordered, the second
(wrapped) I/O completes and the first happens to fail, the log is
left in a state where the last cycle of the log is 0 and the first
cycle is 2. This causes the xlog_find_zeroed() sanity check to fail
and prevents the filesystem from mounting. This situation has been
reproduced on particular systems via repeated runs of generic/475.

This is an expected state that log recovery already knows how to
deal with, however. Since the log is still partially zeroed, the
head is detected correctly and points to a valid tail. The
subsequent stale block detection clears blocks beyond the head up to
the tail (within a maximum range), with the express purpose of
clearing such out of order writes. As expected, this removes the out
of order cycle 2 blocks at the physical start of the log.

In other words, the only thing that prevents a clean mount and
recovery of the filesystem in this scenario is the specific (last ==
0 && first != 1) sanity check in xlog_find_zeroed(). Since the log
head/tail are now independently validated via cycle, log record and
CRC checks, this highly specific first cycle check is of dubious
value. Remove it and rely on the higher level validation to
determine whether log content is sane and recoverable.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2018-09-29 13:50:41 +10:00
Eric Sandeen
339e1a3fcd xfs: validate inode di_forkoff
Verify the inode di_forkoff, lifted from xfs_repair's
process_check_inode_forkoff().

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2018-09-29 13:50:13 +10:00
Christoph Hellwig
f5f3f959b7 xfs: skip delalloc COW blocks in xfs_reflink_end_cow
The iomap direct I/O code issues a single ->end_io call for the whole
I/O request, and if some of the extents cowered needed a COW operation
it will call xfs_reflink_end_cow over the whole range.

When we do AIO writes we drop the iolock after doing the initial setup,
but before the I/O completion.  Between dropping the lock and completing
the I/O we can have a racing buffered write create new delalloc COW fork
extents in the region covered by the outstanding direct I/O write, and
thus see delalloc COW fork extents in xfs_reflink_end_cow.  As
concurrent writes are fundamentally racy and no guarantees are given we
can simply skip those.

This can be easily reproduced with xfstests generic/208 in always_cow
mode.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2018-09-29 13:49:58 +10:00
Eric Sandeen
f369a13cea xfs: don't treat unknown di_flags2 as corruption in scrub
xchk_inode_flags2() currently treats any di_flags2 values that the
running kernel doesn't recognize as corruption, and calls
xchk_ino_set_corrupt() if they are set.  However, it's entirely possible
that these flags were set in some newer kernel and are quite valid,
but ignored in this kernel.

(Validators don't care one bit about unknown di_flags2.)

Call xchk_ino_set_warning instead, because this may or may not actually
indicate a problem.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2018-09-29 13:49:00 +10:00
YueHaibing
2863c2ebc4 xfs: remove duplicated include from alloc.c
Remove duplicated include xfs_alloc.h

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2018-09-29 13:48:21 +10:00
Christoph Hellwig
0065b54119 xfs: don't bring in extents in xfs_bmap_punch_delalloc_range
This function is only used to punch out delayed allocations on I/O
failure, which means we need to have read the extents earlier.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2018-09-29 13:47:46 +10:00