Commit Graph

663473 Commits

Author SHA1 Message Date
NeilBrown
518662e0fc NFS: fix usage of mempools.
When passed GFP flags that allow sleeping (such as
GFP_NOIO), mempool_alloc() will never return NULL, it will
wait until memory is available.

This means that we don't need to handle failure, but that we
do need to ensure one thread doesn't call mempool_alloc()
twice on the one pool without queuing or freeing the first
allocation.  If multiple threads did this during times of
high memory pressure, the pool could be exhausted and a
deadlock could result.

pnfs_generic_alloc_ds_commits() attempts to allocate from
the nfs_commit_mempool while already holding an allocation
from that pool.  This is not safe.  So change
nfs_commitdata_alloc() to take a flag that indicates whether
failure is acceptable.

In pnfs_generic_alloc_ds_commits(), accept failure and
handle it as we currently do.  Else where, do not accept
failure, and do not handle it.

Even when failure is acceptable, we want to succeed if
possible.  That means both
 - using an entry from the pool if there is one
 - waiting for direct reclaim is there isn't.

We call mempool_alloc(GFP_NOWAIT) to achieve the first, then
kmem_cache_alloc(GFP_NOIO|__GFP_NORETRY) to achieve the
second.  Each of these can fail, but together they do the
best they can without blocking indefinitely.

The objects returned by kmem_cache_alloc() will still be freed
by mempool_free().  This is safe as mempool_alloc() uses
exactly the same function to allocate objects (since the mempool
was created with mempool_create_slab_pool()).  The object returned
by mempool_alloc() and kmem_cache_alloc() are indistinguishable
so mempool_free() will handle both identically, either adding to the
pool or calling kmem_cache_free().

Also, don't test for failure when allocating from
nfs_wdata_mempool.

Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-04-20 13:44:05 -04:00
Anna Schumaker
f6148713b2 NFS: Clean up nfs4_proc_get_lease_time()
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-04-20 13:39:35 -04:00
Anna Schumaker
e917f0d1ce NFS: Clean up _nfs4_proc_exchange_id()
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-04-20 13:39:35 -04:00
Anna Schumaker
c7ae763903 NFS: Clean up nfs4_proc_bind_one_conn_to_session()
Returning errors directly even lets us remove the goto

Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-04-20 13:39:35 -04:00
Anna Schumaker
3183783bbb NFS: Remove extra dprintk()s from nfs4namespace.c
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-04-20 13:39:35 -04:00
Anna Schumaker
539fd1d1f4 NFS: Clean up nfs4_get_rootfh()
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-04-20 13:39:35 -04:00
Anna Schumaker
4fe6b366d9 NFS: Remove extra dprintk()s from nfs4client.c
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-04-20 13:39:35 -04:00
Anna Schumaker
1073d9b49a NFS: Clean up nfs4_init_server()
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-04-20 13:39:34 -04:00
Anna Schumaker
2dc42c0d60 NFS: Clean up nfs4_set_client()
If we cut out the dprintk()s, then we can return error codes directly
and cut out the goto.

Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-04-20 13:39:34 -04:00
Anna Schumaker
8da0f93438 NFS: Clean up nfs4_check_server_scope()
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-04-20 13:39:34 -04:00
Anna Schumaker
ddfa0d4860 NFS: Clean up nfs4_check_serverowner_major_id()
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-04-20 13:39:34 -04:00
Anna Schumaker
14d1bbb0ca NFS: Create a common nfs4_match_client() function
This puts all the common code in a single place for the
walk_client_list() functions.

Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-04-20 13:39:34 -04:00
Anna Schumaker
5b6d3ff605 NFS: Clean up nfs4_check_serverowner_minor_id()
Once again, we can remove the function and compare integer values
directly.

Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-04-20 13:39:34 -04:00
Anna Schumaker
f251fd9e71 NFS: Clean up nfs4_match_clientids()
If we cut out the dprintk()s, then we don't even need this to be a
separate function.

Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-04-20 13:39:34 -04:00
Anna Schumaker
5be1810a8d NFS: Clean up nfs42_layoutstat_done()
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-04-20 13:39:34 -04:00
Anna Schumaker
e36d48e9e2 NFS: Remove extra dprintk()s from namespace.c
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-04-20 13:39:34 -04:00
Anna Schumaker
fe4f844d49 NFS: Clean up nfs_direct_commit_complete()
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-04-20 13:39:33 -04:00
Anna Schumaker
beeb533801 NFS: Remove nfs_direct_readpage_release()
Just remove the function and have the caller use nfs_release_request()
instead.

Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-04-20 13:39:33 -04:00
Anna Schumaker
4cbb976821 NFS: Clean up extra dprintk()s in client.c
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-04-20 13:39:33 -04:00
Anna Schumaker
2844b6aecf NFS: Clean up nfs_init_client()
We always call nfs_mark_client_ready() even if nfs_create_rpc_client()
returns an error, so we can rearrange nfs_init_client() to mark the
client ready from a single place.

Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-04-20 13:39:33 -04:00
Anna Schumaker
36718a669e NFS: Remove extra dprintk()s from callback_xdr.c
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-04-20 13:39:33 -04:00
Anna Schumaker
3d0bfaa60d NFS: Clean up encode_cb_sequence_res()
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-04-20 13:39:33 -04:00
Anna Schumaker
535ece2b8e NFS: Clean up decode_notify_lock_args()
Let's cut out the goto and return any errors immedately

Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-04-20 13:39:33 -04:00
Anna Schumaker
1796549ad4 NFS: Clean up decode_cb_sequence_args()
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-04-20 13:39:33 -04:00
Anna Schumaker
c79d56d214 NFS: Clean up decode_layoutrecall_args()
Additionally, this change lets us cut out the goto by returning errors
immediately.

Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-04-20 13:39:33 -04:00
Anna Schumaker
135a4ea0d9 NFS: Clean up decode_recall_args()
Removing the dprintk() lets us simplify the function by returning status
codes directly, rather than using a goto.

Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-04-20 13:39:32 -04:00
Anna Schumaker
56938bb77a NFS: Clean up decode_getattr_args()
Removing the dprintk() lets us return the status value directly, rather
than jumping to a label if an error occurs.

Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-04-20 13:39:32 -04:00
Anna Schumaker
be55f1bca7 NFS: Remove extra dprintk()s from callback_proc.c
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-04-20 13:39:32 -04:00
Anna Schumaker
5694a4f848 NFS: Clean up nfs4_callback_layoutrecall()
In addition to removing the dprintk(), this patch also initializes "res"
to the default return value instead of doing this through an else
condition.

Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-04-20 13:39:32 -04:00
Anna Schumaker
1a916ce049 NFS: Clean up do_callback_layoutrecall()
Removing the dprintk()s lets us simplify the function by removing the
else condition entirely and returning the status of
initiate_{file,bulk}_draining() directly.

Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-04-20 13:39:32 -04:00
Tigran Mkrtchyan
a7878ca140 nfs: flexfilelayout: remove v3-only data server limitation
Flexfilelayout supports data servers which talk NFS v3 and v4.{0,1,2}.
However, this code path is disabled and v3 only servers are accepted.
This change removes this limitation.
Signed-off-by: Tigran Mkrtchyan <tigran.mkrtchyan@desy.de>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-04-20 13:35:24 -04:00
Benjamin Coddington
b044f64513 NFS: switch back to to ->iterate()
NFS has some optimizations for readdir to choose between using READDIR or
READDIRPLUS based on workload, and which NFS operation to use is determined
by subsequent interactions with lookup, d_revalidate, and getattr.

Concurrent use of nfs_readdir() via ->iterate_shared() can cause those
optimizations to repeatedly invalidate the pagecache used to store
directory entries during readdir(), which causes some very bad performance
for directories with many entries (more than about 10000).

There's a couple ways to fix this in NFS, but no fix would be as simple as
going back to ->iterate() to serialize nfs_readdir(), and neither fix I
tested performed as well as going back to ->iterate().

The first required taking the directory's i_lock for each entry, with the
result of terrible contention.

The second way adds another flag to the nfs_inode, and so keeps the
optimizations working for large directories.  The difference from using
->iterate() here is that much more memory is consumed for a given workload
without any performance gain.

The workings of nfs_readdir() are such that concurrent users are serialized
within read_cache_page() waiting to retrieve pages of entries from the
server.  By serializing this work in iterate_dir() instead, contention for
cache pages is reduced.  Waiting processes can have an uncontended pass at
the entirety of the directory's pagecache once previous processes have
completed filling it.

v2 - Keep the bits needed for parallel lookup

Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-04-20 13:33:09 -04:00
Linus Torvalds
4f7d029b9b Linux 4.11-rc7 2017-04-16 13:00:18 -07:00
Linus Torvalds
7395ca0f91 ARM: SoC fixes
Again, a batch that's been sitting a couple of weeks, mostly because I
 anticipated a bit more material but it didn't show up -- which is good.
 
 These are all your garden variety fixes for ARM platforms. Most visible issue
 fixed here is probably the SMP reset issue on OMAP, the rest are minor stuff.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJY877PAAoJEIwa5zzehBx3TaUQAKtE8CSmvdWIk5grVb4MBDfF
 rXQ2R7WJF17LtVhIGwz9BPCQXMK+/XcSuXZ8bvUIoMsmOFXutDao7m7COD0z4aNo
 CPWk7ceQRzIWvsmturla6aYGmpAbWzdDgQj61LnaFbQrzmJOHVvcJN685+L5NGkZ
 8GBrzNmrvVkWz5N+msnrZRIcKpSqGCXrkjUU1EfHgUgNNdTf6BQRQSzVEYBtILEb
 9bC3WdS1fosmSdbsBjcxHsAtWMyO64KqhA691+gGTR93wyOuCRgqv+/ucoFB1oi+
 yjuhUVHW7Y5/8KOi67+97PwoKpZYpUFHge1/5iPbgEN6SqhOLzPAALGCKT4ONEQz
 KmLl//kX1IJZTD/87OegSLDbpS7h2sRYS7FpwgAa1kyYYOHdsZsECzKfhEvgZNHX
 Gtl2XLmtELM9quKQ2X8xWvjU5grfSLkng9OhDJ6ZCFhGddvjtHegq8J5diFyq281
 qf4n/Gp/OLC9IoPUyZefn5ya77K8UZPNppqtiWTBbT1IuXWbPUVPKmZoCpq3Qwu+
 2fdcFa3sIfpzDg9vkTszFAUCFkRqH5Jzy8BfNIQtfSq+pxwyIRWh88a5CV2RvYFB
 uHSt5B88YR2PwoyhV2oFpYNkx3vVu9g+A35ClzIRTLhSMi+Ngh2+DcppHY+LficP
 m9Hes98dZOv92JgiSaoQ
 =C+Hz
 -----END PGP SIGNATURE-----

Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Pull ARM SoC fixes from Olof Johansson:
 "Again, a batch that's been sitting a couple of weeks, mostly because
  I anticipated a bit more material but it didn't show up -- which is
  good.

  These are all your garden variety fixes for ARM platforms.

  The most visible issue fixed here is probably the SMP reset issue on
  OMAP, the rest are minor stuff"

* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
  arm64: allwinner: a64: add pmu0 regs for USB PHY
  ARM: OMAP2+: omap_device: Sync omap_device and pm_runtime after probe defer
  reset: add exported __reset_control_get, return NULL if optional
  ARM: orion5x: only call into phylib when available
  ARM: omap2+: Revert omap-smp.c changes resetting CPU1 during boot
  ARM: dts: am335x-evmsk: adjust mmc2 param to allow suspend
  ARM: dts: ti: fix PCI bus dtc warnings
  ARM: dts: am335x-baltos: disable EEE for Atheros 8035 PHY
  ARM: dts: OMAP3: Fix MFG ID EEPROM
  ARM: sun8i: a33: add operating-points-v2 property to all nodes
  ARM: sun8i: a33: remove highest OPP to fix CPU crashes
2017-04-16 12:38:17 -07:00
Linus Torvalds
a86f106f48 Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
 "Four small fixes.

  Three of them fix the same error in NVMe, in loop, fc, and rdma
  respectively.  The last fix from Ming fixes a regression in this
  series, where our bvec gap logic was wrong and causes an oops on
  NVMe for certain conditions"

* 'for-linus' of git://git.kernel.dk/linux-block:
  block: fix bio_will_gap() for first bvec with offset
  nvme-fc: Fix sqsize wrong assignment based on ctrl MQES capability
  nvme-rdma: Fix sqsize wrong assignment based on ctrl MQES capability
  nvme-loop: Fix sqsize wrong assignment based on ctrl MQES capability
2017-04-16 12:05:09 -07:00
Olof Johansson
e2647b6de7 Regression fix for omap interconnect code for deferred probe.
Without this fix we can get PM related warnings for devices that
 use deferred probe. If necessary, this fix can wait for the
 v4.12 merge window no problem.
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCAAvFiEEkgNvrZJU/QSQYIcQG9Q+yVyrpXMFAljruuwRHHRvbnlAYXRv
 bWlkZS5jb20ACgkQG9Q+yVyrpXMWBBAA0kYBB9IA+OinjpgLBB9ltKX21HBXKAHn
 JCiygnR6KzDxnBzsPJk0v0GfRq7FixsEbstyDqfGXDpK5pOFTlgffGFeaMLQt+jU
 4PcjLtiXklS9j3jJyUS7SAAjh5sPmR8v5q+NO0ELGi5H2q3c7J7X7VojD9LCB9gm
 z/4t133EEPwRdTUghoqxTaB+11ROrbctr0eZUNIafytxsnX4kkpcVGuQxRBu740j
 rLXxd3lIJbqasJHj+4v/IkE5CT61OslqEEA8QDaP1oK4d6M4+JVGCBAPXIl6AZ1b
 vQEjUl1YMgU1QeF/cnQwf6n6fM1DqhjbdySotDRDlZvWExexTG1BXBcKr1mATfNp
 zSGJuAne/zG0AHcqfpTZD3VWbrO1iGw5RmifwwtcmbsAmKu6K7ezMx2QO4L8VBma
 N407KOdhcnzsGQnTn3iQNwK36b/lc6ph1DA02TeS41vYB6MBrIp7uugP30k7eOf2
 Smv3LClY0H9+db9/IMDyuap8Os3QlGEwXwnVyC67TE3dRP3Js5r7Fm3i9WGmV5Oy
 5pU79OXMYzphKUd/11QUSnQUO+SMNH7/fU/dSeqLQMfwe2a5oY4FDuM5Y8uFDoZ7
 2KPGLEGWFmn8kxuZvBw/nHiwPexe3y91dIfvA+S7kTQnqFGmM0IzlkMnfcZeV5Zu
 AaRkd2G7654=
 =07Xj
 -----END PGP SIGNATURE-----

Merge tag 'omap-for-v4.11/fixes-rc6-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes

Regression fix for omap interconnect code for deferred probe.
Without this fix we can get PM related warnings for devices that
use deferred probe. If necessary, this fix can wait for the
v4.12 merge window no problem.

* tag 'omap-for-v4.11/fixes-rc6-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
  ARM: OMAP2+: omap_device: Sync omap_device and pm_runtime after probe defer
  ARM: omap2+: Revert omap-smp.c changes resetting CPU1 during boot
  ARM: dts: am335x-evmsk: adjust mmc2 param to allow suspend
  ARM: dts: ti: fix PCI bus dtc warnings
  ARM: dts: am335x-baltos: disable EEE for Atheros 8035 PHY
  ARM: dts: OMAP3: Fix MFG ID EEPROM

Signed-off-by: Olof Johansson <olof@lixom.net>
2017-04-16 11:52:26 -07:00
Linus Torvalds
11c994d9a5 Merge branch 'for-4.11-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup fix from Tejun Heo:
 "Unfortunately, the commit to fix the cgroup mount race in the previous
  pull request can lead to hangs.

  The original bug has been around for a while and isn't too likely to
  be triggered in usual use cases. Revert the commit for now"

* 'for-4.11-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  Revert "cgroup: avoid attaching a cgroup root to two different superblocks"
2017-04-16 11:48:10 -07:00
Linus Torvalds
032aaf3f9f TTY fix for 4.11-rc7
Here is a single tty core revert for a patch that was reported to cause
 problems.  The original issue is one that we have lived with for
 decades, so trying to scramble to fix the fix in time for 4.11-final
 does not make sense due to the fragility of the tty ldisc layer.  Just
 reverting it makes sense for now.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCWPMjIQ8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+yl+ywCfbEc9XQ2+eDebFSs673OaDUxApmIAoMe+4Qj5
 puBC7ThZvCwrwACsoBWd
 =rugQ
 -----END PGP SIGNATURE-----

Merge tag 'tty-4.11-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty

Pull tty fix from Greg KH:
 "Here is a single tty core revert for a patch that was reported to
  cause problems.

  The original issue is one that we have lived with for decades, so
  trying to scramble to fix the fix in time for 4.11-final does not make
  sense due to the fragility of the tty ldisc layer. Just reverting it
  makes sense for now"

* tag 'tty-4.11-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
  Revert "tty: don't panic on OOM in tty_set_ldisc()"
2017-04-16 11:35:34 -07:00
Linus Torvalds
48538861b9 While rewriting the function probe code, I stumbled over a long standing
bug. This bug has been there sinc function tracing was added way back
 when. But my new development depends on this bug being fixed, and it
 should be fixed regardless as it causes ftrace to disable itself when
 triggered, and a reboot is required to enable it again.
 
 The bug is that the function probe does not disable itself properly
 if there's another probe of its type still enabled. For example:
 
      # cd /sys/kernel/debug/tracing
      # echo schedule:traceoff > set_ftrace_filter
      # echo do_IRQ:traceoff > set_ftrace_filter
      # echo \!do_IRQ:traceoff > /debug/tracing/set_ftrace_filter
      # echo do_IRQ:traceoff > set_ftrace_filter
 
 The above registers two traceoff probes (one for schedule and one for
 do_IRQ, and then removes do_IRQ. But since there still exists one for
 schedule, it is not done properly. When adding do_IRQ back, the breakage
 in the accounting is noticed by the ftrace self tests, and it causes
 a warning and disables ftrace.
 -----BEGIN PGP SIGNATURE-----
 
 iQExBAABCAAbBQJY8ovvFBxyb3N0ZWR0QGdvb2RtaXMub3JnAAoJEMm5BfJq2Y3L
 nkAH/jfsXUWIbZ6J0A7+nmGiBdIVwLwG0ZOJClcxjnCSpsNs+FO/0w6ragtIYCi2
 Km+0s/slA5GOddG4Miga/dhtxGhDosyXnxqC+4GmD0maqJGLweJLbmiQ1xhra0hr
 XGDI+SXHM/n22zVkFEbkGXgxMvOHeR+X/sREZo3XmoXRLbc1QVtTEe/8TdlLXwE5
 5Fs07xSQqx4TS7oBxIjipHnbHL/gIktEo0HiEmq73++r42MztIMYZPoV+cXuim37
 C6xO4PxfPN0aRh9W5gdiMnbv6lummVBNQXwpMya0vTbxz/9WeUex8c+lcInQUJgA
 FhQWKaCGyi0UK4Pa2Pz/Dmxuti0=
 =LYLo
 -----END PGP SIGNATURE-----

Merge tag 'trace-v4.11-rc5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull ftrace fix from Steven Rostedt:
 "While rewriting the function probe code, I stumbled over a long
  standing bug. This bug has been there sinc function tracing was added
  way back when. But my new development depends on this bug being fixed,
  and it should be fixed regardless as it causes ftrace to disable
  itself when triggered, and a reboot is required to enable it again.

  The bug is that the function probe does not disable itself properly if
  there's another probe of its type still enabled. For example:

     # cd /sys/kernel/debug/tracing
     # echo schedule:traceoff > set_ftrace_filter
     # echo do_IRQ:traceoff > set_ftrace_filter
     # echo \!do_IRQ:traceoff > /debug/tracing/set_ftrace_filter
     # echo do_IRQ:traceoff > set_ftrace_filter

  The above registers two traceoff probes (one for schedule and one for
  do_IRQ, and then removes do_IRQ.

  But since there still exists one for schedule, it is not done
  properly. When adding do_IRQ back, the breakage in the accounting is
  noticed by the ftrace self tests, and it causes a warning and disables
  ftrace"

* tag 'trace-v4.11-rc5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  ftrace: Fix removing of second function probe
2017-04-16 10:01:34 -07:00
Tejun Heo
330c418638 Revert "cgroup: avoid attaching a cgroup root to two different superblocks"
This reverts commit bfb0b80db5.

Andrei reports CRIU test hangs with the patch applied.  The bug fixed
by the patch isn't too likely to trigger in actual uses.  Revert the
patch for now.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Andrei Vagin <avagin@virtuozzo.com>
Link: http://lkml.kernel.org/r/20170414232737.GC20350@outlook.office365.com
2017-04-16 23:17:37 +09:00
Linus Torvalds
d5ff0814fd Merge branch 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
Pull nvdimm fixes from Dan Williams:
 "A small crop of lockdep, sleeping while atomic, and other fixes /
  band-aids in advance of the full-blown reworks targeting the next
  merge window. The largest change here is "libnvdimm: fix blk free
  space accounting" which deletes a pile of buggy code that better
  testing would have caught before merging. The next change that is
  borderline too big for a late rc is switching the device-dax locking
  from rcu to srcu, I couldn't think of a smaller way to make that fix.

  The __copy_user_nocache fix will have a full replacement in 4.12 to
  move those pmem special case considerations into the pmem driver. The
  "libnvdimm: band aid btt vs clear poison locking" commit admits that
  our error clearing support for btt went in broken, so we just disable
  it in 4.11 and -stable. A replacement / full fix is in the pipeline
  for 4.12

  Some of these would have been caught earlier had DEBUG_ATOMIC_SLEEP
  been enabled on my development station. I wonder if we should have:

      config DEBUG_ATOMIC_SLEEP
        default PROVE_LOCKING

  ...since I mistakenly thought I got both with PROVE_LOCKING=y.

  These have received a build success notification from the 0day robot,
  and some have appeared in a -next release with no reported issues"

* 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
  x86, pmem: fix broken __copy_user_nocache cache-bypass assumptions
  device-dax: switch to srcu, fix rcu_read_lock() vs pte allocation
  libnvdimm: band aid btt vs clear poison locking
  libnvdimm: fix reconfig_mutex, mmap_sem, and jbd2_handle lockdep splat
  libnvdimm: fix blk free space accounting
  acpi, nfit, libnvdimm: fix interleave set cookie calculation (64-bit comparison)
2017-04-15 14:07:03 -07:00
Linus Torvalds
403a39f8b0 SCSI fixes on 20170415
This is seven small fixes which are all for user visible issues that
 fortunately only occur in rare circumstances.  The most serious is the
 sr one in which QEMU can cause us to read beyond the end of a buffer
 (I don't think it's exploitable, but just in case).  The next is the
 sd capacity fix which means all non 512 byte sector drives greater
 than 2TB fail to be correctly sized.  The rest are either in new
 drivers (qedf) or on error legs.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.vnet.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABAgAGBQJY8krDAAoJEAVr7HOZEZN4BpoQAKKwu5Y7jRS6JNoS94pT2L33
 Rz8dEQNdSmr7JvVQphoQDw7YQjYsGV+4f3FdGMrjMzU0Ew3QR1w2SPwK9ww3wUj6
 eHrIdnkd77FJgWyyy/wLTcf55YwGqhIKCSVv3tQmZN/xiuM7x12ZsW8Kb8ydAbKR
 824+VU49dSx3rpnzUPUI/+eb8u5MgLHoNpzDzRWb4Au+05LOkgthDMfC6GdmOnyx
 Asko/u4UNL69cT+xeYNmryMI+OHdqkRILMadefySJ8QTBfvZHsazYqb4YtWA0qf9
 1VAmuvQzToGC7waAHj2X0FSPEISUx+NehXgQjIOZ9aVD2L1XDFm1wRpI3CsIFRA1
 2IrgGYpR7rQ2Uj6z7VI+uU+0O/fs13Tctyv6C1wuqkjM+itBLQ5IGuG9shBPXazP
 UrD9R+SZfMsGXa1ABlcYJE7M+xEusoN2TKGj5yR5D9kt+a/JcqF2GEmY/KpNkEkT
 rmdqAgHqL/bvIc/qyfU4ZayR1MlGe38oPfm9D+vYqJk/V+drlWt66PkamRowfyej
 WufO4l4mB8bpR+HLjRXK4Nh0wHzh8WDgZUrRJXBmOqpig6v3HCJzUejvFTBjzdjx
 uOUlN41RGnOI1L6OpY6IAc2GPw7IwMHKXaIKL0OwyaK+lxvVMO1jRg3TieXzdSp1
 wF7GqfOw2WFVvX1n5xX3
 =vOpC
 -----END PGP SIGNATURE-----

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "This is seven small fixes which are all for user visible issues that
  fortunately only occur in rare circumstances.

  The most serious is the sr one in which QEMU can cause us to read
  beyond the end of a buffer (I don't think it's exploitable, but just
  in case).

  The next is the sd capacity fix which means all non 512 byte sector
  drives greater than 2TB fail to be correctly sized.

  The rest are either in new drivers (qedf) or on error legs"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: ipr: do not set DID_PASSTHROUGH on CHECK CONDITION
  scsi: aacraid: fix PCI error recovery path
  scsi: sd: Fix capacity calculation with 32-bit sector_t
  scsi: qla2xxx: Add fix to read correct register value for ISP82xx.
  scsi: qedf: Fix crash due to unsolicited FIP VLAN response.
  scsi: sr: Sanity check returned mode data
  scsi: sd: Consider max_xfer_blocks if opt_xfer_blocks is unusable
2017-04-15 09:42:14 -07:00
Linus Torvalds
be84a46c7f Merge branch 'parisc-4.11-4' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
Pull parisc fix from Helge Deller:
 "Mikulas Patocka fixed a few bugs in our new pa_memcpy() assembler
  function, e.g. one bug made the kernel unbootable if source and
  destination address are the same"

* 'parisc-4.11-4' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
  parisc: fix bugs in pa_memcpy
2017-04-15 09:40:35 -07:00
Martin Brandenburg
1ec1688c53 orangefs: free superblock when mount fails
Otherwise lockdep says:

[ 1337.483798] ================================================
[ 1337.483999] [ BUG: lock held when returning to user space! ]
[ 1337.484252] 4.11.0-rc6 #19 Not tainted
[ 1337.484423] ------------------------------------------------
[ 1337.484626] mount/14766 is leaving the kernel with locks still held!
[ 1337.484841] 1 lock held by mount/14766:
[ 1337.485017]  #0:  (&type->s_umount_key#33/1){+.+.+.}, at: [<ffffffff8124171f>] sget_userns+0x2af/0x520

Caught by xfstests generic/413 which tried to mount with the unsupported
mount option dax.  Then xfstests generic/422 ran sync which deadlocks.

Signed-off-by: Martin Brandenburg <martin@omnibond.com>
Acked-by: Mike Marshall <hubcap@omnibond.com>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-04-15 09:39:31 -07:00
Linus Torvalds
c0eb027e5a vfs: don't do RCU lookup of empty pathnames
Normal pathname lookup doesn't allow empty pathnames, but using
AT_EMPTY_PATH (with name_to_handle_at() or fstatat(), for example) you
can trigger an empty pathname lookup.

And not only is the RCU lookup in that case entirely unnecessary
(because we'll obviously immediately finalize the end result), it is
actively wrong.

Why? An empth path is a special case that will return the original
'dirfd' dentry - and that dentry may not actually be RCU-free'd,
resulting in a potential use-after-free if we were to initialize the
path lazily under the RCU read lock and depend on complete_walk()
finalizing the dentry.

Found by syzkaller and KASAN.

Reported-by: Dmitry Vyukov <dvyukov@google.com>
Reported-by: Vegard Nossum <vegard.nossum@gmail.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-04-15 09:34:52 -07:00
Mikulas Patocka
409c1b250e parisc: fix bugs in pa_memcpy
The patch 554bfeceb8 ("parisc: Fix access
fault handling in pa_memcpy()") reimplements the pa_memcpy function.
Unfortunatelly, it makes the kernel unbootable. The crash happens in the
function ide_complete_cmd where memcpy is called with the same source
and destination address.

This patch fixes a few bugs in pa_memcpy:

* When jumping to .Lcopy_loop_16 for the first time, don't skip the
  instruction "ldi 31,t0" (this bug made the kernel unbootable)
* Use the COND macro when comparing length, so that the comparison is
  64-bit (a theoretical issue, in case the length is greater than
  0xffffffff)
* Don't use the COND macro after the "extru" instruction (the PA-RISC
  specification says that the upper 32-bits of extru result are undefined,
  although they are set to zero in practice)
* Fix exception addresses in .Lcopy16_fault and .Lcopy8_fault
* Rename .Lcopy_loop_4 to .Lcopy_loop_8 (so that it is consistent with
  .Lcopy8_fault)

Cc: <stable@vger.kernel.org> # v4.9+
Fixes: 554bfeceb8 ("parisc: Fix access fault handling in pa_memcpy()")
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Helge Deller <deller@gmx.de>
2017-04-15 17:24:05 +02:00
Linus Torvalds
1bf4b1268e Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input fixes from Dmitry Torokhov:
 "Just a small update to xpad driver to recognize yet another gamepad,
  and another change making sure userio.h is exported"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: xpad - add support for Razer Wildcat gamepad
  uapi: add missing install of userio.h
2017-04-14 17:51:16 -07:00
Linus Torvalds
7e703eccf0 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:
 "Things seem to be settling down as far as networking is concerned,
  let's hope this trend continues...

   1) Add iov_iter_revert() and use it to fix the behavior of
      skb_copy_datagram_msg() et al., from Al Viro.

   2) Fix the protocol used in the synthetic SKB we cons up for the
      purposes of doing a simulated route lookup for RTM_GETROUTE
      requests. From Florian Larysch.

   3) Don't add noop_qdisc to the per-device qdisc hashes, from Cong
      Wang.

   4) Don't call netdev_change_features with the team lock held, from
      Xin Long.

   5) Revert TCP F-RTO extension to catch more spurious timeouts because
      it interacts very badly with some middle-boxes. From Yuchung
      Cheng.

   6) Fix the loss of error values in l2tp {s,g}etsockopt calls, from
      Guillaume Nault.

   7) ctnetlink uses bit positions where it should be using bit masks,
      fix from Liping Zhang.

   8) Missing RCU locking in netfilter helper code, from Gao Feng.

   9) Avoid double frees and use-after-frees in tcp_disconnect(), from
      Eric Dumazet.

  10) Don't do a changelink before we register the netdevice in
      bridging, from Ido Schimmel.

  11) Lock the ipv6 device address list properly, from Rabin Vincent"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (29 commits)
  netfilter: ipt_CLUSTERIP: Fix wrong conntrack netns refcnt usage
  netfilter: nft_hash: do not dump the auto generated seed
  drivers: net: usb: qmi_wwan: add QMI_QUIRK_SET_DTR for Telit PID 0x1201
  ipv6: Fix idev->addr_list corruption
  net: xdp: don't export dev_change_xdp_fd()
  bridge: netlink: register netdevice before executing changelink
  bridge: implement missing ndo_uninit()
  bpf: reference may_access_skb() from __bpf_prog_run()
  tcp: clear saved_syn in tcp_disconnect()
  netfilter: nf_ct_expect: use proper RCU list traversal/update APIs
  netfilter: ctnetlink: skip dumping expect when nfct_help(ct) is NULL
  netfilter: make it safer during the inet6_dev->addr_list traversal
  netfilter: ctnetlink: make it safer when checking the ct helper name
  netfilter: helper: Add the rcu lock when call __nf_conntrack_helper_find
  netfilter: ctnetlink: using bit to represent the ct event
  netfilter: xt_TCPMSS: add more sanity tests on tcph->doff
  net: tcp: Increase TCP_MIB_OUTRSTS even though fail to alloc skb
  l2tp: don't mask errors in pppol2tp_getsockopt()
  l2tp: don't mask errors in pppol2tp_setsockopt()
  tcp: restrict F-RTO to work-around broken middle-boxes
  ...
2017-04-14 17:38:24 -07:00
Linus Torvalds
91174391bf Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Thomas Gleixner:
 "A set of small fixes for x86:

   - fix locking in RDT to prevent memory leaks and freeing in use
     memory

   - prevent setting invalid values for vdso32_enabled which cause
     inconsistencies for user space resulting in application crashes.

   - plug a race in the vdso32 code between fork and sysctl which causes
     inconsistencies for user space resulting in application crashes.

   - make MPX signal delivery work in compat mode

   - make the dmesg output of traps and faults readable again"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/intel_rdt: Fix locking in rdtgroup_schemata_write()
  x86/debug: Fix the printk() debug output of signal_fault(), do_trap() and do_general_protection()
  x86/vdso: Plug race between mapping and ELF header setup
  x86/vdso: Ensure vdso32_enabled gets set to valid values only
  x86/signals: Fix lower/upper bound reporting in compat siginfo
2017-04-14 17:00:01 -07:00
Linus Torvalds
07c7016de7 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Thomas Gleixner:
 "Two small fixes for perf:

   - the move to support cross arch annotation introduced per arch
     initialization requirements, fullfill them for s/390 (Christian
     Borntraeger)

   - add the missing initialization to the LBR entries to avoid exposing
     random or stale data"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86: Avoid exposing wrong/stale data in intel_pmu_lbr_read_32()
  perf annotate s390: Fix perf annotate error -95 (4.10 regression)
2017-04-14 16:58:38 -07:00