linux

Author	SHA1	Message	Date
Trond Myklebust	fc7ff36747	pNFS: If we have to delay the layout callback, mark the layout for return If the client needs to delay the layout callback, then speed up the recall process by marking the remaining layout segments to be actively returned by the client. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2015-12-28 14:33:04 -05:00
Trond Myklebust	0654cc726f	NFSv4.1/pNFS: Add a helper to mark the layout as returned This ensures that we don't reuse the stateid if a layout return or implied layout return means that we've returned all layout segments Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2015-12-28 14:33:04 -05:00
Trond Myklebust	ab7d763e47	pNFS: Ensure nfs4_layoutget_prepare returns the correct error If we're unable to perform the layoutget due to an invalid open stateid or a bulk recall, ensure that we return the error so that the caller can decide on an appropriate action. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2015-12-28 14:33:03 -05:00
Trond Myklebust	4d0ac22109	pNFS/flexfiles: Ensure we record layoutstats even if RPC is terminated early Currently, we will only record the layoutstats correctly if the RPC call successfully obtains a slot. If we exit before that happens, then we may find ourselves starting the busy timer through the call in ff_layout_(read\|write)_prepare_layoutstats, but never stopping it. The same thing happens if we're doing DA-DS. The fix is to ensure that we catch these cases in the rpc_release() callback. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2015-12-28 14:32:41 -05:00
Trond Myklebust	37e9ed22b1	pNFS: Add flag to track if we've called nfs4_ff_layout_stat_io_start_read/write Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2015-12-28 14:32:41 -05:00
Trond Myklebust	7eeea16797	pNFS/flexfiles: Fix a statistics gathering imbalance When we replay a failed read, write or commit to the dataserver, we need to ensure that we call ff_layout_read_prepare_v3(), ff_layout_write_prepare_v3 or ff_layout_commit_prepare_v3() so that we reset the statistics. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2015-12-28 14:32:40 -05:00
Trond Myklebust	b9fc773ef5	pNFS/flexfiles: Don't mark the entire layout as failed, when returning it In pNFS/flexfiles, we want to return the layout without necessarily marking it as having completely failed. We therefore move the call to pnfs_layout_io_set_failed() out of pnfs_error_mark_layout_for_return(), and then ensura that pNFS/files layout calls it separately. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2015-12-28 14:32:40 -05:00
Trond Myklebust	2e5b29f044	pNFS/flexfiles: Don't prevent flexfiles client from retrying LAYOUTGET Fix a bug in which flexfiles clients are falling back to I/O through the MDS even when the FF_FLAGS_NO_IO_THRU_MDS flag is set. The flexfiles client will always report errors through the LAYOUTRETURN and/or LAYOUTERROR mechanisms, so it should normally be safe for it to retry the LAYOUTGET until it fails or succeeds. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2015-12-28 14:32:40 -05:00
Peng Tao	141b9b59ed	pnfs/flexfiles: count io stat in rpc_count_stats callback If client ever restarts IO due to some errors, we'll endup mis-counting IO stats if we do the counting in .rpc_done callback. Move it to .rpc_count_stats callback that is only called when releasing RPC. Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2015-12-28 14:32:39 -05:00
Peng Tao	c22eeb8697	pnfs/flexfiles: do not mark delay-like status as DS failure We just need to delay and retry in these cases. Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2015-12-28 14:32:39 -05:00
Peng Tao	7c1e6e58e2	NFS41: map NFS4ERR_LAYOUTUNAVAILABLE to ENODATA Instead of mapping it to EIO that is a fatal error and fails application. We'll go inband after getting NFS4ERR_LAYOUTUNAVAILABLE. Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2015-12-28 14:32:38 -05:00
Peng Tao	d6c843b96e	nfs: only remove page from mapping if launder_page fails Instead of dropping pages when write fails, only do it when we get fatal failure in launder_page write back. Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2015-12-28 14:32:38 -05:00
Peng Tao	0bcbf039f6	nfs: handle request add failure properly When we fail to queue a read page to IO descriptor, we need to clean it up otherwise it is hanging around preventing nfs module from being removed. When we fail to queue a write page to IO descriptor, we need to clean it up and also save the failure status to open context. Then at file close, we can try to write pages back again and drop the page if it fails to writeback in .launder_page, which will be done in the next patch. Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2015-12-28 14:32:37 -05:00
Peng Tao	2bff228857	nfs: centralize pgio error cleanup In case we fail during setting things up for read/write IO, set pg_error in IO descriptor and do the cleanup in nfs_pageio_add_request, where we clean up all pages that are still hanging around on the IO descriptor. Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2015-12-28 14:32:37 -05:00
Peng Tao	c18b96a1b8	nfs: clean up rest of reqs when failing to add one If we fail to set up things before sending anything over wire, we need to clean up the reqs that are still attached to the IO descriptor. Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2015-12-28 14:32:37 -05:00
Peng Tao	d600ad1f2b	NFS41: pop some layoutget errors to application For ERESTARTSYS/EIO/EROFS/ENOSPC/E2BIG in layoutget, we should just bail out instead of hiding the error and retrying inband IO. Change all the call sites to pop the error all the way up. Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2015-12-28 14:32:36 -05:00
Trond Myklebust	d0379a5d06	pNFS/flexfiles: Support server-supplied layoutstats sampling period Some servers want to be able to control the frequency with which clients report layoutstats, for instance, in order to monitor QoS for a particular file or set of file. In order to support this, the flexfiles layout allows the server to pass this info as a hint in the layout payload. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2015-12-28 14:32:36 -05:00
Linus Torvalds	8513342170	Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fix from Herbert Xu: "This fixes a bug in the algif_skcipher interface that can trigger a kernel WARN_ON from user-space. It does so by using the new skcipher interface which unlike the previous ablkcipher does not need to create extra geniv objects which is what was used to trigger the WARN_ON" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: algif_skcipher - Use new skcipher interface	2015-12-28 10:44:41 -08:00
Linus Torvalds	2c7143d4f5	Merge branch 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull key handling bugfix from James Morris: "Fix a race between keyctl_read() and keyctl_revoke()" * 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: KEYS: Fix race between read and revoke	2015-12-28 10:35:19 -08:00
Trond Myklebust	494f74a26c	NFS: Flush reclaim writes using FLUSH_COND_STABLE If there are already writes queued up for commit, then don't flush just this page even if it is a reclaim issue. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2015-12-28 13:34:59 -05:00
Trond Myklebust	b0ac1bd2bb	NFS: Background flush should not be low priority Background flush is needed in order to satisfy the global page limits. Don't subvert by reducing the priority. This should also address a write starvation issue that was reported by Neil Brown. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2015-12-28 13:30:42 -05:00
Trond Myklebust	1a093ceb05	NFSv4.1/pnfs: Fixup an lo->plh_block_lgets imbalance in layoutreturn Since commit `2d8ae84fbc`, nothing is bumping lo->plh_block_lgets in the layoutreturn path, so it should not be touched in nfs4_layoutreturn_release either. Fixes: `2d8ae84fbc` ("NFSv4.1/pnfs: Remove redundant lo->plh_block_lgets...") Cc: stable@vger.kernel.org # 4.3+ Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2015-12-28 13:23:36 -05:00
Trond Myklebust	762674f86d	NFSv4: Don't perform cached access checks before we've OPENed the file Donald Buczek reports that a nfs4 client incorrectly denies execute access based on outdated file mode (missing 'x' bit). After the mode on the server is 'fixed' (chmod +x) further execution attempts continue to fail, because the nfs ACCESS call updates the access parameter but not the mode parameter or the mode in the inode. The root cause is ultimately that the VFS is calling may_open() before the NFS client has a chance to OPEN the file and hence revalidate the access and attribute caches. Al Viro suggests: >>> Make nfs_permission() relax the checks when it sees MAY_OPEN, if you know >>> that things will be caught by server anyway? >> >> That can work as long as we're guaranteed that everything that calls >> inode_permission() with MAY_OPEN on a regular file will also follow up >> with a vfs_open() or dentry_open() on success. Is this always the >> case? > > 1) in do_tmpfile(), followed by do_dentry_open() (not reachable by NFS since > it doesn't have ->tmpfile() instance anyway) > > 2) in atomic_open(), after the call of ->atomic_open() has succeeded. > > 3) in do_last(), followed on success by vfs_open() > > That's all. All calls of inode_permission() that get MAY_OPEN come from > may_open(), and there's no other callers of that puppy. Reported-by: Donald Buczek <buczek@molgen.mpg.de> Link: https://bugzilla.kernel.org/show_bug.cgi?id=109771 Link: http://lkml.kernel.org/r/1451046656-26319-1-git-send-email-buczek@molgen.mpg.de Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2015-12-28 13:22:20 -05:00
Oliver Freyermuth	f7d7f59ab1	USB: cp210x: add ID for ELV Marble Sound Board 1 Add the USB device ID for ELV Marble Sound Board 1. Signed-off-by: Oliver Freyermuth <o.freyermuth@googlemail.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Johan Hovold <johan@kernel.org>	2015-12-28 19:07:35 +01:00
Pablo Neira Ayuso	5913beaf0d	netfilter: nfnetlink: pass down netns pointer to commit() and abort() callbacks Adapt callsites to avoid recurrent lookup of the netns pointer. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-12-28 18:43:15 +01:00
Pablo Neira Ayuso	7b8002a151	netfilter: nfnetlink: pass down netns pointer to call() and call_rcu() Adapt callsites to avoid recurrent lookup of the netns pointer. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-12-28 18:41:41 +01:00
Pablo Neira Ayuso	f4c756b4ea	netfilter: nf_tables: remove check against removal of inactive objects The following sequence inside a batch, although not very useful, is valid: add table foo ... delete table foo This may be generated by some robot while applying some incremental upgrade, so remove the defensive checks against this. This patch keeps the check on the get/dump path by now, we have to replace the inactive flag by introducing object generations. Reported-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-12-28 18:37:20 +01:00
Pablo Neira Ayuso	5ebe0b0eec	netfilter: nf_tables: destroy basechain and rules on netdevice removal If the netdevice is destroyed, the resources that are attached should be released too as they belong to the device that is now gone. Suggested-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-12-28 18:34:35 +01:00
Pablo Neira Ayuso	df05ef874b	netfilter: nf_tables: release objects on netns destruction We have to release the existing objects on netns removal otherwise we leak them. Chains are unregistered in first place to make sure no packets are walking on our rules and sets anymore. The object release happens by when we unregister the family via nft_release_afinfo() which is called from nft_unregister_afinfo() from the corresponding __net_exit path in every family. Reported-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-12-28 18:34:35 +01:00
Devesh Sharma	f41647ef06	RDMA/be2net: Remove open and close entry points Recently Dough Ledford reported a deadlock happening between ocrdma-load sequence and NetworkManager service issueing "open" on be2net interface. The deadlock happens when any be2net hook (e.g. open/close) is called in parallel to insmod ocrdma.ko. A. be2net is sending administrative open/close event to ocrdma holding device_list_mutex. It does this from ndo_open/ndo_stop hooks of be2net. So sequence of locks is rtnl_lock---> device_list lock B. When new ocrdma roce device gets registered, infiniband stack now takes rtnl_lock in ib_register_device() in GID initialization routines. So sequence of locks in this path is device_list lock ---> rtnl_lock. This improper locking sequence causes deadlock. In order to resolve the above deadlock condition, ocrdma intorduced a patch to stop listening to administrative open/close events generated from be2net driver. It now depends on link-state-change async-event generated from CNA. This change leaves behind dead code which used to generate administrative open/close events. This patch cleans-up all that dead code from be2net. Reported-by: Doug Ledford <dledford@redhat.com> CC: Sathya Perla <sathya.perla@avagotech.com> Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@avagotech.com> Signed-off-by: Selvin Xavier <selvin.xavier@avagotech.com> Signed-off-by: Devesh Sharma <devesh.sharma@avagotech.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-12-28 11:45:55 -05:00
Devesh Sharma	10a214dc99	RDMA/ocrdma: Depend on async link events from CNA Recently Dough Ledford reported a deadlock happening between ocrdma-load sequence and NetworkManager service issuing "open" on be2net interface. The deadlock happens when any be2net hook (e.g. open/close) is called in parallel to insmod ocrdma.ko. A. be2net is sending administrative open/close event to ocrdma holding device_list_mutex. It does this from ndo_open/ndo_stop hooks of be2net. So sequence of locks is rtnl_lock---> device_list lock B. When new ocrdma roce device gets registered, infiniband stack now takes rtnl_lock in ib_register_device() in GID initialization routines. So sequence of locks in this path is device_list lock ---> rtnl_lock. This improper locking sequence causes deadlock. With this patch we stop using administrative open and close events injected by be2net driver. These events were used to dispatch PORT_ACTIVE and PORT_ERROR events to the IB-stack. This patch implements a logic to receive async-link-events generated from CNA whenever link-state-change is detected. Now on, these async-events will be used to dispatch PORT_ACTIVE and PORT_ERROR events to IB-stack. Depending on async-events from CNA removes the need to hold device-list-mutex and thus breaks the busy-wait scenario. Reported-by: Doug Ledford <dledford@redhat.com> CC: Sathya Perla <sathya.perla@avagotech.com> Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@avagotech.com> Signed-off-by: Selvin Xavier <selvin.xavier@avagotech.com> Signed-off-by: Devesh Sharma <devesh.sharma@avagotech.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-12-28 11:45:54 -05:00
Devesh Sharma	36ac0db0db	RDMA/ocrdma: Dispatch only port event when port state changes Dispatch only port event to IB stack when port state changes. Don't explicitly modify qps to error. Let application listen to port events on async event queue or let QP fail with retry-exceeded completion error. Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@avagotech.com> Signed-off-by: Devesh Sharma <devesh.sharma@avagotech.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-12-28 11:45:54 -05:00
Devesh Sharma	c6002d5602	RDMA/ocrdma: Fix vlan-id assignment in qp parameters vlan-id is wrongly getting as 0 when PFC is enabled. Set vlan-id configured by user in QP parameters. In case vlan interface is not used, flash a warning to user to configure vlan and assign vlan-id as 0 in qp params. Fixes: `dbf727de74` ('IB/core: Use GID table in AH creation and dmac resolution') Cc: Matan Barak <matanb@mellanox.com> Signed-off-by: Devesh Sharma <devesh.sharma@avagotech.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-12-28 11:45:54 -05:00
Joerg Roedel	a639a8eecf	iommu/amd: Preallocate dma_ops apertures based on dma_mask Preallocate between 4 and 8 apertures when a device gets it dma_mask. With more apertures we reduce the lock contention of the domain lock significantly. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-12-28 17:18:54 +01:00
Joerg Roedel	7b5e25b84e	iommu/amd: Use trylock to aquire bitmap_lock First search for a non-contended aperture with trylock before spinning. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-12-28 17:18:54 +01:00
Joerg Roedel	5f6bed5005	iommu/amd: Make dma_ops_domain->next_index percpu Make this pointer percpu so that we start searching for new addresses in the range we last stopped and which is has a higher probability of being still in the cache. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-12-28 17:18:54 +01:00
Joerg Roedel	92d420ec02	iommu/amd: Relax locking in dma_ops path Remove the long holding times of the domain->lock and rely on the bitmap_lock instead. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-12-28 17:18:54 +01:00
Joerg Roedel	a73c156665	iommu/amd: Initialize new aperture range before making it visible Make sure the aperture range is fully initialized before it is visible to the address allocator. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-12-28 17:18:53 +01:00
Joerg Roedel	7bfa5bd270	iommu/amd: Build io page-tables with cmpxchg64 This allows to build up the page-tables without holding any locks. As a consequence it removes the need to pre-populate dma_ops page-tables. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-12-28 17:18:53 +01:00
Joerg Roedel	266a3bd28f	iommu/amd: Allocate new aperture ranges in dma_ops_alloc_addresses It really belongs there and not in __map_single. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-12-28 17:18:53 +01:00
Joerg Roedel	4eeca8c5e7	iommu/amd: Optimize dma_ops_free_addresses Don't flush the iommu tlb when we free something behind the current next_bit pointer. Update the next_bit pointer instead and let the flush happen on the next wraparound in the allocation path. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-12-28 17:18:53 +01:00
Joerg Roedel	ab7032bb9c	iommu/amd: Remove need_flush from struct dma_ops_domain The flushing of iommu tlbs is now done on a per-range basis. So there is no need anymore for domain-wide flush tracking. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-12-28 17:18:53 +01:00
Joerg Roedel	2a87442c5b	iommu/amd: Iterate over all aperture ranges in dma_ops_area_alloc This way we don't need to care about the next_index wrapping around in dma_ops_alloc_addresses. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-12-28 17:18:52 +01:00
Joerg Roedel	d41ab09896	iommu/amd: Flush iommu tlb in dma_ops_free_addresses Instead of setting need_flush, do the flush directly in dma_ops_free_addresses. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-12-28 17:18:52 +01:00
Joerg Roedel	ebaecb423b	iommu/amd: Rename dma_ops_domain->next_address to next_index It points to the next aperture index to allocate from. We don't need the full address anymore because this is now tracked in struct aperture_range. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-12-28 17:18:52 +01:00
Joerg Roedel	05ab49e005	iommu/amd: Remove 'start' parameter from dma_ops_area_alloc Parameter is not needed because the value is part of the already passed in struct dma_ops_domain. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-12-28 17:18:52 +01:00
Joerg Roedel	ccb50e03da	iommu/amd: Flush iommu tlb in dma_ops_aperture_alloc() Since the allocator wraparound happens in this function now, flush the iommu tlb there too. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-12-28 17:18:51 +01:00
Joerg Roedel	60e6a7cb44	iommu/amd: Retry address allocation within one aperture Instead of skipping to the next aperture, first try again in the current one. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-12-28 17:18:51 +01:00
Joerg Roedel	ae62d49c7a	iommu/amd: Move aperture_range.offset to another cache-line Moving it before the pte_pages array puts in into the same cache-line as the spin-lock and the bitmap array pointer. This should safe a cache-miss. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-12-28 17:18:51 +01:00
Joerg Roedel	a0f51447f4	iommu/amd: Add dma_ops_aperture_alloc() function Make this a wrapper around iommu_ops_area_alloc() for now and add more logic to this function later on. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-12-28 17:18:51 +01:00

... 112 113 114 115 116 ...

575451 Commits