The number of TLB lines was increased from 16 on Tegra30 to 32 on
Tegra114 and later. Parameterize the value so that the initial default
can be set accordingly.
On Tegra30, initializing the value to 32 would effectively disable the
TLB and hence cause massive latencies for memory accesses translated
through the SMMU. This is especially noticeable for isochronuous clients
such as display, whose FIFOs would continuously underrun.
Fixes: 8918465163 ("memory: Add NVIDIA Tegra memory controller support")
Signed-off-by: Thierry Reding <treding@nvidia.com>
This code is used both when creating a new page directory entry and when
tearing it down, with only the PDE value changing between both cases.
Factor the code out so that it can be reused.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
[treding@nvidia.com: make commit message more accurate]
Signed-off-by: Thierry Reding <treding@nvidia.com>
Extract the use count reference accounting into a separate function and
separate it from allocating the PTE.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
[treding@nvidia.com: extract and write commit message]
Signed-off-by: Thierry Reding <treding@nvidia.com>
Rather than explicitly zeroing pages allocated via alloc_page(), add
__GFP_ZERO to the gfp mask to ask the allocator for zeroed pages.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Remove the unnecessary manipulation of the PageReserved flags in the
Tegra SMMU driver. None of this is required as the page(s) remain
private to the SMMU driver.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Use the DMA API instead of calling architecture internal functions in
the Tegra SMMU driver.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Pass smmu_flush_ptc() the device address rather than struct page
pointer.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Thierry Reding <treding@nvidia.com>
smmu_flush_ptc() is used in two modes: one is to flush an individual
entry, the other is to flush all entries. We know at the call site
which we require. Split the function into smmu_flush_ptc_all() and
smmu_flush_ptc().
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Drivers should not be using __cpuc_* functions nor outer_cache_flush()
directly. This change partly cleans up tegra-smmu.c.
The only difference between cache handling of the tegra variants is
Denver, which omits the call to outer_cache_flush(). This is due to
Denver being an ARM64 CPU, and the ARM64 architecture does not provide
this function. (This, in itself, is a good reason why these should not
be used.)
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
[treding@nvidia.com: fix build failure on 64-bit ARM]
Signed-off-by: Thierry Reding <treding@nvidia.com>
Use kcalloc() to allocate the use-counter array for the page directory
entries/page tables. Using kcalloc() allows us to be provided with
zero-initialised memory from the allocators, rather than initialising
it ourselves.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Store the struct page pointer for the second level page tables, rather
than working back from the page directory entry. This is necessary as
we want to eliminate the use of physical addresses used with
arch-private functions, switching instead to use the streaming DMA API.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Fix the page table lookup in the unmap and iova_to_phys methods.
Neither of these methods should allocate a page table; a missing page
table should be treated the same as no mapping present.
More importantly, using as_get_pte() for an IOVA corresponding with a
non-present page table entry increments the use-count for the page
table, on the assumption that the caller of as_get_pte() is going to
setup a mapping. This is an incorrect assumption.
Fix both of these bugs by providing a separate helper which only looks
up the page table, but never allocates it. This is akin to pte_offset()
for CPU page tables.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Add a pair of helpers to get the page directory and page table indexes.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Factor out the common PTE setting code into a separate function.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Thierry Reding <treding@nvidia.com>
The Tegra SMMU unmap path has several problems:
1. as_pte_put() can perform a write-after-free
2. tegra_smmu_unmap() can perform cache maintanence on a page we have
just freed.
3. when a page table is unmapped, there is no CPU cache maintanence of
the write clearing the page directory entry, nor is there any
maintanence of the IOMMU to ensure that it sees the page table has
gone.
Fix this by getting rid of as_pte_put(), and instead coding the PTE
unmap separately from the PDE unmap, placing the PDE unmap after the
PTE unmap has been completed.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Thierry Reding <treding@nvidia.com>
iova_to_phys() has several problems:
(a) iova_to_phys() is supposed to return 0 if there is no entry present
for the iova.
(b) if as_get_pte() fails, we oops the kernel by dereferencing a NULL
pointer. Really, we should not even be trying to allocate a page
table at all, but should only be returning the presence of the 2nd
level page table. This will be fixed in a subsequent patch.
Treat both of these conditions as "no mapping" conditions.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Thierry Reding <treding@nvidia.com>
When a 'struct device_domain_info' is created as an alias
for another device, this struct will not be re-used when the
real device is encountered. Fix that to avoid duplicate
device_domain_info structures being added.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
For devices without an PCI alias there will be two
device_domain_info structures added. Prevent that by
checking if the alias is different from the device.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This struct contains all necessary information for the
function already. Also handle the info->dev == NULL case
while at it.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The code in the locked section does not touch anything
protected by the dmar_global_lock. Remove it from there.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
When this lock is held the device_domain_lock is also
required to make sure the device_domain_info does not vanish
while in use. So this lock can be removed as it gives no
additional protection.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Move the code to attach/detach domains to iommus and vice
verce into a single function to make sure there are no
dangling references.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This makes domain attachment more synchronous with domain
deattachment. The domain<->iommu link is released in
dmar_remove_one_dev_info.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Rename this function and the ones further down its
call-chain to domain_context_clear_*. In particular this
means:
iommu_detach_dependent_devices -> domain_context_clear
iommu_detach_dev_cb -> domain_context_clear_one_cb
iommu_detach_dev -> domain_context_clear_one
These names match a lot better with its
domain_context_mapping counterparts.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Rename the function to dmar_remove_one_dev_info to match is
name better with its dmar_insert_one_dev_info counterpart.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Rename this function to dmar_insert_one_dev_info() to match
the name better with its counter part function
domain_remove_one_dev_info().
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Do the context-mapping of devices from a single place in the
call-path and clean up the other call-sites.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Just call domain_remove_one_dev_info() for all devices in
the domain instead of reimplementing the functionality.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
We don't need to do an expensive search for domain-ids
anymore, as we keep track of per-iommu domain-ids.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This replaces the dmar_domain->iommu_bmp with a similar
reference count array. This allows us to keep track of how
many devices behind each iommu are attached to the domain.
This is necessary for further simplifications and
optimizations to the iommu<->domain attachment code.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This field is now obsolete because all places use the
per-iommu domain-ids. Kill the remaining uses of this field
and remove it.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
There is no reason for this special handling of the
si_domain. The per-iommu domain-id can be allocated
on-demand like for any other domain. So remove the
pre-allocation code.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This function can figure out the domain-id to use itself
from the iommu_did array. This is more reliable over
different domain types and brings us one step further to
remove the domain->id field.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Get rid of the special cases for VM domains vs. non-VM
domains and simplify the code further to just handle the
hardware passthrough vs. page-table case.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
There is no reason to pass the translation type through
multiple layers. It can also be determined in the
domain_context_mapping_one function directly.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The special case for VM domains is not needed, as other
domains could be attached to the iommu in the same way. So
get rid of this special case.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This array is indexed by the domain-id and contains the
pointers to the domains attached to this iommu. Modern
systems support 65536 domain ids, so that this array has a
size of 512kb, per iommu.
This is a huge waste of space, as the array is usually
sparsely populated. This patch makes the array
two-dimensional and allocates the memory for the domain
pointers on-demand.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Instead of searching in the domain array for already
allocated domain ids, keep track of them explicitly.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
With the users fully converted to DMA API operations, it's dead, Jim.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
With the io-pgtable code now enforcing its own appropriate sync points,
the vestigial flush_pgtable callback becomes entirely redundant, so
remove it altogether.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
With the io-pgtable code now enforcing its own appropriate sync points,
the vestigial flush_pgtable callback becomes entirely redundant, so
remove it altogether.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
With all current users now opted in to DMA API operations, make the
iommu_dev pointer mandatory, rendering the flush_pgtable callback
redundant for cache maintenance. However, since the DMA calls could be
nops in the case of a coherent IOMMU, we still need to ensure the page
table updates are fully synchronised against a subsequent page table
walk. In the unmap path, the TLB sync will usually need to do this
anyway, so just cement that requirement; in the map path which may
consist solely of cacheable memory writes (in the coherent case),
insert an appropriate barrier at the end of the operation, and obviate
the need to call flush_pgtable on every individual update for
synchronisation.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
[will: slight clarification to tlb_sync comment]
Signed-off-by: Will Deacon <will.deacon@arm.com>
With the correct DMA API calls now integrated into the io-pgtable code,
let that handle the flushing of non-coherent page table updates.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
With the correct DMA API calls now integrated into the io-pgtable code,
let that handle the flushing of non-coherent page table updates.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
With the correct DMA API calls now integrated into the io-pgtable code,
let that handle the flushing of non-coherent page table updates.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Currently, users of the LPAE page table code are (ab)using dma_map_page()
as a means to flush page table updates for non-coherent IOMMUs. Since
from the CPU's point of view, creating IOMMU page tables *is* passing
DMA buffers to a device (the IOMMU's page table walker), there's little
reason not to use the DMA API correctly.
Allow IOMMU drivers to opt into DMA API operations for page table
allocation and updates by providing their appropriate device pointer.
The expectation is that an LPAE IOMMU should have a full view of system
memory, so use streaming mappings to avoid unnecessary pressure on
ZONE_DMA, and treat any DMA translation as a warning sign.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
A late change to the SMMUv3 architecture ensures that the OAS field
will be monotonically increasing, so we can assume that an unknown OAS
is at least 48-bit and use that, rather than fail the device probe.
Signed-off-by: Will Deacon <will.deacon@arm.com>
The debug_read_tlb() uses the sprintf() functions directly on the buffer
allocated by buf = kmalloc(count), without taking into account the size
of the buffer, with the consequence corrupting the heap, depending on
the count requested by the user.
The patch fixes the issue replacing sprintf() by seq_printf().
Signed-off-by: Salva Peiró <speirofr@gmail.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Debugging domain ID leakage typically requires long running tests in
order to exhaust the domain ID space or kernel instrumentation to
track the setting and clearing of bits. A couple trivial intel-iommu
specific sysfs extensions make it much easier to expose the IOMMU
capabilities and current usage.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
iommu_load_old_irte() appears to leak the old_irte mapping after use.
Cc: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This makes sure it won't be possible to accidentally leak format
strings into iommu device names. Current name allocations are safe,
but this makes the "%s" explicit.
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Printing "IOMMU is currently not supported for PCI" for
every PCI device probed on a DT-based system proves to be
both irritatingly noisy and confusing to users who have
misinterpreted it to mean they can no longer use VFIO device
assignment.
Since configuring DMA masks for PCI devices via
of_dma_configure() has not in fact changed anything with
regard to IOMMUs there really is nothing to warn about here;
shut it up.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Fix all the occurrences of the following check warning
generated with the checkpatch --strict option:
"CHECK: Alignment should match open parenthesis"
Signed-off-by: Suman Anna <s-anna@ti.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Switch to using the BIT(x) macros in omap-iommu.h where
possible. This eliminates the following checkpatch check
warning:
"CHECK: Prefer using the BIT macro"
A couple of the warnings were ignored for better readability
of the bit-shift for the different values.
Signed-off-by: Suman Anna <s-anna@ti.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Switch to using the BIT(x) macros in omap-iopgtable.h where
possible. This eliminates the following checkpatch check
warning:
"CHECK: Prefer using the BIT macro"
A couple of macros that used zero bit shifting are defined
directly to avoid the above warning on one of the macros.
Signed-off-by: Suman Anna <s-anna@ti.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Fix couple of checkpatch warnings of the type,
"WARNING: Possible unnecessary 'out of memory' message"
Signed-off-by: Suman Anna <s-anna@ti.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Remove the trailing semi-colon in the DEBUG_FOPS_RO macro
definition. This fixes the checking warning,
"WARNING: macros should not use a trailing semicolon"
Signed-off-by: Suman Anna <s-anna@ti.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
There are couple of unions defined in the structures
iotlb_entry and cr_regs. There are no usage/references
to some of these union fields in the code, so clean
them up and simplify the structures.
Signed-off-by: Suman Anna <s-anna@ti.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Protect the omap-pgtable.h header against double inclusion in
source code by using the standard include guard mechanism.
Signed-off-by: Suman Anna <s-anna@ti.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The main OMAP IOMMU driver file has some helper functions used
by the OMAP IOMMU debugfs functionality, and there is already a
dedicated source file omap-iommu-debug.c dealing with these debugfs
routines. Move all these functions to the omap-iommu-debug.c file,
so that all the debugfs related routines are in one place.
The move required exposing some new functions and moving some
definitions to the internal omap-iommu.h header file.
Signed-off-by: Suman Anna <s-anna@ti.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The OMAP IOMMU driver has been adapted to the IOMMU framework
for a while now, and it does not support being built as a
module anymore. So, remove all the module references from the
OMAP IOMMU driver.
While at it, also relocate a comment around the subsys_initcall
to avoid a checkpatch strict warning about using a blank line
after function/struct/union/enum declarations.
Signed-off-by: Suman Anna <s-anna@ti.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
With the grouping of multi-function devices a non-ATS
capable device might also end up in the same domain as an
IOMMUv2 capable device.
So handle this situation gracefully and don't consider it a
bug anymore.
Tested-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Currently, we detect whether the SMMU has coherent page table walk
capability from the IDR0.CTTW field, and base our cache maintenance
decisions on that. In preparation for fixing the bogus DMA API usage,
however, we need to ensure that the DMA API agrees about this, which
necessitates deferring to the dma-coherent property in the device tree
for the final say.
As an added bonus, since systems exist where an external CTTW signal
has been tied off incorrectly at integration, allowing DT to override
it offers a neat workaround for coherency issues with such SMMUs.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
If the StreamIDs in a system can all be resolved by a single level-2
stream table (i.e. SIDSIZE < SPLIT), then we currently get our maths
wrong and allocate the largest strtab we support, thanks to unsigned
overflow in our calculation.
This patch fixes the issue by checking the SIDSIZE explicitly when
calculating the size of our first-level stream table.
Reported-by: Matt Evans <matt.evans@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The MSI memory attributes in the SMMUv3 driver are from an older
revision of the spec, which doesn't match the current implementations.
Out with the old, in with the new.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
When an ARM SMMUv3 instance supports PRI, the driver registers
an interrupt handler, but fails to enable the generation of
such interrupt at the SMMU level.
This patches simply moves the enable flags to a variable that
gets updated by the PRI handling code before being written to the
SMMU register.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Some AMD systems also have non-PCI devices which can do DMA.
Those can't be handled by the AMD IOMMU, as the hardware can
only handle PCI. These devices would end up with no dma_ops,
as neither the per-device nor the global dma_ops will get
set. SWIOTLB provides global dma_ops when it is active, so
make sure there are global dma_ops too when swiotlb is
disabled.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
In passthrough mode (iommu=pt) all devices are identity
mapped. If a device does not support 64bit DMA it might
still need remapping. Make sure swiotlb is initialized to
provide this remapping.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Since devices with IOMMUv2 functionality might be in the
same group as devices without it, allow those devices in
IOMMUv2 domains too.
Otherwise attaching the group with the IOMMUv2 device to the
domain will fail.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Remove the AMD IOMMU driver implementation for passthrough
mode and rely on the new iommu core features for that.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Since the conversion to default domains the
iommu_attach_device function only works for devices with
their own group. But this isn't always true for current
IOMMUv2 capable devices, so use iommu_attach_group instead.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The iova library has use outside the intel-iommu driver, thus make it a
module.
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Use EXPORT_SYMBOL_GPL() to export the iova library symbols. The symbols
include:
init_iova_domain();
iova_cache_get();
iova_cache_put();
iova_cache_init();
alloc_iova();
find_iova();
__free_iova();
free_iova();
put_iova_domain();
reserve_iova();
copy_reserved_iova();
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
This is necessary to separate intel-iommu from the iova library.
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Currently, allocating a size-aligned IOVA region quietly adjusts the
actual allocation size in the process, returning a rounded-up
power-of-two-sized allocation. This results in mismatched behaviour in
the IOMMU driver if the original size was not a power of two, where the
original size is mapped, but the rounded-up IOVA size is unmapped.
Whilst some IOMMUs will happily unmap already-unmapped pages, others
consider this an error, so fix it by computing the necessary alignment
padding without altering the actual allocation size. Also clean up by
making pad_size unsigned, since its callers always pass unsigned values
and negative padding makes little sense here anyway.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
This continues the attempt to fix commit fb170fb4c5 ("iommu/vt-d:
Introduce helper functions to make code symmetric for readability").
The previous attempt in commit 7168440690 ("iommu/vt-d: Detach
domain *only* from attached iommus") overlooked the fact that
dmar_domain.iommu_bmp gets cleared for VM domains when devices are
detached:
intel_iommu_detach_device
domain_remove_one_dev_info
domain_detach_iommu
The domain is detached from the iommu, but the iommu is still attached
to the domain, for whatever reason. Thus when we get to domain_exit(),
we can't rely on iommu_bmp for VM domains to find the active iommus,
we must check them all. Without that, the corresponding bit in
intel_iommu.domain_ids doesn't get cleared and repeated VM domain
creation and destruction will run out of domain IDs. Meanwhile we
still can't call iommu_detach_domain() on arbitrary non-VM domains or
we risk clearing in-use domain IDs, as 7168440690 attempted to
address.
It's tempting to modify iommu_detach_domain() to test the domain
iommu_bmp, but the call ordering from domain_remove_one_dev_info()
prevents it being able to work as fb170fb4c5 seems to have intended.
Caching of unused VM domains on the iommu object seems to be the root
of the problem, but this code is far too fragile for that kind of
rework to be proposed for stable, so we simply revert this chunk to
its state prior to fb170fb4c5.
Fixes: fb170fb4c5 ("iommu/vt-d: Introduce helper functions to make
code symmetric for readability")
Fixes: 7168440690 ("iommu/vt-d: Detach domain *only* from attached
iommus")
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Cc: stable@vger.kernel.org # v3.17+
Signed-off-by: Joerg Roedel <jroedel@suse.de>
We check the ATS state (enabled/disabled) and fetch the PCI ATS Invalidate
Queue Depth in performance-sensitive paths. It's easy to cache these,
which removes dependencies on PCI.
Remember the ATS enabled state. When enabling, read the queue depth once
and cache it in the device_domain_info struct. This is similar to what
amd_iommu.c does.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Joerg Roedel <jroedel@suse.de>
Acked-by: Joerg Roedel <jroedel@suse.de>
Hisilicon SMMUv3 devices treat CMD_PREFETCH_CONFIG as a illegal command,
execute it will trigger GERROR interrupt. Although the gerror code manage
to turn the prefetch into a SYNC, and the system can continue to run
normally, but it's ugly to print error information.
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
[will: extended binding documentation]
Signed-off-by: Will Deacon <will.deacon@arm.com>
Because we will choose the minimum value between STRTAB_L1_SZ_SHIFT and
IDR1.SIDSIZE, so enlarge STRTAB_L1_SZ_SHIFT will not impact the platforms
whose IDR1.SIDSIZE is smaller than old STRTAB_L1_SZ_SHIFT value.
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The arm64 CPU architecture defines TCR[8:11] as holding the inner and
outer memory attributes for TTBR0.
This patch fixes the ARM SMMUv3 driver to pack these bits into the
context descriptor, rather than picking up the TTBR1 attributes as it
currently does.
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
STRTAB_BASE_CFG.LOG2SIZE should be set to log2(entries), where entries
is the *total* number of entries in the stream table, not just the first
level.
This patch fixes the register setting, which was previously being set to
the size of the l1 thanks to a multi-use "size" variable.
Reported-by: Zhen Lei <thunder.leizhen@huawei.com>
Tested-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The element size of cfg->strtab is just one DWORD, so we should use a
multiply operation instead of a shift when calculating the level 1
index.
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Four fixes have queued up to fix regressions introduced after v4.1:
* Don't fail IOMMU driver initialization when the add_device
call-back returns -ENODEV, as that just means that the device
is not translated by the IOMMU. This is pretty common on ARM.
* Two fixes for the ARM-SMMU driver for a wrong feature check
and to remove a redundant NULL check.
* A fix for the AMD IOMMU driver to fix a boot panic on systems
where the BIOS requests Unity Mappings in the IVRS table.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABAgAGBQJVlBZTAAoJECvwRC2XARrj6ngQAI/fjz1cW4WYVDBGPffoHhtB
9BH72XQTyt2OaQPsiECnWkl4bAoJ0TmS6dvcYTf75znqClL1Eez/TqfATEPOSFwI
7N0qkkVc3OffvF3XnxksNNV4tLaojdIFNdxAVrrOOuWDeNKC4Rkvcx+Typ9Y7CxI
YR4+qdkPqjYVn13JVMvZDr6SLAnvfHPSIcW1CP3vQzH6w4mWJSmRMLd42Xel1Kb7
hvEDqlT6k6KJxBt3W601eo3sgqZ1AJTFiY4RFh0diHbHQlgg1PcsbWsL5QJMHozi
SSHFDCxag9NgHy97OTcGuDptD9F9fI4+t1ANtWULis7+sN5Bx5/xsG/VRJ9fpiMN
RNlcCMCufC89EHXdoPuAvOcoPmUHqv1CU9I6+DpOo9FQrGMoXDrdosApNaJZ73E/
qtgzJN0hueeBOvB7Hk+U+mI4BSzAtGguHoO+LzjrZBzoW5L9WWuznmHYriLE0bMm
uKnZFBEnXFe8DugQ3ta7PkyzIWsnD0O++NRueN9pSOLvOUpNk6Iddv4hER9QwwPA
RQOfsASEo1ResAd9SJGnPX1MQxXxl4OB/9R1Q648lQguAj7WhV1nn21cISgLjESC
nEKma+A7dGT6nOTm/wK+wokAgndOGlztMU9wJBK12ozxrhbO+0VP/oTjhhmvcWGb
DbpzhyeCpi1qLmsZe0x7
=NzGR
-----END PGP SIGNATURE-----
Merge tag 'iommu-fixes-v4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pul IOMMU fixes from Joerg Roedel:
"Four fixes have queued up to fix regressions introduced after v4.1:
- Don't fail IOMMU driver initialization when the add_device
call-back returns -ENODEV, as that just means that the device is
not translated by the IOMMU. This is pretty common on ARM.
- Two fixes for the ARM-SMMU driver for a wrong feature check and to
remove a redundant NULL check.
- A fix for the AMD IOMMU driver to fix a boot panic on systems where
the BIOS requests Unity Mappings in the IVRS table"
* tag 'iommu-fixes-v4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
iommu/amd: Introduce protection_domain_init() function
iommu/arm-smmu: Delete an unnecessary check before the function call "free_io_pgtable_ops"
iommu/arm-smmu: Fix broken ATOS check
iommu: Ignore -ENODEV errors from add_device call-back
This function contains the common parts between the
initialization of dma_ops_domains and usual protection
domains. This also fixes a long-standing bug which was
uncovered by recent changes, in which the api_lock was not
initialized for dma_ops_domains.
Reported-by: George Wang <xuw2015@gmail.com>
Tested-by: George Wang <xuw2015@gmail.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The free_io_pgtable_ops() function tests whether its argument is NULL
and then returns immediately. Thus the test around the call is not needed.
This issue was detected by using the Coccinelle software.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Commit 83a60ed8f0 ("iommu/arm-smmu: fix ARM_SMMU_FEAT_TRANS_OPS
condition") accidentally negated the ID0_ATOSNS predicate in the ATOS
feature check, causing the driver to attempt ATOS requests on SMMUv2
hardware without the ATOS feature implemented.
This patch restores the predicate to the correct value.
Cc: <stable@vger.kernel.org> # 4.0+
Reported-by: Varun Sethi <varun.sethi@freescale.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The -ENODEV error just means that the device is not
translated by an IOMMU. We shouldn't bail out of iommu
driver initialization when that happens, as this is a common
scenario on ARM.
Not returning -ENODEV in the drivers would be a bad idea, as
the IOMMU core would have no indication whether a device is
translated or not. This indication is not used at the
moment, but will probably be in the future.
Fixes: 19762d7 ("iommu: Propagate error in add_iommu_group")
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Tested-by: Eric Auger <eric.auger@linaro.org>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Pull drm updates from Dave Airlie:
"This is the main drm pull request for v4.2.
I've one other new driver from freescale on my radar, it's been posted
and reviewed, I'd just like to get someone to give it a last look, so
maybe I'll send it or maybe I'll leave it.
There is no major nouveau changes in here, Ben was working on
something big, and we agreed it was a bit late, there wasn't anything
else he considered urgent to merge.
There might be another msm pull for some bits that are waiting on
arm-soc, I'll see how we time it.
This touches some "of" stuff, acks are in place except for the fixes
to the build in various configs,t hat I just applied.
Summary:
New drivers:
- virtio-gpu:
KMS only pieces of driver for virtio-gpu in qemu.
This is just the first part of this driver, enough to run
unaccelerated userspace on. As qemu merges more we'll start
adding the 3D features for the virgl 3d work.
- amdgpu:
a new driver from AMD to driver their newer GPUs. (VI+)
It contains a new cleaner userspace API, and is a clean
break from radeon moving forward, that AMD are going to
concentrate on. It also contains a set of register headers
auto generated from AMD internal database.
core:
- atomic modesetting API completed, enabled by default now.
- Add support for mode_id blob to atomic ioctl to complete interface.
- bunch of Displayport MST fixes
- lots of misc fixes.
panel:
- new simple panels
- fix some long-standing build issues with bridge drivers
radeon:
- VCE1 support
- add a GPU reset counter for userspace
- lots of fixes.
amdkfd:
- H/W debugger support module
- static user-mode queues
- support killing all the waves when a process terminates
- use standard DECLARE_BITMAP
i915:
- Add Broxton support
- S3, rotation support for Skylake
- RPS booting tuning
- CPT modeset sequence fixes
- ns2501 dither support
- enable cmd parser on haswell
- cdclk handling fixes
- gen8 dynamic pte allocation
- lots of atomic conversion work
exynos:
- Add atomic modesetting support
- Add iommu support
- Consolidate drm driver initialization
- and MIC, DECON and MIPI-DSI support for exynos5433
omapdrm:
- atomic modesetting support (fixes lots of things in rewrite)
tegra:
- DP aux transaction fixes
- iommu support fix
msm:
- adreno a306 support
- various dsi bits
- various 64-bit fixes
- NV12MT support
rcar-du:
- atomic and misc fixes
sti:
- fix HDMI timing complaince
tilcdc:
- use drm component API to access tda998x driver
- fix module unloading
qxl:
- stability fixes"
* 'drm-next' of git://people.freedesktop.org/~airlied/linux: (872 commits)
drm/nouveau: Pause between setting gpu to D3hot and cutting the power
drm/dp/mst: close deadlock in connector destruction.
drm: Always enable atomic API
drm/vgem: Set unique to "vgem"
of: fix a build error to of_graph_get_endpoint_by_regs function
drm/dp/mst: take lock around looking up the branch device on hpd irq
drm/dp/mst: make sure mst_primary mstb is valid in work function
of: add EXPORT_SYMBOL for of_graph_get_endpoint_by_regs
ARM: dts: rename the clock of MIPI DSI 'pll_clk' to 'sclk_mipi'
drm/atomic: Don't set crtc_state->enable manually
drm/exynos: dsi: do not set TE GPIO direction by input
drm/exynos: dsi: add support for MIC driver as a bridge
drm/exynos: dsi: add support for Exynos5433
drm/exynos: dsi: make use of array for clock access
drm/exynos: dsi: make use of driver data for static values
drm/exynos: dsi: add macros for register access
drm/exynos: dsi: rename pll_clk to sclk_clk
drm/exynos: mic: add MIC driver
of: add helper for getting endpoint node of specific identifiers
drm/exynos: add Exynos5433 decon driver
...
Some of these are for drivers/soc, where we're now putting
SoC-specific drivers these days. Some are for other driver subsystems
where we have received acks from the appropriate maintainers.
Some highlights:
- simple-mfd: document DT bindings and misc updates
- migrate mach-berlin to simple-mfd for clock, pinctrl and reset
- memory: support for Tegra132 SoC
- memory: introduce tegra EMC driver for scaling memory frequency
- misc. updates for ARM CCI and CCN busses
Conflicts:
arch/arm64/boot/dts/arm/juno-motherboard.dtsi
Trivial add/add conflict with our dt branch.
Resolution: take both sides.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJVi4RRAAoJEFk3GJrT+8ZljIcQAIsqxM/o0drd90xTJ6ex9h0B
RmqVLTDgesHmBacJ+SBsa9/ybFIM1uErByftc1dmKankEQVXW3wcH7keQnoStPT2
zTEjadHgZ/ARYjV/oG5oohjfDZpO1kECVHL8O8RmcWxgzRB3az1IW2eD+dzrga/Y
R7K6D8rDHMADIUmv0e0DzvQEbSUYdCx3rBND1qZznwZDP3NoivLkOG5MTraccLbQ
ouCRoZtyNYD5Lxk+BHLBepnxAa0Ggc6IjEmiUv8fF2OYdu0OruMliT4rcAtOSmzg
2Y7pP85h8u0CxbJDkOyc+2BELyKo7Hv97XtDNNbRYABTMXdskRIadXt4Sh4mwFtM
nvlhB4ovbIX7noECJToEkSAgmStLSUwA3R6+DVdLbeQY4uSuXuTRhiWHMyQB6va9
CdjJDk2RE0dZ77c5ZoUnUDtBe4cULU/n4agpYkKMf/HcpnqMUwZzP4KZbbPMBpgL
0CVTt3YrEcjoU7g0SFHhOGPSgl4yIXKU2eHEscokyFYLrS5zRWepmUEmlSoaWn+W
p7pJE65TvOGf2xbaWI+UBeK/3ZG7XAP8qUfhsi7NS4bV6oFCk/foqsWAuru0H7OW
2Gk8fuF0qLgE1eFWQp8BHZ4IUeytoWbnGhhHXh8zH39SKAVncOiAGDNfuEP9CyXJ
fZFfruYrnz2emOwj2v9m
=02Gm
-----END PGP SIGNATURE-----
Merge tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC driver updates from Kevin Hilman:
"Some of these are for drivers/soc, where we're now putting
SoC-specific drivers these days. Some are for other driver subsystems
where we have received acks from the appropriate maintainers.
Some highlights:
- simple-mfd: document DT bindings and misc updates
- migrate mach-berlin to simple-mfd for clock, pinctrl and reset
- memory: support for Tegra132 SoC
- memory: introduce tegra EMC driver for scaling memory frequency
- misc. updates for ARM CCI and CCN busses"
* tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (48 commits)
drivers: soc: sunxi: Introduce SoC driver to map SRAMs
arm-cci: Add aliases for PMU events
arm-cci: Add CCI-500 PMU support
arm-cci: Sanitise CCI400 PMU driver specific code
arm-cci: Abstract handling for CCI events
arm-cci: Abstract out the PMU counter details
arm-cci: Cleanup PMU driver code
arm-cci: Do not enable CCI-400 PMU by default
firmware: qcom: scm: Add HDCP Support
ARM: berlin: add an ADC node for the BG2Q
ARM: berlin: remove useless chip and system ctrl compatibles
clk: berlin: drop direct of_iomap of nodes reg property
ARM: berlin: move BG2Q clock node
ARM: berlin: move BG2CD clock node
ARM: berlin: move BG2 clock node
clk: berlin: prepare simple-mfd conversion
pinctrl: berlin: drop SoC stub provided regmap
ARM: berlin: move pinctrl to simple-mfd nodes
pinctrl: berlin: prepare to use regmap provided by syscon
reset: berlin: drop arch_initcall initialization
...
This time with bigger changes than usual:
* A new IOMMU driver for the ARM SMMUv3. This IOMMU is pretty
different from SMMUv1 and v2 in that it is configured through
in-memory structures and not through the MMIO register region.
The ARM SMMUv3 also supports IO demand paging for PCI devices
with PRI/PASID capabilities, but this is not implemented in
the driver yet.
* Lots of cleanups and device-tree support for the Exynos IOMMU
driver. This is part of the effort to bring Exynos DRM support
upstream.
* Introduction of default domains into the IOMMU core code. The
rationale behind this is to move functionalily out of the
IOMMU drivers to common code to get to a unified behavior
between different drivers.
The patches here introduce a default domain for iommu-groups
(isolation groups). A device will now always be attached to a
domain, either the default domain or another domain handled by
the device driver. The IOMMU drivers have to be modified to
make use of that feature. So long the AMD IOMMU driver is
converted, with others to follow.
* Patches for the Intel VT-d drvier to fix DMAR faults that
happen when a kdump kernel boots. When the kdump kernel boots
it re-initializes the IOMMU hardware, which destroys all
mappings from the crashed kernel. As this happens before
the endpoint devices are re-initialized, any in-flight DMA
causes a DMAR fault. These faults cause PCI master aborts,
which some devices can't handle properly and go into an
undefined state, so that the device driver in the kdump kernel
fails to initialize them and the dump fails.
This is now fixed by copying over the mapping structures (only
context tables and interrupt remapping tables) from the old
kernel and keep the old mappings in place until the device
driver of the new kernel takes over. This emulates the the
behavior without an IOMMU to the best degree possible.
* A couple of other small fixes and cleanups.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABAgAGBQJViSIWAAoJECvwRC2XARrjl+cP/2FXS7SWDq91VFiIZfXfPt8H
C5Ef3OGWCnMzn4MKE1ExkyDhC+AH6pF1s4zi3XfT6b8iOA+DUpa51rxJjixszt31
tQwmvB7hWu4mznGxSN7EA0Pm0l/v3tBAY5BvG598af0aNZFFJ6po+31MyQA5X67+
6xpqLbH/hm4IZhFBOEzZwxuWWsNxlJwwzKqeAjGyqeUhdruRYZiPHWQ17sDjwLM/
QcVvWBb7meOtKv1OCtpzC4sglSk3scbAfEHMEBuDt8cI6OD7/t2VzPXDWWZuXGqK
nRAxCT7NrXvyOnv0xwdn0j5p1FUGipVxvhsGWX7sJsh3UHWm8Q+5rRKFFVI9pm50
QcMjiIMazK5VwcAkDnLoDgSz4Zz6TfHXEOqSJ2vjTPt2VDP/J9zdM2iwHx2ujicI
mIkrtmsBprvAPx6e9jcqiS5L/Xy1y1xewXuGxa5F2XOjqdoXkPqaupjlyrWzrChA
MC8w67FdzjHDPCfIqfIWZpJQj4f1OFQGd3HS5HpkBACxIwCg85gRw4DEMfD/sirO
BL2VM0RO/bB5+4R0AY7UA2VszQvNMqedj1bA4vAbrnXqOh8BI/0GgeoWiBMXhyX1
qvT1jl+cxuCm5tgBOMUGYoRyF+//bH+l78jLsTYaWRtuVzFlkAX6idNvYYK0dmNt
tLII2IIZBk87P3pF4d6A
=Zicw
-----END PGP SIGNATURE-----
Merge tag 'iommu-updates-v4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull IOMMU updates from Joerg Roedel:
"This time with bigger changes than usual:
- A new IOMMU driver for the ARM SMMUv3.
This IOMMU is pretty different from SMMUv1 and v2 in that it is
configured through in-memory structures and not through the MMIO
register region. The ARM SMMUv3 also supports IO demand paging for
PCI devices with PRI/PASID capabilities, but this is not
implemented in the driver yet.
- Lots of cleanups and device-tree support for the Exynos IOMMU
driver. This is part of the effort to bring Exynos DRM support
upstream.
- Introduction of default domains into the IOMMU core code.
The rationale behind this is to move functionalily out of the IOMMU
drivers to common code to get to a unified behavior between
different drivers. The patches here introduce a default domain for
iommu-groups (isolation groups).
A device will now always be attached to a domain, either the
default domain or another domain handled by the device driver. The
IOMMU drivers have to be modified to make use of that feature. So
long the AMD IOMMU driver is converted, with others to follow.
- Patches for the Intel VT-d drvier to fix DMAR faults that happen
when a kdump kernel boots.
When the kdump kernel boots it re-initializes the IOMMU hardware,
which destroys all mappings from the crashed kernel. As this
happens before the endpoint devices are re-initialized, any
in-flight DMA causes a DMAR fault. These faults cause PCI master
aborts, which some devices can't handle properly and go into an
undefined state, so that the device driver in the kdump kernel
fails to initialize them and the dump fails.
This is now fixed by copying over the mapping structures (only
context tables and interrupt remapping tables) from the old kernel
and keep the old mappings in place until the device driver of the
new kernel takes over. This emulates the the behavior without an
IOMMU to the best degree possible.
- A couple of other small fixes and cleanups"
* tag 'iommu-updates-v4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (69 commits)
iommu/amd: Handle large pages correctly in free_pagetable
iommu/vt-d: Don't disable IR when it was previously enabled
iommu/vt-d: Make sure copied over IR entries are not reused
iommu/vt-d: Copy IR table from old kernel when in kdump mode
iommu/vt-d: Set IRTA in intel_setup_irq_remapping
iommu/vt-d: Disable IRQ remapping in intel_prepare_irq_remapping
iommu/vt-d: Move QI initializationt to intel_setup_irq_remapping
iommu/vt-d: Move EIM detection to intel_prepare_irq_remapping
iommu/vt-d: Enable Translation only if it was previously disabled
iommu/vt-d: Don't disable translation prior to OS handover
iommu/vt-d: Don't copy translation tables if RTT bit needs to be changed
iommu/vt-d: Don't do early domain assignment if kdump kernel
iommu/vt-d: Allocate si_domain in init_dmars()
iommu/vt-d: Mark copied context entries
iommu/vt-d: Do not re-use domain-ids from the old kernel
iommu/vt-d: Copy translation tables from old kernel
iommu/vt-d: Detect pre enabled translation
iommu/vt-d: Make root entry visible for hardware right after allocation
iommu/vt-d: Init QI before root entry is allocated
iommu/vt-d: Cleanup log messages
...
Pull x86 core updates from Ingo Molnar:
"There were so many changes in the x86/asm, x86/apic and x86/mm topics
in this cycle that the topical separation of -tip broke down somewhat -
so the result is a more traditional architecture pull request,
collected into the 'x86/core' topic.
The topics were still maintained separately as far as possible, so
bisectability and conceptual separation should still be pretty good -
but there were a handful of merge points to avoid excessive
dependencies (and conflicts) that would have been poorly tested in the
end.
The next cycle will hopefully be much more quiet (or at least will
have fewer dependencies).
The main changes in this cycle were:
* x86/apic changes, with related IRQ core changes: (Jiang Liu, Thomas
Gleixner)
- This is the second and most intrusive part of changes to the x86
interrupt handling - full conversion to hierarchical interrupt
domains:
[IOAPIC domain] -----
|
[MSI domain] --------[Remapping domain] ----- [ Vector domain ]
| (optional) |
[HPET MSI domain] ----- |
|
[DMAR domain] -----------------------------
|
[Legacy domain] -----------------------------
This now reflects the actual hardware and allowed us to distangle
the domain specific code from the underlying parent domain, which
can be optional in the case of interrupt remapping. It's a clear
separation of functionality and removes quite some duct tape
constructs which plugged the remap code between ioapic/msi/hpet
and the vector management.
- Intel IOMMU IRQ remapping enhancements, to allow direct interrupt
injection into guests (Feng Wu)
* x86/asm changes:
- Tons of cleanups and small speedups, micro-optimizations. This
is in preparation to move a good chunk of the low level entry
code from assembly to C code (Denys Vlasenko, Andy Lutomirski,
Brian Gerst)
- Moved all system entry related code to a new home under
arch/x86/entry/ (Ingo Molnar)
- Removal of the fragile and ugly CFI dwarf debuginfo annotations.
Conversion to C will reintroduce many of them - but meanwhile
they are only getting in the way, and the upstream kernel does
not rely on them (Ingo Molnar)
- NOP handling refinements. (Borislav Petkov)
* x86/mm changes:
- Big PAT and MTRR rework: making the code more robust and
preparing to phase out exposing direct MTRR interfaces to drivers -
in favor of using PAT driven interfaces (Toshi Kani, Luis R
Rodriguez, Borislav Petkov)
- New ioremap_wt()/set_memory_wt() interfaces to support
Write-Through cached memory mappings. This is especially
important for good performance on NVDIMM hardware (Toshi Kani)
* x86/ras changes:
- Add support for deferred errors on AMD (Aravind Gopalakrishnan)
This is an important RAS feature which adds hardware support for
poisoned data. That means roughly that the hardware marks data
which it has detected as corrupted but wasn't able to correct, as
poisoned data and raises an APIC interrupt to signal that in the
form of a deferred error. It is the OS's responsibility then to
take proper recovery action and thus prolonge system lifetime as
far as possible.
- Add support for Intel "Local MCE"s: upcoming CPUs will support
CPU-local MCE interrupts, as opposed to the traditional system-
wide broadcasted MCE interrupts (Ashok Raj)
- Misc cleanups (Borislav Petkov)
* x86/platform changes:
- Intel Atom SoC updates
... and lots of other cleanups, fixlets and other changes - see the
shortlog and the Git log for details"
* 'x86-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (222 commits)
x86/hpet: Use proper hpet device number for MSI allocation
x86/hpet: Check for irq==0 when allocating hpet MSI interrupts
x86/mm/pat, drivers/infiniband/ipath: Use arch_phys_wc_add() and require PAT disabled
x86/mm/pat, drivers/media/ivtv: Use arch_phys_wc_add() and require PAT disabled
x86/platform/intel/baytrail: Add comments about why we disabled HPET on Baytrail
genirq: Prevent crash in irq_move_irq()
genirq: Enhance irq_data_to_desc() to support hierarchy irqdomain
iommu, x86: Properly handle posted interrupts for IOMMU hotplug
iommu, x86: Provide irq_remapping_cap() interface
iommu, x86: Setup Posted-Interrupts capability for Intel iommu
iommu, x86: Add cap_pi_support() to detect VT-d PI capability
iommu, x86: Avoid migrating VT-d posted interrupts
iommu, x86: Save the mode (posted or remapped) of an IRTE
iommu, x86: Implement irq_set_vcpu_affinity for intel_ir_chip
iommu: dmar: Provide helper to copy shared irte fields
iommu: dmar: Extend struct irte for VT-d Posted-Interrupts
iommu: Add new member capability to struct irq_remap_ops
x86/asm/entry/64: Disentangle error_entry/exit gsbase/ebx/usermode code
x86/asm/entry/32: Shorten __audit_syscall_entry() args preparation
x86/asm/entry/32: Explain reloading of registers after __audit_syscall_entry()
...
Make sure that we are skipping over large PTEs while walking
the page-table tree.
Cc: stable@kernel.org
Fixes: 5c34c403b7 ("iommu/amd: Fix memory leak in free_pagetable")
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Keep it enabled in kdump kernel to guarantee interrupt
delivery.
Tested-by: ZhenHua Li <zhen-hual@hp.com>
Tested-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Walk over the copied entries and mark the present ones as
allocated.
Tested-by: ZhenHua Li <zhen-hual@hp.com>
Tested-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
When we are booting into a kdump kernel and find IR enabled,
copy over the contents of the previous IR table so that
spurious interrupts will not be target aborted.
Tested-by: ZhenHua Li <zhen-hual@hp.com>
Tested-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This way we can give the hardware the new IR table right
after it has been allocated and initialized.
Tested-by: ZhenHua Li <zhen-hual@hp.com>
Tested-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Move it to this function for now, so that the copy routines
for irq remapping take no effect yet.
Tested-by: ZhenHua Li <zhen-hual@hp.com>
Tested-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
QI needs to be enabled when we program the irq remapping
table to hardware in the prepare phase later.
Tested-by: ZhenHua Li <zhen-hual@hp.com>
Tested-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
We need this to be detected already when we program the irq
remapping table pointer to hardware.
Tested-by: ZhenHua Li <zhen-hual@hp.com>
Tested-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Do not touch the TE bit unless we know translation is
disabled.
Tested-by: ZhenHua Li <zhen-hual@hp.com>
Tested-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
For all the copy-translation code to run, we have to keep
translation enabled in intel_iommu_init(). So remove the
code disabling it.
Tested-by: ZhenHua Li <zhen-hual@hp.com>
Tested-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
We can't change the RTT bit when translation is enabled, so
don't copy translation tables when we would change the bit
with our new root entry.
Tested-by: ZhenHua Li <zhen-hual@hp.com>
Tested-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
When we copied over context tables from an old kernel, we
need to defer assignment of devices to domains until the
device driver takes over. So skip this part of
initialization when we copied over translation tables from
the old kernel.
Tested-by: ZhenHua Li <zhen-hual@hp.com>
Tested-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This seperates the allocation of the si_domain from its
assignment to devices. It makes sure that the iommu=pt case
still works in the kdump kernel, when we have to defer the
assignment of devices to domains to device driver
initialization time.
Tested-by: ZhenHua Li <zhen-hual@hp.com>
Tested-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Mark the context entries we copied over from the old kernel,
so that we don't detect them as present in other code paths.
This makes sure we safely overwrite old context entries when
a new domain is assigned.
Tested-by: ZhenHua Li <zhen-hual@hp.com>
Tested-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Mark all domain-ids we find as reserved, so that there could
be no collision between domains from the previous kernel and
our domains in the IOMMU TLB.
Tested-by: ZhenHua Li <zhen-hual@hp.com>
Tested-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
If we are in a kdump kernel and find translation enabled in
the iommu, try to copy the translation tables from the old
kernel to preserve the mappings until the device driver
takes over.
This supports old and the extended root-entry and
context-table formats.
Tested-by: ZhenHua Li <zhen-hual@hp.com>
Tested-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Add code to detect whether translation is already enabled in
the IOMMU. Save this state in a flags field added to
struct intel_iommu.
Tested-by: ZhenHua Li <zhen-hual@hp.com>
Tested-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
In case there was an old root entry, make our new one
visible immediately after it was allocated.
Tested-by: ZhenHua Li <zhen-hual@hp.com>
Tested-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
QI needs to be available when we write the root entry into
hardware because flushes might be necessary after this.
Tested-by: ZhenHua Li <zhen-hual@hp.com>
Tested-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Give them a common prefix that can be grepped for and
improve the wording here and there.
Tested-by: ZhenHua Li <zhen-hual@hp.com>
Tested-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Pull VT-d hardware workarounds from David Woodhouse:
"This contains a workaround for hardware issues which I *thought* were
never going to be seen on production hardware. I'm glad I checked
that before the 4.1 release...
Firstly, PASID support is so broken on existing chips that we're just
going to declare the old capability bit 28 as 'reserved' and change
the VT-d spec to move PASID support to another bit. So any existing
hardware doesn't support SVM; it only sets that (now) meaningless bit
28.
That patch *wasn't* imperative for 4.1 because we don't have PASID
support yet. But *even* the extended context tables are broken — if
you just enable the wider tables and use none of the new bits in them,
which is precisely what 4.1 does, you find that translations don't
work. It's this problem which I thought was caught in time to be
fixed before production, but wasn't.
To avoid triggering this issue, we now *only* enable the extended
context tables on hardware which also advertises "we have PASID
support and we actually tested it this time" with the new PASID
feature bit.
In addition, I've added an 'intel_iommu=ecs_off' command line
parameter to allow us to disable it manually if we need to"
* git://git.infradead.org/intel-iommu:
iommu/vt-d: Only enable extended context tables if PASID is supported
iommu/vt-d: Change PASID support to bit 40 of Extended Capability Register
Although the extended tables are theoretically a completely orthogonal
feature to PASID and anything else that *uses* the newly-available bits,
some of the early hardware has problems even when all we do is enable
them and use only the same bits that were in the old context tables.
For now, there's no motivation to support extended tables unless we're
going to use PASID support to do SVM. So just don't use them unless
PASID support is advertised too. Also add a command-line bailout just in
case later chips also have issues.
The equivalent problem for PASID support has already been fixed with the
upcoming VT-d spec update and commit bd00c606a ("iommu/vt-d: Change
PASID support to bit 40 of Extended Capability Register"), because the
problematic platforms use the old definition of the PASID-capable bit,
which is now marked as reserved and meaningless.
So with this change, we'll magically start using ECS again only when we
see the new hardware advertising "hey, we have PASID support and we
actually tested it this time" on bit 40.
The VT-d hardware architect has promised that we are not going to have
any reason to support ECS *without* PASID any time soon, and he'll make
sure he checks with us before changing that.
In the future, if hypothetical new features also use new bits in the
context tables and can be seen on implementations *without* PASID support,
we might need to add their feature bits to the ecs_enabled() macro.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Add a new interface irq_remapping_cap() to detect whether irq
remapping supports new features, such as VT-d Posted-Interrupts.
Export the function, so that KVM code can check this and use this
mechanism properly.
Signed-off-by: Feng Wu <feng.wu@intel.com>
Reviewed-by: Jiang Liu <jiang.liu@linux.intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Joerg Roedel <joro@8bytes.org>
Cc: iommu@lists.linux-foundation.org
Cc: dwmw2@infradead.org
Link: http://lkml.kernel.org/r/1433827237-3382-10-git-send-email-feng.wu@intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
When the interrupt is configured in posted mode, the destination of
the interrupt is set in the Posted-Interrupts Descriptor and the
migration of these interrupts happens during vCPU scheduling.
We still update the cached irte, which will be used when changing back
to remapping mode, but we avoid writing the table entry as this would
overwrite the posted mode entry.
Signed-off-by: Feng Wu <feng.wu@intel.com>
Reviewed-by: Jiang Liu <jiang.liu@linux.intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: David Woodhouse <David.Woodhouse@intel.com>
Acked-by: Joerg Roedel <joro@8bytes.org>
Cc: iommu@lists.linux-foundation.org
Cc: dwmw2@infradead.org
Link: http://lkml.kernel.org/r/1433827237-3382-7-git-send-email-feng.wu@intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Interrupt chip callback to set the VCPU affinity for posted interrupts.
[ tglx: Use the helper function to copy from the remap irte instead of
open coding it. Massage the comment as well ]
Signed-off-by: Feng Wu <feng.wu@intel.com>
Reviewed-by: Jiang Liu <jiang.liu@linux.intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: David Woodhouse <David.Woodhouse@intel.com>
Cc: iommu@lists.linux-foundation.org
Cc: joro@8bytes.org
Cc: dwmw2@infradead.org
Link: http://lkml.kernel.org/r/1433827237-3382-5-git-send-email-feng.wu@intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Add a new member 'capability' to struct irq_remap_ops for storing
information about available capabilities such as VT-d
Posted-Interrupts.
Signed-off-by: Feng Wu <feng.wu@intel.com>
Reviewed-by: Jiang Liu <jiang.liu@linux.intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Joerg Roedel <joro@8bytes.org>
Cc: iommu@lists.linux-foundation.org
Cc: dwmw2@infradead.org
Link: http://lkml.kernel.org/r/1433827237-3382-2-git-send-email-feng.wu@intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Without this patch only -ENOTSUPP is handled, but there are
other possible errors. Handle them too.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The iommu_group_alloc() and iommu_group_get_for_dev()
functions return error pointers, they never return NULL.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
With device intialization done in the add_device call-back
now there is no reason for this function anymore.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
A device that might be used for HSA needs to be in a direct
mapped domain so that all DMA-API mappings stay alive when
the IOMMUv2 stack is used.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This enables allocation of DMA-API default domains from the
IOMMU core and switches allocation of domain dma-api domain
to the IOMMU core too.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Implement these two iommu-ops call-backs to make use of the
initialization and notifier features of the iommu core.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This function can be called by an IOMMU driver to request
that a device's default domain is direct mapped.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Use the information exported by the IOMMU drivers to create
direct mapped regions in the default domains.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Add two new functions to the IOMMU-API to allow the IOMMU
drivers to export the requirements for direct mapped regions
per device.
This is useful for exporting the information in Intel VT-d's
RMRR entries or AMD-Vi's unity mappings.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Make use of the default domain and re-attach a device to it
when it is detached from another domain. Also enforce that a
device has to be in the default domain before it can be
attached to a different domain.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This patch changes the behavior of the iommu_attach_device
and iommu_detach_device functions. With this change these
functions only work on devices that have their own group.
For all other devices the iommu_group_attach/detach
functions must be used.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The default domain will be used (if supported by the iommu
driver) when the devices in the iommu group are not attached
to any other domain.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Pull Intel IOMMU fix from David Woodhouse:
"This fixes an oops when attempting to enable 1:1 passthrough mode for
devices on which VT-d translation was disabled anyway.
It's actually a long-standing bug but recent changes (commit
18436afdc1: "iommu/vt-d: Allow RMRR on graphics devices too") have
made it much easier to trigger with 'iommu=pt intel_iommu=igfx_off' on
the command line"
* git://git.infradead.org/intel-iommu:
iommu/vt-d: Fix passthrough mode with translation-disabled devices