linux/arch
Alexey Kardashevskiy 2157e7b82f vfio: powerpc/spapr: Register memory and define IOMMU v2
The existing implementation accounts the whole DMA window in
the locked_vm counter. This is going to be worse with multiple
containers and huge DMA windows. Also, real-time accounting would requite
additional tracking of accounted pages due to the page size difference -
IOMMU uses 4K pages and system uses 4K or 64K pages.

Another issue is that actual pages pinning/unpinning happens on every
DMA map/unmap request. This does not affect the performance much now as
we spend way too much time now on switching context between
guest/userspace/host but this will start to matter when we add in-kernel
DMA map/unmap acceleration.

This introduces a new IOMMU type for SPAPR - VFIO_SPAPR_TCE_v2_IOMMU.
New IOMMU deprecates VFIO_IOMMU_ENABLE/VFIO_IOMMU_DISABLE and introduces
2 new ioctls to register/unregister DMA memory -
VFIO_IOMMU_SPAPR_REGISTER_MEMORY and VFIO_IOMMU_SPAPR_UNREGISTER_MEMORY -
which receive user space address and size of a memory region which
needs to be pinned/unpinned and counted in locked_vm.
New IOMMU splits physical pages pinning and TCE table update
into 2 different operations. It requires:
1) guest pages to be registered first
2) consequent map/unmap requests to work only with pre-registered memory.
For the default single window case this means that the entire guest
(instead of 2GB) needs to be pinned before using VFIO.
When a huge DMA window is added, no additional pinning will be
required, otherwise it would be guest RAM + 2GB.

The new memory registration ioctls are not supported by
VFIO_SPAPR_TCE_IOMMU. Dynamic DMA window and in-kernel acceleration
will require memory to be preregistered in order to work.

The accounting is done per the user process.

This advertises v2 SPAPR TCE IOMMU and restricts what the userspace
can do with v1 or v2 IOMMUs.

In order to support memory pre-registration, we need a way to track
the use of every registered memory region and only allow unregistration
if a region is not in use anymore. So we need a way to tell from what
region the just cleared TCE was from.

This adds a userspace view of the TCE table into iommu_table struct.
It contains userspace address, one per TCE entry. The table is only
allocated when the ownership over an IOMMU group is taken which means
it is only used from outside of the powernv code (such as VFIO).

As v2 IOMMU supports IODA2 and pre-IODA2 IOMMUs (which do not support
DDW API), this creates a default DMA window for IODA2 for consistency.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[aw: for the vfio related changes]
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-11 15:16:55 +10:00
..
alpha alpha: forward declare struct pt_regs in processor.h 2015-04-17 09:03:53 -04:00
arc ARC changes for 4.1-rc1: 2015-04-24 07:55:54 -07:00
arm Merge branch 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm 2015-05-10 11:16:48 -07:00
arm64 arm64: perf: Fix the pmu node name in warning message 2015-04-30 12:11:30 +01:00
avr32 Merge branch 'exec_domain_rip_v2' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/misc 2015-04-15 13:53:55 -07:00
blackfin blackfin updates for Linux 4.1 2015-04-24 07:58:07 -07:00
c6x C6X Fixes for v4.1 2015-04-16 18:48:55 -04:00
cris CRIS changes for 4.1 2015-04-26 13:31:05 -07:00
frv Devicetree updates for 4.1: 2015-04-24 08:46:18 -07:00
hexagon Merge branch 'exec_domain_rip_v2' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/misc 2015-04-15 13:53:55 -07:00
ia64 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2015-04-26 17:22:07 -07:00
m32r m32r: make flush_cpumask non-volatile. 2015-05-09 11:09:29 -07:00
m68k Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu 2015-04-20 10:12:29 -07:00
metag Metag architecture changes for v4.1 2015-04-24 07:56:50 -07:00
microblaze microblaze: use asm-generic for seccomp.h 2015-04-17 09:04:10 -04:00
mips TTY/Serial patches for 4.1-rc1 2015-04-21 09:33:10 -07:00
mn10300 Devicetree updates for 4.1: 2015-04-24 08:46:18 -07:00
nios2 nios2 update for v4.1-rc1 2015-04-24 07:59:07 -07:00
openrisc Merge branch 'akpm' (patches from Andrew) 2015-04-15 16:39:15 -07:00
parisc parisc: Replace PT_NLEVELS with CONFIG_PGTABLE_LEVELS 2015-04-21 22:04:03 +02:00
powerpc vfio: powerpc/spapr: Register memory and define IOMMU v2 2015-06-11 15:16:55 +10:00
s390 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux 2015-04-28 09:58:46 -07:00
score arch: Remove exec_domain from remaining archs 2015-04-12 21:03:30 +02:00
sh Merge branch 'for-linus' of git://git.infradead.org/users/vkoul/slave-dma 2015-04-24 09:49:37 -07:00
sparc Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc 2015-04-21 23:21:34 -07:00
tile tile: properly use node_isset() on a nodemask_t 2015-04-28 10:36:45 -04:00
um Merge branch 'exec_domain_rip_v2' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/misc 2015-04-15 13:53:55 -07:00
unicore32 Merge branch 'exec_domain_rip_v2' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/misc 2015-04-15 13:53:55 -07:00
x86 Power management and ACPI fixes for v4.1-rc3 2015-05-07 15:58:00 -07:00
xtensa Xtensa changes and fixes for 4.1 2015-04-17 15:32:30 -04:00
.gitignore
Kconfig powerpc updates for 4.1 2015-04-16 13:53:32 -05:00