linux

Author	SHA1	Message	Date
Alexandre Courbot	e02d586da6	drm/nouveau/instmem/gk20a: add write barrier when releasing DMA object When using the DMA-API for instmem, we may obtain a write-combined mapping. For such cases, add a write barrier in gk20a_instobj_release_dma() to make sure that all writes have reached memory at this time. Signed-off-by: Alexandre Courbot <acourbot@nvidia.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2016-03-14 10:13:34 +10:00
Alexandre Courbot	b306712d92	drm/nouveau/instmem/gk20a: use DMA API CPU mapping Commit `69c4938249` ("drm/nouveau/instmem/gk20a: use direct CPU access") tried to be smart while using the DMA-API by managing the CPU mappings of buffers allocated with the DMA-API by itself. In doing so, it relied on dma_to_phys() which is an architecture-private function not available everywhere. This broke the build on several architectures. Since there is no reliable and portable way to obtain the physical address of a DMA-API buffer, stop trying to be smart and just use the CPU mapping that the DMA-API can provide. This means that buffers will be CPU-mapped for all their life as opposed to when we need them, but anyway using the DMA-API here is a fallback for when no IOMMU is available so we should not expect optimal behavior. This makes the IOMMU and DMA-API implementations of instmem diverge enough that we should maybe put them into separate files... Signed-off-by: Alexandre Courbot <acourbot@nvidia.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2016-01-11 11:17:40 +10:00
Alexandre Courbot	338840eed1	drm/nouveau/instmem/gk20a: fix race conditions The LRU list used for recycling CPU mappings was handling concurrency very poorly. For instance, if an instobj was acquired twice before being released once, it would end up into the LRU list even though there is still a client accessing it. This patch fixes this by properly counting how many clients are currently using a given instobj. While at it, we also raise errors when inconsistencies are detected, and factorize some code. Signed-off-by: Alexandre Courbot <acourbot@nvidia.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2016-01-11 11:17:40 +10:00
Dave Airlie	10855aeb1e	drm/nouveau: fix build failures on all non ARM. gk20a is an ARM only GPU, so we can just do the correct thing on ARM but fail on other architectures. The other option was to use SWIOTLB as the define, which means phys_to_page exists, but this seems clearer. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-11-11 12:37:57 +10:00
Alexandre Courbot	68b566534c	drm/nouveau/instmem/gk20a: make use of the IOMMU bit Use the IOMMU bit specified in platform data instead of hardcoding it to the bit used by current Tegra GPUs. Signed-off-by: Alexandre Courbot <acourbot@nvidia.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2015-11-03 15:02:18 +10:00
Alexandre Courbot	69c4938249	drm/nouveau/instmem/gk20a: use direct CPU access The Great Nouveau Refactoring Take II brought us a lot of goodness, including acquire/release methods that are called before and after an instobj is modified. These functions can be used as synchronization points to manage CPU/GPU coherency if we modify an instobj using the CPU. This patch replaces the legacy and slow PRAMIN access for gk20a instmem with CPU mappings and writes. A LRU list is used to unmap unused mappings after a certain threshold (currently 1MB) of mapped instobjs is reached. This allows mappings to be reused most of the time. Accessing instobjs using the CPU requires to maintain the GPU L2 cache, which we do in the acquire/release functions. This triggers a lot of L2 flushes/invalidates, but most of them are performed on an empty cache (and thus return immediately), and overall context setup performance greatly benefits from this (from 250ms to 160ms on Jetson TK1 for a simple libdrm program). Making L2 management more explicit should allow us to grab some more performance in the future. Signed-off-by: Alexandre Courbot <acourbot@nvidia.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2015-11-03 15:02:18 +10:00
Ben Skeggs	43a70661ea	drm/nouveau/tegra: merge platform setup from nouveau drm The copyright header in nvkm/engine/device/platform.c has been replaced with the NVIDIA one from drm/nouveau_platform.c, as most of the actual code is now theirs. Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2015-08-28 12:40:49 +10:00
Ben Skeggs	26c9e8effe	drm/nouveau/device: remove pci/platform_device from common struct Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2015-08-28 12:40:49 +10:00
Ben Skeggs	b7a2bc1886	drm/nouveau/imem: convert to new-style nvkm_subdev Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2015-08-28 12:40:44 +10:00
Ben Skeggs	d8e83994aa	drm/nouveau/imem: improve management of instance memory Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2015-08-28 12:40:36 +10:00
Ben Skeggs	47b2505efb	drm/nouveau/platform: remove subclassing of nvkm_device Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2015-08-28 12:40:30 +10:00
Ben Skeggs	00c5550710	drm/nouveau/imem: switch to subdev printk macros Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2015-08-28 12:40:23 +10:00
Ben Skeggs	d5c5bcf693	drm/nouveau/imem: switch to device pri macros Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2015-08-28 12:40:15 +10:00
Ben Skeggs	c44c06aeeb	drm/nouveau/imem: cosmetic changes This is purely preparation for upcoming commits, there should be no code changes here. Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2015-08-28 12:40:08 +10:00
Ben Skeggs	9ace404b10	drm/nouveau/device: include core/device.h automatically for subdevs/engines Pretty much every subdev/engine is going to need access to nvkm_device shortly to touch registers and/or output messages. The odd placement of the includes is necessary to work around some inter-dependencies that currently exist. This will be fixed later. Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2015-08-28 12:40:07 +10:00
Alexandre Courbot	df16896b86	drm/nouveau/instmem/gk20a: fix crash during error path If a memory allocation fails when using the DMA allocator, gk20a_instobj_dtor_dma() will be called on the failed instmem object. At this time, node->handle might not be NULL despite the call to dma_alloc_attrs() having failed. node->cpuaddr is the right member to check for such a failure, so use it instead. Reported-by: Vince Hsu <vinceh@nvidia.com> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com> Reviewed-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2015-04-14 17:00:49 +10:00
Alexandre Courbot	a7f6da6e75	drm/nouveau/instmem/gk20a: add IOMMU support Let GK20A's instmem take advantage of the IOMMU if it is present. Having an IOMMU means that instmem is no longer allocated using the DMA API, but instead obtained through page_alloc and made contiguous to the GPU by IOMMU mappings. Signed-off-by: Alexandre Courbot <acourbot@nvidia.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2015-04-14 17:00:45 +10:00
Alexandre Courbot	5dc240bcfe	drm/nouveau/instmem/gk20a: use DMA attributes instmem for GK20A is allocated using dma_alloc_coherent(), which provides us with a coherent CPU mapping that we never use because instmem objects are accessed through PRAMIN. Switch to dma_alloc_attrs() which gives us the option to dismiss that CPU mapping and free up some CPU virtual space. Signed-off-by: Alexandre Courbot <acourbot@nvidia.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2015-04-14 17:00:44 +10:00
Alexandre Courbot	a6ff85d386	drm/nouveau/instmem/gk20a: move memory allocation to instmem GK20A does not have dedicated RAM, thus having a RAM device for it does not make sense. Move the contiguous physical memory allocation to instmem. Signed-off-by: Alexandre Courbot <acourbot@nvidia.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2015-04-14 17:00:42 +10:00

19 Commits