radeon userptr support.
* 'drm-next-3.18' of git://people.freedesktop.org/~agd5f/linux:
drm/radeon: allow userptr write access under certain conditions
drm/radeon: add userptr flag to register MMU notifier v3
drm/radeon: add userptr flag to directly validate the BO to GTT
drm/radeon: add userptr flag to limit it to anonymous memory v2
drm/radeon: add userptr support v8
Conflicts:
drivers/gpu/drm/radeon/radeon_prime.c
Now that the PFP and ME synchronization is fixed, we
can enable this again reliably.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
It isn't necessary for command streams generated by the kernel (at least
not while we aren't storing ring or indirect buffers in VRAM).
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Not doing this causes piglit hangs[0] on my Cape Verde card. No issues on
Bonaire and Kaveri though.
[0] Same symptoms as those fixed on CIK by 'drm/radeon: set VM base addr
using the PFP v2'.
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
We can easily return -ENOMEM here if kzalloc() fails.
v2: agd5f: drop the vm mutex
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This patch adds an IOCTL for turning a pointer supplied by
userspace into a buffer object.
It imposes several restrictions upon the memory being mapped:
1. It must be page aligned (both start/end addresses, i.e ptr and size).
2. It must be normal system memory, not a pointer into another map of IO
space (e.g. it must not be a GTT mmapping of another object).
3. The BO is mapped into GTT, so the maximum amount of memory mapped at
all times is still the GTT limit.
4. The BO is only mapped readonly for now, so no write support.
5. List of backing pages is only acquired once, so they represent a
snapshot of the first use.
Exporting and sharing as well as mapping of buffer objects created by
this function is forbidden and results in an -EPERM.
v2: squash all previous changes into first public version
v3: fix tabs, map readonly, don't use MM callback any more
v4: set TTM_PAGE_FLAG_SG so that TTM never messes with the pages,
pin/unpin pages on bind/unbind instead of populate/unpopulate
v5: rebased on 3.17-wip, IOCTL renamed to userptr, reject any unknown
flags, better handle READONLY flag, improve permission check
v6: fix ptr cast warning, use set_page_dirty/mark_page_accessed on unpin
v7: add warning about it's availability in the API definition
v8: drop access_ok check, fix VM mapping bits
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com> (v4)
Reviewed-by: Jérôme Glisse <jglisse@redhat.com> (v4)
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
That should allow us to allocate bigger BOs.
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Move the decision what to use into the common VM code.
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This closes a small window where the GPU might have accessed freed up memory.
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Let's try to fix bugs related to this instead of just disabling it.
Signed-off-by: Christian König <christian.koenig@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Scales much better than scanning the address range linearly.
v2: store pfn instead of address
Signed-off-by: Christian König <christian.koenig@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Don't wait for the BO to be used again, just
update the PT on the next VM use.
v2: remove stray semicolon.
Signed-off-by: Christian König <christian.koenig@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
v2: fix rebase onto drm-fixes
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Calling radeon_vm_bo_find on the IB BO during CS
is illegal and can lead to an crash.
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
v3: completely rewritten. We now just remember which areas
of the PT to clear and do so on the next command submission.
Bug: https://bugs.freedesktop.org/show_bug.cgi?id=79980
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Userspace shouldn't be able to access them.
Signed-off-by: Christian König <christian.koenig@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
And also domain to prefered_domains. That matches better
what those values represent.
Signed-off-by: Christian König <christian.koenig@amd.com>
Cc: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Merge drm-fixes into drm-next.
Both i915 and radeon need this done for later patches.
Conflicts:
drivers/gpu/drm/drm_crtc_helper.c
drivers/gpu/drm/i915/i915_drv.h
drivers/gpu/drm/i915/i915_gem.c
drivers/gpu/drm/i915/i915_gem_execbuffer.c
drivers/gpu/drm/i915/i915_gem_gtt.c
This patch makes it possible to decide how many address
bits are spend on the page directory vs the page tables.
v2: remove unintended change
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This patch implements support for VRAM page table entry compression.
PTE construction is enhanced to identify physically contiguous page
ranges and mark them in the PTE fragment field. L1/L2 TLB support is
enabled for 64KB (SI/CIK) and 256KB (NI) PTE fragments, significantly
improving TLB utilization for VRAM allocations.
Linear store bandwidth is improved from 60GB/s to 125GB/s on Pitcairn.
Unigine Heaven 3.0 sees an average improvement from 24.7 to 27.7 FPS
on default settings at 1920x1200 resolution with vsync disabled.
See main comment in radeon_vm.c for a technical description.
v2 (chk): rebased and simplified.
v3 (chk): add missing hw setup
v4 (chk): rebased on current drm-fixes-3.15
v5 (chk): fix comments and commit text
Signed-off-by: Jay Cornwall <jay@jcornwall.me>
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Take padding into account as well.
Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=75651
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Just move all fields into radeon_cs_reloc, removing unused/duplicated fields.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
No need to make it more complicated than necessary,
just allocate the page tables as normal BO and
flush whenever the address change.
v2: update comments and function name
v3: squash bug fixes, page directory and tables patch
v4: rebased on Mareks changes
Signed-off-by: Christian König <christian.koenig@amd.com>