Make sure the CSA is mapped.
v2: agd: rebase.
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Fixes a rare NULL pointer dereference in amdgpu_ttm_bind.
The issue was found by Nicolai Haehnle.
The patch was tested by Nicolai Haehnle.
Signed-off-by: Alex Xie <AlexBin.Xie@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
In fence waiting, it never return -EDEADLK yet, so drop this function
here.
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
- better atomic state debugging from Rob
- fence prep from gustavo
- sumits flushed out his backlog of pending dma-buf/fence patches from
various people
- drm_mm leak debugging plus trying to appease Kconfig (Chris)
- a few misc things all over
* tag 'topic/drm-misc-2016-11-10' of git://anongit.freedesktop.org/drm-intel: (35 commits)
drm: Make DRM_DEBUG_MM depend on STACKTRACE_SUPPORT
drm/i915: Restrict DRM_DEBUG_MM automatic selection
drm: Restrict stackdepot usage to builtin drm.ko
drm/msm: module param to dump state on error irq
drm/msm/mdp5: add atomic_print_state support
drm/atomic: add debugfs file to dump out atomic state
drm/atomic: add new drm_debug bit to dump atomic state
drm: add helpers to go from plane state to drm_rect
drm: add helper for printing to log or seq_file
drm: helper macros to print composite types
reservation: revert "wait only with non-zero timeout specified (v3)" v2
drm/ttm: fix ttm_bo_wait
dma-buf/fence: revert "don't wait when specified timeout is zero" (v2)
dma-buf/fence: make timeout handling in fence_default_wait consistent (v2)
drm/amdgpu: add the interface of waiting multiple fences (v4)
dma-buf: return index of the first signaled fence (v2)
MAINTAINERS: update Sync File Framework files
dma-buf/sw_sync: put fence reference from the fence creation
dma-buf/sw_sync: mark sync_timeline_create() static
drm: Add stackdepot include for DRM_DEBUG_MM
...
v2: agd: rebase and squash in all the previous optimizations and
changes so everything compiles.
v3: squash in Slava's 32bit build fix
v4: rebase on drm-next (fence -> dma_fence),
squash in Monk's ioctl update patch
Signed-off-by: Junwei Zhang <Jerry.Zhang@amd.com>
Reviewed-by: Monk Liu <monk.liu@amd.com>
Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
[sumits: fix checkpatch warnings]
Link: http://patchwork.freedesktop.org/patch/msgid/1478290570-30982-2-git-send-email-alexander.deucher@amd.com
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJYHmoCAAoJEHm+PkMAQRiG7RMIAI2i7Y5hpL5yCxK5AFaL4u/G
KxXfp1B1UanUTgjOmd7zGqtDYcFX9t7GTTUFixQ7/9Opr4PD9qbnatoDGSc3xjbT
msDgA1B78F1/Q3kHWfeGq32MihQ4mj5NwUCo+igUcUvvWG7mHgzErj/Nh5RoobQX
p/izdpTbrw3GX6xXB8olbG7XWHaVye/+TT3q6+gmgm8I/QEujcLeGoycE0zlhPN8
FG/JX76At/+ZM2Py7Oxo3k+oKL9CHrtOQYDp/wN0uslV5eYvvkZz0/M1HMOGZt+c
gZU5jzM17K7C4Nzo06WAuBU9wUBGc25m+cPicLlOmljnzfU+f50SKaDjZq3p7QI=
=2KUF
-----END PGP SIGNATURE-----
Backmerge tag 'v4.9-rc4' into drm-next
Linux 4.9-rc4
This is needed for nouveau development.
Pull request already again to get the s/fence/dma_fence/ stuff in and
allow everyone to resync. Otherwise really just misc stuff all over, and a
new bridge driver.
* tag 'topic/drm-misc-2016-10-27' of git://anongit.freedesktop.org/git/drm-intel:
drm/bridge: fix platform_no_drv_owner.cocci warnings
drm/bridge: fix semicolon.cocci warnings
drm: Print some debug/error info during DP dual mode detect
drm: mark drm_of_component_match_add dummy inline
drm/bridge: add Silicon Image SiI8620 driver
dt-bindings: add Silicon Image SiI8620 bridge bindings
video: add header file for Mobile High-Definition Link (MHL) interface
drm: convert DT component matching to component_match_add_release()
dma-buf: Rename struct fence to dma_fence
dma-buf/fence: add an lockdep_assert_held()
drm/dp: Factor out helper to distinguish between branch and sink devices
drm/edid: Only print the bad edid when aborting
drm/msm: add missing header dependencies
drm/msm/adreno: move function declarations to header file
drm/i2c/tda998x: mark symbol static where possible
doc: add missing docbook parameter for fence-array
drm: RIP mode_config->rotation_property
drm/msm/mdp5: Advertize 180 degree rotation
drm/msm/mdp5: Use per-plane rotation property
This way we can use parse_cs and still keep VM mode enabled.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-and-Tested by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
It's constant, so it doesn't make to much sense to keep it
with the variable data.
v2: update vce and uvd phys mode ring structures as well
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Saves a bunch of CPU cycles when swapping things back in and
allows us to split the VM headers into a separate file.
v2: rename parameters
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
It's completely pointless to have two pointers to the
device in the same structure.
v2: rename function to amdgpu_ttm_adev, fix typos
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add a flag noting that a BO must be created using linear VRAM
and set this flag on all in kernel users where appropriate.
Hopefully I haven't missed anything.
v2: add it in a few more places, fix CPU mapping.
v3: rename to VRAM_CONTIGUOUS, fix typo in CS code.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Tested-by: Mike Lothian <mike@fireburn.co.uk>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Only allocate address space when we really need it.
v2: fix a typo, add correct function description,
stop leaking the node in the error case.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
We need to validate the offset to make sure that we don't write after the BO.
Additional to that a page should be enough and can make address space
handling much easier.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
We get a few warnings when building kernel with W=1:
drivers/gpu/drm/amd/amdgpu/cz_smc.c:51:5: warning: no previous prototype for 'cz_send_msg_to_smc_async' [-Wmissing-prototypes]
drivers/gpu/drm/amd/amdgpu/cz_smc.c:143:5: warning: no previous prototype for 'cz_write_smc_sram_dword' [-Wmissing-prototypes]
drivers/gpu/drm/amd/amdgpu/iceland_smc.c:124:6: warning: no previous prototype for 'iceland_start_smc' [-Wmissing-prototypes]
drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c:3926:6: warning: no previous prototype for 'gfx_v8_0_rlc_stop' [-Wmissing-prototypes]
drivers/gpu/drm/amd/amdgpu/amdgpu_job.c:94:6: warning: no previous prototype for 'amdgpu_job_free_cb' [-Wmissing-prototypes]
....
In fact, these functions are only used in the file in which they are
declared and don't need a declaration, but can be made static.
So this patch marks these functions with 'static'.
Reviewed-by: Christian König <christian.koenig@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Signed-off-by: Baoyou Xie <baoyou.xie@linaro.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
We don't really need the GTT table any more most of the time. So bind it
only on demand.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
v1:
for gfx8, use CONTEXT_CONTROL package to dynamically
skip preamble CEIB and other load_xxx command in sequence.
v2:
support GFX7 as well.
remove cntxcntl in compute ring funcs because CPC doesn't
support this packet.
v3: fix reduntant judgement in cntxcntl.
v4: some cleanups, don't change cs_submit()
v5: keep old MESA supported & bump up KMS version.
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Ack-by: Chunming Zhou <David1.Zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
job->ctx actually is a fence_context of the entity
it belongs to, naming it as ctx is too vague, and
we'll need add amdgpu_ctx into the job structure
later.
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
As last resort try to evict BOs from the current working set into other
memory domains. This effectively prevents command submission failures when
VM page tables have been swapped out.
v2: fix typos
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
All other errors can't be fixed by using a different memory domain.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The old mechanism used a per-submission limit that didn't take previous
submissions within the same time frame into account. It also filled VRAM
slowly when VRAM usage dropped due to a big eviction or buffer deallocation.
This new method establishes a configurable MBps limit that is obeyed when
VRAM usage is very high. When VRAM usage is not very high, it gives
the driver the freedom to fill it quickly. The result is more consistent
performance.
It can't keep the BO move rate low if lots of evictions are happening due
to VRAM fragmentation, or if a big buffer is being migrated.
The amdgpu.moverate parameter can be used to set a non-default limit.
Measurements can be done to find out which amdgpu.moverate setting gives
the best results.
Mainly APUs and cards with small VRAM will benefit from this. For F1 2015,
anything with 2 GB VRAM or less will benefit.
Some benchmark results - F1 2015 (Tonga 2GB):
Limit MinFPS AvgFPS
Old code: 14 32.6
128 MB/s: 28 41
64 MB/s: 15.5 43
32 MB/s: 28.7 43.4
8 MB/s: 27.8 44.4
8 MB/s: 21.9 42.8 (different run)
Random drops in Min FPS can still occur (due to fragmented VRAM?), but
the average FPS is much better. 8 MB/s is probably a good limit for this
game & the current VRAM management. The random FPS drops are still to be
tackled.
v2: use a spinlock
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Make it more obvious what we are doing here.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Chunming Zhou <David1.Zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
It's useful for debugging.
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
We return the fence as part of the job structur anyway,
no need to do this twice.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Keep the time we don't have a fence associated with the resource smaller.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Same problem as with the VM page tables. The user fence address must be
determined before the job is scheduled, not when the IB is executed.
This fixes a security problem where user fences could be used to overwrite
any part of VRAM.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
It's just overhead to do so and allocating a VMID
when we don't need one is actually a bit dangerous.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
We don't need to validate them again if the eviction counter didn't changed.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
When we pipeline evictions the page directory could already be
moving somewhere else when grab_id is called.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Remove the job reference counting and just properly destroy it from a
work item which blocks on any potential running timeout handler.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Monk.Liu <monk.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The driver shouldn't mess with the scheduler internals.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Monk.Liu <monk.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm_gem_object_lookup() has never required the drm_device for its file
local translation of the user handle to the GEM object. Let's remove the
unused parameter and save some space.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: dri-devel@lists.freedesktop.org
Cc: Dave Airlie <airlied@redhat.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
[danvet: Fixup kerneldoc too.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We leaked the BO in the error pass, additional to that we only have
one user fence for all IBs in a job.
v2: remove white space changes
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
They are the same for all IBs.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
We only have one context for all IBs.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Use of the ctx pointer is not safe, because they are likely already
be assigned to another ctx when doing comparing.
v2: recreate from scratch, avoid all unnecessary changes.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Monk.Liu <monk.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
ib.vm is a legacy way to get vm, after scheduler
implemented vm should be get from job, and all ibs
from one job share the same vm, no need to keep ib.vm
just move vm field to job.
this patch as well add job as paramter to ib_schedule
so it can get vm from job->vm.
v2: agd: sqaush in:
drm/amdgpu: check if ring emit_vm_flush exists in vm flush
No vm flush on engines that don't support VM.
bug:
https://bugs.freedesktop.org/show_bug.cgi?id=95195
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Not needed any more.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
this is to fix fatal page fault error that occured if:
job is signaled/released after its timeout work is already
put to the global queue (in this case the cancel_delayed_work
will return false), which will lead to NX-protection error
page fault during job_timeout_func.
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add two callbacks to scheduler to maintain jobs, and invoked for
job timeout calculations. Now TDR measures time gap from
job is processed by hw.
v2:
fix typo
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Consolidate job initialization in one place rather than
duplicating it in multiple places.
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Reviewed-by: Christian König <christian.koenig@amd.com.
Signed-off-by: Dave Airlie <airlied@redhat.com>
That avoids lock inversion between the BO reservation lock
and the anon_vma lock.
v2:
* Changed amdgpu_bo_list_entry.user_pages to an array of pointers
* Lock mmap_sem only for get_user_pages
* Added invalidation of unbound userpointer BOs
* Fixed memory leak and page reference leak
v3 (chk):
* Revert locking mmap_sem only for_get user_pages
* Revert adding invalidation of unbound userpointer BOs
* Sanitize and fix error handling
v4 (chk):
* Init userpages pointer everywhere.
* Fix error handling when get_user_pages() fails.
* Add invalidation of unbound userpointer BOs again.
v5 (chk):
* Add maximum number of tries.
v6 (chk):
* Fix error handling when we run out of tries.
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> (v4)
Acked-by: Alex Deucher <alexander.deucher@amd.com>
We need them together with the next patch.
v2: Don't take bo reference twice
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
No need to keep that for every IB.
Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add a job_alloc_with_ib helper and proper job submission.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>