Commit Graph

56 Commits

Author SHA1 Message Date
Christian König
edf600dac6 drm/amd: cleanup remaining spaces and tabs v2
This is the result of running the following commands:
find drivers/gpu/drm/amd/ -name "*.h" -exec sed -i 's/[ \t]\+$//' {} \;
find drivers/gpu/drm/amd/ -name "*.c" -exec sed -i 's/[ \t]\+$//' {} \;
find drivers/gpu/drm/amd/ -name "*.h" -exec sed -i 's/ \+\t/\t/' {} \;
find drivers/gpu/drm/amd/ -name "*.c" -exec sed -i 's/ \+\t/\t/' {} \;

v2: drop changes to DAL and internal headers

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-05-11 12:31:20 -04:00
Nils Wallménius
62250a910a drm/amd/scheduler: Mark amdgpu_sched_ops const
This marks the struct amdgpu_sched_ops const and
adjusts amd_sched_init to take a const pointer
for the ops param. The ops member of
struct amd_gpu_scheduler is also changed to const.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Nils Wallménius <nils.wallmenius@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-05-04 20:20:05 -04:00
Monk Liu
b6723c8da5 drm/amdgpu: use ref to keep job alive
this is to fix fatal page fault error that occured if:
job is signaled/released after its timeout work is already
put to the global queue (in this case the cancel_delayed_work
will return false), which will lead to NX-protection error
page fault during job_timeout_func.

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-05-02 15:20:07 -04:00
Monk Liu
0de2479c95 drm/amdgpu: rework TDR in scheduler (v2)
Add two callbacks to scheduler to maintain jobs, and invoked for
job timeout calculations. Now TDR measures time gap from
job is processed by hw.

v2:
fix typo

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-05-02 15:19:57 -04:00
Monk Liu
cccd9bce97 drm/amdgpu: get rid of incorrect TDR
original time out detect routine is incorrect, cuz it measures
the gap from job scheduled, but we should only measure the
gap from processed by hw.

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-05-02 15:19:42 -04:00
Monk Liu
4835096b07 drm/amdgpu: put job to list before done
the mirror_list will be used for later time out detect
feature.  This is needed to properly detect a GPU
timeout with the scheduler.

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-05-02 15:17:53 -04:00
Monk Liu
e472d2588e drm/amdgpu: delay job free to when it's finished (v2)
for those jobs submitted through scheduler, do not
free it immediately after scheduled, instead free it
in global workqueue by its sched fence signaling
callback function.

v2:
call uf's bo_undef after job_run()
call job's sync free after job_run()
no static inline __amdgpu_job_free() anymore, just use
kfree(job) to replace it.

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-05-02 15:17:41 -04:00
Monk Liu
e686941a32 drm/amdgpu: use sched_job_init to initialize sched_job
Consolidate job initialization in one place rather than
duplicating it in multiple places.

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-05-02 15:12:59 -04:00
Chunming Zhou
d033a6de80 drm/amd: abstract kernel rq and normal rq to priority of run queue
Allows us to set priorities in the scheduler.

Signed-off-by: Chunming Zhou <David1.Zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
2015-12-02 15:54:33 -05:00
Christian König
393a0bd437 drm/amdgpu: optimize scheduler fence handling
We only need to wait for jobs to be scheduled when
the dependency is from the same scheduler.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
2015-11-23 12:19:58 -05:00
Christian König
e284022163 drm/amdgpu: fix incorrect mutex usage v3
Before this patch the scheduler fence was created when we push the job
into the queue, so we could only get the fence after pushing it.

The mutex now was necessary to prevent the thread pushing the jobs to
the hardware from running faster than the thread pushing the jobs into
the queue.

Otherwise the thread pushing jobs into the queue would have accessed
possible freed up memory when it tries to get a reference to the fence.

So what you get in the end is thread A:
mutex_lock(&job->lock);
...
Kick of thread B.
...
mutex_unlock(&job->lock);

And thread B:
mutex_lock(&job->lock);
....
mutex_unlock(&job->lock);
kfree(job);

I'm actually not sure if I'm still up to date on this, but this usage
pattern used to be not allowed with mutexes. See here as well
https://lwn.net/Articles/575460/.

v2: remove unrelated changes, fix missing owner
v3: rebased, add more commit message

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-11-16 11:05:58 -05:00
Chunming Zhou
f5617f9dde drm/amd: add kmem cache for sched fence
Change-Id: I45bb8ff10ef05dc3b15e31a77fbcf31117705f11
Signed-off-by: Chunming Zhou <David1.Zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2015-11-16 11:05:51 -05:00
Junwei Zhang
2440ff2c91 drm/amdgpu: add timer to fence to detect scheduler lockup
Change-Id: I67e987db0efdca28faa80b332b75571192130d33
Signed-off-by: Junwei Zhang <Jerry.Zhang@amd.com>
Reviewed-by: David Zhou <david1.zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2015-10-14 16:16:42 -04:00
Christian König
4f839a243d drm/amdgpu: more scheduler cleanups v2
Embed the scheduler into the ring structure instead of allocating it.
Use the ring name directly instead of the id.

v2: rebased, whitespace cleanup

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Reviewed-by: Chunming Zhou<david1.zhou@amd.com>
2015-09-23 17:23:39 -04:00
Christian König
9b398fa5c2 drm/amdgpu: rename fence->scheduler to sched v2
Just to be consistent with the other members.

v2: rename the ring member as well.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com> (v1)
Reviewed-by: Chunming Zhou<david1.zhou@amd.com>
2015-09-23 17:23:37 -04:00
Christian König
0f75aee751 drm/amdgpu: cleanup entity init
Reorder the fields and properly return the kfifo_alloc error code.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Reviewed-by: Chunming Zhou<david1.zhou@amd.com>
2015-09-23 17:23:37 -04:00
Junwei Zhang
4c7eb91cae drm/amdgpu: refine the job naming for amdgpu_job and amdgpu_sched_job
Use consistent naming across functions.

Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: David Zhou <david1.zhou@amd.com>
Signed-off-by: Junwei Zhang <Jerry.Zhang@amd.com>
2015-09-23 17:23:36 -04:00
Christian König
1886d1a9ca drm/amdgpu: remove process_job callback from the scheduler
Just free the resources immediately after submitting the job.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com>
2015-09-23 17:23:33 -04:00
Christian König
258f3f99d5 drm/amdgpu: move scheduler fence callback into fence v2
And call the processed callback directly after submitting the job.

v2: split adding error handling into separate patch.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com>
2015-09-23 17:23:32 -04:00
Christian König
e61235db62 drm/amdgpu: add scheduler dependency callback v2
This way the scheduler doesn't wait in it's work thread any more.

v2: fix race conditions

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com>
2015-08-28 15:04:17 -04:00
Christian König
c2b6bd7e91 drm/amdgpu: fix wait queue handling in the scheduler
Freeing up a queue after signalling it isn't race free.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
2015-08-26 17:55:07 -04:00
Christian König
bd755d0870 drm/amdgpu: remove extra parameters from scheduler callbacks
Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
2015-08-26 17:54:10 -04:00
Christian König
062c7fb3eb drm/amdgpu: remove entity idle timeout v2
Removing the entity from scheduling can deadlock the whole system.
Wait forever till the remaining IBs are scheduled.

v2: fix comment as well

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com> (v1)
2015-08-26 17:52:18 -04:00
Chunming Zhou
f38fdfddfa drm/amdgpu: add priv data to sched
Signed-off-by: Chunming Zhou <david1.zhou@amd.com>
Reviewed-by: Christian K?nig <christian.koenig@amd.com>
2015-08-25 10:52:18 -04:00
Chunming Zhou
84f76ea6b0 drm/amdgpu: add owner for sched fence
Signed-off-by: Chunming Zhou <david1.zhou@amd.com>
Reviewed-by: Christian K?nig <christian.koenig@amd.com>
2015-08-25 10:51:32 -04:00
Christian König
c14692f0a7 drm/amdgpu: remove entity reference from sched fence
Entity don't live as long as scheduler fences.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
2015-08-25 10:50:42 -04:00
Christian König
6c859274f3 drm/amdgpu: fix and cleanup amd_sched_entity_push_job
Calling schedule() is probably the worse things we can do.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
2015-08-25 10:49:57 -04:00
Christian König
69f7dd652c drm/amdgpu: remove unused parameters to amd_sched_create
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
2015-08-25 10:47:41 -04:00
Christian König
1fca766b24 drm/amdgpu: remove sched_lock
It isn't protecting anything.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
2015-08-25 10:46:46 -04:00
Christian König
b034b572f2 drm/amdgpu: remove prepare_job callback
Not used any more.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
2015-08-25 10:46:02 -04:00
Christian König
aef4852eed drm/amdgpu: fix entity wakeup race condition
That actually didn't worked at all.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
2015-08-25 10:42:30 -04:00
Christian König
9788ec4032 drm/amdgpu: remove some more unused entity members v2
None of them are used any more.

v2: fix type in error message

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
2015-08-25 10:40:55 -04:00
Christian König
c746ba2223 drm/amdgpu: rework scheduler submission handling.
Remove active_hw_rq and it's protecting queue_lock, they are unused.

User 32bit atomic for hw_rq_count, 64bits for counting to three is a bit
overkill.

Cleanup the function name and remove incorrect comments.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com>
2015-08-25 10:39:31 -04:00
Christian König
ce882e6dc2 drm/amdgpu: remove v_seq handling from the scheduler v2
Simply not used any more. Only keep 32bit atomic for fence sequence numbering.

v2: trivial rebase

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com> (v1)
Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com> (v1)
Reviewed-by: Chunming Zhou <david1.zhou@amd.com> (v1)
2015-08-25 10:39:16 -04:00
Christian König
2b184d8dbc drm/amdgpu: use a spinlock instead of a mutex for the rq
More appropriate and fixes some nasty lockdep warnings.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
2015-08-20 17:03:47 -04:00
Chunming Zhou
bb977d3711 drm/amdgpu: abstract amdgpu_job for scheduler
Signed-off-by: Chunming Zhou <david1.zhou@amd.com>
Reviewed-by: Christian K?nig <christian.koenig@amd.com>
2015-08-20 17:00:35 -04:00
Christian König
432a4ff8b7 drm/amdgpu: cleanup sheduler rq handling v2
Rework run queue implementation, especially remove the odd list handling.

v2: cleanup the code only, no algorithem change.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
2015-08-17 16:51:22 -04:00
Chunming Zhou
1c8f805af9 drm/amdgpu: fix unnecessary wake up
decrease CPU extra overhead.

Signed-off-by: Chunming Zhou <david1.zhou@amd.com>
Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com>
Reviewed-by: Christian K?nig <christian.koenig@amd.com>
2015-08-17 16:51:20 -04:00
Christian König
5b232c2a71 drm/amdgpu: remove scheduler fence list v2
Unused and missing proper locking.

v2: add locking comment to commit message.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com> (v1)
2015-08-17 16:51:16 -04:00
Christian König
05caae8515 drm/amdgpu: remove amd_sched_wait_emit v2
Not used any more.

v2: remove amd_sched_emit as well.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
2015-08-17 16:51:15 -04:00
Chunming Zhou
f556cb0cae drm/amd: add scheduler fence implementation (v2)
scheduler fence is based on kernel fence framework.

v2: squash in Christian's build fix

Signed-off-by: Chunming Zhou <david1.zhou@amd.com>
Reviewed-by: Christian K?nig <christian.koenig@amd.com>
2015-08-17 16:51:07 -04:00
Chunming Zhou
953e8fd4e7 drm/amdgpu: use amd_sched_job in its backend ops
Signed-off-by: Chunming Zhou <david1.zhou@amd.com>
Reviewed-by: Christian K?nig <christian.koenig@amd.com>
2015-08-17 16:51:06 -04:00
Christian König
6f0e54a964 drm/amdgpu: cleanup and fix scheduler fence handling v2
v2: rebased

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com> (v1)
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
2015-08-17 16:51:05 -04:00
Christian König
91404fb208 drm/amdgpu: merge amd_sched_entity and amd_context_entity v2
Avoiding a couple of casts.

v2: rename c_entity to entity as well

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
2015-08-17 16:51:05 -04:00
Christian König
ddf94d33d6 drm/amdgpu: remove unused parent entity
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
2015-08-17 16:51:03 -04:00
Chunming Zhou
4cef92670b drm/amdgpu: process sched job exactly triggered by fence signal
Signed-off-by: Chunming Zhou <david1.zhou@amd.com>
Reviewed-by: Christian K?nig <christian.koenig@amd.com>
2015-08-17 16:51:03 -04:00
Chunming Zhou
80de5913cf Revert "drm/amdgpu: return new seq_no for amd_sched_push_job"
This reverts commit d1d33da8eb86b8ca41dd9ed95738030df5267b95.

Reviewed-by: Christian K?nig <christian.koenig@amd.com>

Conflicts:
	drivers/gpu/drm/amd/amdgpu/amdgpu_sched.c
	drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
2015-08-17 16:51:02 -04:00
Christian König
0e89d0c16b drm/amdgpu: stop leaking the ctx id into the scheduler v2
Id's are for the IOCTL ABI only.

v2: remove tgid as well

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
2015-08-17 16:51:01 -04:00
Jammy Zhou
27f6642d06 drm/amdgpu: add amd_sched_next_queued_seq function
This function is used to get the next queued sequence number

Signed-off-by: Jammy Zhou <Jammy.Zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-08-17 16:50:58 -04:00
Jammy Zhou
63ad8d5882 drm/amdgpu: make last_handled_seq atomic
Signed-off-by: Jammy Zhou <Jammy.Zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-08-17 16:50:57 -04:00