linux

Author	SHA1	Message	Date
Chunming Zhou	a340c7bcf1	drm/amdgpu: add dep_sync for amdgpu job The fence in dep_sync cannot be optimized. Signed-off-by: Chunming Zhou <David1.Zhou@amd.com> Tested and Reviewed-by: Roger.He <Hongbo.He@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-05-24 18:14:49 -04:00
Chunming Zhou	15d73ce6f9	drm/amdgpu: skip all jobs of guilty vm If the vm is guilty of a GPU reset, skips all its jobs. Signed-off-by: Chunming Zhou <David1.Zhou@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-05-24 18:13:17 -04:00
Monk Liu	7225f8736c	drm/amdgpu:use job* to replace voluntary that way we can know which job cause hang and can do per sched reset/recovery instead of all sched. Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-05-24 17:40:38 -04:00
Monk Liu	4fbf87e2fe	drm/amdgpu:don't invoke srio-gpu-reset in gpu-reset (v2) because we don't want to do sriov-gpu-reset under certain cases, so just split those two funtion and don't invoke sr-iov one from bare-metal one. V2: remove debugfs_gpu_reset routine on SRIOV case. Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-05-24 17:40:37 -04:00
Chunming Zhou	b9bf33d5ac	drm/amdgpu: make pipeline sync be in same place v2 v2: directly return for 'if' case. Signed-off-by: Chunming Zhou <David1.Zhou@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-05-24 17:40:35 -04:00
Chunming Zhou	df83d1ebc9	drm/amdgpu: add sched sync for amdgpu job v2 this is an improvement for previous patch, the sched_sync is to store fence that could be skipped as scheduled, when job is executed, we didn't need pipeline_sync if all fences in sched_sync are signalled, otherwise insert pipeline_sync still. v2: handle error when adding fence to sync failed. Signed-off-by: Chunming Zhou <David1.Zhou@amd.com> Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com> (v1) Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-05-24 17:40:35 -04:00
Chunming Zhou	30514decb2	drm/amdgpu: fix dependency issue The problem is that executing the jobs in the right order doesn't give you the right result because consecutive jobs executed on the same engine are pipelined. In other words job B does it buffer read before job A has written it's result. Signed-off-by: Chunming Zhou <David1.Zhou@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-05-10 13:23:53 -04:00
Chunming Zhou	6c98d31ee8	drm/amdgpu: fix no-vmid job [ 132.036658] amdgpu 0000:22:00.0: VM IB without ID [ 132.036709] [drm:amdgpu_job_run [amdgpu]] ERROR Error scheduling IBs (-22) [ 132.036755] [drm:amd_sched_main [amdgpu]] ERROR Failed to run job! root cause is fence is signaled during sync transfer. Signed-off-by: Chunming Zhou <David1.Zhou@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-04-28 17:32:55 -04:00
Junwei Zhang	50ddc75e32	drm/amd/amdgpu: remove the uncessary parameter for ib scheduler Signed-off-by: Junwei Zhang <Jerry.Zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-01-27 12:20:37 -05:00
Chris Wilson	f54d186700	dma-buf: Rename struct fence to dma_fence I plan to usurp the short name of struct fence for a core kernel struct, and so I need to rename the specialised fence/timeline for DMA operations to make room. A consensus was reached in https://lists.freedesktop.org/archives/dri-devel/2016-July/113083.html that making clear this fence applies to DMA operations was a good thing. Since then the patch has grown a bit as usage increases, so hopefully it remains a good thing! (v2...: rebase, rerun spatch) v3: Compile on msm, spotted a manual fixup that I broke. v4: Try again for msm, sorry Daniel coccinelle script: @@ @@ - struct fence + struct dma_fence @@ @@ - struct fence_ops + struct dma_fence_ops @@ @@ - struct fence_cb + struct dma_fence_cb @@ @@ - struct fence_array + struct dma_fence_array @@ @@ - enum fence_flag_bits + enum dma_fence_flag_bits @@ @@ ( - fence_init + dma_fence_init \| - fence_release + dma_fence_release \| - fence_free + dma_fence_free \| - fence_get + dma_fence_get \| - fence_get_rcu + dma_fence_get_rcu \| - fence_put + dma_fence_put \| - fence_signal + dma_fence_signal \| - fence_signal_locked + dma_fence_signal_locked \| - fence_default_wait + dma_fence_default_wait \| - fence_add_callback + dma_fence_add_callback \| - fence_remove_callback + dma_fence_remove_callback \| - fence_enable_sw_signaling + dma_fence_enable_sw_signaling \| - fence_is_signaled_locked + dma_fence_is_signaled_locked \| - fence_is_signaled + dma_fence_is_signaled \| - fence_is_later + dma_fence_is_later \| - fence_later + dma_fence_later \| - fence_wait_timeout + dma_fence_wait_timeout \| - fence_wait_any_timeout + dma_fence_wait_any_timeout \| - fence_wait + dma_fence_wait \| - fence_context_alloc + dma_fence_context_alloc \| - fence_array_create + dma_fence_array_create \| - to_fence_array + to_dma_fence_array \| - fence_is_array + dma_fence_is_array \| - trace_fence_emit + trace_dma_fence_emit \| - FENCE_TRACE + DMA_FENCE_TRACE \| - FENCE_WARN + DMA_FENCE_WARN \| - FENCE_ERR + DMA_FENCE_ERR ) ( ... ) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk> Acked-by: Sumit Semwal <sumit.semwal@linaro.org> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/20161025120045.28839-1-chris@chris-wilson.co.uk	2016-10-25 14:40:39 +02:00
Baoyou Xie	761c2e8205	drm/amdgpu: mark symbols static where possible We get a few warnings when building kernel with W=1: drivers/gpu/drm/amd/amdgpu/cz_smc.c:51:5: warning: no previous prototype for 'cz_send_msg_to_smc_async' [-Wmissing-prototypes] drivers/gpu/drm/amd/amdgpu/cz_smc.c:143:5: warning: no previous prototype for 'cz_write_smc_sram_dword' [-Wmissing-prototypes] drivers/gpu/drm/amd/amdgpu/iceland_smc.c:124:6: warning: no previous prototype for 'iceland_start_smc' [-Wmissing-prototypes] drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c:3926:6: warning: no previous prototype for 'gfx_v8_0_rlc_stop' [-Wmissing-prototypes] drivers/gpu/drm/amd/amdgpu/amdgpu_job.c:94:6: warning: no previous prototype for 'amdgpu_job_free_cb' [-Wmissing-prototypes] .... In fact, these functions are only used in the file in which they are declared and don't need a declaration, but can be made static. So this patch marks these functions with 'static'. Reviewed-by: Christian König <christian.koenig@amd.com> Acked-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Signed-off-by: Baoyou Xie <baoyou.xie@linaro.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-09-14 15:10:37 -04:00
Monk Liu	3aecd24c65	drm/amdgpu: change job->ctx field name job->ctx actually is a fence_context of the entity it belongs to, naming it as ctx is too vague, and we'll need add amdgpu_ctx into the job structure later. Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-09-12 18:12:17 -04:00
Christian König	22a77cf6d8	drm/amdgpu: cleanup hw reference handling in the IB tests Reference should be taken when we make the assignment, not anywhere else. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-07-14 16:46:05 -04:00
Chunming Zhou	fd53be302f	drm/amdgpu: add a bool to specify if needing vm flush V2 which avoids job->vm_pd_addr be changed. V2: pass job structure to amdgpu_vm_grab_id and amdgpu_vm_flush directly. Signed-off-by: Chunming Zhou <David1.Zhou@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-07-07 15:06:16 -04:00
Chunming Zhou	c7c5fbcdc3	drm/amdgpu: put old hw fence of job if gpu reset Signed-off-by: Chunming Zhou <David1.Zhou@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-07-07 15:06:12 -04:00
Christian König	595a9cd68c	drm/amdgpu: remove fence parameter from amd_sched_job_init We return the fence as part of the job structur anyway, no need to do this twice. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-07-07 15:06:11 -04:00
Christian König	a5fb4ec29c	drm/amdgpu: earlier free SA resources Keep the time we don't have a fence associated with the resource smaller. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-07-07 15:06:10 -04:00
Christian König	a79a5bdcef	drm/amdgpu: shorten amdgpu_job_free_resources The fence and the sync object are not hardware resources. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-07-07 15:06:10 -04:00
Christian König	b5f5acbc87	drm/amdgpu: fix user fence handling once more Same problem as with the VM page tables. The user fence address must be determined before the job is scheduled, not when the IB is executed. This fixes a security problem where user fences could be used to overwrite any part of VRAM. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-07-07 15:06:09 -04:00
Christian König	1fbb2e9299	drm/amdgpu: use a fence array for VMID management Just wait for any fence to become available, instead of waiting for the last entry of the LRU. Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Chunming Zhou <david1.zhou@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-07-07 14:51:23 -04:00
Christian König	3542023896	drm/amdgpu: add optional ring to amdgpu_sync_is_idle Check if the sync object is idle depending on the ring a submission works with. Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Chunming Zhou <david1.zhou@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-07-07 14:51:21 -04:00
Christian König	a7e7a93e57	drm/amdgpu: remove amdgpu_sync_wait Stop hiding bugs, instead print a proper error when the scheduler doesn't handle all dependencies. Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Chunming Zhou <david1.zhou@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-07-07 14:51:21 -04:00
Christian König	6fc1367582	drm/amdgpu: generalize the scheduler fence Make it two events, one for the job being scheduled and one when it is finished. Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Chunming Zhou <david1.zhou@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-07-07 14:51:20 -04:00
Chunming Zhou	1974e30eb1	drm/amdgpu: add gpu reset to timeout handler so that we could actually reset the GPU when it hangs. Signed-off-by: Chunming Zhou <David1.Zhou@amd.com> Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-07-07 14:51:12 -04:00
Christian König	c5f74f7802	drm/amdgpu: fix and cleanup job destruction Remove the job reference counting and just properly destroy it from a work item which blocks on any potential running timeout handler. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Monk.Liu <monk.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-07-07 14:50:54 -04:00
Christian König	0e51a772e2	drm/amdgpu: properly abstract scheduler timeout handling The driver shouldn't mess with the scheduler internals. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Monk.Liu <monk.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-07-07 14:50:53 -04:00
Christian König	1e24e31f22	drm/amdgpu: remove use_shed hack in job cleanup Remembering the code path in a variable to cleanup differently is usually not a good idea at all. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Monk.Liu <monk.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-07-07 14:50:52 -04:00
Christian König	1ab0d211f3	drm/amdgpu: fix coding style in amdgpu_job_free Ther should be a new line between code and decleration. Also use amdgpu_ib_free() instead of releasing the member manually. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Monk.Liu <monk.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-07-07 14:50:52 -04:00
Christian König	7392c329ee	drm/amdgpu: remove begin_job/finish_job Completely pointless and confusing to use a callback to call into the same code file. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Monk.Liu <monk.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-07-07 14:50:50 -04:00
Christian König	758ac17f96	drm/amdgpu: fix and cleanup user fence handling v2 We leaked the BO in the error pass, additional to that we only have one user fence for all IBs in a job. v2: remove white space changes Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-05-11 13:30:32 -04:00
Christian König	d88bf583bd	drm/amdgpu: move VM fields into job They are the same for all IBs. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-05-11 13:30:31 -04:00
Christian König	92f250989b	drm/amdgpu: move the context from the IBs into the job We only have one context for all IBs. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-05-11 13:30:31 -04:00
Monk Liu	c5637837ba	drm/amdgpu: keep vm in job instead of ib (v2) ib.vm is a legacy way to get vm, after scheduler implemented vm should be get from job, and all ibs from one job share the same vm, no need to keep ib.vm just move vm field to job. this patch as well add job as paramter to ib_schedule so it can get vm from job->vm. v2: agd: sqaush in: drm/amdgpu: check if ring emit_vm_flush exists in vm flush No vm flush on engines that don't support VM. bug: https://bugs.freedesktop.org/show_bug.cgi?id=95195 Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-05-11 12:31:16 -04:00
Nils Wallménius	62250a910a	drm/amd/scheduler: Mark amdgpu_sched_ops const This marks the struct amdgpu_sched_ops const and adjusts amd_sched_init to take a const pointer for the ops param. The ops member of struct amd_gpu_scheduler is also changed to const. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Nils Wallménius <nils.wallmenius@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-05-04 20:20:05 -04:00
Monk Liu	b6723c8da5	drm/amdgpu: use ref to keep job alive this is to fix fatal page fault error that occured if: job is signaled/released after its timeout work is already put to the global queue (in this case the cancel_delayed_work will return false), which will lead to NX-protection error page fault during job_timeout_func. Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-05-02 15:20:07 -04:00
Monk Liu	0de2479c95	drm/amdgpu: rework TDR in scheduler (v2) Add two callbacks to scheduler to maintain jobs, and invoked for job timeout calculations. Now TDR measures time gap from job is processed by hw. v2: fix typo Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-05-02 15:19:57 -04:00
Monk Liu	e472d2588e	drm/amdgpu: delay job free to when it's finished (v2) for those jobs submitted through scheduler, do not free it immediately after scheduled, instead free it in global workqueue by its sched fence signaling callback function. v2: call uf's bo_undef after job_run() call job's sync free after job_run() no static inline __amdgpu_job_free() anymore, just use kfree(job) to replace it. Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-05-02 15:17:41 -04:00
Monk Liu	e686941a32	drm/amdgpu: use sched_job_init to initialize sched_job Consolidate job initialization in one place rather than duplicating it in multiple places. Signed-off-by: Monk Liu <Monk.Liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-05-02 15:12:59 -04:00
Monk Liu	676d8c24f3	drm/amdgpu: use sched fence if possible when preemption feature lands, the SA bo should rely on sched fence, because hw fence will be invalid after its job preempted or skipped. Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-03-17 11:54:53 -04:00
Monk Liu	73cfa5f5ce	drm/amdgpu: move ib.fence to job.fence Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-03-17 11:54:11 -04:00
Monk Liu	cc55c45db5	drm/amdgpu: give a fence param to ib_free thus amdgpu_ib_free() can hook sched fence to SA manager in later patches. BTW: for amdgpu_free_job(), it should only fence_put() the fence of the last ib once, so fix it as well in this patch. Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-03-17 11:53:34 -04:00
Christian König	336d1f5efe	drm/amdgpu: remove HW fence owner Not used any more since we now always use the sheduler. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com>	2016-03-08 11:01:47 -05:00
Christian König	4ff37a83f1	drm/amdgpu: fix VM faults caused by vm_grab_id() v4 The owner must be per ring as long as we don't support sharing VMIDs per process. Also move the assigned VMID and page directory address into the IB structure. v3: assign the VMID to all IBs, not just the first one. v4: use correct pointer for owner Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-02-29 11:33:46 -05:00
Christian König	20874179a2	drm/amdgpu: nuke the kernel context Not used any more. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2016-02-12 15:41:58 -05:00
Christian König	2bd9ccfa75	drm/amdgpu: use per VM entity for page table updates (v2) Updates from different VMs can be processed independently. v2: agd: rebase on upstream Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2016-02-12 15:35:16 -05:00
Christian König	e86f9ceee1	drm/amdgpu: move sync into job object No need to keep that for every IB. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-02-10 14:17:24 -05:00
Christian König	0856cab1a6	drm/amdgpu: rename amdgpu_sched.c to amdgpu_job.c That's probably a better matching name. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucer@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-02-10 14:17:23 -05:00

47 Commits