drm/amdgpu: hold reference to fences in amdgpu_sa_bo_new (v2)

An arbitrary amount of time can pass between spin_unlock and fence_wait_any_timeout, so we need to ensure that nobody frees the fences from under us. A stress test (rapidly starting and killing hundreds of glxgears instances) ran into a deadlock in fence_wait_any_timeout after about an hour, and this race condition appears to be a plausible cause. v2: agd: rebase on upstream Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Cc: stable@vger.kernel.org
2024-11-07 20:51:47 +00:00 · 2016-02-05 10:59:43 -05:00 · 2016-02-05 10:59:43 -05:00 · a8d81b3626
commit a8d81b3626
parent ca19852884
1 changed files with 4 additions and 1 deletions
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sa.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sa.c
@ -354,12 +354,15 @@ int amdgpu_sa_bo_new(struct amdgpu_sa_manager *sa_manager,

 		for (i = 0, count = 0; i < AMDGPU_MAX_RINGS; ++i)
 			if (fences[i])
-				fences[count++] = fences[i];
+				fences[count++] = fence_get(fences[i]);

 		if (count) {
 			spin_unlock(&sa_manager->wq.lock);
 			t = fence_wait_any_timeout(fences, count, false,
 						   MAX_SCHEDULE_TIMEOUT);
+			for (i = 0; i < count; ++i)
+				fence_put(fences[i]);
+
 			r = (t > 0) ? 0 : t;
 			spin_lock(&sa_manager->wq.lock);
 		} else {