linux

Author	SHA1	Message	Date
Chris Wilson	f4d49692ad	drm/i915/gt: Mark up the racy read of execlists->context_tag Since we are using bitops on context_tag to allow us to reserve and release inflight tags concurrently, the scan for the next bit is intentionally racy. [ 516.446854] BUG: KCSAN: data-race in execlists_schedule_in.isra.0 [i915] / execlists_schedule_out [i915] [ 516.446874] [ 516.446886] write (marked) to 0xffff8881f7644048 of 8 bytes by interrupt on cpu 2: [ 516.447076] execlists_schedule_out+0x538/0x6a0 [i915] [ 516.447263] process_csb+0x10b/0x3d0 [i915] [ 516.447449] execlists_submission_tasklet+0x30/0x170 [i915] [ 516.447468] tasklet_action_common.isra.0+0x42/0x90 [ 516.447484] __do_softirq+0xc8/0x206 [ 516.447498] irq_exit+0xcd/0xe0 [ 516.447516] do_IRQ+0x44/0xc0 [ 516.447535] ret_from_intr+0x0/0x1c [ 516.447550] cpuidle_enter_state+0x199/0x400 [ 516.447572] cpuidle_enter+0x50/0x90 [ 516.447587] do_idle+0x197/0x1e0 [ 516.447600] cpu_startup_entry+0x14/0x20 [ 516.447619] start_secondary+0xf9/0x130 [ 516.447643] secondary_startup_64+0xa4/0xb0 [ 516.447655] [ 516.447671] read to 0xffff8881f7644048 of 8 bytes by task 460 on cpu 1: [ 516.447863] execlists_schedule_in.isra.0+0x3cf/0x5a0 [i915] [ 516.448064] execlists_dequeue+0xf8f/0x1690 [i915] [ 516.448252] __execlists_submission_tasklet+0x48/0x60 [i915] [ 516.448440] execlists_submit_request+0x2e2/0x310 [i915] [ 516.448634] submit_notify+0x8f/0xc8 [i915] [ 516.448820] __i915_sw_fence_complete+0x61/0x420 [i915] [ 516.449005] i915_sw_fence_complete+0x58/0x80 [i915] [ 516.449208] i915_sw_fence_commit+0x16/0x20 [i915] [ 516.449399] __i915_request_queue+0x60/0x70 [i915] [ 516.449590] i915_gem_do_execbuffer+0x33f1/0x4a00 [i915] [ 516.449782] i915_gem_execbuffer2_ioctl+0x2a2/0x550 [i915] [ 516.449800] drm_ioctl_kernel+0xe9/0x130 [ 516.449814] drm_ioctl+0x27d/0x45e [ 516.449827] ksys_ioctl+0x89/0xb0 [ 516.449842] __x64_sys_ioctl+0x42/0x60 [ 516.449864] do_syscall_64+0x6e/0x2c0 [ 516.449878] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200511075722.13483-1-chris@chris-wilson.co.uk	2020-05-11 11:01:12 +01:00
Gustavo A. R. Silva	f1e79c7e18	drm/i915: Replace zero-length array with flexible-array The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variable-length types such as these ones is a flexible array member[1][2], introduced in C99: struct foo { int stuff; struct boo array[]; }; By making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being inadvertently introduced[3] to the codebase from now on. Also, notice that, dynamic memory allocations won't be affected by this change: "Flexible array members have incomplete type, and so the sizeof operator may not be applied. As a quirk of the original implementation of zero-length arrays, sizeof evaluates to zero."[1] sizeof(flexible-array-member) triggers a warning because flexible array members have incomplete type[1]. There are some instances of code in which the sizeof operator is being incorrectly/erroneously applied to zero-length arrays and the result is zero. Such instances may be hiding some bugs. So, this work (flexible-array member conversions) will also help to get completely rid of those sorts of issues. This issue was found with the help of Coccinelle. [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [2] https://github.com/KSPP/linux/issues/21 [3] commit `7649773293` ("cxgb3/l2t: Fix undefined behaviour") Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200507185408.GA14561@embeddedor	2020-05-09 12:59:23 +01:00
Chris Wilson	16dc224f1c	drm/i915: Replace the hardcoded I915_FENCE_TIMEOUT Expose the hardcoded timeout for unsignaled foreign fences as a Kconfig option, primarily to allow brave systems to disable the timeout and solely rely on correct signaling. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Acked-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200509105021.12542-1-chris@chris-wilson.co.uk	2020-05-09 12:57:57 +01:00
Chris Wilson	fcae496153	drm/i915: Prevent using semaphores to chain up to external fences The downside of using semaphores is that we lose metadata passing along the signaling chain. This is particularly nasty when we need to pass along a fatal error such as EFAULT or EDEADLK. For fatal errors we want to scrub the request before it is executed, which means that we cannot preload the request onto HW and have it wait upon a semaphore. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200508092933.738-3-chris@chris-wilson.co.uk	2020-05-08 21:02:33 +01:00
Lionel Landwerlin	3136deb7ba	drm/i915: Peel dma-fence-chains for await To allow faster engine to engine synchronization, peel the layer of dma-fence-chain to expose potential i915 fences so that the i915_request code can emit HW semaphore wait/signal operations in the ring which is faster than waking up the host to submit unblocked workloads after interrupt notification. This is similar to the peeling we do for e.g. dma_fence_array. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200508185448.29709-1-chris@chris-wilson.co.uk	2020-05-08 20:57:56 +01:00
Chris Wilson	e41627db6f	drm/i915/gt: Improve precision on defer_request assert The kernel_context does not use initial-breadcrumbs, so when we ask if its requests have started we do so by comparing against the completion seqno of the previous request. This is very imprecise, not precise enough for the defer_request assertion. Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1847 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200508104220.9872-1-chris@chris-wilson.co.uk	2020-05-08 15:09:11 +01:00
Chris Wilson	ac938052e5	drm/i915: Pull waiting on an external dma-fence into its routine As a means for a small code consolidation, but primarily to start thinking more carefully about internal-vs-external linkage, pull the pair of i915_sw_fence_await_dma_fence() calls into a common routine. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200508092933.738-2-chris@chris-wilson.co.uk	2020-05-08 12:39:05 +01:00
Chris Wilson	2045d666ae	drm/i915: Ignore submit-fences on the same timeline While we ordinarily do not skip submit-fences due to the accompanying hook that we want to callback on execution, a submit-fence on the same timeline is meaningless. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200508092933.738-1-chris@chris-wilson.co.uk	2020-05-08 12:38:54 +01:00
Mika Kuoppala	972282c4cf	drm/i915/gen12: Add aux table invalidate for all engines All engines, exception being blitter as it does not care about the form, can access compressed surfaces. So we need to add forced aux table invalidates for those engines. v2: virtual instance masking (Chris) v3: bug on if not found (Chris) References: `d248b371f7` ("drm/i915/gen12: Invalidate aux table entries forcibly") References bspec#43904, hsdes#1809175790 Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Chuansheng Liu <chuansheng.liu@intel.com> Cc: Rafael Antognolli <rafael.antognolli@intel.com> Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200507142045.8668-1-mika.kuoppala@linux.intel.com	2020-05-07 20:18:28 +01:00
Chris Wilson	eec39e441c	drm/i915: Remove wait priority boosting Upon waiting a request (when asked), we gave that request a small priority boost, not enough for it to cause preemption, but enough for it to be scheduled next before all equals. We also used that bit to give new clients a small priority boost, similar to FQ_CODEL, such that we favoured short interactive tasks ahead of long running streams. However, this is causing lots of complications with timeslicing where we both want to honour the boost and yet ignore it. Those complications cause unexpected user behaviour (tasks not being timesliced and run concurrently as epxected), and the easiest way to resolve that is to remove the boost. Hopefully, we can find a compromise again if we need to, but in theory timeslicing itself and future more advanced schedulers should give us the interactivity boost we seek. Testcase: igt/gem_exec_schedule/lateslice Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200507152338.7452-3-chris@chris-wilson.co.uk	2020-05-07 20:08:58 +01:00
Chris Wilson	6b6cd2ebd8	drm/i915: Mark concurrent submissions with a weak-dependency We recorded the dependencies for WAIT_FOR_SUBMIT in order that we could correctly perform priority inheritance from the parallel branches to the common trunk. However, for the purpose of timeslicing and reset handling, the dependency is weak -- as we the pair of requests are allowed to run in parallel and not in strict succession. The real significance though is that this allows us to rearrange groups of WAIT_FOR_SUBMIT linked requests along the single engine, and so can resolve user level inter-batch scheduling dependencies from user semaphores. Fixes: `c81471f5e9` ("drm/i915: Copy across scheduler behaviour flags across submit fences") Testcase: igt/gem_exec_fence/submit Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: <stable@vger.kernel.org> # v5.6+ Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200507155109.8892-1-chris@chris-wilson.co.uk	2020-05-07 19:49:21 +01:00
Mika Kuoppala	d248b371f7	drm/i915/gen12: Invalidate aux table entries forcibly Aux table invalidation can fail on update. So next access may cause memory access to be into stale entry. Proposed workaround is to invalidate entries between all batchbuffers. v2: correct register address (Yang) v3: respect the order (Chris) References bspec#43904, hsdes#1809175790 Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Chuansheng Liu <chuansheng.liu@intel.com> Cc: Rafael Antognolli <rafael.antognolli@intel.com> Cc: Yang A Shi <yang.a.shi@intel.com> Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200506165310.1239-1-mika.kuoppala@linux.intel.com	2020-05-07 07:44:42 +01:00
Mika Kuoppala	0c7c0c8e6f	drm/i915/gen12: Flush L3 Flush TDL,L3 and EUs Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200506144734.29297-3-mika.kuoppala@linux.intel.com	2020-05-07 07:44:41 +01:00
Mika Kuoppala	32d7171ee2	drm/i915/gen12: Fix HDC pipeline flush HDC pipeline flush is bit on the first dword of the PIPE_CONTROL, not the second. Make it so. v2: function naming (Chris) Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200506144734.29297-2-mika.kuoppala@linux.intel.com	2020-05-07 07:44:41 +01:00
Mika Kuoppala	f02ac414ba	Revert "drm/i915/tgl: Include ro parts of l3 to invalidate" This reverts commit `62037ffff2`. L3 ro cache invalidation is part of the dword0 of pipe control. Also it is not relevant to this gen. Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200506144734.29297-1-mika.kuoppala@linux.intel.com	2020-05-07 07:44:40 +01:00
Chris Wilson	24fe5f2ab2	drm/i915: Propagate error from completed fences We need to preserve fatal errors from fences that are being terminated as we hook them up. Fixes: `ef46884975` ("drm/i915: Propagate fence errors") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200506162136.3325-1-chris@chris-wilson.co.uk	2020-05-06 18:23:14 +01:00
Matt Roper	9b2383a7ac	drm/i915/icp: Add Wa_14010685332 We need to toggle a SDE chicken bit on and then off as the final step when disabling interrupts in preparation for runtime suspend. Bspec: 33450 Bspec: 8402 Cc: Bob Paauwe <bob.j.paauwe@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200501213701.371443-1-matthew.d.roper@intel.com Reviewed-by: Bob Paauwe <bob.j.paauwe@intel.com>	2020-05-05 14:26:46 -07:00
Chris Wilson	977253df64	drm/i915/gt: Stop holding onto the pinned_default_state As we only restore the default context state upon banning a context, we only need enough of the state to run the ring and nothing more. That is we only need our bare protocontext. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Andi Shyti <andi.shyti@intel.com> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200504180745.15645-1-chris@chris-wilson.co.uk	2020-05-05 21:12:33 +01:00
Chris Wilson	b68be5c623	drm/i915/execlists: Record the active CCID from before reset If we cannot trust the reset will flush out the CS event queue such that process_csb() reports an accurate view of HW, we will need to search the active and pending contexts to determine which was actually running at the time we issued the reset. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200505084629.31365-1-chris@chris-wilson.co.uk	2020-05-05 12:05:40 +01:00
Stanislav Lisovskiy	f136c58a0d	drm/i915: Added required new PCode commands We need a new PCode request commands and reply codes to be added as a prepartion patch for QGV points restricting for new SAGV support. v2: - Extracted those changes into separate patch (Ville Syrjälä) v3: - Moved new PCode masks to another place from PCode commands(Ville) v4: - Moved new PCode masks to correspondent PCode command, with identation(Ville) - Changed naming to ICL_ instead of GEN11_ to fit more nicely into existing definition style. Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200505102247.32452-5-stanislav.lisovskiy@intel.com	2020-05-05 13:59:55 +03:00
Imre Deak	054318c7e3	drm/i915/tgl+: Fix interrupt handling for DP AUX transactions Unmask/enable AUX interrupts on all ports on TGL+. So far the interrupts worked only on port A, which meant each transaction on other ports took 10ms. Cc: <stable@vger.kernel.org> # v5.4+ Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200504075828.20348-1-imre.deak@intel.com	2020-05-05 11:59:48 +03:00
Chris Wilson	25fd6de315	drm/i915/gt: Small tidy of gen8+ breadcrumb emission Use a local to shrink a line under 80 columns, and refactor the common emit_xcs_breadcrumb() wrapper of ggtt-write. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200504180507.6017-1-chris@chris-wilson.co.uk	2020-05-05 09:16:59 +01:00
Chris Wilson	8757797ff9	drm/i915/selftests: Repeat the rps clock frequency measurement Repeat the measurement of the clock frequency a few times and use the median to try and reduce the systematic measurement error. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200504044903.7626-6-chris@chris-wilson.co.uk	2020-05-04 18:21:28 +01:00
Chris Wilson	0065e5f5cc	drm/i915/display: Warn if the FBC is still writing to stolen on removal If the FBC is still writing into stolen, it will overwrite any future users of that stolen region. Check before release, just to ease any concerns -- we can remove it again later if it is barking up the wrong tree. References: https://gitlab.freedesktop.org/drm/intel/-/issues/1635 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200503180034.20010-1-chris@chris-wilson.co.uk	2020-05-04 17:11:51 +01:00
Sultan Alsawaf	690d22dafa	drm/i915: Don't enable WaIncreaseLatencyIPCEnabled when IPC is disabled In commit `5a7d202b15`, a logical AND was erroneously changed to an OR, causing WaIncreaseLatencyIPCEnabled to be enabled unconditionally for kabylake and coffeelake, even when IPC is disabled. Fix the logic so that WaIncreaseLatencyIPCEnabled is only used when IPC is enabled. Fixes: `5a7d202b15` ("drm/i915: Drop WaIncreaseLatencyIPCEnabled/1140 for cnl") Cc: stable@vger.kernel.org # 5.3.x+ Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200430214654.51314-1-sultan@kerneltoast.com	2020-05-04 18:55:41 +03:00
Ville Syrjälä	2dd43144e8	drm/i915: Streamline the artihmetic All these ROUNDING_FACTORs and whatnot are making this thing hard to read. Get rid of them. And let's massage some of the fractions to give us less questionable intermediate results and perhaps less divisions. Also looks like a good helping of 64bit math stuff is needed to avoid some of overflows present in the current code. There might still be a few overflows, namely when calculating link_clks_available/samples_room (would require a huge hblank though), and potentially when calculating hblank_rise (not sure how large link_clks_active can get). It looks like we're still not calculating exactly what the spec says since we truncate tu_data and tu_line early. But I'm too lazy to figure out if we could avoid that. v2: Fix typo in commit msg (Uma) Remove ROUNDING_FACTOR define (Uma) s/5link_clk+5cdclk/5*(link_clk+cdclk)/ (Chris) Cc: Anshuman Gupta <anshuman.gupta@intel.com> Cc: Uma Shankar <uma.shankar@intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200429185457.26235-3-ville.syrjala@linux.intel.com Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Uma Shankar <uma.shankar@intel.com>	2020-05-04 18:44:53 +03:00
Ville Syrjälä	41ee86d6ee	drm/i915: Rename variables to be consistent with bspec Since the code seems insistent on using the variable names from the bspec formulat, let's be consistent and use those names for all the things. For some reason 'link_clk' and 'lanes' were left out in the code until now. Cc: Anshuman Gupta <anshuman.gupta@intel.com> Cc: Uma Shankar <uma.shankar@intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200429185457.26235-2-ville.syrjala@linux.intel.com Reviewed-by: Uma Shankar <uma.shankar@intel.com>	2020-05-04 18:44:53 +03:00
Ville Syrjälä	d19b29be65	drm/i915: Nuke mode.vrefresh usage mode.vrefresh is rounded to the nearest integer. You don't want to use it anywhere that requires precision. Also I want to nuke it. vtotalvrefresh == 1000clock/htotal, so let's use the latter. Cc: Anshuman Gupta <anshuman.gupta@intel.com> Cc: Uma Shankar <uma.shankar@intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200429185457.26235-1-ville.syrjala@linux.intel.com Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Uma Shankar <uma.shankar@intel.com>	2020-05-04 18:44:52 +03:00
Ville Syrjälä	dab3aff7b1	drm/i915: Remove cnl pre-prod workarounds Remove all the stepping dependent cnl workarounds. Bspec lists more steppings than this so presumably these are classed as pre-production. And this is cnl after all so no one should really care anyway. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200430125822.21985-2-ville.syrjala@linux.intel.com Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2020-05-04 18:44:52 +03:00
Ville Syrjälä	25444ca6cb	drm/i915/fbc: Require linear fb stride to be multiple of 512 bytes on gen9/glk Display WA #1105 says that FBC requires PLANE_STRIDE to be a multiple of 512 bytes on gen9 and glk. This is definitely true for glk as certain tests (such as igt/kms_big_fb/linear-16bpp-rotate-0) are now failing when the display resolution results in a plane stride which is not a multiple of 512 bytes. Curiously I was not able to reproduce this on a KBL. First I suspected that our use of the FBC override stride explain this, but after trying to use the override stride on glk the test still failed. I did try both the old CHICKEN_MISC_4 way and the new FBC_STRIDE way, neither had any effect on the result. Anyways, we need this at least on glk. But let's trust the spec and apply the w/a for all gen9 as well, despite being unable to reproduce the problem. v2: s/FBC_CHICKEN/FBC_STRIDE/ in commit msg Cc: José Roberto de Souza <jose.souza@intel.com> Fixes: `691f7ba58d` ("drm/i915/display/fbc: Make fences a nice-to-have for GEN9+") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200429101034.8208-2-ville.syrjala@linux.intel.com Reviewed-by: Matt Roper <matthew.d.roper@intel.com>	2020-05-04 18:44:52 +03:00
Stanislav Lisovskiy	9ff79708c5	drm/i915: Rename bw_state to new_bw_state That is a preparation patch before next one where we introduce old_bw_state and a bunch of other changes as well. In a review comment it was suggested to split out at least that renaming into a separate patch, what is done here. v2: Removed spurious space Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200423075902.21892-8-stanislav.lisovskiy@intel.com	2020-05-04 18:44:52 +03:00
Stanislav Lisovskiy	ecab0f3d05	drm/i915: Track active_pipes in bw_state We need to calculate SAGV mask also in a non-modeset commit, however currently active_pipes are only calculated for modesets in global atomic state, thus now we will be tracking those also in bw_state in order to be able to properly access global data. v2: - Removed pre/post plane SAGV updates from modeset(Ville) - Now tracking active pipes in intel_can_enable_sagv(Ville) v3: - lock global state if active_pipes change as well(Ville) Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200430195634.7666-1-stanislav.lisovskiy@intel.com	2020-05-04 18:44:52 +03:00
Stanislav Lisovskiy	9728889f42	drm/i915: Use bw state for per crtc SAGV evaluation Future platforms require per-crtc SAGV evaluation and serializing global state when those are changed from different commits. v2: - Add has_sagv check to intel_crtc_can_enable_sagv so that it sets bit in reject mask. - Use bw_state in intel_pre/post_plane_enable_sagv instead of atomic state v3: - Fixed rebase conflict, now using intel_atomic_crtc_state_for_each_plane_state in order to call it from atomic check v4: - Use fb modifier from plane state v5: - Make intel_has_sagv static again(Ville) - Removed unnecessary NULL assignments(Ville) - Removed unnecessary SAGV debug(Ville) - Call intel_compute_sagv_mask only for modesets(Ville) - Serialize global state only if sagv results change, but not mask itself(Ville) v6: - use lock global state instead of serialize(Ville) v7: - use both global state lock and serialize depending on if we need to change only global state or access hw (Ville) Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com> Cc: Ville Syrjälä <ville.syrjala@intel.com> Cc: James Ausmus <james.ausmus@intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200430191757.18206-1-stanislav.lisovskiy@intel.com	2020-05-04 18:44:52 +03:00
Chris Wilson	e3d291301f	drm/i915/gem: Implement legacy MI_STORE_DATA_IMM The older arches did not convert MI_STORE_DATA_IMM to using the GTT, but left them writing to a physical address. The notes suggest that the primary reason would be so that the writes were cache coherent, as the CPU cache uses physical tagging. As such we did not implement the legacy variant of MI_STORE_DATA_IMM and so left all the relocations synchronous -- but with a small function to convert from the vma address into the physical address, we can implement asynchronous relocs on these older arches, fixing up a few tests that require them. In order to be able to test the legacy paths, refactor the gpu relocations so that we can hook them up to a selftest. v2: Use an array of offsets not enum labels for the selftest v3: Refactor the common igt_hexdump() Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/757 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200504140629.28240-1-chris@chris-wilson.co.uk	2020-05-04 15:15:04 +01:00
Chris Wilson	f5b62bdbb6	drm/i915/gem: Specify address type for chained reloc batches It is required that a chained batch be in the same address domain as its parent, and also that must be specified in the command for earlier gen as it is not inferred from the chaining until gen6. Fixes: `964a9b0f61` ("drm/i915/gem: Use chained reloc batches") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200504125149.4396-1-chris@chris-wilson.co.uk	2020-05-04 14:28:48 +01:00
Chris Wilson	378974f7f9	drm/i915: Allow some leniency in PCU reads Extend the timeout for pcode reads to 20ms as they should not be performed along critical paths, and succeeding after a short delay is better than failing entirely. References: https://gitlab.freedesktop.org/drm/intel/-/issues/1800 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200504044903.7626-1-chris@chris-wilson.co.uk	2020-05-04 11:12:37 +01:00
Chris Wilson	6983dafa31	drm/i915/gem: Lazily acquire the device wakeref for freeing objects We only need the device wakeref on freeing the objects if we have to unbind the object from the global GTT, or otherwise update device information. If the objects are clean, we never need the wakeref, so avoid taking until required. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com> Reviewed-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200503171513.18704-1-chris@chris-wilson.co.uk	2020-05-04 11:12:37 +01:00
Chris Wilson	389b7f00c7	drm/i915/gt: Sanitize RPS interrupts upon resume Currently we clear and disable the RPS pm interrupts on module load, and presume that they remain disabled forevermore. However, the mask is cleared on suspend and so after resume they may start showing up again unexepectedly. Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1811 Fixes: `8e99299a04` ("drm/i915/gt: Track use of RPS interrupts in flags") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Andi Shyti <andi@etezian.org> Reviewed-by: Andi Shyti <andi@etezian.org> Link: https://patchwork.freedesktop.org/patch/msgid/20200502173512.32353-1-chris@chris-wilson.co.uk	2020-05-03 08:24:36 +01:00
Chris Wilson	6f576d6277	drm/i915/gem: Try an alternate engine for relocations If at first we don't succeed, try try again. Not all engines may support the MI ops we need to perform asynchronous relocation patching, and so we end up falling back to a synchronous operation that has a liability of blocking. However, Tvrtko pointed out we don't need to use the same engine to perform the relocations as we are planning to execute the execbuf on, and so if we switch over to a working engine, we can perform the relocation asynchronously. The user execbuf will be queued after the relocations by virtue of fencing. This patch creates a new context per execbuf requiring asynchronous relocations on an unusable engines. This is perhaps a bit excessive and can be ameliorated by a small context cache, but for the moment we only need it for working around a little used engine on Sandybridge, and only if relocations are actually required to an active batch buffer. Now we just need to teach the relocation code to handle physical addressing for gen2/3, and we should then have universal support! Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Testcase: igt/gem_exec_reloc/basic-spin # snb Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200501192945.22215-3-chris@chris-wilson.co.uk	2020-05-01 22:56:16 +01:00
Chris Wilson	0e97fbb080	drm/i915/gem: Use a single chained reloc batches for a single execbuf As we can now keep chaining together a relocation batch to process any number of relocations, we can keep building that relocation batch for all of the target vma. This avoiding emitting a new request into the ring for each target, consuming precious ring space and a potential stall. v2: Propagate the failure from submitting the relocation batch. Testcase: igt/gem_exec_reloc/basic-wide-active Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200501192945.22215-2-chris@chris-wilson.co.uk	2020-05-01 22:56:15 +01:00
Chris Wilson	964a9b0f61	drm/i915/gem: Use chained reloc batches The ring is a precious resource: we anticipate to only use a few hundred bytes for a request, and only try to reserve that before we start. If we go beyond our guess in building the request, then instead of waiting at the start of execbuf before we hold any locks or other resources, we may trigger a wait inside a critical region. One example is in using gpu relocations, where currently we emit a new MI_BB_START from the ring every time we overflow a page of relocation entries. However, instead of insert the command into the precious ring, we can chain the next page of relocation entries as MI_BB_START from the end of the previous. v2: Delay the emit_bb_start until after all the chained vma synchronisation is complete. Since the buffer pool batches are idle, this _should_ be a no-op, but one day we may some fancy async GPU bindings for new vma! v3: Use pool/batch consitently, once we start thinking in terms of the batch vma, use batch->obj. v4: Explain the magic number 4. Tvrtko spotted that we lose propagation of the error for failing to submit the relocation request; that's easier to fix up in the next patch. Testcase: igt/gem_exec_reloc/basic-many-active Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200501192945.22215-1-chris@chris-wilson.co.uk	2020-05-01 22:56:15 +01:00
Chris Wilson	9f909e215f	drm/i915: Implement vm_ops->access for gdb access into mmaps gdb uses ptrace() to peek and poke bytes of the target's address space. The driver must implement an vm_ops->access() handler or else gdb will be unable to inspect the pointer and report it as out-of-bounds. Worse than useless as it causes immediate suspicion of the valid GTT pointer, distracting the poor programmer trying to find his bug. v2: Write-protect readonly objects (Matthew). Testcase: igt/gem_mmap_gtt/ptrace Testcase: igt/gem_mmap_offset/ptrace Suggested-by: Kristian H. Kristensen <hoegsberg@google.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Maciej Patelczyk <maciej.patelczyk@intel.com> Cc: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200501145120.18830-1-chris@chris-wilson.co.uk	2020-05-01 17:30:47 +01:00
Chris Wilson	a211da9c77	drm/i915/gt: Make timeslicing an explicit engine property In order to allow userspace to rely on timeslicing to reorder their batches, we must support preemption of those user batches. Declare timeslicing as an explicit property that is a combination of having the kernel support and HW support. Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Fixes: `8ee36e048c` ("drm/i915/execlists: Minimalistic timeslicing") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200501122249.12417-1-chris@chris-wilson.co.uk	2020-05-01 15:17:33 +01:00
Chris Wilson	3b55cdeb8f	drm/i915/pmu: Keep a reference to module while active While a perf event is open, keep a reference to the module so we don't remove the driver internals mid-sampling. Testcase: igt/perf_pmu/module-unload Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: stable@vger.kernel.org Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200430183324.23984-1-chris@chris-wilson.co.uk	2020-05-01 09:24:34 +01:00
Chris Wilson	16e8745967	drm/i915/gt: Move the batch buffer pool from the engine to the gt Since the introduction of 'soft-rc6', we aim to park the device quickly and that results in frequent idling of the whole device. Currently upon idling we free the batch buffer pool, and so this renders the cache ineffective for many workloads. If we want to have an effective cache of recently allocated buffers available for reuse, we need to decouple that cache from the engine powermanagement and make it timer based. As there is no reason then to keep it within the engine (where it once made retirement order easier to track), we can move it up the hierarchy to the owner of the memory allocations. v2: Hook up to debugfs/drop_caches to clear the cache on demand. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200430111819.10262-2-chris@chris-wilson.co.uk	2020-04-30 19:12:02 +01:00
Joonas Lahtinen	230982d8d8	drm/i915: Update DRIVER_DATE to 20200430 Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2020-04-30 11:13:21 +03:00
Joonas Lahtinen	8b46ed57f3	Merge tag 'gvt-next-2020-04-22' of https://github.com/intel/gvt-linux into drm-intel-next-queued gvt-next-2020-04-22 - remove non-upstream xen support bits (Christoph) - guest context shadow copy optimization (Yan) - guest context tracking for shadow skip optimization (Yan) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> From: Zhenyu Wang <zhenyuw@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200422051230.GH11247@zhen-hp.sh.intel.com	2020-04-30 10:53:21 +03:00
Zbigniew Kempczyński	79eb8c7f01	drm/i915/selftests: Add tiled blits selftest Extend coverage of the blitter client by exercising conversion to and from tiled sources. In the process we perform spot checks to verify that the tiling/detiling is being applied correctly, along with position invariance of the tiling parameters. Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200430064957.14942-1-chris@chris-wilson.co.uk	2020-04-30 08:31:12 +01:00
Chris Wilson	de3b4d9361	drm/i915/gt: Restore aggressive post-boost downclocking We reduced the clocks slowly after a boost event based on the observation that the smoothness of animations suffered. However, since reducing the evalution intervals, we should be able to respond to the rapidly fluctuating workload of a simple desktop animation and so restore the more aggressive downclocking. References: `2a8862d2f3` ("drm/i915: Reduce the RPS shock") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200429205446.3259-6-chris@chris-wilson.co.uk	2020-04-30 00:57:38 +01:00
Chris Wilson	3f88dde6ee	drm/i915/gt: Apply the aggressive downclocking to parking We treat parking as a manual RPS timeout event, and downclock the GPU for the next unpark and batch execution. However, having restored the aggressive downclocking and observed that we have very light workloads whose only interaction is through the manual parking events, carry over the aggressive downclocking to the fake RPS events. References: `21abf0bf16` ("drm/i915/gt: Treat idling as a RPS downclock event") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200429205446.3259-5-chris@chris-wilson.co.uk	2020-04-30 00:57:37 +01:00

1 2 3 4 5 ...

915709 Commits