Commit Graph

915709 Commits

Author SHA1 Message Date
Chris Wilson
f4d49692ad drm/i915/gt: Mark up the racy read of execlists->context_tag
Since we are using bitops on context_tag to allow us to reserve and
release inflight tags concurrently, the scan for the next bit is
intentionally racy.

[  516.446854] BUG: KCSAN: data-race in execlists_schedule_in.isra.0 [i915] / execlists_schedule_out [i915]
[  516.446874]
[  516.446886] write (marked) to 0xffff8881f7644048 of 8 bytes by interrupt on cpu 2:
[  516.447076]  execlists_schedule_out+0x538/0x6a0 [i915]
[  516.447263]  process_csb+0x10b/0x3d0 [i915]
[  516.447449]  execlists_submission_tasklet+0x30/0x170 [i915]
[  516.447468]  tasklet_action_common.isra.0+0x42/0x90
[  516.447484]  __do_softirq+0xc8/0x206
[  516.447498]  irq_exit+0xcd/0xe0
[  516.447516]  do_IRQ+0x44/0xc0
[  516.447535]  ret_from_intr+0x0/0x1c
[  516.447550]  cpuidle_enter_state+0x199/0x400
[  516.447572]  cpuidle_enter+0x50/0x90
[  516.447587]  do_idle+0x197/0x1e0
[  516.447600]  cpu_startup_entry+0x14/0x20
[  516.447619]  start_secondary+0xf9/0x130
[  516.447643]  secondary_startup_64+0xa4/0xb0
[  516.447655]
[  516.447671] read to 0xffff8881f7644048 of 8 bytes by task 460 on cpu 1:
[  516.447863]  execlists_schedule_in.isra.0+0x3cf/0x5a0 [i915]
[  516.448064]  execlists_dequeue+0xf8f/0x1690 [i915]
[  516.448252]  __execlists_submission_tasklet+0x48/0x60 [i915]
[  516.448440]  execlists_submit_request+0x2e2/0x310 [i915]
[  516.448634]  submit_notify+0x8f/0xc8 [i915]
[  516.448820]  __i915_sw_fence_complete+0x61/0x420 [i915]
[  516.449005]  i915_sw_fence_complete+0x58/0x80 [i915]
[  516.449208]  i915_sw_fence_commit+0x16/0x20 [i915]
[  516.449399]  __i915_request_queue+0x60/0x70 [i915]
[  516.449590]  i915_gem_do_execbuffer+0x33f1/0x4a00 [i915]
[  516.449782]  i915_gem_execbuffer2_ioctl+0x2a2/0x550 [i915]
[  516.449800]  drm_ioctl_kernel+0xe9/0x130
[  516.449814]  drm_ioctl+0x27d/0x45e
[  516.449827]  ksys_ioctl+0x89/0xb0
[  516.449842]  __x64_sys_ioctl+0x42/0x60
[  516.449864]  do_syscall_64+0x6e/0x2c0
[  516.449878]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200511075722.13483-1-chris@chris-wilson.co.uk
2020-05-11 11:01:12 +01:00
Gustavo A. R. Silva
f1e79c7e18 drm/i915: Replace zero-length array with flexible-array
The current codebase makes use of the zero-length array language
extension to the C90 standard, but the preferred mechanism to declare
variable-length types such as these ones is a flexible array member[1][2],
introduced in C99:

struct foo {
        int stuff;
        struct boo array[];
};

By making use of the mechanism above, we will get a compiler warning
in case the flexible array does not occur last in the structure, which
will help us prevent some kind of undefined behavior bugs from being
inadvertently introduced[3] to the codebase from now on.

Also, notice that, dynamic memory allocations won't be affected by
this change:

"Flexible array members have incomplete type, and so the sizeof operator
may not be applied. As a quirk of the original implementation of
zero-length arrays, sizeof evaluates to zero."[1]

sizeof(flexible-array-member) triggers a warning because flexible array
members have incomplete type[1]. There are some instances of code in
which the sizeof operator is being incorrectly/erroneously applied to
zero-length arrays and the result is zero. Such instances may be hiding
some bugs. So, this work (flexible-array member conversions) will also
help to get completely rid of those sorts of issues.

This issue was found with the help of Coccinelle.

[1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
[2] https://github.com/KSPP/linux/issues/21
[3] commit 7649773293 ("cxgb3/l2t: Fix undefined behaviour")

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200507185408.GA14561@embeddedor
2020-05-09 12:59:23 +01:00
Chris Wilson
16dc224f1c drm/i915: Replace the hardcoded I915_FENCE_TIMEOUT
Expose the hardcoded timeout for unsignaled foreign fences as a Kconfig
option, primarily to allow brave systems to disable the timeout and
solely rely on correct signaling.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Acked-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200509105021.12542-1-chris@chris-wilson.co.uk
2020-05-09 12:57:57 +01:00
Chris Wilson
fcae496153 drm/i915: Prevent using semaphores to chain up to external fences
The downside of using semaphores is that we lose metadata passing
along the signaling chain. This is particularly nasty when we
need to pass along a fatal error such as EFAULT or EDEADLK. For
fatal errors we want to scrub the request before it is executed,
which means that we cannot preload the request onto HW and have
it wait upon a semaphore.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200508092933.738-3-chris@chris-wilson.co.uk
2020-05-08 21:02:33 +01:00
Lionel Landwerlin
3136deb7ba drm/i915: Peel dma-fence-chains for await
To allow faster engine to engine synchronization, peel the layer of
dma-fence-chain to expose potential i915 fences so that the
i915_request code can emit HW semaphore wait/signal operations in the
ring which is faster than waking up the host to submit unblocked
workloads after interrupt notification.

This is similar to the peeling we do for e.g. dma_fence_array.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200508185448.29709-1-chris@chris-wilson.co.uk
2020-05-08 20:57:56 +01:00
Chris Wilson
e41627db6f drm/i915/gt: Improve precision on defer_request assert
The kernel_context does not use initial-breadcrumbs, so when we ask if
its requests have started we do so by comparing against the completion
seqno of the previous request. This is very imprecise, not precise
enough for the defer_request assertion.

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1847
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200508104220.9872-1-chris@chris-wilson.co.uk
2020-05-08 15:09:11 +01:00
Chris Wilson
ac938052e5 drm/i915: Pull waiting on an external dma-fence into its routine
As a means for a small code consolidation, but primarily to start
thinking more carefully about internal-vs-external linkage, pull the
pair of i915_sw_fence_await_dma_fence() calls into a common routine.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200508092933.738-2-chris@chris-wilson.co.uk
2020-05-08 12:39:05 +01:00
Chris Wilson
2045d666ae drm/i915: Ignore submit-fences on the same timeline
While we ordinarily do not skip submit-fences due to the accompanying
hook that we want to callback on execution, a submit-fence on the same
timeline is meaningless.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200508092933.738-1-chris@chris-wilson.co.uk
2020-05-08 12:38:54 +01:00
Mika Kuoppala
972282c4cf drm/i915/gen12: Add aux table invalidate for all engines
All engines, exception being blitter as it does not
care about the form, can access compressed surfaces.

So we need to add forced aux table invalidates
for those engines.

v2: virtual instance masking (Chris)
v3: bug on if not found (Chris)

References: d248b371f7 ("drm/i915/gen12: Invalidate aux table entries forcibly")
References bspec#43904, hsdes#1809175790
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Chuansheng Liu <chuansheng.liu@intel.com>
Cc: Rafael Antognolli <rafael.antognolli@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200507142045.8668-1-mika.kuoppala@linux.intel.com
2020-05-07 20:18:28 +01:00
Chris Wilson
eec39e441c drm/i915: Remove wait priority boosting
Upon waiting a request (when asked), we gave that request a small
priority boost, not enough for it to cause preemption, but enough for it
to be scheduled next before all equals. We also used that bit to give
new clients a small priority boost, similar to FQ_CODEL, such that we
favoured short interactive tasks ahead of long running streams.

However, this is causing lots of complications with timeslicing where we
both want to honour the boost and yet ignore it. Those complications
cause unexpected user behaviour (tasks not being timesliced and run
concurrently as epxected), and the easiest way to resolve that is to
remove the boost. Hopefully, we can find a compromise again if we need
to, but in theory timeslicing itself and future more advanced schedulers
should give us the interactivity boost we seek.

Testcase: igt/gem_exec_schedule/lateslice
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200507152338.7452-3-chris@chris-wilson.co.uk
2020-05-07 20:08:58 +01:00
Chris Wilson
6b6cd2ebd8 drm/i915: Mark concurrent submissions with a weak-dependency
We recorded the dependencies for WAIT_FOR_SUBMIT in order that we could
correctly perform priority inheritance from the parallel branches to the
common trunk. However, for the purpose of timeslicing and reset
handling, the dependency is weak -- as we the pair of requests are
allowed to run in parallel and not in strict succession.

The real significance though is that this allows us to rearrange
groups of WAIT_FOR_SUBMIT linked requests along the single engine, and
so can resolve user level inter-batch scheduling dependencies from user
semaphores.

Fixes: c81471f5e9 ("drm/i915: Copy across scheduler behaviour flags across submit fences")
Testcase: igt/gem_exec_fence/submit
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: <stable@vger.kernel.org> # v5.6+
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200507155109.8892-1-chris@chris-wilson.co.uk
2020-05-07 19:49:21 +01:00
Mika Kuoppala
d248b371f7 drm/i915/gen12: Invalidate aux table entries forcibly
Aux table invalidation can fail on update. So
next access may cause memory access to be into stale entry.

Proposed workaround is to invalidate entries between
all batchbuffers.

v2: correct register address (Yang)
v3: respect the order (Chris)

References bspec#43904, hsdes#1809175790
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Chuansheng Liu <chuansheng.liu@intel.com>
Cc: Rafael Antognolli <rafael.antognolli@intel.com>
Cc: Yang A Shi <yang.a.shi@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200506165310.1239-1-mika.kuoppala@linux.intel.com
2020-05-07 07:44:42 +01:00
Mika Kuoppala
0c7c0c8e6f drm/i915/gen12: Flush L3
Flush TDL,L3 and EUs

Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200506144734.29297-3-mika.kuoppala@linux.intel.com
2020-05-07 07:44:41 +01:00
Mika Kuoppala
32d7171ee2 drm/i915/gen12: Fix HDC pipeline flush
HDC pipeline flush is bit on the first dword of
the PIPE_CONTROL, not the second. Make it so.

v2: function naming (Chris)

Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200506144734.29297-2-mika.kuoppala@linux.intel.com
2020-05-07 07:44:41 +01:00
Mika Kuoppala
f02ac414ba Revert "drm/i915/tgl: Include ro parts of l3 to invalidate"
This reverts commit 62037ffff2.

L3 ro cache invalidation is part of the dword0 of pipe
control. Also it is not relevant to this gen.

Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200506144734.29297-1-mika.kuoppala@linux.intel.com
2020-05-07 07:44:40 +01:00
Chris Wilson
24fe5f2ab2 drm/i915: Propagate error from completed fences
We need to preserve fatal errors from fences that are being terminated
as we hook them up.

Fixes: ef46884975 ("drm/i915: Propagate fence errors")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200506162136.3325-1-chris@chris-wilson.co.uk
2020-05-06 18:23:14 +01:00
Matt Roper
9b2383a7ac drm/i915/icp: Add Wa_14010685332
We need to toggle a SDE chicken bit on and then off as the final
step when disabling interrupts in preparation for runtime suspend.

Bspec: 33450
Bspec: 8402
Cc: Bob Paauwe <bob.j.paauwe@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200501213701.371443-1-matthew.d.roper@intel.com
Reviewed-by: Bob Paauwe <bob.j.paauwe@intel.com>
2020-05-05 14:26:46 -07:00
Chris Wilson
977253df64 drm/i915/gt: Stop holding onto the pinned_default_state
As we only restore the default context state upon banning a context, we
only need enough of the state to run the ring and nothing more. That is
we only need our bare protocontext.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Andi Shyti <andi.shyti@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200504180745.15645-1-chris@chris-wilson.co.uk
2020-05-05 21:12:33 +01:00
Chris Wilson
b68be5c623 drm/i915/execlists: Record the active CCID from before reset
If we cannot trust the reset will flush out the CS event queue such that
process_csb() reports an accurate view of HW, we will need to search the
active and pending contexts to determine which was actually running at
the time we issued the reset.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200505084629.31365-1-chris@chris-wilson.co.uk
2020-05-05 12:05:40 +01:00
Stanislav Lisovskiy
f136c58a0d drm/i915: Added required new PCode commands
We need a new PCode request commands and reply codes
to be added as a prepartion patch for QGV points
restricting for new SAGV support.

v2: - Extracted those changes into separate patch
      (Ville Syrjälä)

v3: - Moved new PCode masks to another place from
      PCode commands(Ville)

v4: - Moved new PCode masks to correspondent PCode
      command, with identation(Ville)
    - Changed naming to ICL_ instead of GEN11_
      to fit more nicely into existing definition
      style.

Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200505102247.32452-5-stanislav.lisovskiy@intel.com
2020-05-05 13:59:55 +03:00
Imre Deak
054318c7e3 drm/i915/tgl+: Fix interrupt handling for DP AUX transactions
Unmask/enable AUX interrupts on all ports on TGL+. So far the interrupts
worked only on port A, which meant each transaction on other ports took
10ms.

Cc: <stable@vger.kernel.org> # v5.4+
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200504075828.20348-1-imre.deak@intel.com
2020-05-05 11:59:48 +03:00
Chris Wilson
25fd6de315 drm/i915/gt: Small tidy of gen8+ breadcrumb emission
Use a local to shrink a line under 80 columns, and refactor the common
emit_xcs_breadcrumb() wrapper of ggtt-write.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200504180507.6017-1-chris@chris-wilson.co.uk
2020-05-05 09:16:59 +01:00
Chris Wilson
8757797ff9 drm/i915/selftests: Repeat the rps clock frequency measurement
Repeat the measurement of the clock frequency a few times and use the
median to try and reduce the systematic measurement error.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200504044903.7626-6-chris@chris-wilson.co.uk
2020-05-04 18:21:28 +01:00
Chris Wilson
0065e5f5cc drm/i915/display: Warn if the FBC is still writing to stolen on removal
If the FBC is still writing into stolen, it will overwrite any future
users of that stolen region. Check before release, just to ease any
concerns -- we can remove it again later if it is barking up the wrong
tree.

References: https://gitlab.freedesktop.org/drm/intel/-/issues/1635
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200503180034.20010-1-chris@chris-wilson.co.uk
2020-05-04 17:11:51 +01:00
Sultan Alsawaf
690d22dafa drm/i915: Don't enable WaIncreaseLatencyIPCEnabled when IPC is disabled
In commit 5a7d202b15, a logical AND was erroneously changed to an OR,
causing WaIncreaseLatencyIPCEnabled to be enabled unconditionally for
kabylake and coffeelake, even when IPC is disabled. Fix the logic so
that WaIncreaseLatencyIPCEnabled is only used when IPC is enabled.

Fixes: 5a7d202b15 ("drm/i915: Drop WaIncreaseLatencyIPCEnabled/1140 for cnl")
Cc: stable@vger.kernel.org # 5.3.x+
Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200430214654.51314-1-sultan@kerneltoast.com
2020-05-04 18:55:41 +03:00
Ville Syrjälä
2dd43144e8 drm/i915: Streamline the artihmetic
All these ROUNDING_FACTORs and whatnot are making this thing hard to
read. Get rid of them. And let's massage some of the fractions to
give us less questionable intermediate results and perhaps less
divisions.

Also looks like a good helping of 64bit math stuff is needed to
avoid some of overflows present in the current code. There
might still be a few overflows, namely when calculating
link_clks_available/samples_room (would require a huge hblank
though), and potentially when calculating hblank_rise (not sure
how large link_clks_active can get).

It looks like we're still not calculating exactly what the spec says
since we truncate tu_data and tu_line early. But I'm too lazy to
figure out if we could avoid that.

v2: Fix typo in commit msg (Uma)
    Remove ROUNDING_FACTOR define (Uma)
    s/5*link_clk+5*cdclk/5*(link_clk+cdclk)/ (Chris)

Cc: Anshuman Gupta <anshuman.gupta@intel.com>
Cc: Uma Shankar <uma.shankar@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200429185457.26235-3-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
2020-05-04 18:44:53 +03:00
Ville Syrjälä
41ee86d6ee drm/i915: Rename variables to be consistent with bspec
Since the code seems insistent on using the variable names from the
bspec formulat, let's be consistent and use those names for all
the things. For some reason 'link_clk' and 'lanes' were left out
in the code until now.

Cc: Anshuman Gupta <anshuman.gupta@intel.com>
Cc: Uma Shankar <uma.shankar@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200429185457.26235-2-ville.syrjala@linux.intel.com
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
2020-05-04 18:44:53 +03:00
Ville Syrjälä
d19b29be65 drm/i915: Nuke mode.vrefresh usage
mode.vrefresh is rounded to the nearest integer. You don't want to use
it anywhere that requires precision. Also I want to nuke it.
vtotal*vrefresh == 1000*clock/htotal, so let's use the latter.

Cc: Anshuman Gupta <anshuman.gupta@intel.com>
Cc: Uma Shankar <uma.shankar@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200429185457.26235-1-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
2020-05-04 18:44:52 +03:00
Ville Syrjälä
dab3aff7b1 drm/i915: Remove cnl pre-prod workarounds
Remove all the stepping dependent cnl workarounds. Bspec lists
more steppings than this so presumably these are classed as
pre-production. And this is cnl after all so no one should
really care anyway.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200430125822.21985-2-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2020-05-04 18:44:52 +03:00
Ville Syrjälä
25444ca6cb drm/i915/fbc: Require linear fb stride to be multiple of 512 bytes on gen9/glk
Display WA #1105 says that FBC requires PLANE_STRIDE to be a multiple
of 512 bytes on gen9 and glk.

This is definitely true for glk as certain tests (such as
igt/kms_big_fb/linear-16bpp-rotate-0) are now failing when the
display resolution results in a plane stride which is not a
multiple of 512 bytes.

Curiously I was not able to reproduce this on a KBL. First I
suspected that our use of the FBC override stride explain this,
but after trying to use the override stride on glk the test
still failed. I did try both the old CHICKEN_MISC_4 way and
the new FBC_STRIDE way, neither had any effect on the result.

Anyways, we need this at least on glk. But let's trust the spec
and apply the w/a for all gen9 as well, despite being unable to
reproduce the problem.

v2: s/FBC_CHICKEN/FBC_STRIDE/ in commit msg

Cc: José Roberto de Souza <jose.souza@intel.com>
Fixes: 691f7ba58d ("drm/i915/display/fbc: Make fences a nice-to-have for GEN9+")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200429101034.8208-2-ville.syrjala@linux.intel.com
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
2020-05-04 18:44:52 +03:00
Stanislav Lisovskiy
9ff79708c5 drm/i915: Rename bw_state to new_bw_state
That is a preparation patch before next one where we
introduce old_bw_state and a bunch of other changes
as well.
In a review comment it was suggested to split out
at least that renaming into a separate patch, what
is done here.

v2: Removed spurious space

Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200423075902.21892-8-stanislav.lisovskiy@intel.com
2020-05-04 18:44:52 +03:00
Stanislav Lisovskiy
ecab0f3d05 drm/i915: Track active_pipes in bw_state
We need to calculate SAGV mask also in a non-modeset
commit, however currently active_pipes are only calculated
for modesets in global atomic state, thus now we will be
tracking those also in bw_state in order to be able to
properly access global data.

v2: - Removed pre/post plane SAGV updates from modeset(Ville)
    - Now tracking active pipes in intel_can_enable_sagv(Ville)

v3: - lock global state if active_pipes change as well(Ville)

Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200430195634.7666-1-stanislav.lisovskiy@intel.com
2020-05-04 18:44:52 +03:00
Stanislav Lisovskiy
9728889f42 drm/i915: Use bw state for per crtc SAGV evaluation
Future platforms require per-crtc SAGV evaluation
and serializing global state when those are changed
from different commits.

v2: - Add has_sagv check to intel_crtc_can_enable_sagv
      so that it sets bit in reject mask.
    - Use bw_state in intel_pre/post_plane_enable_sagv
      instead of atomic state

v3: - Fixed rebase conflict, now using
      intel_atomic_crtc_state_for_each_plane_state in
      order to call it from atomic check
v4: - Use fb modifier from plane state

v5: - Make intel_has_sagv static again(Ville)
    - Removed unnecessary NULL assignments(Ville)
    - Removed unnecessary SAGV debug(Ville)
    - Call intel_compute_sagv_mask only for modesets(Ville)
    - Serialize global state only if sagv results change, but
      not mask itself(Ville)

v6: - use lock global state instead of serialize(Ville)
v7: - use both global state lock and serialize depending on
      if we need to change only global state or access hw
      (Ville)

Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Cc: Ville Syrjälä <ville.syrjala@intel.com>
Cc: James Ausmus <james.ausmus@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200430191757.18206-1-stanislav.lisovskiy@intel.com
2020-05-04 18:44:52 +03:00
Chris Wilson
e3d291301f drm/i915/gem: Implement legacy MI_STORE_DATA_IMM
The older arches did not convert MI_STORE_DATA_IMM to using the GTT, but
left them writing to a physical address. The notes suggest that the
primary reason would be so that the writes were cache coherent, as the
CPU cache uses physical tagging. As such we did not implement the
legacy variant of MI_STORE_DATA_IMM and so left all the relocations
synchronous -- but with a small function to convert from the vma address
into the physical address, we can implement asynchronous relocs on these
older arches, fixing up a few tests that require them.

In order to be able to test the legacy paths, refactor the gpu
relocations so that we can hook them up to a selftest.

v2: Use an array of offsets not enum labels for the selftest
v3: Refactor the common igt_hexdump()

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/757
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200504140629.28240-1-chris@chris-wilson.co.uk
2020-05-04 15:15:04 +01:00
Chris Wilson
f5b62bdbb6 drm/i915/gem: Specify address type for chained reloc batches
It is required that a chained batch be in the same address domain as its
parent, and also that must be specified in the command for earlier gen
as it is not inferred from the chaining until gen6.

Fixes: 964a9b0f61 ("drm/i915/gem: Use chained reloc batches")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200504125149.4396-1-chris@chris-wilson.co.uk
2020-05-04 14:28:48 +01:00
Chris Wilson
378974f7f9 drm/i915: Allow some leniency in PCU reads
Extend the timeout for pcode reads to 20ms as they should not be
performed along critical paths, and succeeding after a short delay is
better than failing entirely.

References: https://gitlab.freedesktop.org/drm/intel/-/issues/1800
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200504044903.7626-1-chris@chris-wilson.co.uk
2020-05-04 11:12:37 +01:00
Chris Wilson
6983dafa31 drm/i915/gem: Lazily acquire the device wakeref for freeing objects
We only need the device wakeref on freeing the objects if we have to
unbind the object from the global GTT, or otherwise update device
information. If the objects are clean, we never need the wakeref, so
avoid taking until required.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Reviewed-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200503171513.18704-1-chris@chris-wilson.co.uk
2020-05-04 11:12:37 +01:00
Chris Wilson
389b7f00c7 drm/i915/gt: Sanitize RPS interrupts upon resume
Currently we clear and disable the RPS pm interrupts on module load, and
presume that they remain disabled forevermore. However, the mask is
cleared on suspend and so after resume they may start showing up again
unexepectedly.

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1811
Fixes: 8e99299a04 ("drm/i915/gt: Track use of RPS interrupts in flags")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andi Shyti <andi@etezian.org>
Reviewed-by: Andi Shyti <andi@etezian.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20200502173512.32353-1-chris@chris-wilson.co.uk
2020-05-03 08:24:36 +01:00
Chris Wilson
6f576d6277 drm/i915/gem: Try an alternate engine for relocations
If at first we don't succeed, try try again.

Not all engines may support the MI ops we need to perform asynchronous
relocation patching, and so we end up falling back to a synchronous
operation that has a liability of blocking. However, Tvrtko pointed out
we don't need to use the same engine to perform the relocations as we
are planning to execute the execbuf on, and so if we switch over to a
working engine, we can perform the relocation asynchronously. The user
execbuf will be queued after the relocations by virtue of fencing.

This patch creates a new context per execbuf requiring asynchronous
relocations on an unusable engines. This is perhaps a bit excessive and
can be ameliorated by a small context cache, but for the moment we only
need it for working around a little used engine on Sandybridge, and only
if relocations are actually required to an active batch buffer.

Now we just need to teach the relocation code to handle physical
addressing for gen2/3, and we should then have universal support!

Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Testcase: igt/gem_exec_reloc/basic-spin # snb
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200501192945.22215-3-chris@chris-wilson.co.uk
2020-05-01 22:56:16 +01:00
Chris Wilson
0e97fbb080 drm/i915/gem: Use a single chained reloc batches for a single execbuf
As we can now keep chaining together a relocation batch to process any
number of relocations, we can keep building that relocation batch for
all of the target vma. This avoiding emitting a new request into the
ring for each target, consuming precious ring space and a potential
stall.

v2: Propagate the failure from submitting the relocation batch.

Testcase: igt/gem_exec_reloc/basic-wide-active
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200501192945.22215-2-chris@chris-wilson.co.uk
2020-05-01 22:56:15 +01:00
Chris Wilson
964a9b0f61 drm/i915/gem: Use chained reloc batches
The ring is a precious resource: we anticipate to only use a few hundred
bytes for a request, and only try to reserve that before we start. If we
go beyond our guess in building the request, then instead of waiting at
the start of execbuf before we hold any locks or other resources, we
may trigger a wait inside a critical region. One example is in using gpu
relocations, where currently we emit a new MI_BB_START from the ring
every time we overflow a page of relocation entries. However, instead of
insert the command into the precious ring, we can chain the next page of
relocation entries as MI_BB_START from the end of the previous.

v2: Delay the emit_bb_start until after all the chained vma
synchronisation is complete. Since the buffer pool batches are idle, this
_should_ be a no-op, but one day we may some fancy async GPU bindings
for new vma!

v3: Use pool/batch consitently, once we start thinking in terms of the
batch vma, use batch->obj.
v4: Explain the magic number 4.

Tvrtko spotted that we lose propagation of the error for failing to
submit the relocation request; that's easier to fix up in the next
patch.

Testcase: igt/gem_exec_reloc/basic-many-active
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200501192945.22215-1-chris@chris-wilson.co.uk
2020-05-01 22:56:15 +01:00
Chris Wilson
9f909e215f drm/i915: Implement vm_ops->access for gdb access into mmaps
gdb uses ptrace() to peek and poke bytes of the target's address space.
The driver must implement an vm_ops->access() handler or else gdb will
be unable to inspect the pointer and report it as out-of-bounds.
Worse than useless as it causes immediate suspicion of the valid GTT
pointer, distracting the poor programmer trying to find his bug.

v2: Write-protect readonly objects (Matthew).

Testcase: igt/gem_mmap_gtt/ptrace
Testcase: igt/gem_mmap_offset/ptrace
Suggested-by: Kristian H. Kristensen <hoegsberg@google.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Maciej Patelczyk <maciej.patelczyk@intel.com>
Cc: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200501145120.18830-1-chris@chris-wilson.co.uk
2020-05-01 17:30:47 +01:00
Chris Wilson
a211da9c77 drm/i915/gt: Make timeslicing an explicit engine property
In order to allow userspace to rely on timeslicing to reorder their
batches, we must support preemption of those user batches. Declare
timeslicing as an explicit property that is a combination of having the
kernel support and HW support.

Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Fixes: 8ee36e048c ("drm/i915/execlists: Minimalistic timeslicing")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200501122249.12417-1-chris@chris-wilson.co.uk
2020-05-01 15:17:33 +01:00
Chris Wilson
3b55cdeb8f drm/i915/pmu: Keep a reference to module while active
While a perf event is open, keep a reference to the module so we don't
remove the driver internals mid-sampling.

Testcase: igt/perf_pmu/module-unload
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200430183324.23984-1-chris@chris-wilson.co.uk
2020-05-01 09:24:34 +01:00
Chris Wilson
16e8745967 drm/i915/gt: Move the batch buffer pool from the engine to the gt
Since the introduction of 'soft-rc6', we aim to park the device quickly
and that results in frequent idling of the whole device. Currently upon
idling we free the batch buffer pool, and so this renders the cache
ineffective for many workloads. If we want to have an effective cache of
recently allocated buffers available for reuse, we need to decouple that
cache from the engine powermanagement and make it timer based. As there
is no reason then to keep it within the engine (where it once made
retirement order easier to track), we can move it up the hierarchy to the
owner of the memory allocations.

v2: Hook up to debugfs/drop_caches to clear the cache on demand.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200430111819.10262-2-chris@chris-wilson.co.uk
2020-04-30 19:12:02 +01:00
Joonas Lahtinen
230982d8d8 drm/i915: Update DRIVER_DATE to 20200430
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-04-30 11:13:21 +03:00
Joonas Lahtinen
8b46ed57f3 Merge tag 'gvt-next-2020-04-22' of https://github.com/intel/gvt-linux into drm-intel-next-queued
gvt-next-2020-04-22

- remove non-upstream xen support bits (Christoph)
- guest context shadow copy optimization (Yan)
- guest context tracking for shadow skip optimization (Yan)

Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
From: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200422051230.GH11247@zhen-hp.sh.intel.com
2020-04-30 10:53:21 +03:00
Zbigniew Kempczyński
79eb8c7f01 drm/i915/selftests: Add tiled blits selftest
Extend coverage of the blitter client by exercising conversion to and
from tiled sources. In the process we perform spot checks to verify that
the tiling/detiling is being applied correctly, along with position
invariance of the tiling parameters.

Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200430064957.14942-1-chris@chris-wilson.co.uk
2020-04-30 08:31:12 +01:00
Chris Wilson
de3b4d9361 drm/i915/gt: Restore aggressive post-boost downclocking
We reduced the clocks slowly after a boost event based on the
observation that the smoothness of animations suffered. However, since
reducing the evalution intervals, we should be able to respond to the
rapidly fluctuating workload of a simple desktop animation and so
restore the more aggressive downclocking.

References: 2a8862d2f3 ("drm/i915: Reduce the RPS shock")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200429205446.3259-6-chris@chris-wilson.co.uk
2020-04-30 00:57:38 +01:00
Chris Wilson
3f88dde6ee drm/i915/gt: Apply the aggressive downclocking to parking
We treat parking as a manual RPS timeout event, and downclock the GPU
for the next unpark and batch execution. However, having restored the
aggressive downclocking and observed that we have very light workloads
whose only interaction is through the manual parking events, carry over
the aggressive downclocking to the fake RPS events.

References: 21abf0bf16 ("drm/i915/gt: Treat idling as a RPS downclock event")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200429205446.3259-5-chris@chris-wilson.co.uk
2020-04-30 00:57:37 +01:00