Commit Graph

78437 Commits

Author SHA1 Message Date
Tvrtko Ursulin
74388ca483 drm/i915: Use Transparent Hugepages when IOMMU is enabled
Usage of Transparent Hugepages was disabled in 9987da4b5d
("drm/i915: Disable THP until we have a GPU read BW W/A"), but since it
appears majority of performance regressions reported with an enabled IOMMU
can be almost eliminated by turning them on, lets just do that.

To err on the side of safety we keep the current default in cases where
IOMMU is not active, and only when it is default to the "huge=within_size"
mode. Although there probably would be wins to enable them throughout,
more extensive testing across benchmarks and platforms would need to be
done.

With the patch and IOMMU enabled my local testing on a small Skylake part
shows OglVSTangent regression being reduced from ~14% (IOMMU on versus
IOMMU off) to ~2% (same comparison but with THP on).

More detailed testing done in the below referenced Gitlab issue by Eero:

Skylake GT4e:

Performance drops from enabling IOMMU:

    30-35% SynMark CSDof
    20-25% Unigine Heaven, MemBW GPU write, SynMark VSTangent
    ~20% GLB Egypt  (1/2 screen window)
    10-15% GLB T-Rex (1/2 screen window)
    8-10% GfxBench T-Rex, MemBW GPU blit
    7-8% SynMark DeferredAA + TerrainFly* + ZBuffer
    6-7% GfxBench Manhattan 3.0 + 3.1, SynMark TexMem128 & CSCloth
    5-6% GfxBench CarChase, Unigine Valley
    3-5% GfxBench Vulkan & GL AztecRuins + ALU2, MemBW GPU texture,
         SynMark Fill*, Deferred, TerrainPan*
    1-2% Most of the other tests

With the patch drops become:

    20-25% SynMark TexMem*
    15-20% GLB Egypt (1/2 screen window)
    10-15% GLB T-Rex (1/2 screen window)
    4-7% GfxBench T-Rex, GpuTest Triangle
    1-8% GfxBench ALU2 (offscreen 1%, onscreen 8%)
    3% GfxBench Manhattan 3.0, SynMark CSDof
    2-3% Unigine Heaven + Valley, MemBW GPU texture
    1-3 GfxBench Manhattan 3.1 + CarChase + Vulkan & GL AztecRuins

Broxton:

Performance drops from IOMMU, without patch:

    30% MemBW GPU write
    25% SynMark ZBuffer + Fill*
    20% MemBW GPU blit
    15% MemBW GPU blend, GpuTest Triangle
    10-15% MemBW GPU texture
    10% GLB Egypt, Unigine Heaven (had hangs), SynMark TerrainFly*
    7-9% GLB T-Rex, GfxBench Manhattan 3.0 + T-Rex,
         SynMark Deferred* + TexMem*
    6-8% GfxBench CarChase, Unigine Valley,
         SynMark CSCloth + ShMapVsm + TerrainPan*
    5-6% GfxBench Manhattan 3.1 + GL AztecRuins,
         SynMark CSDof + TexFilterTri
    2-4% GfxBench ALU2, SynMark DrvRes + GSCloth + ShMapPcf + Batch[0-5] +
         TexFilterAniso, GpuTest GiMark + 32-bit Julia

And with patch:

    15-20% MemBW GPU texture
    10% SynMark TexMem*
    8-9% GLB Egypt (1/2 screen window)
    4-5% GLB T-Rex (1/2 screen window)
    3-6% GfxBench Manhattan 3.0, GpuTest FurMark,
         SynMark Deferred + TexFilterTri
    3-4% GfxBench Manhattan 3.1 + T-Rex, SynMark VSInstancing
    2-4% GpuTest Triangle, SynMark DeferredAA
    2-3% Unigine Heaven + Valley
    1-3% SynMark Terrain*
    1-2% GfxBench CarChase, SynMark TexFilterAniso + ZBuffer

Tigerlake-H:

    20-25% MemBW GPU texture
    15-20% GpuTest Triangle
    13-15% SynMark TerrainFly* + DeferredAA + HdrBloom
    8-10% GfxBench Manhattan 3.1, SynMark TerrainPan* + DrvRes
    6-7% GfxBench Manhattan 3.0, SynMark TexMem*
    4-8% GLB onscreen Fill + T-Rex + Egypt (more in onscreen than
         offscreen versions of T-Rex/Egypt)
    4-6% GfxBench CarChase + GLES AztecRuins + ALU2, GpuTest 32-bit Julia,
         SynMark CSDof + DrvState
    3-5% GfxBench T-Rex + Egypt, Unigine Heaven + Valley, GpuTest Plot3D
    1-7% Media tests
    2-3% MemBW GPU blit
    1-3% Most of the rest of 3D tests

With the patch:

    6-8% MemBW GPU blend => the only regression in these tests (compared
         to IOMMU without THP)
    4-6% SynMark DrvState (not impacted) + HdrBloom (improved)
    3-4% GLB T-Rex
    ~3% GLB Egypt, SynMark DrvRes
    1-3% GfxBench T-Rex + Egypt, SynMark TexFilterTri
    1-2% GfxBench CarChase + GLES AztecRuins, Unigine Valley,
        GpuTest Triangle
    ~1% GfxBench Manhattan 3.0/3.1, Unigine Heaven

Perf of several tests actually improved with IOMMU + THP, compared to no
IOMMU / no THP:

    10-15% SynMark Batch[0-3]
    5-10% MemBW GPU texture, SynMark ShMapVsm
    3-4% SynMark Fill* + Geom*
    2-3% SynMark TexMem512 + CSCloth
    1-2% SynMark TexMem128 + DeferredAA

As a summary across all platforms, these are the benchmarks where enabling
THP on top of IOMMU enabled brings regressions:

 * Skylake GT4e:
   20-25% SynMark TexMem*
   (whereas all MemBW GPU tests either improve or are not affected)

 * Broxton J4205:
   7% MemBW GPU texture
   2-3% SynMark TexMem*

 * Tigerlake-H:
   7% MemBW GPU blend

Other benchmarks show either lowering of regressions or improvements.

v2:
 * Add Kconfig dependency to transparent hugepages and some help text.
 * Move to helper for easier handling of kernel build options.

v3:
 * Drop Kconfig. (Daniel)

v4:
 * Add some benchmark results to commit message.

v5:
 * Add explicit regression summary to commit message. (Eero)

References: b901bb8932 ("drm/i915/gemfs: enable THP")
References: 9987da4b5d ("drm/i915: Disable THP until we have a GPU read BW W/A")
References: https://gitlab.freedesktop.org/drm/intel/-/issues/430
Co-developed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Eero Tamminen <eero.t.tamminen@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210909114448.508493-1-tvrtko.ursulin@linux.intel.com
2021-09-10 14:29:19 +01:00
xinhui pan
70982eef4d drm/ttm: Fix a deadlock if the target BO is not idle during swap
The ret value might be -EBUSY, caller will think lru lock is still
locked but actually NOT. So return -ENOSPC instead. Otherwise we hit
list corruption.

ttm_bo_cleanup_refs might fail too if BO is not idle. If we return 0,
caller(ttm_tt_populate -> ttm_global_swapout ->ttm_device_swapout) will
be stuck as we actually did not free any BO memory. This usually happens
when the fence is not signaled for a long time.

Signed-off-by: xinhui pan <xinhui.pan@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Fixes: ebd59851c7 ("drm/ttm: move swapout logic around v3")
Link: https://patchwork.freedesktop.org/patch/msgid/20210907040832.1107747-1-xinhui.pan@amd.com
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2021-09-10 16:18:53 +10:00
Dave Airlie
de04744d65 Merge tag 'drm-misc-next-fixes-2021-09-03' of git://anongit.freedesktop.org/drm/drm-misc into drm-next
drm-misc-next-fixes for v5.15:
- Fix ttm_bo_move_memcpy() when ttm_resource is subclassed.
- Small fixes to panfrost, mgag200, vc4.
- Small ttm compilation fixes.

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/41ff5e54-0837-2226-a182-97ffd11ef01e@linux.intel.com
2021-09-09 13:35:54 +10:00
Dave Airlie
06b224d516 Merge tag 'amd-drm-next-5.15-2021-09-01' of https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-next-5.15-2021-09-01:

amdgpu:
- Misc cleanups, typo fixes
- EEPROM fix
- Add some new PCI IDs
- Scatter/Gather display support for Yellow Carp
- PCIe DPM fix for RKL platforms
- RAS fix

amdkfd:
- SVM fix

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210901214015.4488-1-alexander.deucher@amd.com
2021-09-09 13:33:49 +10:00
Colin Ian King
058d7d6260 drm/i915: clean up inconsistent indenting
There is a statement that is indented one character too deeply,
clean this up.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210902215737.55570-1-colin.king@canonical.com
2021-09-08 19:13:26 +02:00
Matthew Auld
f503eb0cf2 drm/i915/selftests: fixup igt_shrink_thp
Since the object might still be active here, the shrink_all will simply
ignore it, which blows up in the test, since the pages will still be
there. Currently THP is disabled which should result in the test being
skipped, but if we ever re-enable THP we might start seeing the failure.
Fix this by forcing I915_SHRINK_ACTIVE.

v2: Some machine in the shard runs doesn't seem to have any available
swap when running this test. Try to handle this.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210906091729.2093312-1-matthew.auld@intel.com
2021-09-08 09:41:26 +01:00
Matthew Auld
502d0609fc drm/i915/gtt: add some flushing for the 64K GTT path
If we need to mark the PDE as operating in 64K GTT mode, we should be
paranoid and flush the extra writes, like we already do for the PTEs. On
some platforms the clflush can apparently add the just the right amount
of magical delay to force the GPU to see the updated entry.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210903155317.1854012-1-matthew.auld@intel.com
2021-09-08 09:35:37 +01:00
Ayaz A Siddiqui
3f027d6166 drm/i915/gt: Add separate MOCS table for Gen12 devices other than TGL/RKL
MOCS table of TGL/RKL has MOCS[1] set to L3_UC.
While for other gen12 devices we need to set MOCS[1] as L3_WB,
So adding a new MOCS table for other gen 12 devices eg. ADL.

Fixes: cfbe5291a1 ("drm/i915/gt: Initialize unused MOCS entries with device specific values")
Cc: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Ayaz A Siddiqui <ayaz.siddiqui@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
[mattrope: fix whitespace error]
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210907171639.1221287-1-ayaz.siddiqui@intel.com
2021-09-07 19:33:42 -07:00
ravitejax
f5392e5f8e drm/i915/adl_s: Remove require_force_probe protection
Removing force probe protection from ADLS platform. Did
not observe warnings, errors, flickering or any visual
defects while doing ordinary tasks like browsing and
editing documents in a two monitor setup.

For more info drm-tip idle run results :
https://intel-gfx-ci.01.org/tree/drm-tip/bat-all.html?

Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: ravitejax <ravitejax.goud.talla@intel.com>
Reviewed-by: Ayaz A Siddiqui <ayaz.siddiqui@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210903182034.668467-1-ravitejax.gpud.talla@intel.com
2021-09-07 10:52:12 +02:00
Daniel Vetter
dcc5d82063 drm/i915: Stop rcu support for i915_address_space
The full audit is quite a bit of work:

- i915_dpt has very simple lifetime (somehow we create a display pagetable vm
  per object, so its _very_ simple, there's only ever a single vma in there),
  and uses i915_vm_close(), which internally does a i915_vm_put(). No rcu.

  Aside: wtf is i915_dpt doing in the intel_display.c garbage collector as a new
  feature, instead of added as a separate file with some clean-ish interface.

  Also, i915_dpt unfortunately re-introduces some coding patterns from
  pre-dma_resv_lock conversion times.

- i915_gem_proto_ctx is fully refcounted and no rcu, all protected by
  fpriv->proto_context_lock.

- i915_gem_context is itself rcu protected, and that might leak to anything it
  points at. Before

	commit cf977e1861
	Author: Chris Wilson <chris@chris-wilson.co.uk>
	Date:   Wed Dec 2 11:21:40 2020 +0000

	    drm/i915/gem: Spring clean debugfs

  and

	commit db80a1294c
	Author: Chris Wilson <chris@chris-wilson.co.uk>
	Date:   Mon Jan 18 11:08:54 2021 +0000

	    drm/i915/gem: Remove per-client stats from debugfs/i915_gem_objects

  we had a bunch of debugfs files that relied on rcu protecting everything, but
  those are gone now. The main one was removed even earlier with

  There doesn't seem to be anything left that's actually protecting
  stuff now that the ctx->vm itself is invariant. See

	commit ccbc1b9794
	Author: Jason Ekstrand <jason@jlekstrand.net>
	Date:   Thu Jul 8 10:48:30 2021 -0500

	    drm/i915/gem: Don't allow changing the VM on running contexts (v4)

  Note that we drop the vm refcount before the final release of the gem context
  refcount, so this is all very dangerous even without rcu. Note that aside from
  later on creating new engines (a defunct feature) and debug output we're never
  looked at gem_ctx->vm for anything functional, hence why this is ok.
  Fingers crossed.

  Preceeding patches removed all vestiges of rcu use from gem_ctx->vm
  derferencing to make it clear it's really not used.

  The gem_ctx->rcu protection was introduced in

	commit a4e7ccdac3
	Author: Chris Wilson <chris@chris-wilson.co.uk>
	Date:   Fri Oct 4 14:40:09 2019 +0100

	    drm/i915: Move context management under GEM

  The commit message is somewhat entertaining because it fails to
  mention this fact completely, and compensates that by an in-commit
  changelog entry that claims that ctx->vm is protected by ctx->mutex.
  Which was the case _before_ this commit, but no longer after it.

- intel_context holds a full reference. Unfortunately intel_context is also rcu
  protected and the reference to the ->vm is dropped before the
  rcu barrier - only the kfree is delayed. So again we need to check
  whether that leaks anywhere on the intel_context->vm. RCU is only
  used to protect intel_context sitting on the breadcrumb lists, which
  don't look at the vm anywhere, so we are fine.

  Nothing else relies on rcu protection of intel_context and hence is
  fully protected by the kref refcount alone, which protects
  intel_context->vm in turn.

  The breadcrumbs rcu usage was added in

	commit c744d50363
	Author: Chris Wilson <chris@chris-wilson.co.uk>
	Date:   Thu Nov 26 14:04:06 2020 +0000

	    drm/i915/gt: Split the breadcrumb spinlock between global and contexts

  its parent commit added the intel_context rcu protection:

	commit 14d1eaf088
	Author: Chris Wilson <chris@chris-wilson.co.uk>
	Date:   Thu Nov 26 14:04:05 2020 +0000

	    drm/i915/gt: Protect context lifetime with RCU

  given some credence to my claim that I've actually caught them all.

- drm_i915_gem_object's shares_resv_from pointer has a full refcount to the
  dma_resv, which is a sub-refcount that's released after the final
  i915_vm_put() has been called. Safe.

  Aside: Maybe we should have a struct dma_resv_shared which is just dma_resv +
  kref as a stand-alone thing. It's a pretty useful pattern which other drivers
  might want to copy.

  For a bit more context see

	commit 4d8151ae53
	Author: Thomas Hellström <thomas.hellstrom@linux.intel.com>
	Date:   Tue Jun 1 09:46:41 2021 +0200

	    drm/i915: Don't free shared locks while shared

- the fpriv->vm_xa was relying on rcu_read_lock for lookup, but that
  was updated in a prep patch too to just be a spinlock-protected
  lookup.

- intel_gt->vm is set at driver load in intel_gt_init() and released
  in intel_gt_driver_release(). There seems to be some issue that
  in some error paths this is called twice, but otherwise no rcu to be
  found anywhere. This was added in the below commit, which
  unfortunately doesn't explain why this complication exists.

	commit e6ba764802
	Author: Chris Wilson <chris@chris-wilson.co.uk>
	Date:   Sat Dec 21 16:03:24 2019 +0000

	    drm/i915: Remove i915->kernel_context

  The proper fix most likely for this is to start using drmm_ at large
  scale, but that's also huge amounts of work.

- i915_vma->vm is some real pain, because rcu is rcu protected, at
  least in the vma lookup in the context lookup cache in
  eb_lookup_vma(). This was added in

	commit 4ff4b44cbb
	Author: Chris Wilson <chris@chris-wilson.co.uk>
	Date:   Fri Jun 16 15:05:16 2017 +0100

	    drm/i915: Store a direct lookup from object handle to vma

  This was changed to a radix tree from the hashtable in, but with the
  locking unchanged, in

	commit d1b48c1e71
	Author: Chris Wilson <chris@chris-wilson.co.uk>
	Date:   Wed Aug 16 09:52:08 2017 +0100

	    drm/i915: Replace execbuf vma ht with an idr

  In

	commit 93159e1235
	Author: Chris Wilson <chris@chris-wilson.co.uk>
	Date:   Mon Mar 23 09:28:41 2020 +0000

	    drm/i915/gem: Avoid gem_context->mutex for simple vma lookup

  the locking was changed from dev->struct_mutex to rcu, which added
  the requirement to rcu protect i915_vma. Somehow this was missed in
  review (or I'm completely blind).

  Irrespective of all that the vma lookup cache rcu_read_lock grabs a
  full reference of the vma and the rcu doesn't leak further. So no
  impact on i915_address_space from that.

  I have not found any other rcu use for i915_vma, but given that it
  seems broken I also didn't bother to do a careful in-depth audit.

Alltogether there's nothing left in-tree anymore which requires that a
pointer deref to an i915_address_space is safe undre rcu_read_lock
only.

rcu protection of i915_address_space was introduced in

commit b32fa81115
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Thu Jun 20 19:37:05 2019 +0100

    drm/i915/gtt: Defer address space cleanup to an RCU worker

by mixing up a bugfixing (i915_address_space needs to be released from
a worker) with enabling rcu support. The commit message also seems
somewhat confused, because it talks about cleanup of WC pages
requiring sleep, while the code and linked bugzilla are about a
requirement to take dev->struct_mutex (which yes sleeps but it's a
much more specific problem). Since final kref_put can be called from
pretty much anywhere (including hardirq context through the
scheduler's i915_active cleanup) we need a worker here. Hence that
part must be kept.

Ideally all these reclaim workers should have some kind of integration
with our shrinkers, but for some of these it's rather tricky. Anyway,
that's a preexisting condition in the codeebase that we wont fix in
this patch here.

We also remove the rcu_barrier in ggtt_cleanup_hw added in

commit 60a4233a49
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Mon Jul 29 14:24:12 2019 +0100

    drm/i915: Flush the i915_vm_release before ggtt shutdown

Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20210902142057.929669-11-daniel.vetter@ffwll.ch
2021-09-06 11:11:56 +02:00
Daniel Vetter
8431515218 drm/i915: use xa_lock/unlock for fpriv->vm_xa lookups
We don't need the absolute speed of rcu for this. And
i915_address_space in general dont need rcu protection anywhere else,
after we've made gem contexts and engines a lot more immutable.

Note that this semantically reverts

commit aabbe344dc
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Fri Aug 30 19:03:25 2019 +0100

    drm/i915: Use RCU for unlocked vm_idr lookup

except we have the conversion from idr to xarray in between.

v2: kref_get_unless_zero is no longer required (Maarten)

Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20210902142057.929669-10-daniel.vetter@ffwll.ch
2021-09-06 11:11:45 +02:00
Daniel Vetter
9ec8795e7d drm/i915: Drop __rcu from gem_context->vm
It's been invariant since

    commit ccbc1b9794
    Author: Jason Ekstrand <jason@jlekstrand.net>
    Date:   Thu Jul 8 10:48:30 2021 -0500

        drm/i915/gem: Don't allow changing the VM on running contexts (v4)

this just completes the deed. I've tried to split out prep work for
more careful review as much as possible, this is what's left:

- get_ppgtt gets simplified since we don't need to grab a temporary
  reference - we can rely on the temporary reference for the gem_ctx
  while we inspect the vm. The new vm_id still needs a full
  i915_vm_open ofc. This also removes the final caller of context_get_vm_rcu

- A pile of selftests can now just look at ctx->vm instead of
  rcu_dereference_protected( , true) or similar things.

- All callers of i915_gem_context_vm also disappear.

- I've changed the hugepage selftest to set scrub_64K without any
  locking, because when we inspect that setting we're also not taking
  any locks either. It works because it's a selftests that's careful
  (single threaded gives you nice ordering) and not a live driver
  where races can happen from anywhere.

These can only be split up further if we have some intermediate state
with a bunch more rcu_dereference_protected(ctx->vm, true), just to
shut up lockdep and sparse.

The conversion to __rcu happened in

commit a4e7ccdac3
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Fri Oct 4 14:40:09 2019 +0100

    drm/i915: Move context management under GEM

Note that we're not breaking the actual bugfix in there: The real
bugfix is pushing the i915_vm_relase onto a separate worker, to avoid
locking inversion issues. The rcu conversion was just thrown in for
entertainment value on top (no vm lookup isn't even close to anything
that's a hotpath where removing the single spinlock can be measured).

v2: Rebase over the change to move the i915_vm_put() into
i915_gem_context_release().

v3: Trivial conflict against repainted shed.

Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20210902142057.929669-9-daniel.vetter@ffwll.ch
2021-09-06 11:11:32 +02:00
Daniel Vetter
0483a30187 drm/i915: Use i915_gem_context_get_eb_vm in intel_context_set_gem
Since

commit ccbc1b9794
Author: Jason Ekstrand <jason@jlekstrand.net>
Date:   Thu Jul 8 10:48:30 2021 -0500

    drm/i915/gem: Don't allow changing the VM on running contexts (v4)

the gem_ctx->vm can't change anymore. Plus we always set the
intel_context->vm, so might as well use the helper we have for that.

This makes it very clear that we always overwrite intel_context->vm
for userspace contexts, since the default is gt->vm, which is
explicitly reserved for kernel context use. It would be good to split
things up a bit further and avoid any possibility for an accident
where we run kernel stuff in userspace vm or the other way round.

Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20210902142057.929669-8-daniel.vetter@ffwll.ch
2021-09-06 11:04:35 +02:00
Daniel Vetter
a82a9979de drm/i915: Add i915_gem_context_is_full_ppgtt
And use it anywhere we have open-coded checks for ctx->vm that really
only check for full ppgtt.

Plus for paranoia add a GEM_BUG_ON that checks it's really only set
when we have full ppgtt, just in case. gem_context->vm is different
since it's NULL in ggtt mode, unlike intel_context->vm or gt->vm,
which is always set.

v2: 0day found a testcase that I missed.

v3: Repaint shed (Jon, Tvrtko)

Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20210902142057.929669-7-daniel.vetter@ffwll.ch
2021-09-06 10:53:04 +02:00
Daniel Vetter
24fad29e52 drm/i915: Use i915_gem_context_get_eb_vm in ctx_getparam
Consolidates the "which is the vm my execbuf runs in" code a bit. We
do some get/put which isn't really required, but all the other users
want the refcounting, and I figured doing a function just for this
getparam to avoid 2 atomis is a bit much.

Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20210902142057.929669-6-daniel.vetter@ffwll.ch
2021-09-06 10:52:03 +02:00
Daniel Vetter
c6d04e48d2 drm/i915: Rename i915_gem_context_get_vm_rcu to i915_gem_context_get_eb_vm
The important part isn't so much that this does an rcu lookup - that's
more an implementation detail, which will also be removed.

The thing that makes this different from other functions is that it's
gettting you the vm that batchbuffers will run in for that gem
context, which is either a full ppgtt stored in gem->ctx, or the ggtt.

We'll make more use of this function later on.

Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20210902142057.929669-5-daniel.vetter@ffwll.ch
2021-09-06 10:46:04 +02:00
Daniel Vetter
e1068a9e80 drm/i915: Drop code to handle set-vm races from execbuf
Changing the vm from a finalized gem ctx is no longer possible, which
means we don't have to check for that anymore.

I was pondering whether to keep the check as a WARN_ON, but things go
boom real bad real fast if the vm of a vma is wrong. Plus we'd need to
also get the ggtt vm for !full-ppgtt platforms. Ditching it all seemed
like a better idea.

Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
References: ccbc1b9794 ("drm/i915/gem: Don't allow changing the VM on running contexts (v4)")
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20210902142057.929669-4-daniel.vetter@ffwll.ch
2021-09-06 10:45:56 +02:00
Daniel Vetter
8cf97637ff drm/i915: Keep gem ctx->vm alive until the final put
The comment added in

    commit b81dde7194
    Author: Chris Wilson <chris@chris-wilson.co.uk>
    Date:   Tue May 21 22:11:29 2019 +0100

        drm/i915: Allow userspace to clone contexts on creation

and moved in

    commit 27dbae8f36
    Author: Chris Wilson <chris@chris-wilson.co.uk>
    Date:   Wed Nov 6 09:13:12 2019 +0000

        drm/i915/gem: Safely acquire the ctx->vm when copying

suggested that i915_address_space were at least intended to be managed
through SLAB_TYPESAFE_BY_RCU:

                * This ppgtt may have be reallocated between
                * the read and the kref, and reassigned to a third
                * context. In order to avoid inadvertent sharing
                * of this ppgtt with that third context (and not
                * src), we have to confirm that we have the same
                * ppgtt after passing through the strong memory
                * barrier implied by a successful
                * kref_get_unless_zero().

But extensive git history search has not brough any such reuse to
light.

What has come to light though is that ever since

commit 2850748ef8
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Fri Oct 4 14:39:58 2019 +0100

    drm/i915: Pull i915_vma_pin under the vm->mutex

(yes this commit is earlier) the final i915_vma_put call has been
moved from i915_gem_context_free (now called _release) to
context_close, which means it's not actually safe anymore to access
the ctx->vm pointer without lock helds, because it might disappear at
any moment. Note that superficially things all still work, because the
i915_address_space is RCU protected since

    commit b32fa81115
    Author: Chris Wilson <chris@chris-wilson.co.uk>
    Date:   Thu Jun 20 19:37:05 2019 +0100

        drm/i915/gtt: Defer address space cleanup to an RCU worker

except the very clever macro above (which is designed to protected
against object reuse due to SLAB_TYPESAFE_BY_RCU or similar tricks)
results in an endless loop if the refcount of the ctx->vm ever
permanently drops to 0. Which it totally now can.

Fix that by moving the final i915_vm_put to where it should be.

Note that i915_gem_context is rcu protected, but _only_ the final
kfree. This means anyone who chases a pointer to a gem ctx solely
under the protection can pretty only call kref_get_unless_zero(). This
seems to be pretty much the case, aside from a bunch of cases that
consult the scheduling information without any further protection.

Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: "Thomas Hellström" <thomas.hellstrom@intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Fixes: 2850748ef8 ("drm/i915: Pull i915_vma_pin under the vm->mutex")
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210902142057.929669-3-daniel.vetter@ffwll.ch
2021-09-06 10:45:48 +02:00
Daniel Vetter
c238980efd drm/i915: Release ctx->syncobj on final put, not on ctx close
gem context refcounting is another exercise in least locking design it
seems, where most things get destroyed upon context closure (which can
race with anything really). Only the actual memory allocation and the
locks survive while holding a reference.

This tripped up Jason when reimplementing the single timeline feature
in

commit 00dae4d3d3
Author: Jason Ekstrand <jason@jlekstrand.net>
Date:   Thu Jul 8 10:48:12 2021 -0500

    drm/i915: Implement SINGLE_TIMELINE with a syncobj (v4)

We could fix the bug by holding ctx->mutex in execbuf and clear the
pointer (again while holding the mutex) context_close, but it's
cleaner to just make the context object actually invariant over its
_entire_ lifetime. This way any other ioctl that's potentially racing,
but holding a full reference, can still rely on ctx->syncobj being
an immutable pointer. Which without this change, is not the case.

Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Fixes: 00dae4d3d3 ("drm/i915: Implement SINGLE_TIMELINE with a syncobj (v4)")
Cc: Jason Ekstrand <jason@jlekstrand.net>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: "Thomas Hellström" <thomas.hellstrom@intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210902142057.929669-2-daniel.vetter@ffwll.ch
2021-09-06 10:45:40 +02:00
Daniel Vetter
75eefd8258 drm/i915: Release i915_gem_context from a worker
The only reason for this really is the i915_gem_engines->fence
callback engines_notify(), which exists purely as a fairly funky
reference counting scheme for that. Otherwise all other callers are
from process context, and generally fairly benign locking context.

Unfortunately untangling that requires some major surgery, and we have
a few i915_gem_context reference counting bugs that need fixing, and
they blow in the current hardirq calling context, so we need a
stop-gap measure.

Put a FIXME comment in when this should be removable again.

v2: Fix mock_context(), noticed by intel-gfx-ci.

Acked-by: Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20210902142057.929669-1-daniel.vetter@ffwll.ch
2021-09-06 10:45:29 +02:00
Stephen Rothwell
1645cca9da drm/i915: use linux/stddef.h due to "isystem: trim/fixup stdarg.h and other headers"
After merging the drm tree, today's linux-next build (x86_64 allmodconfig)
failed like this:

In file included from drivers/gpu/drm/i915/i915_debugfs.c:39:
drivers/gpu/drm/i915/gt/intel_gt_requests.h:9:10: fatal error: stddef.h: No such file or directory
    9 | #include <stddef.h>
      |          ^~~~~~~~~~

Caused by commit

  564f963eabd1 ("isystem: delete global -isystem compile option")

from the kbuild tree interacting with commit

  b97060a99b ("drm/i915/guc: Update intel_gt_wait_for_idle to work with GuC")

Fixes: b97060a99b ("drm/i915/guc: Update intel_gt_wait_for_idle to work with GuC")
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210820123348.6535a87e@canb.auug.org.au
2021-09-06 09:31:23 +02:00
Linus Torvalds
b250e6d141 Merge tag 'kbuild-v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild updates from Masahiro Yamada:

 - Add -s option (strict mode) to merge_config.sh to make it fail when
   any symbol is redefined.

 - Show a warning if a different compiler is used for building external
   modules.

 - Infer --target from ARCH for CC=clang to let you cross-compile the
   kernel without CROSS_COMPILE.

 - Make the integrated assembler default (LLVM_IAS=1) for CC=clang.

 - Add <linux/stdarg.h> to the kernel source instead of borrowing
   <stdarg.h> from the compiler.

 - Add Nick Desaulniers as a Kbuild reviewer.

 - Drop stale cc-option tests.

 - Fix the combination of CONFIG_TRIM_UNUSED_KSYMS and CONFIG_LTO_CLANG
   to handle symbols in inline assembly.

 - Show a warning if 'FORCE' is missing for if_changed rules.

 - Various cleanups

* tag 'kbuild-v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (39 commits)
  kbuild: redo fake deps at include/ksym/*.h
  kbuild: clean up objtool_args slightly
  modpost: get the *.mod file path more simply
  checkkconfigsymbols.py: Fix the '--ignore' option
  kbuild: merge vmlinux_link() between ARCH=um and other architectures
  kbuild: do not remove 'linux' link in scripts/link-vmlinux.sh
  kbuild: merge vmlinux_link() between the ordinary link and Clang LTO
  kbuild: remove stale *.symversions
  kbuild: remove unused quiet_cmd_update_lto_symversions
  gen_compile_commands: extract compiler command from a series of commands
  x86: remove cc-option-yn test for -mtune=
  arc: replace cc-option-yn uses with cc-option
  s390: replace cc-option-yn uses with cc-option
  ia64: move core-y in arch/ia64/Makefile to arch/ia64/Kbuild
  sparc: move the install rule to arch/sparc/Makefile
  security: remove unneeded subdir-$(CONFIG_...)
  kbuild: sh: remove unused install script
  kbuild: Fix 'no symbols' warning when CONFIG_TRIM_UNUSD_KSYMS=y
  kbuild: Switch to 'f' variants of integrated assembler flag
  kbuild: Shuffle blank line to improve comment meaning
  ...
2021-09-03 15:33:47 -07:00
Linus Torvalds
3de18c865f Merge branch 'stable/for-linus-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb
Pull swiotlb updates from Konrad Rzeszutek Wilk:
 "A new feature called restricted DMA pools. It allows SWIOTLB to
  utilize per-device (or per-platform) allocated memory pools instead of
  using the global one.

  The first big user of this is ARM Confidential Computing where the
  memory for DMA operations can be set per platform"

* 'stable/for-linus-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb: (23 commits)
  swiotlb: use depends on for DMA_RESTRICTED_POOL
  of: restricted dma: Don't fail device probe on rmem init failure
  of: Move of_dma_set_restricted_buffer() into device.c
  powerpc/svm: Don't issue ultracalls if !mem_encrypt_active()
  s390/pv: fix the forcing of the swiotlb
  swiotlb: Free tbl memory in swiotlb_exit()
  swiotlb: Emit diagnostic in swiotlb_exit()
  swiotlb: Convert io_default_tlb_mem to static allocation
  of: Return success from of_dma_set_restricted_buffer() when !OF_ADDRESS
  swiotlb: add overflow checks to swiotlb_bounce
  swiotlb: fix implicit debugfs declarations
  of: Add plumbing for restricted DMA pool
  dt-bindings: of: Add restricted DMA pool
  swiotlb: Add restricted DMA pool initialization
  swiotlb: Add restricted DMA alloc/free support
  swiotlb: Refactor swiotlb_tbl_unmap_single
  swiotlb: Move alloc_size to swiotlb_find_slots
  swiotlb: Use is_swiotlb_force_bounce for swiotlb data bouncing
  swiotlb: Update is_swiotlb_active to add a struct device argument
  swiotlb: Update is_swiotlb_buffer to add a struct device argument
  ...
2021-09-03 10:34:44 -07:00
Sreedhar Telukuntla
fb1e95bc27 drm/i915/gt: Initialize L3CC table in mocs init
Initialize the L3CC table as part of mocs initialization to program
LNCFCMOCSx registers so that the mocs settings are available for
selection for subsequent memory transactions in the driver load path.

We need to keep L3CC initialization in intel_mocs_init_engine() also
so that in execlists submission, these registers can be rewritten
during engine reset.

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Sreedhar Telukuntla <sreedhar.telukuntla@intel.com>
Signed-off-by: Ayaz A Siddiqui <ayaz.siddiqui@intel.com>
Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210903092153.535736-6-ayaz.siddiqui@intel.com
2021-09-03 20:17:24 +05:30
Ayaz A Siddiqui
cfbe5291a1 drm/i915/gt: Initialize unused MOCS entries with device specific values
Historically we've initialized all undefined/reserved entries in
a platform's MOCS table to the contents of table entry #1 (i.e.,
I915_MOCS_PTE).
Going forward, we can't assume that table entry #1 will always
contain suitable values to use for undefined/reserved table
indices. We'll allow a platform-specific table index to be
selected at table initialization time in these cases.

This new mechanism to select L3 WB entry will be applicable for
all the Gen12+ platforms except TGL and RKL.

Since TGL and RLK are already in production so their mocs settings
are intact to avoid ABI break.

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Ayaz A Siddiqui <ayaz.siddiqui@intel.com>
Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210903092153.535736-5-ayaz.siddiqui@intel.com
2021-09-03 20:17:23 +05:30
Ayaz A Siddiqui
c6b248489d drm/i915/gt: Set BLIT_CCTL reg to un-cached
Blitter commands which do not have MOCS fields rely on
cacheability of BlitterCacheControlRegister which was mapped
to index 0 by default.Once we changed the MOCS value of
index 0 to L3 WB, tests like gem_linear_blits started failing
due to a change in cacheability from UC to WB.

Program and place the BlitterCacheControlRegister in
build_aux_regs().

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Ayaz A Siddiqui <ayaz.siddiqui@intel.com>
Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210903092153.535736-4-ayaz.siddiqui@intel.com
2021-09-03 20:17:22 +05:30
Ayaz A Siddiqui
d79a1d7131 drm/i915/gt: Set CMD_CCTL to UC for Gen12 Onward
Cache-control registers for Command Stream(CMD_CCTL) are used
to set catchability for memory writes and reads outputted by
Command Streamers on Gen12 onward platforms.

These registers need to point un-cached(UC) MOCS index.

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Ayaz A Siddiqui <ayaz.siddiqui@intel.com>
Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210903092153.535736-3-ayaz.siddiqui@intel.com
2021-09-03 20:17:21 +05:30
Ayaz A Siddiqui
b62aa57e3c drm/i915/gt: Add support of mocs propagation
Now there are lots of Command and registers that require mocs index
programming.
So propagating mocs_index from mocs to gt so that it can be
used directly without having platform-specific checks.

V2:
Changed 'i915_mocs_index_gt' to anonymous structure.

Cc: CQ Tang<cq.tang@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Ayaz A Siddiqui <ayaz.siddiqui@intel.com>
Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210903092153.535736-2-ayaz.siddiqui@intel.com
2021-09-03 20:17:20 +05:30
Linus Torvalds
23852bec53 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Pull rdma updates from Jason Gunthorpe:
 "This is quite a small cycle, no major series stands out. The HNS and
  rxe drivers saw the most activity this cycle, with rxe being broken
  for a good chunk of time. The significant deleted line count is due to
  a SPDX cleanup series.

  Summary:

   - Various cleanup and small features for rtrs

   - kmap_local_page() conversions

   - Driver updates and fixes for: efa, rxe, mlx5, hfi1, qed, hns

   - Cache the IB subnet prefix

   - Rework how CRC is calcuated in rxe

   - Clean reference counting in iwpm's netlink

   - Pull object allocation and lifecycle for user QPs to the uverbs
     core code

   - Several small hns features and continued general code cleanups

   - Fix the scatterlist confusion of orig_nents/nents introduced in an
     earlier patch creating the append operation"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (90 commits)
  RDMA/mlx5: Relax DCS QP creation checks
  RDMA/hns: Delete unnecessary blank lines.
  RDMA/hns: Encapsulate the qp db as a function
  RDMA/hns: Adjust the order in which irq are requested and enabled
  RDMA/hns: Remove RST2RST error prints for hw v1
  RDMA/hns: Remove dqpn filling when modify qp from Init to Init
  RDMA/hns: Fix QP's resp incomplete assignment
  RDMA/hns: Fix query destination qpn
  RDMA/hfi1: Convert to SPDX identifier
  IB/rdmavt: Convert to SPDX identifier
  RDMA/hns: Bugfix for incorrect association between dip_idx and dgid
  RDMA/hns: Bugfix for the missing assignment for dip_idx
  RDMA/hns: Bugfix for data type of dip_idx
  RDMA/hns: Fix incorrect lsn field
  RDMA/irdma: Remove the repeated declaration
  RDMA/core/sa_query: Retry SA queries
  RDMA: Use the sg_table directly and remove the opencoded version from umem
  lib/scatterlist: Fix wrong update of orig_nents
  lib/scatterlist: Provide a dedicated function to support table append
  RDMA/hns: Delete unused hns bitmap interface
  ...
2021-09-02 14:47:21 -07:00
Linus Torvalds
89b6b8cd92 Merge tag 'vfio-v5.15-rc1' of git://github.com/awilliam/linux-vfio
Pull VFIO updates from Alex Williamson:

 - Fix dma-valid return WAITED implementation (Anthony Yznaga)

 - SPDX license cleanups (Cai Huoqing)

 - Split vfio-pci-core from vfio-pci and enhance PCI driver matching to
   support future vendor provided vfio-pci variants (Yishai Hadas, Max
   Gurtovoy, Jason Gunthorpe)

 - Replace duplicated reflck with core support for managing first open,
   last close, and device sets (Jason Gunthorpe, Max Gurtovoy, Yishai
   Hadas)

 - Fix non-modular mdev support and don't nag about request callback
   support (Christoph Hellwig)

 - Add semaphore to protect instruction intercept handler and replace
   open-coded locks in vfio-ap driver (Tony Krowiak)

 - Convert vfio-ap to vfio_register_group_dev() API (Jason Gunthorpe)

* tag 'vfio-v5.15-rc1' of git://github.com/awilliam/linux-vfio: (37 commits)
  vfio/pci: Introduce vfio_pci_core.ko
  vfio: Use kconfig if XX/endif blocks instead of repeating 'depends on'
  vfio: Use select for eventfd
  PCI / VFIO: Add 'override_only' support for VFIO PCI sub system
  PCI: Add 'override_only' field to struct pci_device_id
  vfio/pci: Move module parameters to vfio_pci.c
  vfio/pci: Move igd initialization to vfio_pci.c
  vfio/pci: Split the pci_driver code out of vfio_pci_core.c
  vfio/pci: Include vfio header in vfio_pci_core.h
  vfio/pci: Rename ops functions to fit core namings
  vfio/pci: Rename vfio_pci_device to vfio_pci_core_device
  vfio/pci: Rename vfio_pci_private.h to vfio_pci_core.h
  vfio/pci: Rename vfio_pci.c to vfio_pci_core.c
  vfio/ap_ops: Convert to use vfio_register_group_dev()
  s390/vfio-ap: replace open coded locks for VFIO_GROUP_NOTIFY_SET_KVM notification
  s390/vfio-ap: r/w lock for PQAP interception handler function pointer
  vfio/type1: Fix vfio_find_dma_valid return
  vfio-pci/zdev: Remove repeated verbose license text
  vfio: platform: reset: Convert to SPDX identifier
  vfio: Remove struct vfio_device_ops open/release
  ...
2021-09-02 13:41:33 -07:00
Thomas Hellström
450cede7f3 drm/i915/gem: Fix the mman selftest
Using the I915_MMAP_TYPE_FIXED mmap type requires the TTM backend, so
for that mmap type, use __i915_gem_object_create_user() instead of
i915_gem_object_create_internal(), as we really want to tests objects
mmap-able by user-space.

This also means that the out-of-space error happens at object creation
and returns -ENXIO rather than -ENOSPC, so fix the code up to expect
that on out-of-offset-space errors.

Finally only use I915_MMAP_TYPE_FIXED for LMEM and SMEM for now if
testing on LMEM-capable devices. For stolen LMEM, we still take the
same path as for integrated, as that haven't been moved over to TTM yet,
and user-space should not be able to create out of stolen LMEM anyway.

v2:
 - Check the presence of the obj->ops->mmap_offset callback rather than
   hardcoding the supported mmap regions in can_mmap() (Maarten Lankhorst)

Fixes: 7961c5b60f ("drm/i915: Add TTM offset argument to mmap.")
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210831122931.157536-1-thomas.hellstrom@linux.intel.com
2021-09-02 14:43:14 +02:00
Alex Sierra
d6043581e1 drm/amdkfd: drop process ref count when xnack disable
During svm restore pages interrupt handler, kfd_process ref count was
never dropped when xnack was disabled. Therefore, the object was never
released.

Fixes: 2383f56bbe ("drm/amdkfd: page table restore through svm API")
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Philip Yang <philip.yang@amd.com>
Reviewed-by: Jonathan Kim <jonathan.kim@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-09-01 16:55:09 -04:00
Linus Torvalds
477f70cd2a Merge tag 'drm-next-2021-08-31-1' of git://anongit.freedesktop.org/drm/drm
Pull drm updates from Dave Airlie:
 "Highlights:

   - i915 has seen a lot of refactoring and uAPI cleanups due to a
     change in the upstream direction going forward

     This has all been audited with known userspace, but there may be
     some pitfalls that were missed.

   - i915 now uses common TTM to enable discrete memory on DG1/2 GPUs

   - i915 enables Jasper and Elkhart Lake by default and has preliminary
     XeHP/DG2 support

   - amdgpu adds support for Cyan Skillfish

   - lots of implicit fencing rules documented and fixed up in drivers

   - msm now uses the core scheduler

   - the irq midlayer has been removed for non-legacy drivers

   - the sysfb code now works on more than x86.

  Otherwise the usual smattering of stuff everywhere, panels, bridges,
  refactorings.

  Detailed summary:

  core:
   - extract i915 eDP backlight into core
   - DP aux bus support
   - drm_device.irq_enabled removed
   - port drivers to native irq interfaces
   - export gem shadow plane handling for vgem
   - print proper driver name in framebuffer registration
   - driver fixes for implicit fencing rules
   - ARM fixed rate compression modifier added
   - updated fb damage handling
   - rmfb ioctl logging/docs
   - drop drm_gem_object_put_locked
   - define DRM_FORMAT_MAX_PLANES
   - add gem fb vmap/vunmap helpers
   - add lockdep_assert(once) helpers
   - mark drm irq midlayer as legacy
   - use offset adjusted bo mapping conversion

  vgaarb:
   - cleanups

  fbdev:
   - extend efifb handling to all arches
   - div by 0 fixes for multiple drivers

  udmabuf:
   - add hugepage mapping support

  dma-buf:
   - non-dynamic exporter fixups
   - document implicit fencing rules

  amdgpu:
   - Initial Cyan Skillfish support
   - switch virtual DCE over to vkms based atomic
   - VCN/JPEG power down fixes
   - NAVI PCIE link handling fixes
   - AMD HDMI freesync fixes
   - Yellow Carp + Beige Goby fixes
   - Clockgating/S0ix/SMU/EEPROM fixes
   - embed hw fence in job
   - rework dma-resv handling
   - ensure eviction to system ram

  amdkfd:
   - uapi: SVM address range query added
   - sysfs leak fix
   - GPUVM TLB optimizations
   - vmfault/migration counters

  i915:
   - Enable JSL and EHL by default
   - preliminary XeHP/DG2 support
   - remove all CNL support (never shipped)
   - move to TTM for discrete memory support
   - allow mixed object mmap handling
   - GEM uAPI spring cleaning
       - add I915_MMAP_OBJECT_FIXED
       - reinstate ADL-P mmap ioctls
       - drop a bunch of unused by userspace features
       - disable and remove GPU relocations
   - revert some i915 misfeatures
   - major refactoring of GuC for Gen11+
   - execbuffer object locking separate step
   - reject caching/set-domain on discrete
   - Enable pipe DMC loading on XE-LPD and ADL-P
   - add PSF GV point support
   - Refactor and fix DDI buffer translations
   - Clean up FBC CFB allocation code
   - Finish INTEL_GEN() and friends macro conversions

  nouveau:
   - add eDP backlight support
   - implicit fence fix

  msm:
   - a680/7c3 support
   - drm/scheduler conversion

  panfrost:
   - rework GPU reset

  virtio:
   - fix fencing for planes

  ast:
   - add detect support

  bochs:
   - move to tiny GPU driver

  vc4:
   - use hotplug irqs
   - HDMI codec support

  vmwgfx:
   - use internal vmware device headers

  ingenic:
   - demidlayering irq

  rcar-du:
   - shutdown fixes
   - convert to bridge connector helpers

  zynqmp-dsub:
   - misc fixes

  mgag200:
   - convert PLL handling to atomic

  mediatek:
   - MT8133 AAL support
   - gem mmap object support
   - MT8167 support

  etnaviv:
   - NXP Layerscape LS1028A SoC support
   - GEM mmap cleanups

  tegra:
   - new user API

  exynos:
   - missing unlock fix
   - build warning fix
   - use refcount_t"

* tag 'drm-next-2021-08-31-1' of git://anongit.freedesktop.org/drm/drm: (1318 commits)
  drm/amd/display: Move AllowDRAMSelfRefreshOrDRAMClockChangeInVblank to bounding box
  drm/amd/display: Remove duplicate dml init
  drm/amd/display: Update bounding box states (v2)
  drm/amd/display: Update number of DCN3 clock states
  drm/amdgpu: disable GFX CGCG in aldebaran
  drm/amdgpu: Clear RAS interrupt status on aldebaran
  drm/amdgpu: Add support for RAS XGMI err query
  drm/amdkfd: Account for SH/SE count when setting up cu masks.
  drm/amdgpu: rename amdgpu_bo_get_preferred_pin_domain
  drm/amdgpu: drop redundant cancel_delayed_work_sync call
  drm/amdgpu: add missing cleanups for more ASICs on UVD/VCE suspend
  drm/amdgpu: add missing cleanups for Polaris12 UVD/VCE on suspend
  drm/amdkfd: map SVM range with correct access permission
  drm/amdkfd: check access permisson to restore retry fault
  drm/amdgpu: Update RAS XGMI Error Query
  drm/amdgpu: Add driver infrastructure for MCA RAS
  drm/amd/display: Add Logging for HDMI color depth information
  drm/amd/amdgpu: consolidate PSP TA init shared buf functions
  drm/amd/amdgpu: add name field back to ras_common_if
  drm/amdgpu: Fix build with missing pm_suspend_target_state module export
  ...
2021-09-01 11:26:46 -07:00
Daniele Ceraolo Spurio
5db1856781 drm/i915/guc: drop guc_communication_enabled
The function is only used from within GEM_BUG_ON(), which is causing
warnings with Wunneeded-internal-declaration in some builds. Since the
function is a simple wrapper around a CT function, we can just call the
CT function directly instead.

Fixes: 1fb12c5871 ("drm/i915/guc: skip disabling CTBs before sanitizing the GuC")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210823163137.19770-1-daniele.ceraolospurio@intel.com
2021-08-31 12:46:36 -07:00
Jiawei Gu
7884d0e9e3 drm/amdgpu: enable more pm sysfs under SRIOV 1-VF mode
Enable pp_num_states, pp_cur_state, pp_force_state, pp_table sysfs under
SRIOV 1-VF scenario.

Signed-off-by: Jiawei Gu <Jiawei.Gu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-31 14:20:41 -04:00
Philip Yang
d7eff46c21 drm/amdgpu: fix fdinfo race with process exit
Get process vm root BO ref in case process is exiting and root BO is
freed, to avoid NULL pointer dereference backtrace:

BUG: unable to handle kernel NULL pointer dereference at
0000000000000000
Call Trace:
amdgpu_show_fdinfo+0xfe/0x2a0 [amdgpu]
seq_show+0x12c/0x180
seq_read+0x153/0x410
vfs_read+0x91/0x140[ 3427.206183]  ksys_read+0x4f/0xb0
do_syscall_64+0x5b/0x1a0
entry_SYSCALL_64_after_hwframe+0x65/0xca

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-31 14:20:40 -04:00
xinhui pan
703677d934 drm/amdgpu: Fix a deadlock if previous GEM object allocation fails
Fall through to handle the error instead of return.

Fixes: f8aab60422 ("drm/amdgpu: Initialise drm_gem_object_funcs for imported BOs")
Cc: stable@vger.kernel.org
Signed-off-by: xinhui pan <xinhui.pan@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-31 14:20:08 -04:00
Guchun Chen
f7d6779df6 drm/amdgpu: stop scheduler when calling hw_fini (v2)
This gurantees no more work on the ring can be submitted
to hardware in suspend/resume case, otherwise a potential
race will occur and the ring will get no chance to stay
empty before suspend.

v2: Call drm_sched_resubmit_job before drm_sched_start to
restart jobs from the pending list.

Suggested-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Suggested-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-08-31 14:19:47 -04:00
John Clements
156872b07e drm/amdgpu: Clear RAS interrupt status on aldebaran
Resolve incorrect register address

Reviewed-by: Candice Li <candice.li@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-31 14:19:37 -04:00
Michael Strauss
e5b310f900 drm/amd/display: Initialize lt_settings on instantiation
[WHY]
lt_settings' pointers remain uninitialized but nonzero if display fails
to light up with no DPCD/EDID info populated, leading to a hang on access

Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com>
Acked-by: Mikita Lipski <mikita.lipski@amd.com>
Signed-off-by: Michael Strauss <michael.strauss@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-31 14:19:30 -04:00
Angus Wang
0e62b094a8 drm/amd/display: cleanup idents after a revert
[WHY]
The change has caused high idle memory clock speed and power
consumption at some resolutions and frame rates for Navi10

[HOW]
Reverted change "drm/amd/display: Fixed Intermittent blue
screen on OLED panel"

Reviewed-by: Aric Cyr  <aric.cyr@amd.com>
Acked-by: Mikita Lipski <mikita.lipski@amd.com>
Signed-off-by: Angus Wang <angus.wang@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-31 14:19:20 -04:00
Anson Jacob
03388a347f drm/amd/display: Fix memory leak reported by coverity
Free memory allocated if any of the previous allocations failed.

>>>     CID 1487129:  Resource leaks  (RESOURCE_LEAK)
>>>     Variable "vpg" going out of scope leaks the storage it points to.

Addresses-Coverity-ID: 1487129: ("Resource leaks")

Reviewed-by: Aric Cyr <aric.cyr@amd.com>
Acked-by: Mikita Lipski <mikita.lipski@amd.com>
Signed-off-by: Anson Jacob <Anson.Jacob@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-31 14:19:00 -04:00
Thomas Hellström
efcefc7127 drm/ttm: Fix ttm_bo_move_memcpy() for subclassed struct ttm_resource
The code was making a copy of a struct ttm_resource. However,
recently the struct ttm_resources were allowed to be subclassed and
also were allowed to be malloced, hence the driver could end up assuming
the copy we handed it was subclassed and worse, the original could have
been freed at this point.

Fix this by using the original struct ttm_resource before it is
potentially freed in ttm_bo_move_sync_cleanup()

v2: Base on drm-misc-next-fixes rather than drm-tip.

Reported-by: Ben Skeggs <skeggsb@gmail.com>
Reported-by: Dave Airlie <airlied@gmail.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: <stable@vger.kernel.org>
Fixes: 3bf3710e37 ("drm/ttm: Add a generic TTM memcpy move for page-based iomem")
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210831071536.80636-1-thomas.hellstrom@linux.intel.com
2021-08-31 10:48:26 +02:00
Linus Torvalds
7d6e3fa87e Merge tag 'irq-core-2021-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq updates from Thomas Gleixner:
 "Updates to the interrupt core and driver subsystems:

  Core changes:

   - The usual set of small fixes and improvements all over the place,
     but nothing stands out

  MSI changes:

   - Further consolidation of the PCI/MSI interrupt chip code

   - Make MSI sysfs code independent of PCI/MSI and expose the MSI
     interrupts of platform devices in the same way as PCI exposes them.

  Driver changes:

   - Support for ARM GICv3 EPPI partitions

   - Treewide conversion to generic_handle_domain_irq() for all chained
     interrupt controllers

   - Conversion to bitmap_zalloc() throughout the irq chip drivers

   - The usual set of small fixes and improvements"

* tag 'irq-core-2021-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (57 commits)
  platform-msi: Add ABI to show msi_irqs of platform devices
  genirq/msi: Move MSI sysfs handling from PCI to MSI core
  genirq/cpuhotplug: Demote debug printk to KERN_DEBUG
  irqchip/qcom-pdc: Trim unused levels of the interrupt hierarchy
  irqdomain: Export irq_domain_disconnect_hierarchy()
  irqchip/gic-v3: Fix priority comparison when non-secure priorities are used
  irqchip/apple-aic: Fix irq_disable from within irq handlers
  pinctrl/rockchip: drop the gpio related codes
  gpio/rockchip: drop irq_gc_lock/irq_gc_unlock for irq set type
  gpio/rockchip: support next version gpio controller
  gpio/rockchip: use struct rockchip_gpio_regs for gpio controller
  gpio/rockchip: add driver for rockchip gpio
  dt-bindings: gpio: change items restriction of clock for rockchip,gpio-bank
  pinctrl/rockchip: add pinctrl device to gpio bank struct
  pinctrl/rockchip: separate struct rockchip_pin_bank to a head file
  pinctrl/rockchip: always enable clock for gpio controller
  genirq: Fix kernel doc indentation
  EDAC/altera: Convert to generic_handle_domain_irq()
  powerpc: Bulk conversion to generic_handle_domain_irq()
  nios2: Bulk conversion to generic_handle_domain_irq()
  ...
2021-08-30 14:38:37 -07:00
Colin Ian King
f5d8e16488 drm/amdgpu/swsmu: fix spelling mistake "minimun" -> "minimum"
There are three identical spelling mistakes in dev_err messages.
Fix these.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-30 15:09:42 -04:00
Koba Ko
b3dc549986 drm/amdgpu: Disable PCIE_DPM on Intel RKL Platform
Due to high latency in PCIE clock switching on RKL platforms,
switching the PCIE clock dynamically at runtime can lead to HDMI/DP
audio problems. On newer asics this is handled in the SMU firmware.
For SMU7-based asics, disable PCIE clock switching to avoid the issue.

AMD provide a parameter to disable PICE_DPM.

modprobe amdgpu ppfeaturemask=0xfff7bffb

It's better to contorl PCIE_DPM in amd gpu driver,
switch PCI_DPM by determining intel RKL platform for SMU7-based asics.

Fixes: 1a31474cdb ("drm/amd/pm: workaround for audio noise issue")
Ref: https://lists.freedesktop.org/archives/amd-gfx/2021-August/067413.html
Signed-off-by: Koba Ko <koba.ko@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-30 15:09:27 -04:00
Lang Yu
50c6dedeb1 drm/amdgpu: show both cmd id and name when psp cmd failed
To cover the corner case that people want to know the ID
of an UNKNOWN CMD.

Suggested-by: John Clements <john.clements@amd.com>
Signed-off-by: Lang Yu <lang.yu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-30 15:09:21 -04:00
Aaron Liu
3ca001aff0 drm/amd/display: setup system context for APUs
Scatter/gather is APU feature starting from carrizo.
adev->apu_flags is not used for all APUs.
adev->flags & AMD_IS_APU can be used for all APUs.

Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-08-30 15:09:03 -04:00
Nicholas Kazlauskas
c5d3c9a093 drm/amdgpu: Enable S/G for Yellow Carp
Missing code for Yellow Carp to enable scatter gather - follows how
DCN21 support was added.

Tested that 8k framebuffer allocation and display can now succeed after
applying the patch.

v2: Add hookup in DM

Reviewed-by: Aaron Liu <aaron.liu@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-08-30 15:08:38 -04:00
Kees Cook
4a9bd6db19 drm/amd/pm: And destination bounds checking to struct copy
In preparation for FORTIFY_SOURCE performing compile-time and run-time
field bounds checking for memcpy(), memmove(), and memset(), avoid
intentionally writing across neighboring fields.

The "Board Parameters" members of the structs:
	struct atom_smc_dpm_info_v4_5
	struct atom_smc_dpm_info_v4_6
	struct atom_smc_dpm_info_v4_7
	struct atom_smc_dpm_info_v4_10
are written to the corresponding members of the corresponding PPTable_t
variables, but they lack destination size bounds checking, which means
the compiler cannot verify at compile time that this is an intended and
safe memcpy().

Since the header files are effectively immutable[1] and a struct_group()
cannot be used, nor a common struct referenced by both sides of the
memcpy() arguments, add a new helper, amdgpu_memcpy_trailing(), to
perform the bounds checking at compile time. Replace the open-coded
memcpy()s with amdgpu_memcpy_trailing() which includes enough context
for the bounds checking.

"objdump -d" shows no object code changes.

[1] https://lore.kernel.org/lkml/e56aad3c-a06f-da07-f491-a894a570d78f@amd.com

Cc: "Christian König" <christian.koenig@amd.com>
Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Cc: Feifei Xu <Feifei.Xu@amd.com>
Cc: Likun Gao <Likun.Gao@amd.com>
Cc: Jiawei Gu <Jiawei.Gu@amd.com>
Cc: Evan Quan <evan.quan@amd.com>
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-30 14:59:34 -04:00