Commit Graph

78370 Commits

Author SHA1 Message Date
John Clements
3907c49218 drm/amdgpu: Add driver infrastructure for MCA RAS
Add MCA specific IP blocks targetting RAS features

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-24 15:36:18 -04:00
Praful Swarnakar
3341d30d1c drm/amd/display: Add Logging for HDMI color depth information
[Why]
Recent HDMI2.0 HF1-1 V-Swing testing showed that logging deep color
status helps in validation of testcase.

[How]
Add logging based on various color depths and pixel encoding
formats.

Signed-off-by: Praful Swarnakar <Praful.Swarnakar@amd.com>
Reviewed-by: Hersen Wu <hersenwu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-24 15:36:12 -04:00
Candice Li
30acef3c4a drm/amd/amdgpu: consolidate PSP TA init shared buf functions
Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-24 15:36:05 -04:00
Candice Li
355e3e4ccc drm/amd/amdgpu: add name field back to ras_common_if
Adding name field back to ras_common_if to work around error
injection failure with amdgpuras tool.

Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-24 15:35:57 -04:00
Borislav Petkov
a47f6a5806 drm/amdgpu: Fix build with missing pm_suspend_target_state module export
Building a randconfig here triggered:

  ERROR: modpost: "pm_suspend_target_state" [drivers/gpu/drm/amd/amdgpu/amdgpu.ko] undefined!

because the module export of that symbol happens in
kernel/power/suspend.c which is enabled with CONFIG_SUSPEND.

The ifdef guards in amdgpu_acpi_is_s0ix_supported(), however, test for
CONFIG_PM_SLEEP which is defined like this:

  config PM_SLEEP
          def_bool y
          depends on SUSPEND || HIBERNATE_CALLBACKS

and that randconfig has:

  # CONFIG_SUSPEND is not set
  CONFIG_HIBERNATE_CALLBACKS=y

leading to the module export missing.

Change the ifdeffery to depend directly on CONFIG_SUSPEND.

Fixes: 5706cb3c91 ("drm/amdgpu: fix checking pmops when PM_SLEEP is not enabled")
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/YSP6Lv53QV0cOAsd@zn.tnic
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-08-24 15:35:50 -04:00
Christophe JAILLET
a5f61dd412 drm/radeon: switch from 'pci_' to 'dma_' API
The wrappers in include/linux/pci-dma-compat.h should go away.

The patch has been generated with the coccinelle script below.

It has been compile tested.

@@
@@
-    PCI_DMA_BIDIRECTIONAL
+    DMA_BIDIRECTIONAL

@@
@@
-    PCI_DMA_TODEVICE
+    DMA_TO_DEVICE

@@
@@
-    PCI_DMA_FROMDEVICE
+    DMA_FROM_DEVICE

@@
@@
-    PCI_DMA_NONE
+    DMA_NONE

@@
expression e1, e2, e3;
@@
-    pci_alloc_consistent(e1, e2, e3)
+    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3;
@@
-    pci_zalloc_consistent(e1, e2, e3)
+    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3, e4;
@@
-    pci_free_consistent(e1, e2, e3, e4)
+    dma_free_coherent(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_map_single(e1, e2, e3, e4)
+    dma_map_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_single(e1, e2, e3, e4)
+    dma_unmap_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4, e5;
@@
-    pci_map_page(e1, e2, e3, e4, e5)
+    dma_map_page(&e1->dev, e2, e3, e4, e5)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_page(e1, e2, e3, e4)
+    dma_unmap_page(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_map_sg(e1, e2, e3, e4)
+    dma_map_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_sg(e1, e2, e3, e4)
+    dma_unmap_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_single_for_cpu(e1, e2, e3, e4)
+    dma_sync_single_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_single_for_device(e1, e2, e3, e4)
+    dma_sync_single_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_sg_for_cpu(e1, e2, e3, e4)
+    dma_sync_sg_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_sg_for_device(e1, e2, e3, e4)
+    dma_sync_sg_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2;
@@
-    pci_dma_mapping_error(e1, e2)
+    dma_mapping_error(&e1->dev, e2)

@@
expression e1, e2;
@@
-    pci_set_dma_mask(e1, e2)
+    dma_set_mask(&e1->dev, e2)

@@
expression e1, e2;
@@
-    pci_set_consistent_dma_mask(e1, e2)
+    dma_set_coherent_mask(&e1->dev, e2)

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-24 15:35:47 -04:00
Christophe JAILLET
8a1d1bdb84 drm/amdgpu: switch from 'pci_' to 'dma_' API
The wrappers in include/linux/pci-dma-compat.h should go away.

The patch has been generated with the coccinelle script below.

It has been compile tested.

@@
@@
-    PCI_DMA_BIDIRECTIONAL
+    DMA_BIDIRECTIONAL

@@
@@
-    PCI_DMA_TODEVICE
+    DMA_TO_DEVICE

@@
@@
-    PCI_DMA_FROMDEVICE
+    DMA_FROM_DEVICE

@@
@@
-    PCI_DMA_NONE
+    DMA_NONE

@@
expression e1, e2, e3;
@@
-    pci_alloc_consistent(e1, e2, e3)
+    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3;
@@
-    pci_zalloc_consistent(e1, e2, e3)
+    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3, e4;
@@
-    pci_free_consistent(e1, e2, e3, e4)
+    dma_free_coherent(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_map_single(e1, e2, e3, e4)
+    dma_map_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_single(e1, e2, e3, e4)
+    dma_unmap_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4, e5;
@@
-    pci_map_page(e1, e2, e3, e4, e5)
+    dma_map_page(&e1->dev, e2, e3, e4, e5)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_page(e1, e2, e3, e4)
+    dma_unmap_page(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_map_sg(e1, e2, e3, e4)
+    dma_map_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_sg(e1, e2, e3, e4)
+    dma_unmap_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_single_for_cpu(e1, e2, e3, e4)
+    dma_sync_single_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_single_for_device(e1, e2, e3, e4)
+    dma_sync_single_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_sg_for_cpu(e1, e2, e3, e4)
+    dma_sync_sg_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_sg_for_device(e1, e2, e3, e4)
+    dma_sync_sg_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2;
@@
-    pci_dma_mapping_error(e1, e2)
+    dma_mapping_error(&e1->dev, e2)

@@
expression e1, e2;
@@
-    pci_set_dma_mask(e1, e2)
+    dma_set_mask(&e1->dev, e2)

@@
expression e1, e2;
@@
-    pci_set_consistent_dma_mask(e1, e2)
+    dma_set_coherent_mask(&e1->dev, e2)

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-24 15:35:41 -04:00
Mukul Joshi
f270921a17 drm/amdkfd: CWSR with sw scheduler on Aldebaran and Arcturus
Program trap handler settings to enable CWSR with software scheduler
on Aldebaran and Arcturus.

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Amber Lin <Amber.Lin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-24 15:35:33 -04:00
Shashank Sharma
7301757ea1 drm/amdgpu/OLAND: clip the ref divider max value
This patch limits the ref_div_max value to 100, during the
calculation of PLL feedback reference divider. With current
value (128), the produced fb_ref_div value generates unstable
output at particular frequencies. Radeon driver limits this
value at 100.

On Oland, when we try to setup mode 2048x1280@60 (a bit weird,
I know), it demands a clock of 221270 Khz. It's been observed
that the PLL calculations using values 128 and 100 are vastly
different, and look like this:

+------------------------------------------+
|Parameter    |AMDGPU        |Radeon       |
|             |              |             |
+-------------+----------------------------+
|Clock feedback              |             |
|divider max  |  128         |   100       |
|cap value    |              |             |
|             |              |             |
|             |              |             |
+------------------------------------------+
|ref_div_max  |              |             |
|             |  42          |  20         |
|             |              |             |
|             |              |             |
+------------------------------------------+
|ref_div      |  42          |  20         |
|             |              |             |
+------------------------------------------+
|fb_div       |  10326       |  8195       |
+------------------------------------------+
|fb_div       |  1024        |  163        |
+------------------------------------------+
|fb_dev_p     |  4           |  9          |
|frac fb_de^_p|              |             |
+----------------------------+-------------+

With ref_div_max value clipped at 100, AMDGPU driver can also
drive videmode 2048x1280@60 (221Mhz) and produce proper output
without any blanking and distortion on the screen.

PS: This value was changed from 128 to 100 in Radeon driver also, here:
4b21ce1b4b

V1:
Got acks from:
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>

V2:
- Restricting the changes only for OLAND, just to avoid any regression
  for other cards.
- Changed unsigned -> unsigned int to make checkpatch quiet.

V3: Apply the change on SI family (not only oland) (Christian)

Cc: Alex Deucher <Alexander.Deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Eddy Qin <Eddy.Qin@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-24 15:35:25 -04:00
Eric Yang
234b4fd917 drm/amd/display: refactor riommu invalidation wa
[Why]
A cleaner solution, only done once on boot.

[How]
Remove previous workaround and configure an extra
vmid one time on boot

Reviewed-by: Kazlauskas Nicholas <Nicholas.Kazlauskas@amd.com>
Acked-by: Solomon Chiu <solomon.chiu@amd.com>
Signed-off-by: Eric Yang <Eric.Yang2@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-24 15:35:13 -04:00
Maor Gottlieb
90e7a6de62 lib/scatterlist: Provide a dedicated function to support table append
RDMA is the only in-kernel user that uses __sg_alloc_table_from_pages to
append pages dynamically. In the next patch. That mode will be extended
and that function will get more parameters. So separate it into a unique
function to make such change more clear.

Link: https://lore.kernel.org/r/20210824142531.3877007-2-maorg@nvidia.com
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-08-24 15:21:14 -03:00
Borislav Petkov
c41a4e877a drm/amdgpu: Fix build with missing pm_suspend_target_state module export
Building a randconfig here triggered:

  ERROR: modpost: "pm_suspend_target_state" [drivers/gpu/drm/amd/amdgpu/amdgpu.ko] undefined!

because the module export of that symbol happens in
kernel/power/suspend.c which is enabled with CONFIG_SUSPEND.

The ifdef guards in amdgpu_acpi_is_s0ix_supported(), however, test for
CONFIG_PM_SLEEP which is defined like this:

  config PM_SLEEP
          def_bool y
          depends on SUSPEND || HIBERNATE_CALLBACKS

and that randconfig has:

  # CONFIG_SUSPEND is not set
  CONFIG_HIBERNATE_CALLBACKS=y

leading to the module export missing.

Change the ifdeffery to depend directly on CONFIG_SUSPEND.

Fixes: 5706cb3c91 ("drm/amdgpu: fix checking pmops when PM_SLEEP is not enabled")
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/YSP6Lv53QV0cOAsd@zn.tnic
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-08-24 11:57:44 -04:00
Nathan Chancellor
fb43ebc83e drm/i915/selftest: Fix use of err in igt_reset_{fail, nop}_engine()
Clang warns:

In file included from drivers/gpu/drm/i915/gt/intel_reset.c:1514:
drivers/gpu/drm/i915/gt/selftest_hangcheck.c:465:62: warning: variable
'err' is uninitialized when used here [-Wuninitialized]
        pr_err("[%s] Create context failed: %d!\n", engine->name, err);
                                                                  ^~~
...
drivers/gpu/drm/i915/gt/selftest_hangcheck.c:580:62: warning: variable
'err' is uninitialized when used here [-Wuninitialized]
        pr_err("[%s] Create context failed: %d!\n", engine->name, err);
                                                                  ^~~
...
2 warnings generated.

This appears to be a copy and paste issue. Use ce directly using the %pe
specifier to pretty print the error code so that err is not used
uninitialized in these functions.

Fixes: 3a7b72665e ("drm/i915/selftest: Bump selftest timeouts for hangcheck")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210813171158.2665823-1-nathan@kernel.org
(cherry picked from commit ac5a2dff42)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2021-08-24 17:23:10 +03:00
Dan Carpenter
2c772cf5fe drm/i915/gt: Potential error pointer dereference in pinned_context()
If the intel_engine_create_pinned_context() function returns an error
pointer, then dereferencing "ce" will Oops.  Use "vm" instead of
"ce->vm".

Fixes: cf58602164 ("drm/i915/gt: Pipelined page migration")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210813113600.GC30697@kili
(cherry picked from commit ff12ce2c9c)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2021-08-24 17:23:03 +03:00
Matt Roper
3070d934a0 drm/i915/adl_p: Also disable underrun recovery with MSO
One of the cases that the bspec lists for when underrun recovery must be
disabled is "COG;" that note actually refers to eDP multi-segmented
operation (MSO).  Let's ensure the this additional restriction is
honored by the driver.

Bspec: 50351
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Fixes: ba3b049f47 ("drm/i915/adl_p: Allow underrun recovery when possible")
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210816204112.2960624-1-matthew.d.roper@intel.com
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
(cherry picked from commit c00e14cd4d)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2021-08-24 17:22:57 +03:00
Kees Cook
37bf34e10c drm/i915: Use designated initializers for init/exit table
The kernel builds with -Werror=designated-init, and __designated_init
is used by CONFIG_GCC_PLUGIN_RANDSTRUCT for automatically selected (all
function pointer) structures. Include the field names in the init/exit
table. Avoids warnings like:

drivers/gpu/drm/i915/i915_module.c:59:4: error: positional initialization of field in 'struct' declared with 'designated_init' attribute [-Werror=designated-init]

Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: David Airlie <airlied@linux.ie>
Cc: intel-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Fixes: a04ea6ae7c ("drm/i915: Use a table for i915_init/exit (v2)")
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210817233357.2379455-1-keescook@chromium.org
(cherry picked from commit 90fd2194a0)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2021-08-24 17:22:52 +03:00
Nathan Chancellor
c626f3864b drm/exynos: Always initialize mapping in exynos_drm_register_dma()
In certain randconfigs, clang warns:

drivers/gpu/drm/exynos/exynos_drm_dma.c:121:19: warning: variable
'mapping' is uninitialized when used here [-Wuninitialized]
                priv->mapping = mapping;
                                ^~~~~~~
drivers/gpu/drm/exynos/exynos_drm_dma.c:111:16: note: initialize the
variable 'mapping' to silence this warning
                void *mapping;
                             ^
                              = NULL
1 warning generated.

This occurs when CONFIG_EXYNOS_IOMMU is enabled and both
CONFIG_ARM_DMA_USE_IOMMU and CONFIG_IOMMU_DMA are disabled, which makes
the code look like

  void *mapping;

  if (0)
    mapping = arm_iommu_create_mapping()
  else if (0)
    mapping = iommu_get_domain_for_dev()

  ...
  priv->mapping = mapping;

Add an else branch that initializes mapping to the -ENODEV error pointer
so that there is no more warning and the driver does not change during
runtime.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
2021-08-22 01:56:39 +09:00
Xiyu Yang
8c27cc5b90 drm/exynos: Convert from atomic_t to refcount_t on g2d_cmdlist_userptr->refcount
refcount_t type and corresponding API can protect refcounters from
accidental underflow and overflow and further use-after-free situations.

Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn>
Signed-off-by: Xin Tan <tanxin.ctf@gmail.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
2021-08-22 01:56:38 +09:00
Wei Yongjun
b74a29fac6 drm/exynos: g2d: fix missing unlock on error in g2d_runqueue_worker()
Add the missing unlock before return from function g2d_runqueue_worker()
in the error handling case.

Fixes: 445d3bed75 ("drm/exynos: use pm_runtime_resume_and_get()")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
2021-08-22 01:56:35 +09:00
Michel Dänzer
32bc8f8373 drm/amdgpu: Cancel delayed work when GFXOFF is disabled
schedule_delayed_work does not push back the work if it was already
scheduled before, so amdgpu_device_delay_enable_gfx_off ran ~100 ms
after the first time GFXOFF was disabled and re-enabled, even if GFXOFF
was disabled and re-enabled again during those 100 ms.

This resulted in frame drops / stutter with the upcoming mutter 41
release on Navi 14, due to constantly enabling GFXOFF in the HW and
disabling it again (for getting the GPU clock counter).

To fix this, call cancel_delayed_work_sync when the disable count
transitions from 0 to 1, and only schedule the delayed work on the
reverse transition, not if the disable count was already 0. This makes
sure the delayed work doesn't run at unexpected times, and allows it to
be lock-free.

v2:
* Use cancel_delayed_work_sync & mutex_trylock instead of
  mod_delayed_work.
v3:
* Make amdgpu_device_delay_enable_gfx_off lock-free (Christian König)
v4:
* Fix race condition between amdgpu_gfx_off_ctrl incrementing
  adev->gfx.gfx_off_req_count and amdgpu_device_delay_enable_gfx_off
  checking for it to be 0 (Evan Quan)

Cc: stable@vger.kernel.org
Reviewed-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> # v3
Acked-by: Christian König <christian.koenig@amd.com> # v3
Signed-off-by: Michel Dänzer <mdaenzer@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-20 13:35:42 -04:00
Christian König
2a7b9a8437 drm/amdgpu: use the preferred pin domain after the check
For some reason we run into an use case where a BO is already pinned
into GTT, but should be pinned into VRAM|GTT again.

Handle that case gracefully as well.

Reviewed-by: Shashank Sharma <Shashank.sharma@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-08-20 13:35:35 -04:00
Michel Dänzer
90a9266269 drm/amdgpu: Cancel delayed work when GFXOFF is disabled
schedule_delayed_work does not push back the work if it was already
scheduled before, so amdgpu_device_delay_enable_gfx_off ran ~100 ms
after the first time GFXOFF was disabled and re-enabled, even if GFXOFF
was disabled and re-enabled again during those 100 ms.

This resulted in frame drops / stutter with the upcoming mutter 41
release on Navi 14, due to constantly enabling GFXOFF in the HW and
disabling it again (for getting the GPU clock counter).

To fix this, call cancel_delayed_work_sync when the disable count
transitions from 0 to 1, and only schedule the delayed work on the
reverse transition, not if the disable count was already 0. This makes
sure the delayed work doesn't run at unexpected times, and allows it to
be lock-free.

v2:
* Use cancel_delayed_work_sync & mutex_trylock instead of
  mod_delayed_work.
v3:
* Make amdgpu_device_delay_enable_gfx_off lock-free (Christian König)
v4:
* Fix race condition between amdgpu_gfx_off_ctrl incrementing
  adev->gfx.gfx_off_req_count and amdgpu_device_delay_enable_gfx_off
  checking for it to be 0 (Evan Quan)

Cc: stable@vger.kernel.org
Reviewed-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> # v3
Acked-by: Christian König <christian.koenig@amd.com> # v3
Signed-off-by: Michel Dänzer <mdaenzer@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-20 12:09:44 -04:00
Christian König
9deb0b3dcf drm/amdgpu: use the preferred pin domain after the check
For some reason we run into an use case where a BO is already pinned
into GTT, but should be pinned into VRAM|GTT again.

Handle that case gracefully as well.

Reviewed-by: Shashank Sharma <Shashank.sharma@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-08-20 12:09:32 -04:00
Evan Quan
8ac1696b1d drm/amd/pm: a quick fix for "divided by zero" error
Considering Arcturus is a dedicated ASIC for computing, it
will be more proper to drop the support for fan speed reading
and setting. That's on the TODO list.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reported-by: Rui Teng <rui.teng@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-20 12:06:14 -04:00
Dave Airlie
daa7772d47 Merge tag 'amd-drm-fixes-5.14-2021-08-18' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
amd-drm-fixes-5.14-2021-08-18:

amdgpu:
- vega10 SMU workload fix
- DCN VM fix
- DCN 3.01 watermark fix

amdkfd:
- SVM fix

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210818225137.4070-1-alexander.deucher@amd.com
2021-08-20 15:13:56 +10:00
Dave Airlie
f5b27f7f8d Merge tag 'mediatek-drm-fixes-5.14-2' of https://git.kernel.org/pub/scm/linux/kernel/git/chunkuang.hu/linux into drm-fixes
Mediatek DRM Fixes for Linux 5.14-2

1. Fix AAL output size setting.
2. Delete component in remove function.

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Chun-Kuang Hu <chunkuang.hu@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20210819001635.14803-1-chunkuang.hu@kernel.org
2021-08-20 10:15:04 +10:00
Dave Airlie
5ce5cef019 Merge tag 'drm-intel-fixes-2021-08-18' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
- Expand a tweaked display workaround for all PCHs. (Anshuman)
- Fix eDP MSO pipe sanity checks for ADL-P. (Jani)
- Remove superfluous EXPORT_SYMBOL(). (Jani)

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/YR137zkSAIbun1Ed@intel.com
2021-08-20 09:43:31 +10:00
Dave Airlie
b88aefc51c Merge branch 'linux-5.14' of git://github.com/skeggsb/linux into drm-fixes
- Ampere display fixes
- Fix longstanding MM race issue by removing unused code.

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Ben Skeggs <skeggsb@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/CACAvsv5jtUFkHsGe-pf-=RceDOgKygjPnCi=6d5vCLM_f5aeMQ@mail.gmail.com
2021-08-20 09:18:14 +10:00
Dave Airlie
e213bd1e72 Merge tag 'drm-misc-fixes-2021-08-18' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
Short summary of fixes pull:

 * UAPI: Return results for failed drm_wait_vblank_ioctl()
 * ttm: Fix debugfs initialization

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/YR1c7cG1IaL+g8EN@linux-uq9g.fritz.box
2021-08-20 06:01:34 +10:00
Alexey Dobriyan
c0891ac15f isystem: ship and use stdarg.h
Ship minimal stdarg.h (1 type, 4 macros) as <linux/stdarg.h>.
stdarg.h is the only userspace header commonly used in the kernel.

GPL 2 version of <stdarg.h> can be extracted from
http://archive.debian.org/debian/pool/main/g/gcc-4.2/gcc-4.2_4.2.4.orig.tar.gz

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2021-08-19 09:02:55 +09:00
Alexey Dobriyan
39f75da7bc isystem: trim/fixup stdarg.h and other headers
Delete/fixup few includes in anticipation of global -isystem compile
option removal.

Note: crypto/aegis128-neon-inner.c keeps <stddef.h> due to redefinition
of uintptr_t error (one definition comes from <stddef.h>, another from
<linux/types.h>).

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2021-08-19 09:02:55 +09:00
Zhan Liu
37717b8c9f drm/amd/display: Use DCN30 watermark calc for DCN301
[why]
dcn301_calculate_wm_and_dl() causes flickering when external monitor is
connected.

This issue has been fixed before by commit 0e4c0ae59d
("drm/amdgpu/display: drop dcn301_calculate_wm_and_dl for now"), however
part of the fix was gone after commit 2cbcb78c9e ("Merge tag 'amd-drm-next-5.13-2021-03-23' of https://gitlab.freedesktop.org/agd5f/linux into drm-next").

[how]
Use dcn30_calculate_wm_and_dlg() instead as in the original fix.

Fixes: 2cbcb78c9e ("Merge tag 'amd-drm-next-5.13-2021-03-23' of https://gitlab.freedesktop.org/agd5f/linux into drm-next")

Signed-off-by: Nikola Cornij <nikola.cornij@amd.com>
Reviewed-by: Zhan Liu <zhan.liu@amd.com>
Tested-by: Zhan Liu <zhan.liu@amd.com>
Tested-by: Oliver Logush <oliver.logush@amd.com>
Signed-off-by: Zhan Liu <zhan.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-18 18:30:00 -04:00
Lukas Bulwahn
36a7aee027 drm: amdgpu: remove obsolete reference to config CHASH
Commit 04ed8459f3 ("drm/amdgpu: remove chash") removes the chash
architecture and its corresponding config CHASH.

There is still a reference to CHASH in the config DRM_AMDGPU in
./drivers/gpu/drm/Kconfig.

Remove this obsolete reference to config CHASH.

Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-18 18:26:10 -04:00
Colin Ian King
c94126c4aa drm/amd/pm: Fix spelling mistake "firwmare" -> "firmware"
There is a spelling mistake in a dev_err error message. Fix it.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-18 18:26:10 -04:00
YuBiao Wang
691191a2f4 drm/amd/amdgpu:flush ttm delayed work before cancel_sync
[Why]
In some cases when we unload driver, warning call trace
will show up in vram_mgr_fini which claims that LRU is not empty, caused
by the ttm bo inside delay deleted queue.

[How]
We should flush delayed work to make sure the delay deleting is done.

Signed-off-by: YuBiao Wang <YuBiao.Wang@amd.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-18 18:23:00 -04:00
Candice Li
ce97f37be8 drm/amd: consolidate TA shared memory structures
Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-18 18:22:53 -04:00
Hawking Zhang
f2bd514d85 drm/amdgpu: increase max xgmi physical node for aldebaran
aldebaran supports up to 16 xgmi physical nodes.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-18 18:22:46 -04:00
Evan Quan
42447deb88 drm/amdgpu: disable BACO support for 699F:C7 polaris12 SKU temporarily
We have a S3 issue on that SKU with BACO enabled. Will bring back this
when that root caused.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-18 18:22:25 -04:00
Zhan Liu
65c7e943ea drm/amd/display: Use DCN30 watermark calc for DCN301
[why]
dcn301_calculate_wm_and_dl() causes flickering when external monitor is
connected.

This issue has been fixed before by commit 0e4c0ae59d
("drm/amdgpu/display: drop dcn301_calculate_wm_and_dl for now"), however
part of the fix was gone after commit 2cbcb78c9e ("Merge tag 'amd-drm-next-5.13-2021-03-23' of https://gitlab.freedesktop.org/agd5f/linux into drm-next").

[how]
Use dcn30_calculate_wm_and_dlg() instead as in the original fix.

Fixes: 2cbcb78c9e ("Merge tag 'amd-drm-next-5.13-2021-03-23' of https://gitlab.freedesktop.org/agd5f/linux into drm-next")

Signed-off-by: Nikola Cornij <nikola.cornij@amd.com>
Reviewed-by: Zhan Liu <zhan.liu@amd.com>
Tested-by: Zhan Liu <zhan.liu@amd.com>
Tested-by: Oliver Logush <oliver.logush@amd.com>
Signed-off-by: Zhan Liu <zhan.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-18 18:22:25 -04:00
Zhigang Luo
424f2b2e26 drm/amdgpu: correct MMSCH 1.0 version
MMSCH 1.0 doesn't have major/minor version, only verison.

Signed-off-by: Zhigang Luo <zhigang.luo@amd.com>
Reviewed by Shaoyun.liu <Shaoyun.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-18 18:22:25 -04:00
Jonathan Kim
44357a1bd5 drm/amdgpu: get extended xgmi topology data
The TA has a limit to the amount of data that can be retrieved from
GET_TOPOLOGY.  For setups that exceed this limit, the xGMI topology
needs to be re-initialized and data needs to be re-fetched from the
extended link records by setting a flag in the shared command buffer.

The number of hops and the number of links must be accumulated by the
driver. Other data points are all fetched from the first request.
Because the TA has already exceeded its link record limit, it
cannot hold bidirectional information.  Otherwise the driver would
have to do more than two fetches so the driver has to reflect the
topology information in the opposite direction.

v2: squashed with internal reviewed fix

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Reviewed-by: Hawking Zhang <hawking.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-18 18:22:24 -04:00
Colin Ian King
d03a493f67 drm/mgag200: Fix uninitialized variable delta
The variable delta is not initialized and this will cause unexpected
behaviour with the comparison of tmpdelta < delta. Fix this by setting
it to 0xffffffff. This matches the behaviour as in the similar function
mgag200_pixpll_compute_g200se_04.

v2:
	* move fix up by one line to align style with other functions
	* add additional tags from similar patch

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Fixes: 2545ac9603 ("drm/mgag200: Abstract pixel PLL via struct mgag200_pll")
Addresses-Coverity: ("Uninitialized scalar variable")
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Dave Airlie <airlied@redhat.com>
Cc: dri-devel@lists.freedesktop.org
Link: https://patchwork.freedesktop.org/patch/msgid/20210817163204.494166-1-colin.king@canonical.com
2021-08-18 20:47:59 +02:00
Jani Nikula
e3e86f4138 drm/i915/dp: remove superfluous EXPORT_SYMBOL()
The symbol isn't needed outside of i915.ko.

Fixes: b30edfd8d0 ("drm/i915: Switch to LTTPR non-transparent mode link training")
Fixes: 264613b406 ("drm/i915: Disable LTTPR support when the DPCD rev < 1.4")
Cc: Imre Deak <imre.deak@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210816071737.2917-1-jani.nikula@intel.com
(cherry picked from commit d8959fb338)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2021-08-18 07:23:57 -04:00
Jani Nikula
baa2152dae drm/i915/edp: fix eDP MSO pipe sanity checks for ADL-P
ADL-P supports stream splitter on pipe B in addition to pipe A. Update
the sanity check in intel_ddi_mso_get_config() to reflect this, and
remove the check in intel_ddi_mso_configure() as redundant with
encoder->pipe_mask. Abstract the splitter pipe mask to a single point of
truth while at it to avoid similar mistakes in the future.

Fixes: 7bc188cc2c ("drm/i915/adl_p: enable MSO on pipe B")
Cc: Uma Shankar <uma.shankar@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Swati Sharma <swati2.sharma@intel.com>
Reviewed-by: Swati Sharma <swati2.sharma@intel.com>
Tested-by: Swati Sharma <swati2.sharma@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210812132354.10885-1-jani.nikula@intel.com
(cherry picked from commit f6864b27d6)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2021-08-18 07:23:54 -04:00
Anshuman Gupta
b8441b288d drm/i915: Tweaked Wa_14010685332 for all PCHs
dispcnlunit1_cp_xosc_clkreq clock observed to be active on TGL-H platform
despite Wa_14010685332 original sequence,
thus blocks entry to deeper s0ix state.

The Tweaked Wa_14010685332 sequence fixes this issue, therefore use tweaked
Wa_14010685332 sequence for every PCH since PCH_CNP.

v2:
- removed RKL from comment and simplified condition. [Rodrigo]

Fixes: b896898c73 ("drm/i915: Tweaked Wa_14010685332 for PCHs used on gen11 platforms")
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210810113112.31739-2-anshuman.gupta@intel.com
(cherry picked from commit 8b46cc6577)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2021-08-18 07:23:50 -04:00
Ben Skeggs
59f216cf04 drm/nouveau: rip out nvkm_client.super
No longer required now that userspace can't touch anything that might
need it, and should fix DRM MM operations racing with each other, and
the random hangs/crashes that come with that.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
2021-08-18 19:00:15 +10:00
Ben Skeggs
148a865378 drm/nouveau: block a bunch of classes from userspace
Long ago, there had been plans for making use of a bunch of these APIs
from userspace and there's various checks in place to stop misbehaving.

Countless other projects have occurred in the meantime, and the pieces
didn't finish falling into place for that to happen.

They will (hopefully) in the not-too-distant future, but it won't look
quite as insane.  The super checks are causing problems right now, and
are going to be removed.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
2021-08-18 19:00:13 +10:00
Ben Skeggs
50c4a64491 drm/nouveau/fifo/nv50-: rip out dma channels
I honestly don't even know why...  These have never been used.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
2021-08-18 19:00:11 +10:00
Ben Skeggs
e78b1b545c drm/nouveau/kms/nv50: workaround EFI GOP window channel format differences
Should fix some initial modeset failures on (at least) Ampere boards.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
2021-08-18 19:00:08 +10:00
Ben Skeggs
6eaa1f3c59 drm/nouveau/disp: power down unused DP links during init
When booted with multiple displays attached, the EFI GOP driver on (at
least) Ampere, can leave DP links powered up that aren't being used to
display anything.  This confuses our tracking of SOR routing, with the
likely result being a failed modeset and display engine hang.

Fix this by (ab?)using the DisableLT IED script to power-down the link,
restoring HW to a state the driver expects.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
2021-08-18 19:00:04 +10:00