If one GTT BO has been evicted/swapped out, it should sit in CPU domain.
TTM only alloc struct ttm_resource instead of struct ttm_range_mgr_node
for sysMem.
Now when we update mapping for such invalidated BOs, we might walk out
of bounds of struct ttm_resource.
Three possible fix:
1) Let sysMem manager alloc struct ttm_range_mgr_node, like
ttm_range_manager does.
2) Pass pages_addr to update_mapping function too, but need memset
pages_addr[] to zero when unpopulate.
3) Init amdgpu_res_cursor directly.
bug is detected by kfence.
==================================================================
BUG: KFENCE: out-of-bounds read in amdgpu_vm_bo_update_mapping+0x564/0x6e0
Out-of-bounds read at 0x000000008ea93fe9 (64B right of kfence-#167):
amdgpu_vm_bo_update_mapping+0x564/0x6e0 [amdgpu]
amdgpu_vm_bo_update+0x282/0xa40 [amdgpu]
amdgpu_vm_handle_moved+0x19e/0x1f0 [amdgpu]
amdgpu_cs_vm_handling+0x4e4/0x640 [amdgpu]
amdgpu_cs_ioctl+0x19e7/0x23c0 [amdgpu]
drm_ioctl_kernel+0xf3/0x180 [drm]
drm_ioctl+0x2cb/0x550 [drm]
amdgpu_drm_ioctl+0x5e/0xb0 [amdgpu]
kfence-#167 [0x000000008e11c055-0x000000001f676b3e
ttm_sys_man_alloc+0x35/0x80 [ttm]
ttm_resource_alloc+0x39/0x50 [ttm]
ttm_bo_swapout+0x252/0x5a0 [ttm]
ttm_device_swapout+0x107/0x180 [ttm]
ttm_global_swapout+0x6f/0x130 [ttm]
ttm_tt_populate+0xb1/0x2a0 [ttm]
ttm_bo_handle_move_mem+0x17e/0x1d0 [ttm]
ttm_mem_evict_first+0x59d/0x9c0 [ttm]
ttm_bo_mem_space+0x39f/0x400 [ttm]
ttm_bo_validate+0x13c/0x340 [ttm]
ttm_bo_init_reserved+0x269/0x540 [ttm]
amdgpu_bo_create+0x1d1/0xa30 [amdgpu]
amdgpu_bo_create_user+0x40/0x80 [amdgpu]
amdgpu_gem_object_create+0x71/0xc0 [amdgpu]
amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu+0x2f2/0xcd0 [amdgpu]
kfd_ioctl_alloc_memory_of_gpu+0xe2/0x330 [amdgpu]
kfd_ioctl+0x461/0x690 [amdgpu]
Signed-off-by: xinhui pan <xinhui.pan@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
An early pull for v5.15 (there'll be more coming in a week or two),
consisting of the drm/scheduler conversion and a couple other small
series that one was based one. Mostly sending this now because IIUC
danvet wanted it in drm-next so he could rebase on it. (Daniel, if
you disagree then speak up, and I'll instead include this in the main
pull request once that is ready.)
This also has a core patch to drop drm_gem_object_put_locked() now
that the last use of it is removed.
[airlied: add NULL to drm_sched_init]
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rob Clark <robdclark@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/CAF6AEGumRk7H88bqV=H9Fb1SM0zPBo5B7NsCU3jFFKBYxf5k+Q@mail.gmail.com
drm-misc-next for v5.15:
UAPI Changes:
- Add modifiers for arm fixed rate compression.
Cross-subsystem Changes:
- Assorted dt binding fixes.
- Convert ssd1307fb to json-schema.
- Update a lot of irc channels to point to OFTC, as everyone moved there.
- Fix the same divide by zero for asilantfb, kyro, rivafb.
Core Changes:
- Document requirements for new atomic properties.
- Add drm_gem_fb_(begin/end)_cpu_access helpers, and use them in some drivers.
- Document drm_property_enum.value for bitfields.
- Add explicit _NO_ for MIPI_DSI flags that disable features.
- Assorted documentation fixes.
- Update fb_damage handling, and move drm_plane_enable_fb_damage_clips to core.
- Add logging and docs to RMFB ioctl.
- Assorted small fixes to dp_mst, master handling.
- Clarify drm lease usage.
Driver Changes:
- Assorted small fixes to panfrost, hibmc, bridge/nwl-dsi, rockchip, vc4.
- More drm -> linux irq conversions.
- Add support for some Logic Technologies and Multi-Inno panels.
- Expose phy-functionality for drm/rockchip, to allow controlling from the media subsystem.
- Add support for 2 AUO panels.
- Add damage handling to ssd1307fb.
- Improve FIFO handling on mxsfb.
- Assorted small fixes to vmwgfx, and bump version to 2.19 for the new ioctls.
- Improve sony acx424akp backlight handling.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/a753221a-e23e-0dc4-7ca6-8c1b179738d0@linux.intel.com
The vc4_hdmi_audio_prepare function and the functions it's calling have
in several occurences multiple dereferences of either the sample rate or
the number of channels.
It turns out that these variables are also passed through the hdmi codec
parameters structure. Convert all the users to use this structure, and
if it's used multiple times use a variable to store it instead of
dereferencing it every time.
Reviewed-by: Nicolas Saenz Julienne <nsaenz@kernel.org>
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20210707093632.1468127-1-maxime@cerno.tech
We make the following changes to the documentation of drm leases to
make it easier to reason about their usage. In particular, we clarify
the lifetime and locking rules of lease fields in drm_master:
1. Make it clear that &drm_device.mode_config.idr_mutex protects the
lease idr and list structures for drm_master. The lessor field itself
doesn't need to be protected as it doesn't change after it's set in
drm_lease_create.
2. Add descriptions for the lifetime of lessors and leases.
3. Add an overview DOC: section in drm-uapi.rst that defines the
terminology for drm leasing, and explains how leases work and why
they're used.
Signed-off-by: Desmond Cheong Zhi Xi <desmondcheongzx@gmail.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210728102739.441543-1-desmondcheongzx@gmail.com
Fix a bug in smu_cmn_send_msg_without_waiting() in
that this function does not need to take the
smu->message_lock mutex in order to send a message
down to the SMU. The mutex is acquired by the
caller of this function instead.
Cc: Alex Deucher <Alexander.Deucher@amd.com>
Cc: Changfeng Zhu <Changfeng.Zhu@amd.com>
Cc: Huang Rui <ray.huang@amd.com>
Fixes: 5810323ba6 ("drm/amd/pm: Fix a bug communicating with the SMU (v5)")
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Alex Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Fence driver was enabled per ring when sw init on per IP block before.
Change to enable all the fence driver at the same time after
amdgpu_device_ip_init finished.
Rename some function related to fence to make it reasonable for read.
Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
On platforms that support multiple backlights, register
each one separately. This lets us manage them independently
rather than registering a single backlight and applying the
same settings to both.
v2: fix typo:
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Roman Li <Roman.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This converts the internal backlight in the Sony ACX424AKP
driver to do it the canonical way:
- Assign the panel->backlight during probe.
- Let the panel framework handle the backlight.
- Make the backlight .set_brightness() turn the backlight
off completely if blank.
- Fix some dev_err_probe() use cases along the way.
Tested on the U8500 HREF520 reference design.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20210715092808.1100106-1-linus.walleij@linaro.org
We've gotten a number of reports about backlight control not
working on panels which indicate that they use aux backlight
control. A recent patch:
commit 2d73eabe29
Author: Camille Cho <Camille.Cho@amd.com>
Date: Thu Jul 8 18:28:37 2021 +0800
drm/amd/display: Only set default brightness for OLED
[Why]
We used to unconditionally set backlight path as AUX for panels capable
of backlight adjustment via DPCD in set default brightness.
[How]
This should be limited to OLED panel only since we control backlight via
PWM path for SDR mode in LCD HDR panel.
Reviewed-by: Krunoslav Kovac <krunoslav.kovac@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Camille Cho <Camille.Cho@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Changes some other code to only use aux for backlight control on
OLED panels. The commit message seems to indicate that PWM should
be used for SDR mode on HDR panels. Do something similar for
backlight control in general. This may need to be revisited if and
when HDR started to get used.
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1438
Bug: https://bugzilla.kernel.org/show_bug.cgi?id=213715
Reviewed-by: Roman Li <Roman.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The customized OD settings can be divided into two parts: those
committed ones and non-committed ones.
- For those changes which had been fed to SMU before S3/S4/Runpm
suspend kicked, they are committed changes. They should be properly
restored and fed to SMU on S3/S4/Runpm resume.
- For those non-committed changes, they are restored only without feeding
to SMU.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This version brings along following fixed:
- Guard DST_Y_PREFETCH register overflow in DCN21
- Add missing DCN21 IP parameter
- Fix PSR command version
- Add ETW logging for AUX failures
- Add ETW log to dmub_psr_get_state
- Fixed EdidUtility build errors
- Fix missing reg offset for the dmcub test debug registers
- Adding update authentication interface
- Remove unused functions of opm state query support
- Always wait for update lock status
- Refactor riommu invalidation wa
- Ensure dentist display clock update finished in DCN20
Reviewed-by: Hsieh Mike <Mike.Hsieh@amd.com>
Acked-by: Solomon Chiu <solomon.chiu@amd.com>
Signed-off-by: Aric Cyr <aric.cyr@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
We don't check DENTIST_DISPCLK_CHG_DONE to ensure dentist
display clockis updated to target value. In some scenarios with large
display clock margin, it will deliver unfinished display clock and cause
issues like display black screen.
[How]
Checking DENTIST_DISPCLK_CHG_DONE to ensure display clock
has been update to target value before driver do other clock related
actions.
Reviewed-by: Cyr Aric <aric.cyr@amd.com>
Acked-by: Solomon Chiu <solomon.chiu@amd.com>
Signed-off-by: Dale Zhao <dale.zhao@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[why]
It has been decided that opm state query support will be dropped.
Therefore link encryption enabled and save current encryption states
won't be used anymore and there are no foreseeable usages in the future.
We will remove these two interfaces for clean up.
Acked-by: Solomon Chiu <solomon.chiu@amd.com>
Signed-off-by: Wenjing Liu <wenjing.liu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[why]
Previously to toggle authentication, we need to remove and
add the same display back with modified adjustment.
This method will toggle DTM state without actual hardware changes.
This is not per design and would cause potential issues in the long run.
[how]
We are creating a dedicated interface that does the same thing as
remove and add back the display without changing DTM state.
Acked-by: Solomon Chiu <solomon.chiu@amd.com>
Signed-off-by: Wenjing Liu <wenjing.liu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[HOW]
Added #ifdefs and refactored various parts of dc to
allow dc_link to be built by AMD EDID UTILITY
[WHY]
dc_dsc was refactored moving some of the code that AMD EDID UTILITY needed
to dc_link, so now dc_link needs to be included by AMD EDID UTILITY
Squash in DCN config fix (Alex)
Reviewed-by: Leung Martin <Martin.Leung@amd.com>
Acked-by: Solomon Chiu <solomon.chiu@amd.com>
Signed-off-by: Mark Morra <MarkAlbert.Morra@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The code was trying to keep a strict limit on the amount of mob
memory that was used in the guest by making it match the host
settings. There's technically no reason to do that (guests can
certainly use more than the host can have resident in renderers
at the same time).
In particular this is problematic because our userspace is not
great at handling OOM conditions and running out of MOB space
results in GL apps crashing, e.g. gnome-shell likes to allocate
huge surfaces (~61MB for the desktop on 2560x1600 with two workspaces)
and running out of memory there means that the gnome-shell crashes
on startup taking us back to the login and resulting in a system
where one can not login in graphically anymore.
Instead of letting the userspace crash we can extend available
MOB space, we just don't want to use all of the RAM for graphics,
so we're going to limit it to half of RAM.
With the addition of some extra logging this should make the
"guest has been configured with not enough graphics memory"
errors a lot easier to diagnose in cases where the automatic
expansion of MOB space fails.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Martin Krastev <krastevm@vmware.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210723165153.113198-3-zackr@vmware.com
The code was using the old DRM logging functions, which made it
hard to figure out what was coming from vmwgfx. The newer logging
helpers include the driver name in the logs and make it explicit
which driver they're coming from. This allows us to standardize
our logging a bit and clean it up in the process.
vmwgfx is a little special because technically the hardware it's
running on can be anything from the last 12 years or so which is
why we need to include capabilities in the logs in the first
place or otherwise we'd have no way of knowing what were
the capabilities of the platform the guest was running in.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Martin Krastev <krastevm@vmware.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210723165153.113198-2-zackr@vmware.com
The macro has been accounting for DRM_COMMAND_BASE for a long time
now so there's no reason to still be duplicating it. Plus we were
leaving the name undefined which meant that all the DRM ioctl
warnings/errors were always listing "null" ioctl at the culprit.
This fixes the undefined ioctl name and removes duplicated code.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Martin Krastev <krastevm@vmware.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210723165153.113198-1-zackr@vmware.com
The drm/scheduler provides additional prioritization on top of that
provided by however many number of ringbuffers (each with their own
priority level) is supported on a given generation. Expose the
additional levels of priority to userspace and map the userspace
priority back to ring (first level of priority) and schedular priority
(additional priority levels within the ring).
Signed-off-by: Rob Clark <robdclark@chromium.org>
Acked-by: Christian König <christian.koenig@amd.com>
Link: https://lore.kernel.org/r/20210728010632.2633470-13-robdclark@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>
This was only used to detect userspace including the same bo multiple
times in a submit. But ww_mutex can already tell us this.
When we drop struct_mutex around the submit ioctl, we'd otherwise need
to lock the bo before adding it to the bo_list. But since ww_mutex can
already tell us this, it is simpler just to remove the bo_list.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Link: https://lore.kernel.org/r/20210728010632.2633470-11-robdclark@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>
For existing adrenos, there is one or more ringbuffer, depending on
whether preemption is supported. When preemption is supported, each
ringbuffer has it's own priority. A submitqueue (which maps to a
gl context or vk queue in userspace) is mapped to a specific ring-
buffer at creation time, based on the submitqueue's priority.
Each ringbuffer has it's own drm_gpu_scheduler. Each submitqueue
maps to a drm_sched_entity. And each submit maps to a drm_sched_job.
Closes: https://gitlab.freedesktop.org/drm/msm/-/issues/4
Signed-off-by: Rob Clark <robdclark@chromium.org>
Acked-by: Christian König <christian.koenig@amd.com>
Link: https://lore.kernel.org/r/20210728010632.2633470-10-robdclark@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>