We can directly calculate sdma doorbell indexes in the process doorbell
pages through the doorbell_index structure in amdgpu_device, so no need
to cache them in kgd2kfd_shared_resources any more. This alleviates the
adaptation needs when new SDMA configurations are introduced.
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reserved doorbells for SDMA IH and VCN were not properly masked out
when allocating doorbells for CP user queues. This patch fixed that.
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
At least on i965g and i965gm, performing a device reset clobbers the IER
resulting in loss of interrupts thereafter. So, run the irq_postinstall
hook to restore them.
v2: Ville pointed out that he already attempted to solve this problem by
reinstalling the interrupts in intel_reset_finish() (part of the display
handling around reset). However, reinstalling the irq clobbers the
i915->irq_mask which we need for handling MI_USER_INTERRUPTS, and does
so too late to handle any interrupts generated from resuming the rings.
The simple solution to both is to pull the interrupt reenabling from
afterwards to around the device reset.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190218153106.16768-1-chris@chris-wilson.co.uk
Linux 5.0-rc7
* tag 'v5.0-rc7': (1667 commits)
Linux 5.0-rc7
Input: elan_i2c - add ACPI ID for touchpad in Lenovo V330-15ISK
Input: st-keyscan - fix potential zalloc NULL dereference
Input: apanel - switch to using brightness_set_blocking()
powerpc/64s: Fix possible corruption on big endian due to pgd/pud_present()
efi/arm: Revert "Defer persistent reservations until after paging_init()"
arm64, mm, efi: Account for GICv3 LPI tables in static memblock reserve table
sunrpc: fix 4 more call sites that were using stack memory with a scatterlist
include/linux/module.h: copy __init/__exit attrs to init/cleanup_module
Compiler Attributes: add support for __copy (gcc >= 9)
lib/crc32.c: mark crc32_le_base/__crc32c_le_base aliases as __pure
auxdisplay: ht16k33: fix potential user-after-free on module unload
x86/platform/UV: Use efi_runtime_lock to serialise BIOS calls
i2c: bcm2835: Clear current buffer pointers and counts after a transfer
i2c: cadence: Fix the hold bit setting
drm: Use array_size() when creating lease
dm thin: fix bug where bio that overwrites thin block ignores FUA
Revert "exec: load_script: don't blindly truncate shebang string"
Revert "gfs2: read journal in large chunks to locate the head"
net: ethernet: freescale: set FEC ethtool regs version
...
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Some clients, such as mesa, may only emit minimal incremental batches
that rely on the logical context state from previous batches. They know
that recovery is impossible after a hang as their required GPU state is
lost, and that each in flight and subsequent batch will hang (resetting
the context image back to default perpetuating the problem).
To avoid getting into the state in the first place, we can allow clients
to opt out of automatic recovery and elect to ban any guilty context
following a hang. This prevents the continual stream of hangs and allows
the client to recreate their context and rebuild the state from scratch.
v2: Prefer calling it recoverable rather than unrecoverable.
References: https://lists.freedesktop.org/archives/mesa-dev/2019-February/215431.html
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org> # for mesa
Link: https://patchwork.freedesktop.org/patch/msgid/20190218105821.17293-1-chris@chris-wilson.co.uk
GVT implements a homogeneous vGPU as host GPU so max vGPU display pipes
can't exceed HW. The inconsistency definition has potential risks which
could cause array indexing overflow.
Remove the unnecessary define of INTEL_GVT_MAX_PIPE and align with i915.
Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Colin Xu <colin.xu@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Add a tracepoint for pipe crc. Makes life much simpler when staring at
traces when hunting for fifo underruns and other issues which cause
corrupted frames. We'll add the tracepoint before filtering out any
potentially bogus crcs during modeset (should actually verify if that
filtering is even correct anymore...)
v2: s/crcs[5]/*crcs/ in the function argument because something
in the macros wants to do sizeof(crcs) and gcc likes to
warn us it's not an actual array so the size may not be
as expected. The silly bugger even does that for 'crcs[]'
causing us to lose any helpful syntactic hint that we
are in fact dealing with an array (kbuild test robot)
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190206204910.13965-1-ville.syrjala@linux.intel.com
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Given a master fd we can then override the priority of the context
in another fd.
Using these overrides was recommended by Christian instead of trying
to submit from a master fd, and I am adding a way to override a
single context instead of the entire process so we can only upgrade
a single Vulkan queue and not effectively the entire process.
Reused the flags field as it was checked to be 0 anyways, so nothing
used it. This is source-incompatible (due to the name change), but
ABI compatible.
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Otherwise we interpret the file private data as drm & amdgpu data
while it might not be, possibly allowing one to get memory corruption.
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
I don't see another way to figure out if a ring is initialized if
the hardware block might not be initialized.
Entities have been fixed up to handle num_rqs = 0.
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Some blocks in amdgpu can have 0 rqs.
Job creation already fails with -ENOENT when entity->rq is NULL,
so jobs cannot be pushed. Without a rq there is no scheduler to
pop jobs, and rq selection already does the right thing with a
list of length 0.
So the operations we need to fix are:
- Creation, do not set rq to rq_list[0] if the list can have length 0.
- Do not flush any jobs when there is no rq.
- On entity destruction handle the rq = NULL case.
- on set_priority, do not try to change the rq if it is NULL.
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Passing an object_count of sufficient size will make
object_count * 4 wrap around to be very small, then a later function
will happily iterate off the end of the object_ids array. Using
array_size() will saturate at SIZE_MAX, the kmalloc() will fail and
we'll return an -ENOMEM to the norty userspace.
Fixes: 62884cd386 ("drm: Add four ioctls for managing drm mode object leases [v7]")
Signed-off-by: Matthew Wilcox <willy@infradead.org>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: <stable@vger.kernel.org> # v4.15+
Signed-off-by: Dave Airlie <airlied@redhat.com>
Currently we try to stop the engine by programming the ring registers to
be disabled before we perform the reset. Sometimes, we see the context
image also have invalid ring registers, which one presumes may be
actually caused by us doing so. Lets risk not doing programming the
ring to zero on the first attempt to avoid preserving that corruption
into the context image, leaving the w/a in place for subsequent
reset attempts.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190213232047.8486-1-chris@chris-wilson.co.uk
drm/i915 is tracking all wakeref owners with a cookie in order to
identify leaks. To that end, each rpm acquisition ops->get_power is
assigned a cookie which should be passed to ops->put_power to signify
its release (and removal from the list of wakeref owners). As snd/hda is
already using a bool to track current status of display_power extending
that to an unsigned long to hold the boolean cookie is a trivial
extension, and will quell all doubt that snd/hda is the cause of the
device runtime pm leaks.
v2: Keep using the power abstraction for local wakeref tracking.
v3: BUILD_BUG_ON impedance mismatch
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Jani Nikula <jani.nikula@intel.com>
Acked-by: Takashi Iwai <tiwai@suse.de>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190213152109.16997-1-chris@chris-wilson.co.uk
The prepare_fb call always happens on new_plane_state.
The drm_atomic_helper_cleanup_planes checks to see if
plane state pointer has changed when deciding to call cleanup_fb on
either the new_plane_state or the old_plane_state.
For a non-async atomic commit the state pointer is swapped, so this
helper calls prepare_fb on the new_plane_state and cleanup_fb on the
old_plane_state. This makes sense, since we want to prepare the
framebuffer we are going to use and cleanup the the framebuffer we are
no longer using.
For the async atomic update helpers this differs. The async atomic
update helpers perform in-place updates on the existing state. They call
drm_atomic_helper_cleanup_planes but the state pointer is not swapped.
This means that prepare_fb is called on the new_plane_state and
cleanup_fb is called on the new_plane_state (not the old).
In the case where old_plane_state->fb == new_plane_state->fb then
there should be no behavioral difference between an async update
and a non-async commit. But there are issues that arise when
old_plane_state->fb != new_plane_state->fb.
The first is that the new_plane_state->fb is immediately cleaned up
after it has been prepared, so we're using a fb that we shouldn't
be.
The second occurs during a sequence of async atomic updates and
non-async regular atomic commits. Suppose there are two framebuffers
being interleaved in a double-buffering scenario, fb1 and fb2:
- Async update, oldfb = NULL, newfb = fb1, prepare fb1, cleanup fb1
- Async update, oldfb = fb1, newfb = fb2, prepare fb2, cleanup fb2
- Non-async commit, oldfb = fb2, newfb = fb1, prepare fb1, cleanup fb2
We call cleanup_fb on fb2 twice in this example scenario, and any
further use will result in use-after-free.
The simple fix to this problem is to block framebuffer changes
in the drm_atomic_helper_async_check function for now.
v2: Move check by itself, add a FIXME (Daniel)
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Harry Wentland <harry.wentland@amd.com>
Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Cc: <stable@vger.kernel.org> # v4.14+
Fixes: fef9df8b59 ("drm/atomic: initial support for asynchronous plane update")
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Acked-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Acked-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Daniel Vetter <daniel@ffwll.ch>
Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Link: https://patchwork.freedesktop.org/patch/275364/
This prevents us from accessing extended registers in tools like
umr. The register access functions already check if the offset
is beyond the BAR size and use the indirect accessors with locking
so this is safe.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Missing firmware declaration caused firmware requirement to
not be noted by the module and may cause firmware to not
be available in initrd.
Fixes: bc4b539e38 "drm/amdgpu: remove old CI DPM implementation"
Reviewed-by: James Zhu <James.Zhu@amd.com>
Tested-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
When ring hang happens amdgpu_dm_commit_planes during flip is holding
the BO reserved and then stack waiting for fences to signal in
reservation_object_wait_timeout_rcu (which won't signal because there
was a hnag). Then when we try to shutdown display block during reset
recovery from drm_atomic_helper_suspend we also try to reserve the BO
from dm_plane_helper_cleanup_fb ending in deadlock.
Also remove useless WARN_ON
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
CP_RB_DOORBELL_RANGE_LOWER/UPPER and CP_MEC_DOORBELL_RANGE_LOWER/UPPER
are used for waking up an idle scheduler and for power gating support.
Usually the first few doorbells in pci doorbell bar are used for RB
and all leftover for MEC. This patch fixes the incorrect settings.
Theoretically, gfx ring doorbells should come before all MEC doorbells
to be consistent with the design. However, since the doorbell
allocations are agreed by all and we are not free to change them, also
considering the kernel MEC ring doorbells which are before gfx ring
doorbells are not used often, we compromise by leaving the doorbell
allocations unchanged.
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Temporarily removing eviction fences to avoid triggering them by
accident is no longer necessary due to the fence_owner logic in
amdgpu_sync_resv.
As a result the ef_list usage of amdgpu_amdkfd_remove_eviction_fence
and amdgpu_amdkfd_add_eviction_fence are no longer needed.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Use FENCE_OWNER_KFD to synchronize PT/PD initialization and clearing
of page table entries. This avoids triggering KFD eviction fences on
the PD reservation objects of compute VMs.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
sriov's gpu_recover inside xgpu_ai_mailbox_flr_work would cause duplicate recover in TDR.
TDR's gpu_recover would be triggered by amdgpu_job_timedout,
that could avoid vk-cts failure by unexpected recover.
Signed-off-by: Wentao Lou <Wentao.Lou@amd.com>
Acked-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>