Alex writes:
This is the radeon drm-next request. Big changes include:
- support for dpm on CIK parts
- support for ASPM on CIK parts
- support for berlin GPUs
- major ring handling cleanup
- remove the old 3D blit code for bo moves in favor of CP DMA or sDMA
- lots of bug fixes
[airlied: fix up a bunch of conflicts from drm_order removal]
* 'drm-next-3.12' of git://people.freedesktop.org/~agd5f/linux: (898 commits)
drm/radeon/dpm: make sure dc performance level limits are valid (CI)
drm/radeon/dpm: make sure dc performance level limits are valid (BTC-SI) (v2)
drm/radeon: gcc fixes for extended dpm tables
drm/radeon: gcc fixes for kb/kv dpm
drm/radeon: gcc fixes for ci dpm
drm/radeon: gcc fixes for si dpm
drm/radeon: gcc fixes for ni dpm
drm/radeon: gcc fixes for trinity dpm
drm/radeon: gcc fixes for sumo dpm
drm/radeonn: gcc fixes for rv7xx/eg/btc dpm
drm/radeon: gcc fixes for rv6xx dpm
drm/radeon: gcc fixes for radeon_atombios.c
drm/radeon: enable UVD interrupts on CIK
drm/radeon: fix init ordering for r600+
drm/radeon/dpm: only need to reprogram uvd if uvd pg is enabled
drm/radeon: check the return value of uvd_v1_0_start in uvd_v1_0_init
drm/radeon: split out radeon_uvd_resume from uvd_v4_2_resume
radeon kms: fix uninitialised hotplug work usage in r100_irq_process()
drm/radeon/audio: set up the sads on DCE3.2 asics
drm/radeon: fix handling of variable sized arrays for router objects
...
Conflicts:
drivers/gpu/drm/i915/i915_dma.c
drivers/gpu/drm/i915/i915_gem_dmabuf.c
drivers/gpu/drm/i915/intel_pm.c
drivers/gpu/drm/radeon/cik.c
drivers/gpu/drm/radeon/ni.c
drivers/gpu/drm/radeon/r600.c
Newer asics have a lot of vram so it's less of an
issue to waste a little more space for the gart
page table. This gives us some additional gart space
before having to migrate to non-gart system ram
for games, etc. where we use up most of vram.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
For optimus and powerxpress muxless we really want the GPU
driver deciding when to power up/down the GPU, not userspace.
This adds the ability for a driver to dynamically power up/down
the GPU and remove the switcheroo from controlling it, the
switcheroo reports the dynamic state to userspace also.
It also adds 2 power domains, one for machine where the power
switch is controlled outside the GPU D3 state, so the powerdown
ordering is done correctly, and the second for the hdmi audio
device to make sure it can resume for PCI config space accesses.
v1.1: fix build with switcheroo off
v2: add power domain support for radeon and v1 nvidia dsms
v2.1: fix typo in off case
v3: add audio power domain for hdmi audio + misc audio fixes
v4: use PCI_SLOT macro, drop power reference on hdmi audio resume
failure also.
Signed-off-by: Dave Airlie <airlied@redhat.com>
When we reset the GPU, we need to properly tear
down power management before reseting the GPU and then
set it back up again after reset. Add the missing
radeon_pm_[suspend|resume] calls to the gpu reset
function.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The doorbell aperture is a PCI BAR whose pages can be
mapped to compute resources for things like wptrs
for userspace queues.
This patch maps the BAR and sets up a simple allocator
to allocate pages from the BAR.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
UVD ring can't use scratch thus it does need writeback buffer to keep
a valid address or radeon_ring_backup will trigger a kernel fault.
It's ok to not unpin the write back buffer on suspend as it leave in
gtt and thus does not need eviction.
v2: Fix the uvd case.
Reported and tracked by Wojtek <wojtask9@wp.pl>
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This narrows the scope of the apple re-POST hack added in:
drm/radeon: re-POST the asic on Apple hardware when booted via EFI
That patch prevents UVD from working on macs when booted in EFI
mode. The original patch fixed macbook2,1 systems which were
r5xx and hence have no UVD. Limit the hack to those systems to
prevent UVD breakage on newer systems.
Fixes:
https://bugs.freedesktop.org/show_bug.cgi?id=63935
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Matthew Garrett <matthew.garrett@nebula.com>
Newer asics have variable numbers of crtcs. Use that
rather than the asic family to determine which crtcs
to check. This avoids checking non-existent crtcs or
missing crtcs on certain asics.
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
This is to allow debugging of userspace program not freeing buffer
after, which is basicly a memory leak. This print the list of all
gem object along with their size and placement (VRAM,GTT,CPU) and
with the pid of the task that created them.
agd5f: add warning fix
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Golden registers are arrays of register settings from the
hw team that need to be initialized at asic startup.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add a per-asic MC (memory controller) mask which holds the
mak address mask the asic is capable of. Use this when
calculating the vram and gtt locations rather using asic
specific functions or limiting everything to 32 bits.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
More drm-next bits for radeon. Just bug fixes.
* 'drm-next-3.9' of git://people.freedesktop.org/~agd5f/linux:
drm/radeon: properly validate the atpx interface
drm/radeon: switch get_gpu_clock() to a callback (v2)
drm/radeon: add a asic callback to get the xclk
drm/radeon: Avoid NULL pointer dereference from atom_index_iio() allocation failure
drm/radeon: remove overzealous warning in hdmi handling
drm/radeon: fix multi-head power profile stability on BTC+ asics
Smatch anlysis:
drivers/gpu/drm/radeon/atom.c:1242 atom_index_iio() error: potential null
dereference 'ctx->iio'. (kzalloc returns null)
Also cleaned up some checks before calls to kfree(). kfree(NULL) is OK.
Cc: David Airlie <airlied@linux.ie>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Michel Dänzer" <michel.daenzer@amd.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex writes:
- CS ioctl cleanup and unification. Unification of a lot of functionality
that was duplicated across multiple generates of hardware.
- Add support for Oland GPUs
- Deprecate UMS support. Mesa and the ddx dropped support for UMS and
apparently very few people still use it since the UMS CS ioctl was broken
for several kernels and no one reported it. It was fixed in 3.8/stable.
- Rework GPU reset. Use the status registers to determine what blocks
to reset. This better matches the recommended reset programming model.
This also allows us to properly reset blocks besides GFX and DMA.
- Switch the VM set page code to use an IB rather than the ring. This
fixes overflow issues when doing large page table updates using a small
ring like DMA.
- Several small cleanups and bug fixes.
* 'drm-next-3.9' of git://people.freedesktop.org/~agd5f/linux: (38 commits)
drm/radeon/dce6: fix display powergating
drm/radeon: add Oland pci ids
drm/radeon: radeon-asic updates for Oland
drm/radeon: add ucode loading support for Oland
drm/radeon: fill in gpu init for Oland
drm/radeon: add Oland chip family
drm/radeon: switch back to using the DMA ring for VM PT updates
drm/radeon: use IBs for VM page table updates v2
drm/radeon: don't reset the MC on IGPs/APUs
drm/radeon: use the reset mask to determine if rings are hung
drm/radeon: halt engines before disabling MC (si)
drm/radeon: halt engines before disabling MC (cayman/TN)
drm/radeon: halt engines before disabling MC (evergreen)
drm/radeon: halt engines before disabling MC (6xx/7xx)
drm/radeon: use status regs to determine what to reset (si)
drm/radeon: use status regs to determine what to reset (cayman)
drm/radeon: use status regs to determine what to reset (evergreen)
drm/radeon: use status regs to determine what to reset (6xx/7xx)
drm/radeon: rework GPU reset on cayman/TN
drm/radeon: rework GPU reset on cayman/TN
...
Originally 'efi_enabled' indicated whether a kernel was booted from
EFI firmware. Over time its semantics have changed, and it now
indicates whether or not we are booted on an EFI machine with
bit-native firmware, e.g. 64-bit kernel with 64-bit firmware.
The immediate motivation for this patch is the bug report at,
https://bugs.launchpad.net/ubuntu-cdimage/+bug/1040557
which details how running a platform driver on an EFI machine that is
designed to run under BIOS can cause the machine to become
bricked. Also, the following report,
https://bugzilla.kernel.org/show_bug.cgi?id=47121
details how running said driver can also cause Machine Check
Exceptions. Drivers need a new means of detecting whether they're
running on an EFI machine, as sadly the expression,
if (!efi_enabled)
hasn't been a sufficient condition for quite some time.
Users actually want to query 'efi_enabled' for different reasons -
what they really want access to is the list of available EFI
facilities.
For instance, the x86 reboot code needs to know whether it can invoke
the ResetSystem() function provided by the EFI runtime services, while
the ACPI OSL code wants to know whether the EFI config tables were
mapped successfully. There are also checks in some of the platform
driver code to simply see if they're running on an EFI machine (which
would make it a bad idea to do BIOS-y things).
This patch is a prereq for the samsung-laptop fix patch.
Cc: David Airlie <airlied@linux.ie>
Cc: Corentin Chary <corentincj@iksaif.net>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Olof Johansson <olof@lixom.net>
Cc: Peter Jones <pjones@redhat.com>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Steve Langasek <steve.langasek@canonical.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Konrad Rzeszutek Wilk <konrad@kernel.org>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: <stable@vger.kernel.org>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
vga-switcheroo with apple-gmux does not switch correctly on my system. The PCI
configuration space is not restored correctly, resulting in MSI not working after switch.
Only useful item in dmesg is:
[ 33.922807] radeon 0000:01:00.0: Refused to change power state, currently in D3
I did some testing, dumping the difference in ms between first succesful switch
from D3 to D0, and it seems that there is slightly more than 20 ms difference when
the device is re-enabled through vga-switcheroo.
So bump the re-enable d3 delay to 20 ms to handle this, which fixes msi not working
on my system after switcheroo-ing. Default d3_delay value is PCI_PM_D3_WAIT, 10 ms.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Modeset path seems to conflict sometimes with the memory management
leading to kernel deadlock. This move modesetting reset after GPU
acceleration reset.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
radeon_fence_wait_empty_locked should not trigger GPU reset as no
place where it's call from would benefit from such thing and it
actually lead to a kernel deadlock in case the reset is triggered
from pm codepath. Instead force ring completion in place where it
makes sense or return early in others.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Force all fence to signal if GPU reset failed so no process get stuck
on waiting fence.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
With the new per-crtc locking mutliple set-cursor calls could happen
in parallel. Out of sheer paranoia I've opted for an irqsave spinlock.
But if there's indeed an access from interrupt contexts to these regs
it's already broken with the old code, so this can likely just be
reduced to a normal spinlock. Otoh the pageflip completion happens
from the vblank irq handler ...
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
GART and VRAM size limits need to be a power of two.
Fix values greater than 1GB and simplify those checks a bit.
v2: also fix radeon_vram_limit usage, and simplify test even more.
v3: agd5f: fix spelling as noticed by Klaus Schnass
Signed-off-by: Christian König <deathsimple@vodafone.de>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The actual set up and assignment of VM page tables
is done on the fly in radeon_gart.c.
v2: update vm size comments
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Let's allow GCC to optimize better.
This exposed some five unused functions, but this patch doesn't remove them.
Signed-off-by: Lauri Kasanen <cand@gmx.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Move binding onto the ring, simplifying handling a bit.
Signed-off-by: Christian König <deathsimple@vodafone.de>
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
It was only used for dynpm, but has been replaced with
a better implementation using fences. Remove it.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
radeon_ring_restore is freeing the memory for the saved
ring data. We need to remember that, otherwise we try to
restore the ring data again on the next try. Additional
to that it shouldn't try the reset infinitely if we have
saved ring data.
Signed-off-by: Christian König <deathsimple@vodafone.de>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
It seems some of those IGP dislike non dma32 page despite what
documentation says. Fix regression since we allowed non dma32
pages. It seems it only affect some revision of those IGP chips
as we don't know which one just force dma32 for all of them.
https://bugzilla.redhat.com/show_bug.cgi?id=785375
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Returns a snapshot of the GPU clock counter. Needed
for certain OpenGL extensions.
v2: agd5f
- address Jerome's comments
- add function documentation
Signed-off-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Adds documentation to most of the functions in
radeon_device.c
v2: split out general descriptions as per Christian's
comments.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Just store the index in the ring structure.
Idea taken from one of Jerome's wip rptr patches.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Try to save whatever is on the rings when
we encounter an lockup.
v2: Fix spelling error. Free saved ring data if reset fails.
Add documentation for the new functions.
v3: Some more spelling fixes
v4: It doesn't make sense to save anything if all fences
are signaled
Signed-off-by: Christian König <deathsimple@vodafone.de>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Making it easier to control when it is executed.
Signed-off-by: Christian König <deathsimple@vodafone.de>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
GPU reset need to be exclusive, one happening at a time. For this
add a rw semaphore so that any path that trigger GPU activities
have to take the semaphore as a reader thus allowing concurency.
The GPU reset path take the semaphore as a writer ensuring that
no concurrent reset take place.
v2: init rw semaphore
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Try to remove or replace the cs_mutex with a
vm_mutex where it is still needed.
v2: fix locking order
v3: rebased on drm-next
Signed-off-by: Christian König <deathsimple@vodafone.de>
The spinlock was actually there to protect the
rptr, but rptr was read outside of the locked area.
Also we don't really need a spinlock here, an
atomic should to quite fine since we only need to
prevent it from being reentrant.
v2: Keep the spinlock....
v3: Back to an atomic again after finding & fixing the real bug.
Signed-off-by: Christian Koenig <christian.koenig@amd.com>
It is a rw_semaphore now and only write locked
while changing the clock. Also the lock is renamed
to better reflect what it is protecting.
v2: Keep the ttm_vm_ops on IGPs
Signed-off-by: Christian König <deathsimple@vodafone.de>
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
This adds prime->fd and fd->prime support to radeon.
It passes the sg object to ttm and then populates
the gart entries using it.
Compile tested only.
v2: stub kmap + use new helpers + add reimporting
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
All radeon_gem_init() does is initialize the gem objects
list. radeon_device.c does this explicitly. r600+ calls
radeon_gem_init() so the list gets initialized twice. Older
asics don't call it at all and rely on the the init in
radeon_device.c. Just call radeon_gem_init() in radeon_device.c
and remove the explicit calls from all the newer asics.
All asics call radeon_gem_fini() in their fini pathes. That
could possibly be cleaned up too.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This changes the API as a clean-up. Instead of passing multiple
function pointers at each time, introduce a new struct holding the
whole callback functions and pass it to the registration.
The same struct will be used for the upcoming audio client
registration, too.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
It isn't necessary any more and the suballocator seems to perform
even better.
Signed-off-by: Christian König <deathsimple@vodafone.de>
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Directly use the suballocator to get small chunks of memory.
It's equally fast and doesn't crash when we encounter a GPU reset.
v2: rebased on new SA interface.
Signed-off-by: Christian König <deathsimple@vodafone.de>
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Some callers illegal called fence_wait_next/empty
while holding the ring emission mutex. So don't
relock the mutex in that cases, and move the actual
locking into the fence code.
v2: Don't try to unlock the mutex if it isn't locked.
Signed-off-by: Christian König <deathsimple@vodafone.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Using 64bits fence sequence we can directly compare sequence
number to know if a fence is signaled or not. Thus the fence
list became useless, so does the fence lock that mainly
protected the fence list.
Things like ring.ready are no longer behind a lock, this should
be ok as ring.ready is initialized once and will only change
when facing lockup. Worst case is that we return an -EBUSY just
after a successfull GPU reset, or we go into wait state instead
of returning -EBUSY (thus delaying reporting -EBUSY to fence
wait caller).
v2: Remove left over comment, force using writeback on cayman and
newer, thus not having to suffer from possibly scratch reg
exhaustion
v3: Rebase on top of change to uint64 fence patch
v4: Change DCE5 test to force write back on cayman and newer but
also any APU such as PALM or SUMO family
v5: Rebase on top of new uint64 fence patch
v6: Just break if seq doesn't change any more. Use radeon_fence
prefix for all function names. Even if it's now highly optimized,
try avoiding polling to often.
v7: We should never poll the last_seq from the hardware without
waking the sleeping threads, otherwise we might lose events.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Christian König <deathsimple@vodafone.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>