Skip the "manual" pageflip completion checks via polling and
guessing in the vblank handler radeon_crtc_handle_vblank() on
asics which are known to reliably support hw pageflip completion
irqs. Those pflip irqs are a more reliable and race-free method
of handling pageflip completion detection, whereas the "classic"
polling method has some small races in combination with dpm on,
and with the reworked pageflip implementation since Linux 3.16.
On old asics without pflip irqs, the classic method is used.
On asics with known good pflip irqs, only pflip irqs are used
by default, but a new module parameter "use_pflipirqs" allows to
override this in case we encounter asics in the wild with
unreliable or faulty pflip irqs. A module parameter of 0 allows
to use the classic method only in such a case. A parameter of 1
allows to use both classic method and pflip irqs as additional
band-aid to avoid some small races which could happen with the
classic method alone. The setting 1 gives Linux 3.16 behaviour.
Hw pflip irqs are available since R600.
Tested on DCE-4, AMD Cedar - FirePro 2270.
v2: agd5f: only enable pflip interrupts on DCE4+ as they are not
reliable on older asics.
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Adjust the previous tweak for hawaii to return 3 if the new firmware is used.
Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Older firmware didn't support the new nop packet.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>
Return 2 so we can be sure the kernel has the necessary
changes for acceleration to work.
Note: This patch depends on these two commits:
- drm/radeon: fix cut and paste issue for hawaii.
- drm/radeon: use packet2 for nop on hawaii with old firmware
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>
Cc: stable@vger.kernel.org
Older firmware didn't support the new nop packet.
v2 (Andreas Boll):
- Drop usage of packet3 for new firmware
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com> (v1)
Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>
Cc: stable@vger.kernel.org
That should allow us to allocate bigger BOs.
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Move the decision what to use into the common VM code.
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This closes a small window where the GPU might have accessed freed up memory.
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
To be consistent with radeon_bo_unref, needed in the following patch.
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
It's causing lockdep warnings and why should
we access the memory that is freed up?
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
v2: rebase on vm_size scale change. Adjust vm_size default to 8,
Better handle the default and smaller values.
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Let's try to fix bugs related to this instead of just disabling it.
Signed-off-by: Christian König <christian.koenig@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Scales much better than scanning the address range linearly.
v2: store pfn instead of address
Signed-off-by: Christian König <christian.koenig@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Won't work anyway, instead WARN_ON if the VA list isn't
empty when we free the BO.
Signed-off-by: Christian König <christian.koenig@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Don't wait for the BO to be used again, just
update the PT on the next VM use.
v2: remove stray semicolon.
Signed-off-by: Christian König <christian.koenig@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The HDP cache only applies to CPU access to VRAM.
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Some hawaii boards use a different method for fetching the
voltage information from the vbios.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
This ensures the GPU sees all previous CPU writes to VRAM, which makes it
safe:
* For userspace to stream data from CPU to GPU via VRAM instead of GTT
* For IBs to be stored in VRAM instead of GTT
* For ring buffers to be stored in VRAM instead of GTT, if the HPD flush
is performed via MMIO
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
And clean up the function comment a little.
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Both on their own are complex enough.
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Need to unblank the display when resuming the MC. No
functional change as this code path is not currently
hit. We always disable the displays entirely rather
than just blanking them.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Seems to make VM flushes more stable on SI and CIK.
v2: only use the PFP on the GFX ring on CIK
Signed-off-by: Christian König <christian.koenig@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
For symmetry with other *_set_wptr hooks.
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
PCI GART doesn't support unsnooped access. AGP GART already uses
write-combined CPU mappings.
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
That didn't worked correctly any more and opened up a security problem.
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Unused and unimplemented. Also fix specifying the
kernel flag incorrectly at one occasion.
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Now that fallback to gtt is fixed for cpu access, we can
remove this limit.
bug:
https://bugs.freedesktop.org/show_bug.cgi?id=78717
v2: use new gart_pin_size to accurately track available gtt.
v3: fix comment
v4: clarify comment
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Gives more accurate count and prevents failures when we can't
allocate memory for the tests.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Gives a more accurate limit than the previous code.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
So we know how large an allocation we can allow.
v2: incorporate Michel's comments
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
v2: fix rebase onto drm-fixes
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Doesn't seem necessary, the GART table memory should be persistent.
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
These clutter up dmesg during piglit runs. Userspace generally deals
gracefully with this failure.
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
We keep a cached version of the edid in radeon_connector which
we use for determining connectedness and when to enable certain
features like hdmi audio, etc. When the user uses the firmware
interface to override the driver with some other edid the driver's
copy is never updated. The fetch function will check if there
is a user supplied edid and update the driver's copy if there
is.
bug:
https://bugs.freedesktop.org/show_bug.cgi?id=80691
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Split radeon_ddc_get_modes() and move it into
radeon_connectors.c since that is the only place
that uses it.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
No need to continue with the loops once we've matched
the appropriate connector.
See commit 8a992ee145
Signed-off-by: Stefan Brüns <stefan.bruens@rwth-aachen.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Valid values are 1 to 251 for 0 to 500 ms latency, 0 for unknown
and 255 for audio/video unsupported by sink, according to HDMI 1.3 spec.
Also matches Radeon HDA verb 0xf7b documentation.
Signed-off-by: Stefan Brüns <stefan.bruens@rwth-aachen.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This was originally un-inlined by Andi Kleen in 2011 citing size concerns.
Indeed, a first attempt at inlining it grew radeon.ko by 7%.
However, 2% of cpu is spent in this function. Simply inlining it gave 1% more fps
in Urban Terror.
v2: We know the minimum MMIO size. Adding it to the if allows the compiler to
optimize the branch out, improving both performance and size.
The v2 patch decreases radeon.ko size by 2%. I didn't re-benchmark, but common sense
says perf is now more than 1% better.
v3: Also change _wreg, make the threshold a define.
Inlining _wreg increased the size a bit compared to v2, so now radeon.ko
is only 1% smaller.
Signed-off-by: Lauri Kasanen <cand@gmx.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This enables the display scaler on all connectors for r5xx
and newer asics. Previously we only enabled the scaler for
fixed mode displays (eDP or LVDS) since they have to use the
scaler to support non-native modes. Most other displays
are multi-sync or have a built in scaler to support non-native
modes. The default scaling mode for non-fixed displays is
none which will use the scaler in the monitor. Note that
we do not populate any fake modes like we do for fixed
displays so it will only use the modes in the edid. For
other modes, you'll need to populate them manually.
bug:
https://bugs.freedesktop.org/show_bug.cgi?id=80868
v2: properly handle scaling with no modes defined
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>