Sometimes it may be the case when we idle the gpu or wait on something
we don't actually want to process the retiring list. This patch allows
callers to choose the behavior.
Reviewed-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Various issues involved with the space character were generating
warnings in the checkpatch.pl file. This patch removes most of those
warnings.
Signed-off-by: Akshay Joshi <me@akshayjoshi.com>
Signed-off-by: Keith Packard <keithp@keithp.com>
As pointed out by Dan Carpenter, it was seemingly possible to hit an error
whilst mapping the buffer for the regs (except the only likely error
returns should not happen during init) and so leak a pin count on the
bo. To handle this we would need to reacquire the struct mutex, so for
simplicity rearrange for the lock to be held for the entire function.
For extra pedagogy, test that we only call init once.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Keith Packard <keithp@keithp.com>
Signed-off-by: Keith Packard <keithp@keithp.com>
When auditing the locking in i915_gem.c (for a prospective change which
I then abandoned), I noticed two places where struct_mutex is not held
across GEM object manipulations that would usually require it.
Since one is in initial setup and the other in driver unload, I'm
guessing the mutex is not required for either; but post a patch in case
it is.
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Keith Packard <keithp@keithp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When auditing the locking in i915_gem.c (for a prospective change which I
then abandoned), I noticed two places where struct_mutex is not held
across GEM object manipulations that would usually require it. Since one
is in initial setup and the other in driver unload, I'm guessing the mutex
is not required for either; but post a patch in case it is.
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Keith Packard <keithp@keithp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Keith Packard <keithp@keithp.com>
We need to perform a few operations in order to move the object into the
display plane (where it can be accessed coherently by the display
engine) that are important for future safety to forbid whilst pinned. As a
result, we want to need to perform some of the operations before pinning,
but some are required once we have been bound into the GTT. So combine
the pinning performed by all the callers with set_to_display_plane(), so
this complication is contained within the single function.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The code paths for modesetting are growing in complexity as we may need
to move the buffers around in order to fit the scanout in the aperture.
Therefore we face a choice as to whether to thread the interruptible status
through the entire pinning and unbinding code paths or to add a flag to
the device when we may not be interrupted by a signal. This does the
latter and so fixes a few instances of modesetting failures under stress.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Dave Airlie spotted that we had a potential bug should we ever rearrange
the drm_i915_gem_object so not the base drm_gem_object was not its first
member. He noticed that we often convert the return of
drm_gem_object_lookup() immediately into drm_i915_gem_object and then
check the result for nullity. This is only valid when the base object is
the first member and so the superobject has the same address. Play safe
instead and use the compiler to convert back to the original return
address for sanity testing.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
We had some conversions over to the _PIPE macros, but didn't get
everything. So hide the per-pipe regs with an _ (still used in a few
places for legacy) and add a few _PIPE based macros, then make sure
everyone uses them.
[update: remove usage of non-existent no-op macro]
[update 2: keep modesetting suspend/resume code, update to new reg names]
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
[ickle: stylistic cleanups for checkpatch and taste]
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
A lot of minor tweaks to fix the tracepoints, improve the outputting for
ftrace, and to generally make the tracepoints useful again. It is a start
and enough to begin identifying performance issues and gaps in our
coverage.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
The bulk of the change is to convert the growing list of rings into an
array so that the relationship between the rings and the semaphore sync
registers can be easily computed.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
With this change, every batchbuffer can use all available fences (save
pinned and scanout, of course) without ever stalling the gpu!
In theory. Currently the actual pipelined update of the register is
disabled due to some stability issues. However, just the deferred update
is a significant win.
Based on a series of patches by Daniel Vetter.
The premise is that before every access to a buffer through the GTT we
have to declare whether we need a register or not. If the access is by
the GPU, a pipelined update to the register is made via the ringbuffer,
and we track the last seqno of the batches that access it. If by the
CPU we wait for the last GPU access and update the register (either
to clear or to set it for the current buffer).
One advantage of being able to pipeline changes is that we can defer the
actual updating of the fence register until we first need to access the
object through the GTT, i.e. we can eliminate the stall on set_tiling.
This is important as the userspace bo cache does not track the tiling
status of active buffers which generate frequent stalls on gen3 when
enabling tiling for an already bound buffer.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
a00b10c360 "Only enforce fence limits inside the GTT" also
added a fenceable/mappable disdinction when binding/pinning buffers.
This only complicates the code with no pratical gain:
- In execbuffer this matters on for g33/pineview, as this is the only
chip that needs fences and has an unmappable gtt area. But fences
are only possible in the mappable part of the gtt, so need_fence
implies need_mappable. And need_mappable is only set independantly
with relocations which implies (for sane userspace) that the buffer
is untiled.
- The overlay code is only really used on i8xx, which doesn't have
unmappable gtt. And it doesn't support tiled buffers, currently.
- For all other buffers it's a bug to pass in a tiled bo.
In short, this disdinction doesn't have any practical gain.
I've also reverted mapping the overlay and context pages as possibly
unmappable. It's not worth being overtly clever here, all the big
gains from unmappable are for execbuf bos.
Also add a comment for a clever optimization that confused me
while reading the original patch by Chris Wilson.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
So long as we adhere to the fence registers rules for alignment and no
overlaps (including with unfenced accesses to linear memory) and account
for the tiled access in our size allocation, we do not have to allocate
the full fenced region for the object. This allows us to fight the bloat
tiling imposed on pre-i965 chipsets and frees up RAM for real use. [Inside
the GTT we still suffer the additional alignment constraints, so it doesn't
magic allow us to render larger scenes without stalls -- we need the
expanded GTT and fence pipelining to overcome those...]
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Like before add a parameter mappable (also to gem_object_pin) and
set it depending upon the context. Only bos that are brought into
the gtt due to an execbuffer call can be put into the unmappable
part of the gtt, everything else (especially pinned objects) need
to be put into the mappable part of the gtt.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Preparing the ringbuffer for adding new commands can fail (a timeout
whilst waiting for the GPU to catch up and free some space). So check
for any potential error before overwriting HEAD with new commands, and
propagate that error back to the user where possible.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
"depth" should be signed in case packed_depth_bytes() returns -EINVAL.
This probably doesn't make a difference at runtime. In the original
code we would return -EINVAL later if (rec->offset_Y % 4294967274) is
non-zero.
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
* 'drm-core-next' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: (476 commits)
vmwgfx: Implement a proper GMR eviction mechanism
drm/radeon/kms: fix r6xx/7xx 1D tiling CS checker v2
drm/radeon/kms: properly compute group_size on 6xx/7xx
drm/radeon/kms: fix 2D tile height alignment in the r600 CS checker
drm/radeon/kms/evergreen: set the clear state to the blit state
drm/radeon/kms: don't poll dac load detect.
gpu: Add Intel GMA500(Poulsbo) Stub Driver
drm/radeon/kms: MC vram map needs to be >= pci aperture size
drm/radeon/kms: implement display watermark support for evergreen
drm/radeon/kms/evergreen: add some additional safe regs v2
drm/radeon/r600: fix tiling issues in CS checker.
drm/i915: Move gpu_write_list to per-ring
drm/i915: Invalidate the to-ring, flush the old-ring when updating domains
drm/i915/ringbuffer: Write the value passed in to the tail register
agp/intel: Restore valid PTE bit for Sandybridge after bdd3072
drm/i915: Fix flushing regression from 9af90d19f
drm/i915/sdvo: Remove unused encoding member
i915: enable AVI infoframe for intel_hdmi.c [v4]
drm/i915: Fix current fb blocking for page flip
drm/i915: IS_IRONLAKE is synonymous with gen == 5
...
Fix up conflicts in
- drivers/gpu/drm/i915/{i915_gem.c, i915/intel_overlay.c}: due to the
new simplified stack-based kmap_atomic() interface
- drivers/gpu/drm/vmwgfx/vmwgfx_drv.c: added .llseek entry due to BKL
removal cleanups.
Keep the current interface but ignore the KM_type and use a stack based
approach.
The advantage is that we get rid of crappy code like:
#define __KM_PTE \
(in_nmi() ? KM_NMI_PTE : \
in_irq() ? KM_IRQ_PTE : \
KM_PTE0)
and in general can stop worrying about what context we're in and what kmap
slots might be appropriate for that.
The downside is that FRV kmap_atomic() gets more expensive.
For now we use a CPP trick suggested by Andrew:
#define kmap_atomic(page, args...) __kmap_atomic(page)
to avoid having to touch all kmap_atomic() users in a single patch.
[ not compiled on:
- mn10300: the arch doesn't actually build with highmem to begin with ]
[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: fix up drivers/gpu/drm/i915/intel_overlay.c]
Acked-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Chris Metcalf <cmetcalf@tilera.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: David Miller <davem@davemloft.net>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Dave Airlie <airlied@linux.ie>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When separating out the prepare/commit into its own separate functions
we overlooked that the intel_crtc->dpms_mode was being used elsewhere to
check on the actual status of the pipe.
Track that bit of logic separately from the actual dpms mode, so there
is no confusion should we be able to handle multiple dpms modes, nor
any semantic conflict between prepare/commit and dpms.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Commit 77d07fd9d7 introduced a regression
where by not waiting for the panel to be turned off, left the panel and
PLL registers locked across the modeset. Thus the panel remaining blank.
As pointed out by Daniel Vetter, when testing LVDS it helps to open the
laptop and look at the actual panel you are purporting to test.
A second issue with the patch was that in order to modify the panel
fitter before gen5, the pipe and the panel must have be completely
powered down. So we wait.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
The purpose is to make the code much easier to read and therefore reduce
the possibility for bugs.
A side effect is that it also makes it much easier for the compiler,
reducing the object size by 4k -- from just a few functions!
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Alexander reported that the compilation of intel_overlay.c was failing
due to an inclusion that was only valid with CONFIG_DEBUG_FS. As the
whole error reporting is only useful with debugfs enabled, remove all
the redundant error state collection code when compiling without
CONFIG_DEBUG_FS.
Reported-by: Alexander Lam <lambchop468@gmail.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Slightly easier to follow than the state machine and now possible as the
control structure is opaque and hw_wedged is no longer interferred with.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
During DPMS we currently do not want the overlay code to be
interruptible, so pass that information down and only take the
uninterrruptible paths.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
On i830, there exists a bug where an overlay on pipe B requires the mode
clock on pipe A in order to activate. So workaround this by activating
pipe A when trying to enable the overlay on pipe B.
References:
[Bug 29007] GPU hang on video playback with overlay
https://bugs.freedesktop.org/show_bug.cgi?id=29007
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
By allocating the request prior to writing to the ringbuffer, we can
abort the operation without leaving the GPU in an inconsistent state.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Inline the call to wait_flip() and simplify the resulting code.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We can program the h/w to first wait on the flip and then switch off
without relying on s/w intervention. This removes the need for a double
step switch off, bringing much rejoicing.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The scoping of the validity of the mapping is thus clarified.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The only time where an atomic mapping is required is during
error-capture and there we cannot use the default slot, but need to
specifically use one of the IRQ slots. So separate out the two
conditions and use the atomic mapping only when appropriate.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Just makes sure that writes are not being aliased by the CPU cache and
do make it out to main memory.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=24977
Cc: stable@kernel.org
... instead of threading flush_domains through the execbuf code to
i915_add_request.
With this change 2 small cleanups are possible (likewise the majority
of the patch):
- The flush_domains parameter of i915_add_request is always 0. Drop it
and the corresponding logic.
- Ditto for the seqno param of i915_gem_process_flushing_list.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
drivers/gpu/drm/i915/intel_overlay.c: In function 'intel_overlay_print_error_state':
drivers/gpu/drm/i915/intel_overlay.c:1467: error: implicit declaration of function 'seq_printf'
Addresses https://bugzilla.kernel.org/show_bug.cgi?id=16811
Reported-by: Martin Ziegler <ziegler@uni-freiburg.de>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Eric Anholt <eric@anholt.net>
Cc: Dave Airlie <airlied@linux.ie>
Cc: Andre Muller <andremuellerster@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
i830 requires 32bpp cursors to be aligned to 16KB, so we have to expose
the alignment parameter to i915_gem_attach_phys_object().
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Eric Anholt <eric@anholt.net>
v2: Add the interrupt status and address.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Eric Anholt <eric@anholt.net>
This is required should we ever attempt to use an io-mapping where
KM_USER0 is verboten, such as inside an IRQ context.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Eric Anholt <eric@anholt.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Apparently i830 and i845 cannot handle any stride that is not a multiple
of 256, unlike their brethren which do support 64 byte aligned strides.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: stable@kernel.org
Signed-off-by: Eric Anholt <eric@anholt.net>
The active list and request list move into the ringbuffer structure,
so each can track its active objects in the order they are in that
ring. The flushing list does not, as it doesn't matter which ring
caused data to end up in the render cache. Objects gain a pointer to
the ring they are active on (if any).
Signed-off-by: Zou Nan hai <nanhai.zou@intel.com>
Signed-off-by: Xiang Hai hao <haihao.xiang@intel.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
Introduces a more complete intel_ring_buffer structure with callbacks
for setup and management of a particular ringbuffer, and converts the
render ring buffer consumers to use it.
Signed-off-by: Zou Nan hai <nanhai.zou@intel.com>
Signed-off-by: Xiang Hai hao <haihao.xiang@intel.com>
[anholt: Fixed up whitespace fail and rebased against prep patches]
Signed-off-by: Eric Anholt <eric@anholt.net>
Luckily the change is quite a little bit less invasive than I've
feared.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Just preparation, no functional change.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This is a purely cosmetic change to make changes in this area easier.
And hey, it's not only clearer and typechecked, but actually shorter,
too!
[anholt: To clarify, this is a change to let us later make
drm_i915_gem_object subclass drm_gem_object, instead of having
drm_gem_object have a pointer to i915's private data]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Dave Airlie <airlied@gmail.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
We should free "params" before returning.
Signed-off-by: Dan Carpenter <error27@gmail.com>
Reviewed-by: Daniel Vetter <daniel@ffwll.ch>
Cc: stable@kernel.org (for .33)
Signed-off-by: Eric Anholt <eric@anholt.net>
* anholt/drm-intel-next:
drm/i915: Record batch buffer following GPU error
drm/i915: give up on 8xx lid status
drm/i915: reduce some of the duplication of tiling checking
drm/i915: blow away userspace mappings before fence change
drm/i915: move a gtt flush to the correct place
agp/intel: official names for Pineview and Ironlake
drm/i915: overlay: drop superflous gpu flushes
drm/i915: overlay: nuke readback to flush wc caches
drm/i915: provide self-refresh status in debugfs
drm/i915: provide FBC status in debugfs
drm/i915: fix drps disable so unload & re-load works
drm/i915: Fix OGLC performance regression on 945
drm/i915: Deobfuscate the render p-state obfuscation
drm/i915: add dynamic performance control support for Ironlake
drm/i915: enable memory self refresh on 9xx
drm/i915: Don't reserve compatibility fence regs in KMS mode.
drm/i915: Keep MCHBAR always enabled
drm/i915: Replace open-coded eviction in i915_gem_idle()
Cache-coherency is maintained by gem. Drop these leftover MI_FLUSH
commands from the userspace code.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Eric Anholt <eric@anholt.net>
I retested this and whatever this papered over, the problem doesn't seem
to exist anymore.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Eric Anholt <eric@anholt.net>
[anholt: fixed up compile warning]
Signed-off-by: Eric Anholt <eric@anholt.net>
Mostly obvious simplifications.
The i915 pread/pwrite ioctls, intel_overlay_put_image and
nouveau_gem_new were incorrectly using the locked versions
without locking: this is also fixed in this patch.
Signed-off-by: Luca Barbieri <luca@luca-barbieri.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
IGD* isn't a useful name. Replace with the codenames, as sourced from
pci.ids.
Signed-off-by: Adam Jackson <ajax@redhat.com>
[anholt: Fixed up for merge with pineview/ironlake changes]
Signed-off-by: Eric Anholt <eric@anholt.net>
When switching to interruptible sleeps in the overlay code, I've
forgotten to recover from interruptions at one site. This
resulted in the overlay still running when it should have been
switched off. This in turn caused a hang on resume because it
tried to disable the (not-running) overlay in preparation for the
resume modeset.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=24980
Tested-by: maximlevitsky@gmail.com
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Eric Anholt <eric@anholt.net>
I've suspected some bug there wrt to suspend, but that was not
the case. Clean up the code anyway.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Eric Anholt <eric@anholt.net>
I've simply overlooked one case in the conversion to interruptible
sleeps. Rectify this.
Also delete a leftover debug printk.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Eric Anholt <eric@anholt.net>
At least for the common case of userspace ioctls. When doing a
modeset operation, the wait is still uninterruptible. But considering
that failing to turn off the overlay when switching off the crtc it's
running on hangs the chip, it doesn't complicate matters _very_
much. There's just an unkillable X in addition to a black screen.
BUG() about it and explain in the code.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Eric Anholt <eric@anholt.net>
As long as the gpu can keep up, neither the cpu (waiting for gpu)
nore the gpu (waiting for vblank to do an overlay flip) stalls.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Eric Anholt <eric@anholt.net>
Now that the cache flushing of the memory based overlay regs works,
we can safely switch off the overlay. Beforehand it was only disabled
(like in userspace).
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Eric Anholt <eric@anholt.net>
This implements intel overlay support for kms via a device-specific
ioctl. Thomas Hellstrom brought up the idea of a general ioctl (on
dri-devel). We've reached the conclusion that such an infrastructure
only makes sense when multiple kms overlay implementations exists,
which atm don't (and it doesn't look like this is gonna change).
Open issues:
- Runs in sync with the gpu, i.e. unnecessary waiting. I've decided
to wait on this because the hw tends to hang when changing something
in this area. I left some dummy functions as infrastructure.
- polyphase filtering uses a static table.
- uses uninterruptible sleeps. Unfortunately the alternatives may
unnecessarily wedged the hw if/when we timeout too early (and
userspace only overloaded the batch buffers with stuff worth a few
secs of gpu time).
Changes since v1:
- fix off-by-one misconception on my side. This fixes fullscreen
playback.
Changes since v2:
- add underrun detection as spec'ed for i965.
- flush caches properly, fixing visual corruptions.
Changes since v4:
- fix up cache flushing of overlay memory regs.
- killed require_pipe_a logic - it hangs the chip.
Tested-By: diego.abelenda@gmail.com (on a 865G)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
[anholt: Resolved against the MADVISE ioctl going in before this one]
Signed-off-by: Eric Anholt <eric@anholt.net>