Owain Ainsworth noticed that the reset code failed to clear the flushing
list leaving the driver in an inconsistent state following a hung GPU.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
When flushing the GPU domains,we emit a flush on *both* rings, even
though they share a unified cache. Only emit the flush on the currently
active ring.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Change the semantics to retire any buffer older than the current seqno
rather than repeatedly calling calling the function to retire the
buffer at the head of the list matching the request seqno.
Whilst this should have no semantic impact on the implementation, Daniel
was wondering if there was a bug where we might miss a retirement and so
end up with a continually growing active list.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
With 5 places to update when adding handling for fence registers, it is
easy to overlook one or two. Correct that oversight, but fence
management should be improved before a new set of registers is added.
Bugzilla: https://bugs.freedesktop.org/show_bug?id=30199
Original patch by: Yuanhan Liu <yuanhan.liu@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: stable@kernel.org
As we currently may need to acquire a fence register during a modeset,
we need to be able to do so in an uninterruptible manner. So expose that
parameter to the callers of the fence management code.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
This ensures that we do wait upon the flushes to complete if necessary
and avoid the visual tears, whilst enabling pipelined page-flips.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
I pulled the wrong version of the patch from Daniel Vetter which was
missing the read barriers -- and the one that was causing all the trouble
was from i915_gem_object_put_fence_reg(), leading to GPU hangs on gen3.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
By reducing the hangcheck frequency we check less often, conserving
resources, and still detect a lock up quickly. On a fast machine with a
slow GPU (like a Core2 paired with a 945G) it is easy for the hangcheck to
misfire as we check too fast.
Also once hung and if we fail to completely reset the chip, we have a
nasty habit of proclaming a hang many times a second and generating a
strobe-like display.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
This spinlock only served debugging purposes in a time when we could not
be sure of the mutex ever being released upon a GPU hang. As we now
should be able rely on hangcheck to do the job for us (and that error
reporting should not itself require the struct mutex) we can kill the
incomplete attempt at protection.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
By allocating the request prior to writing to the ringbuffer, we can
abort the operation without leaving the GPU in an inconsistent state.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
... take advantage of the new implicit request issuing of
i915_wait_request.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
One caller (for the pageflip support) wants a purely pipelined flush.
Distinguish this case by a new parameter. This will also be useful
later on for pipelined fencing.
v2: Simplify the code by depending upon the implicit request emitting
of i915_wait_request.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
[ickle: And drop the non-interruptible support in the process.]
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
By moving one i915_add_request we can solely depend on the new
auto-seqno-numbering behaviour.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
i915_gem_object_move_to_active can handle zero seqno for us now.
And not emitting a request is not fatal here - we'll try to emit
a new one if we have to wait for some rendering to complete.
In case this assumption ever gets accidentally broken, there's already
a BUG_ON to catch it in i915_do_wait_request.
So just silently ignore ENOMEM here instead of screwing up the whole
drm.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
... instead of threading flush_domains through the execbuf code to
i915_add_request.
With this change 2 small cleanups are possible (likewise the majority
of the patch):
- The flush_domains parameter of i915_add_request is always 0. Drop it
and the corresponding logic.
- Ditto for the seqno param of i915_gem_process_flushing_list.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Previously I thought that one interrupt per batchbuffer should be
enough. Now tedious benchmarking showed this to be wrong.
Therefore track whether any commands have been isssued with a future
seqno (like pipelined fencing changes or flushes). If this is the case
emit a request before issueing the batchbuffer.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Now that we can move objects to the active list without already having
emitted a request, move the flushing list handling into i915_gem_flush.
This makes more sense and allows to drop a few i915_add_request calls
that are not strictly necessary.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Sometimes (like when flushing in preparation of batchbuffer execution)
we know that we'll emit a request but haven't yet done so. Allow this
case by simply taking the next seqno by default. Ensure that a request
is eventually emitted before waiting for an request by issuing it
in i915_wait_request iff this is not yet done.
Also replace one open-coded version of i915_gem_object_wait_rendering,
to prevent future code-diversion.
Chris Wilson asked me to explain and clarify what this patch does and why.
Here it goes:
Old way of moving objects onto the active list and associating them with a
reques:
1. i915_add_request + store the returned seqno somewhere
2. i915_gem_object_move_to_active (with the stored seqno as parameter)
For the current users, this is all fine. But I'd like to associate objects
(and fence regs) with the batchbuffer request deep down in the execbuf
call-chain. I thought about three ways of implementing this.
a) Don't care, just emit request when we need a new seqno. When heavily
pipelining fence reg changes, this would have caused tons of superflous
request (and corresponding irqs).
b) Thread all changed fences, objects, whatever through the execbuf-maze,
so that when we emit a request, we can store the new seqno at all the right
places.
c) Kill that seqno-threading-around business by simply storing the next
seqno, i.e. allow 2. to be done before 1. in the above sequence.
I've decided to implement c) (in this patch). The following patches are
just fall-out that resulted from this small conceptual change.
* We can handle the flushing list processing where we actually emit a flush
(i915_gem_flush and i915_retire_commands) instead of in i915_add_request.
The code makes IMHO more sense this way (and i915_add_request looses the
flush_domains parameter, obviously).
* We can avoid emitting unnecessary requests. IMHO there's no point in
emitting more than one request per batchbuffer (with or without an
corresponding irq).
* By enforcing 2. before 1. ordering in the above sequence the seqno
argument of i915_gem_object_move_to_active is redundant and can be
dropped.
v2: Now i915_wait_request issues request if it is not yet emitted.
Also introduce i915_gem_next_request_seqno(dev) just in case we ever
need to do some prep work before using a new seqno.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
[ickle: Keep i915_gem_object_set_to_display_plane() uninterruptible.]
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
ums-gem code correctly cancels the retire work (at lastclose time),
kms does not do so. Fix this by canceling the work right after ideling
the gpu.
While staring at the code I noticed that the work function is not
static. Fix this, too.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
This is the first patch to clean up module unload races due to
outstanding timers/work. Preparatory step: Thou shalt not destroy
the workqueue when new work might still get enqued.
Now error_work gets queued by the hangcheck timer and only (atomically)
reads the chip wedged status. So cancel it right after the hangcheck
timer is killed. But the hangcheck is armed by interrupts, so move
everything after irqs are disabled.
Also change a del_timer to a del_timer_sync in the ums gem code, the
hangcheck timer is self-rearming.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Sandybridge GTT has new cache control bits in PTE, which controls
graphics page cache in LLC or LLC/MLC, so we need to extend the mask
function to respect the new bits.
And set cache control to always LLC only by default on Gen6.
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Cc: stable@kernel.org
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
copy_to_user() returns the number of bytes remaining to be copied and
I'm pretty sure we want to return a negative error code here.
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: stable@kernel.org
This reverts commit 86f100b136.
The kref API requires the handlecount to be initialised to one on object
creation (so that kref_get() doesn't complain upon first use) so the
dalliance in the drivers is required in order to sink the initial
floating reference.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: stable@kernel.org
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/anholt/drm-intel: (58 commits)
drm/i915,intel_agp: Add support for Sandybridge D0
drm/i915: fix render pipe control notify on sandybridge
agp/intel: set 40-bit dma mask on Sandybridge
drm/i915: Remove the conflicting BUG_ON()
drm/i915/suspend: s/IS_IRONLAKE/HAS_PCH_SPLIT/
drm/i915/suspend: Flush register writes before busy-waiting.
i915: disable DAC on Ironlake also when doing CRT load detection.
drm/i915: wait for actual vblank, not just 20ms
drm/i915: make sure eDP PLL is enabled at the right time
drm/i915: fix VGA plane disable for Ironlake+
drm/i915: eDP mode set sequence corrections
drm/i915: add panel reset workaround
drm/i915: Enable RC6 on Ironlake.
drm/i915/sdvo: Only set is_lvds if we have a valid fixed mode.
drm/i915: Set up a render context on Ironlake
drm/i915 invalidate indirect state pointers at end of ring exec
drm/i915: Wake-up wait_request() from elapsed hang-check (v2)
drm/i915: Apply i830 errata for cursor alignment
drm/i915: Only update i845/i865 CURBASE when disabled (v2)
drm/i915: FBC is updated within set_base() so remove second call in mode_set()
...
We now attempt to free "active" objects following a GPU hang as either
the GPU will be reset or the hang is permenant. In either case, the GPU
writes will not be flushed to main memory and it should be safe to
return that memory back to the system.
The BUG_ON(active) is thus overkill and can erroneously fire after a
EIO.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Eric Anholt <eric@anholt.net>
This is consistent with trying to access a filename that not exist
within a directory which is a good analogy here. The main reason for the
change is that it is easy to confuse the error code of EBADF as an
performing an ioctl on an invalid file descriptor (rather than an
unknown object).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Dave Airlie <airlied@redhat.com>
i830 requires 32bpp cursors to be aligned to 16KB, so we have to expose
the alignment parameter to i915_gem_attach_phys_object().
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Eric Anholt <eric@anholt.net>
shmfs doesn't actually implement i_ops->truncate() so we were not
immedatiately releasing the backing pages when shrinking the gfx cache
under OOM. Instead use a combination of truncate_inode_pages() and
i_ops->truncate_range() as is used by shmem_delete_inode().
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Eric Anholt <eric@anholt.net>
In order to reduce the penalty of fallbacks under memory pressure and to
avoid a potential immediate ping-pong of evicting a mmaped buffer, we
move the object to the tail of the inactive list when a page is freshly
faulted or the object is moved into the CPU domain.
We choose not to protect the CPU objects from casual eviction,
preferring to keep the GPU active for as long as possible.
v2: Daniel Vetter found a bug where I forgot that pinned objects are
kept off the inactive list.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Eric Anholt <eric@anholt.net>
The eviction code is the gnarly underbelly of memory management, and is
clearer if kept separated from the normal domain management in GEM.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Eric Anholt <eric@anholt.net>
This will be used by the eviction logic to maintain fairness between the
rings.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Eric Anholt <eric@anholt.net>
This does two little changes:
- Add an alignment parameter for evict_something. It's not really great to
whack a carefully sized hole into the gtt with the wrong alignment.
Especially since the fallback path is a full evict.
- With the inactive scan stuff we need to evict more that one object, so
move the unbind call into the helper function that scans for the object
to be evicted, too. And adjust its name.
No functional changes in this patch, just preparation.
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Eric Anholt <eric@anholt.net>
In order to properly track bound objects, they need to exist on one of
the inactive/active lists or be pinned. As this is a requirement, do the
work inside i915_gem_bind_to_gtt() rather than dotted around the
callsites.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Eric Anholt <eric@anholt.net>
This debugging trace was useful for finding the fbcon regression on
i965, and it may prove useful again in future.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Eric Anholt <eric@anholt.net>
Incorporates a similar patch by Daniel Vetter, the alteration being to
report the current busy state after retiring.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Eric Anholt <eric@anholt.net>
This avoids the excess flush and requests on idle rings (and spamming
the debug log ;-)
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Eric Anholt <eric@anholt.net>
This is required should we ever attempt to use an io-mapping where
KM_USER0 is verboten, such as inside an IRQ context.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Eric Anholt <eric@anholt.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
When creating an object, we create the handle by which it is known to
the process and which own the reference to the object. That reference to
the new handle is what we want to transfer to the process, not the lost
reference to the object; so free the local object reference *not* the
process's handle reference.
This brings i915_gem_object_create_ioctl() into line with
drm_gem_open_ioctl()
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Eric Anholt <eric@anholt.net>
If we fail to flush outstanding GPU writes but return the memory to the
system, we risk corrupting memory should the GPU recovery and complete
those writes. On the other hand, if we bail early and free the object
then we have a definite use-after-free and real memory corruption.
Choose the lesser of two evils, since in order to recover from the hung
GPU we need to completely reset it, those pending writes should
never happen.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Eric Anholt <eric@anholt.net>
If during the freeing of an object the unbind is interrupted by a system
call, which is quite possible if we have outstanding GPU writes that
must be flushed, the unbind is silently aborted. This still leaves the
AGP region and backing pages allocated, and perhaps more importantly,
the object remains upon the various lists exposing us to memory
corruption.
I think this is the cause behind the use-after-free, such as
Bug 15664 - Graphics hang and kernel backtrace when starting Azureus
with Compiz enabled
https://bugzilla.kernel.org/show_bug.cgi?id=15664
v2: Daniel Vetter reminded me that kernel space programming is never easy.
We cannot simply spin to clear the pending signal and so must deferred
the freeing of the object until later.
v3: Run from the top level retire requests.
v4: Tested with P(return -ERESTARTSYS)=.5 from i915_gem_do_wait_request()
v5: Rebase against Eric's for-linus tree.
v6: Refactor, split and add a comment about avoiding unbounded recursion.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel@ffwll.ch>
Signed-off-by: Eric Anholt <eric@anholt.net>
Combine the iteration over active render rings into a common function.
This is in preparation for reusing the idle function to also retire
deferred free requests.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Eric Anholt <eric@anholt.net>
Simple fix for error propagation along the old UMS path.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Eric Anholt <eric@anholt.net>