So when reviewing Michel's patch I've noticed a few things and cleaned
them up:
- The early checks in ppgtt_release are now redundant: The inactive
list should always be empty now, so we can ditch these checks. Even
for the aliasing ppgtt (though that's a different confusion) since
we tear that down after all the objects are gone.
- The ppgtt handling functions are splattered all over. Consolidate
them in i915_gem_gtt.c, give them OCD prefixes and add wrappers for
get/put.
- There was a bit a confusion in ppgtt_release about whether it cares
about the active or inactive list. It should care about them both,
so augment the WARNINGs to check for both.
There's still create_vm_for_ctx left to do, put that is blocked on the
removal of ppgtt->ctx. Once that's done we can rename it to
i915_ppgtt_create and move it to its siblings for handling ppgtts.
v2: Move the ppgtt checks into the inline get/put functions as
suggested by Chris.
v3: Inline the now redundant ppgtt local variable.
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
VMAs should take a reference of the address space they use.
Now, when the fd is closed, it will release the ref that the context was
holding, but it will still be referenced by any vmas that are still
active.
ppgtt_release() should then only be called when the last thing referencing
it releases the ref, and it can just call the base cleanup and free the
ppgtt.
Note that with this we will extend the lifetime of ppgtts which
contain shared objects. But all the non-shared objects will get
removed as soon as they drop of the active list and for the shared
ones the shrinker can eventually reap them. Since we currently can't
evict ppgtt pagetables either I don't think that temporary leak is
important.
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
[danvet: Add note about potential ppgtt leak with this approach.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Allocate and populate the default LRC for every ring, call
gen-specific init/cleanup, init/fini the command parser and
set the status page (now inside the LRC object). These are
things all engines/rings have in common.
Stopping the ring before cleanup and initializing the seqnos
is left as a TODO task (we need more infrastructure in place
before we can achieve this).
v2: Check the ringbuffer backing obj for ring_is_initialized,
instead of the context backing obj (similar, but not exactly
the same).
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The backing objects and ringbuffers for contexts created via open
fd are actually empty until the user starts sending execbuffers to
them. At that point, we allocate & populate them. We do this because,
at create time, we really don't know which engine is going to be used
with the context later on (and we don't want to waste memory on
objects that we might never use).
v2: As contexts created via ioctl can only be used with the render
ring, we have enough information to allocate & populate them right
away.
v3: Defer the creation always, even with ioctl-created contexts, as
requested by Daniel Vetter.
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Now that we have the ability to allocate our own context backing objects
and we have multiplexed one of them per engine inside the context structs,
we can finally allocate and free them correctly.
Regarding the context size, reading the register to calculate the sizes
can work, I think, however the docs are very clear about the actual
context sizes on GEN8, so just hardcode that and use it.
v2: Rebased on top of the Full PPGTT series. It is important to notice
that at this point we have one global default context per engine, all
of them using the aliasing PPGTT (as opposed to the single global
default context we have with legacy HW contexts).
v3:
- Go back to one single global default context, this time with multiple
backing objects inside.
- Use different context sizes for non-render engines, as suggested by
Damien (still hardcoded, since the information about the context size
registers in the BSpec is, well, *lacking*).
- Render ctx size is 20 (or 19) pages, but not 21 (caught by Damien).
- Move default context backing object creation to intel_init_ring (so
that we don't waste memory in rings that might not get initialized).
v4:
- Reuse the HW legacy context init/fini.
- Create a separate free function.
- Rename the functions with an intel_ preffix.
v5: Several rebases to account for the changes in the previous patches.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (v1)
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
For the moment this is just a placeholder, but it shows one of the
main differences between the good ol' HW contexts and the shiny
new Logical Ring Contexts: LR contexts allocate and free their
own backing objects. Another difference is that the allocation is
deferred (as the create function name suggests), but that does not
happen in this patch yet, because for the moment we are only dealing
with the default context.
Early in the series we had our own gen8_gem_context_init/fini
functions, but the truth is they now look almost the same as the
legacy hw context init/fini functions. We can always split them
later if this ceases to be the case.
Also, we do not fall back to legacy ringbuffers when logical ring
context initialization fails (not very likely to happen and, even
if it does, hw contexts would probably fail as well).
v2: Daniel says "explain, do not showcase".
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
[danvet: s/BUG_ON/WARN_ON/.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The comment [which was mine] is wrong. The context object can never be
bound in a PPGTT because it is only capable of living in the Global GTT.
So, remove the comment, and reorder the unref. What's nice about the
latter is it keeps the context object alive past the PPGTT. This makes
the destroy ordering symmetric with the creation ordering.
Create:
1. Create context
2. Create PPGTT
Destroy:
1. Destroy PPGTT
2. Destroy context
As far as I know, this does not fix a bug. The code previously kept the
context data structure, only the object was gone. As the code was,
nothing tried to use the object after this point.
NOTE: If in the future we have cases where the PPGTT can/should outlive
the context (which doesn't occur today, but the code permits it), this
ordering does not matter. Even if this occurs, as it stands now, we do
not expect that to be the normal case, and having this order makes
debugging a bit easier if we're tracking object lifetimes for the
context vs ppgtt
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Resolve conflict with Oscar's execlist prep patches.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This is an Execlists preparatory patch, since they make context ID become an
overloaded term:
- In the software, it was used to distinguish which context userspace was
trying to use.
- In the BSpec, the term is used to describe the 20-bits long field the
hardware uses to it to discriminate the contexts that are submitted to
the ELSP and inform the driver about their current status (via Context
Switch Interrupts and Context Status Buffers).
Initially, I tried to make the different meanings converge, but it proved
impossible:
- The software ctx->id is per-filp, while the hardware one needs to be
globally unique.
- Also, we multiplex several backing states objects per intel_context,
and all of them need unique HW IDs.
- I tried adding a per-filp ID and then composing the HW context ID as:
ctx->id + file_priv->id + ring->id, but the fact that the hardware only
uses 20-bits means we have to artificially limit the number of filps or
contexts the userspace can create.
The ctx->user_handle renaming bits are done with this Cocci patch (plus
manual frobbing of the struct declaration):
@@
struct intel_context c;
@@
- (c).id
+ c.user_handle
@@
struct intel_context *c;
@@
- (c)->id
+ c->user_handle
Also, while we are at it, s/DEFAULT_CONTEXT_ID/DEFAULT_CONTEXT_HANDLE and
change the type to unsigned 32 bits.
v2: s/handle/user_handle and change the type to uint32_t as suggested by
Chris Wilson.
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> (v1)
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We have already advanced that Logical Ring Contexts have their own kind
of backing objects, but everything will be better explained in the Execlists
series. For now, suffice it to say that the current backing object is only
ever used with the render ring, so we're making this fact more explicit
(which is a good reason on its own).
As for the is_initialized flag, we only use to signify that the render state
has been initialized (a.k.a. golden context, a.k.a. null context). It doesn't
mean anything for the other engines, so make that distinction obvious.
Done with the following Coccinelle patch (plus manual frobbing of the struct):
@@
struct intel_context c;
@@
- (c).obj
+ c.legacy_hw_ctx.rcs_state
@@
struct intel_context *c;
@@
- (c)->obj
+ c->legacy_hw_ctx.rcs_state
@@
struct intel_context c;
@@
- (c).is_initialized
+ c.legacy_hw_ctx.initialized
@@
struct intel_context *c;
@@
- (c)->is_initialized
+ c->legacy_hw_ctx.initialized
This Execlists prep-work patch has been suggested by Chris Wilson and Daniel
Vetter separately.
Initially, it was two separate patches:
drm/i915: Rename ctx->obj to ctx->rcs_state
drm/i915: Make it obvious that ctx->id is merely a user handle
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: s/id/is_initialized/ to fix the subject and resolve a
conflict in i915_gem_context_reset. Also introduce a new lctx local
variable to avoid overtly long lines.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This is preparatory work for Execlists: we plan to use it later to
allocate our own context objects (since Logical Ring Contexts do
not have the same kind of backing objects).
No functional changes.
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We're forgetting to unpin the last_context from the ggtt at GPU reset
time. This leads to the vma pin_count leaking at every reset if the
last context wasn't the ring default context. Further use of the same
context will trigger the pin_count check in i915_gem_object_pin() and
userspace will be faced with EBUSY as a result.
This plaques kms_flip rather badly since it performs lots of resets,
and every fd has its own default context these days.
Fix the problem by properly unpinning the last context at reset.
This regression seems to back to
commit acce9ffa48
Author: Ben Widawsky <ben@bwidawsk.net>
Date: Fri Dec 6 14:11:03 2013 -0800
drm/i915: Better reset handling for contexts
Testcase: igt/gem_ctx_exec/reset-pin-leak
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJTuaWZAAoJEHm+PkMAQRiGfkIH/2Hhwrg51GWazUYIXVxz5zLU
kPMlaws3vankbhka9HCg02eS3tkzr6shO3F/qlBba+5GUkUDKCcCisIsvk4hgZZg
7YqepTvcaupNxIp4TmTGm1FYVK1GpaWFdJVgg2PDdGFahw3HSlfZoTkBzirNCwga
p/jfeRzathbUixpz9OAC1AEn2gP1AxNRpSt1wShL5rexBb1YRXCPuCEt9B0UsVoR
mzKf5xEsuaZnpCuvWK4S60fjfVhTe8UJ/xGPPfdLyIXU0rvhaKzfeVQO6F5nIQBy
Xvrar1f7oOPZaJRdlmPvAimS7iS8lq/YctuHu7ia1NdJSihtA5sRPf7cWAw2d7s=
=4PrL
-----END PGP SIGNATURE-----
Merge tag 'v3.16-rc4' into drm-intel-next-queued
Due to Dave's vacation drm-next hasn't opened yet for 3.17 so I
couldn't move my drm-intel-next queue forward yet like I usually do.
Just pull in the latest upstream -rc to unblock patch merging - I
don't want to needlessly rebase my current patch pile really and void
all the testing we've done already.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Fallout from
commit 46470fc932
Author: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Date: Wed May 21 19:01:06 2014 +0300
drm/i915: Add null state batch to active list
undid the earlier fix of only marking the ctx as initialised after it is
saved by the hardware during a SET_CONTEXT operation:
commit ad1d219974
Author: Ben Widawsky <benjamin.widawsky@intel.com>
Date: Sat Dec 28 13:31:49 2013 -0800
drm/i915: set ctx->initialized only after RCS
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Damien Lespiau <damien.lespiau@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
[Jani: add reference to the earlier fix in the commit messsage.]
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
The original comment that introduced it said:
commit 0009e46cd5
Author: Ben Widawsky <ben@bwidawsk.net>
Date: Fri Dec 6 14:11:02 2013 -0800
drm/i915: Track which ring a context ran on
Previously we dropped the association of a context to a ring. It is
however very important to know which ring a context ran on (we could
have reused the other member, but I was nitpicky).
This is very important when we switch address spaces, which unlike
context objects, do change per ring.
As an example, if we have:
RCS BCS
ctx A
ctx A
ctx B
ctx B
Without tracking the last ring B ran on, we wouldn't know to switch the
address space on BCS in the last row.
But this is not really true, because we are already checking to != from (with
"from" being = ring->last_context) and that should be enough to make sure we
switch to the right address space.
We would have a problem if we switched the context object for every ring (since
then we would fail to do it in some situations) but we only switch it for the
render ring, so we don't care.
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
It's barely alive now anyway, so give it the "coup de grâce".
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Up until now, contexts had one (and only one) backing object that was
used by the hardware to save/restore render ring contexts (via the
MI_SET_CONTEXT command). Other rings did not have or need this, so
our i915_hw_context struct had a 1:1 relationship with a a real HW
context.
With Logical Ring Contexts and Execlists, this is not possible anymore:
all rings need a backing object, and it cannot be reused. To prepare
for that, rename our contexts to the more generic term intel_context.
No functional changes.
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
In the upcoming patches we plan to break the correlation between
engine command streamers (a.k.a. rings) and ringbuffers, so it
makes sense to refactor the code and make the change obvious.
No functional changes.
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
for proper refcounting to take place as we use
i915_add_request() for it.
i915_add_request() also takes the context for the request
from ring->last_context so move the null state batch
submission after the ring context has been set.
v2: we need to check for correct ring now (Ville Syrjälä)
v3: no need to expose i915_gem_move_object_to_active (Chris Wilson)
v4: cargoculted vma/active/inactive error handling removed (Chris Wilson)
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We implement the following workarounds:
* WaDisableAsyncFlipPerfMode:chv
* WaProgramMiArbOnOffAroundMiSetContext:chv
v2: Drop WaDisableSemaphoreAndSyncFlipWait note
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Since commit 691e6415c8
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Wed Apr 9 09:07:36 2014 +0100
drm/i915: Always use kref tracking for all contexts.
we have contexts everywhere, and so we must be careful to distinguish
fake contexts, which do not have an associated bo, and real ones, which
do. In particular, we now need to be careful not to dereference NULL
pointers.
This is one such example, as the commit highlighted above failed to move
the unpinning of the default ctx object into the real-context-only
branch.
Reported-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78792
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Ben Widawsky <benjamin.widawsky@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
HW guys say that it is not a cool idea to let device
go into rc6 without proper 3d pipeline state.
For each new uninitialized context, generate a
valid null render state to be run on context
creation.
This patch introduces a skeleton with empty states.
v2: - No need to vmap (Chris Wilson)
- use .c files for state (Daniel Vetter)
- no need to flush as i915_add_request does it
- remove parameter for batch alloc size
- don't wait for the init (Ben Widawsky)
v3: - move to cpu/gpu (Chris Wilson)
Tested-by: Kristen Carlson Accardi <kristen@linux.intel.com> (v1)
Tested-by: Oscar Mateo <oscar.mateo@intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
drm-intel-next-2014-04-16:
- vlv infoframe fixes from Jesse
- dsi/mipi fixes from Shobhit
- gen8 pageflip fixes for LRI/SRM from Damien
- cmd parser fixes from Brad Volkin
- some prep patches for CHV, DRRS, ...
- and tons of little things all over
drm-intel-next-2014-04-04:
- cmd parser for gen7 but only in enforcing and not yet granting mode - the
batch copying stuff is still missing. Also performance is a bit ... rough
(Brad Volkin + OACONTROL fix from Ken).
- deprecate UMS harder (i.e. CONFIG_BROKEN)
- interrupt rework from Paulo Zanoni
- runtime PM support for bdw and snb, again from Paulo
- a pile of refactorings from various people all over the place to prep for new
stuff (irq reworks, power domain polish, ...)
drm-intel-next-2014-04-04:
- cmd parser for gen7 but only in enforcing and not yet granting mode - the
batch copying stuff is still missing. Also performance is a bit ... rough
(Brad Volkin + OACONTROL fix from Ken).
- deprecate UMS harder (i.e. CONFIG_BROKEN)
- interrupt rework from Paulo Zanoni
- runtime PM support for bdw and snb, again from Paulo
- a pile of refactorings from various people all over the place to prep for new
stuff (irq reworks, power domain polish, ...)
Conflicts:
drivers/gpu/drm/i915/i915_gem_context.c
If we always initialize kref for the context, even if we are using fake
contexts for hangstats when there is no hw support, we can forgo the
dance to dereference the ctx->obj and inspect whether we are permitted
to use kref inside i915_gem_context_reference() and _unreference().
My ulterior motive here is to improve the debugging of a use-after-free
of ctx->obj. This patch avoids the dereference here and instead forces
the assertion checks associated with kref.
v2: Refactor the fake contexts to being even more like the real
contexts, so that there is much less duplicated and special case code.
v3: Tweaks.
v4: Tweaks, minor.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76671
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Tested-by: lu hua <huax.lu@intel.com>
Cc: Ben Widawsky <benjamin.widawsky@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
[Jani: tiny change to backport to drm-intel-fixes.]
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
We don't do CPU access to GPU contexts so making the GPU access snoop
the CPU caches seems silly, and potentially expensive.
v2: Use !IS_VALLEYVIEW instead of HAS_LLC as this is really
about what the PTEs can represent.
Add a comment clarifying the situation.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We have been setting the bit which was originally BIOS dependent since:
commit f05bb0c7b6
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Sun Jan 20 16:33:32 2013 +0000
drm/i915: GFX_MODE Flush TLB Invalidate Mode must be '1' for scanline waits
Therefore, we do not need to try to figure it out dynamically and we can
just always invalidate the TLBs.
It's a partial revert of:
commit 12b0286f49
Author: Ben Widawsky <ben@bwidawsk.net>
Date: Mon Jun 4 14:42:50 2012 -0700
drm/i915: possibly invalidate TLB before context switch
The original commit attempted to only invalidate when necessary
(very much a relic from the old days). Now, we can just always invalidate.
I guess the old TODO still exists. Since we seem to have abandoned ILK
contexts however, there isn't much point in even remembering.
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
BSpec seems to tell us we need the MI_ARB_ON_OFF w/a around
MI_SET_CONTEXT on gen8.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The idea of printing objects used by each process is to judge how each
process is using them. This means that we need to evaluate whether the
object is bound for that particular process, rather than just whether it
is bound into the global GTT.
v2: Restore the non-full-ppgtt path for simplicity as we may not even
create vma with older hardware.
v3: Tweak handling of global entries and default context entries.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ben Widawsky <benjamin.widawsky@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We used to have per file descriptor hang stats for the
i915_get_reset_stats_ioctl() and for default context banning.
commit 0eea67eb26
Author: Ben Widawsky <ben@bwidawsk.net>
Date: Fri Dec 6 14:11:19 2013 -0800
drm/i915: Create a per file_priv default context
made having separate hangstats in file_private redundant
as i915_hw_context already contained hangstats. So
commit c482972a08
Author: Ben Widawsky <benjamin.widawsky@intel.com>
Date: Fri Dec 6 14:11:20 2013 -0800
drm/i915: Piggy back hangstats off of contexts
consolidated the hangstats and enabled further improvements.
commit 44e2c0705a
Author: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Date: Thu Jan 30 16:01:15 2014 +0200
drm/i915: Use i915_hw_context to set reset stats
tried to reap full benefits of consolidation but fell short
as we never 'switch' to the fake private context on gens
that don't have hw_contexts, so request->ctx remained NULL
on those.
Fix this by 'switching' to fake context so that when
request is submitted to ring, proper context gets assigned
to it.
Testcase: igt/drv_hangman
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76055
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
While reading some code, out of boredom, stumbled on a tiny tiny fix.
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
GEN8 never freed the PPGTT struct. As GEN8 doesn't use full PPGTT, the
leak is small and only found on a module reload. ie. I don't think this
needs to go to stable.
v2: The very naive, kfree in gen8 ppgtt cleanup, is subject to a double
free on PPGTT initialization failure. (Spotted by Imre). Instead this
patch pulls the ppgtt struct freeing out of the cleanup and leaves it to
the allocators/callers or the one doing the last kref_put as in standard
convention
Reported-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
At one time it was expected to be called in multiple places by kref_put.
At the current time however, it is all contained within
i915_gem_context.c.
This patch makes an upcoming required addition a bit nicer since it too
doesn't need to be defined in a header file.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The i915 driver sets DRIVER_GEM unconditionally, so testing for the
feature will always fail.
Signed-off-by: Thierry Reding <treding@nvidia.com>
[danvet: Fix up conflicts.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Anything more than just one bool parameter is just a pain to read,
symbolic constants are much better.
Split out from Chris' vma-binding rework patch.
v2: Undo the behaviour change in object_pin that Chris spotted.
v3: Split out misplaced hunk to handle set_cache_level errors,
spotted by Jani.
v4: Keep the current over-zealous binding logic in the execbuffer code
working with a quick hack while the overall binding code gets shuffled
around.
v5: Reorder the PIN_ flags for more natural patch splitup.
v6: Pull out the PIN_GLOBAL split-up again.
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ben Widawsky <benjamin.widawsky@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
If we have stopped rings then we know that test is running
so no need for spam. In addition, only spam when default
context gets banned.
v2: - make sure default context ban gets shown (Chris)
- use helper for checking for default context, everywhere (Chris)
v3: - dont be quiet when debug is set (Ben, Daniel)
Reference: https://bugs.freedesktop.org/show_bug.cgi?id=73652
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
There are cases where we want to know if there is a full, or aliased
PPGTT. Currently, in fact the only distinction we ever need to make is
when we're using full PPGTT.
This patch is simply to promote readability and clarify for the
confusing existing usage where "aliasing" meant aliasing and full.
v2: Remove USES_ALIASING_PPGTT since there are currently no cases where
we need to check if we're using aliasing, but not full PPGTT. (Daniel)
Cc: Daniel Vetter <daniel@ffwll.ch>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
WaMiSetContext_Hang tells us that a MI_NOOP must follow MI_SET_CONTEXT.
The other thing WaMiSetContext_Hang seems to say is that URB_FENCE isn't
allowed to straddle two cachelines. But we don't issue those from the
kernel so we don't care.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The initialized flag is used to specify a context has been initialized
and it's context is safe to load, ie. the 3d state is setup properly.
With full PPGTT, we emit the address space loads during context switch
and this currently marks a context as initialized. With full PPGTT
patches, if a client first emits a batch to !RCS, then later, RCS, the
code will mistake the context as initialized and try to reload an
uninitialized context.
1. context 1 blit // context marked as initialized, but isn't
2. context 1 render // loads random state from step 2
It is really easy to hit this with a planned upcoming patch which makes
default context reuse possible.
NOTE: This should only effect full PPGTT branches, ie. current
drm-intel-nightly.
Thanks to Chris for helping me track this down.
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: Simplify the failure scenario in the commit message according
to Chris' review a bit.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This was an accidental "ABI" change introduced during PPGTT:
commit 0eea67eb26
Author: Ben Widawsky <ben@bwidawsk.net>
Date: Fri Dec 6 14:11:19 2013 -0800
drm/i915: Create a per file_priv default context
The failure test application actually tests the return type. The other
option is to simply change the test.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Without this fix the ioctls silently succeeded (but actually did
nothing).
It makes all the code which calls into this function way too confusing.
v2: Fix destroy IOCTL as well
v3: Clarify the other two callers of i915_gem_context_get() to never
check for NULL. (Mika)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72903
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Testcase: igt/gem_ctx_exec/basic
[danvet: Fix up the commit message and actually bother to mention the
testcase this fixes.]
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
As with processes which run on the CPU, the goal of multiple VMs is to
provide process isolation. Specific to GEN, there is also the ability to
map more objects per process (2GB each instead of 2Gb-2k total).
For the most part, all the pipes have been laid, and all we need to do
is remove asserts and actually start changing address spaces with the
context switch. Since prior to this we've converted the setting of the
page tables to a streamed version, this is quite easy.
One important thing to point out (since it'd been hotly contested) is
that with this patch, every context created will have it's own address
space (provided the HW can do it).
v2: Disable BDW on rebase
NOTE: I tried to make this commit as small as possible. I needed one
place where I could "turn everything on" and that is here. It could be
split into finer commits, but I didn't really see much point.
Cc: Eric Anholt <eric@anholt.net>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
I need the tricky do_switch fix before I can merge the final piece of
the ppgtt enabling puzzle. Otherwise the conflict will be a real pain
to resolve since the do_switch hunk from -fixes must be placed at the
exact right place within a hunk in the next patch.
Conflicts:
drivers/gpu/drm/i915/i915_gem_context.c
drivers/gpu/drm/i915/i915_gem_execbuffer.c
drivers/gpu/drm/i915/intel_display.c
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We need to have the address space when reserving space for the objects.
Since the address space and context are tied together, and reserve
occurs before context switch (for good reason), we must lookup our
context earlier in the process.
This leaves some room for optimizations where we no longer need to use
ctx_id in certain places. This will be addressed in a subsequent patch.
Important tricky bit:
Because slow relocations during execbuffer drop struct_mutex
Perhaps it would be best to acquire the reference when we get the
context, but I'll save that for another day (note I have written the
patch before, and I found the changes required to be uglier than this).
Note that since we currently access everything via context id, and not
the data structure this is fine, though not desirable. The next change
attempts to get the context only once via the context ID idr lookup, and
as such, the following can happen:
CTX-A is created, refcount = 1
CTX-A execbuf, mutex dropped
close IOCTL called on CTX-A, refcount = 0
CTX-A resumes in execbuf.
v2: Rebased on top of
commit b6359918b8
Author: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Date: Wed Oct 30 15:44:16 2013 +0200
drm/i915: add i915_get_reset_stats_ioctl
v3: Rebased on top of
commit 25b3dfc87b
Author: Mika Westerberg <mika.westerberg@linux.intel.com>
Date: Tue Nov 12 11:57:30 2013 +0200
Author: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Date: Tue Nov 26 16:14:33 2013 +0200
drm/i915: check context reset stats before relocations
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
To simplify the codepaths somewhat, we can simply always create a
context. Contexts already keep hangstat information. This prevents us
from having to differentiate at other parts in the code.
There is allocation overhead, but it should not be measurable.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Every file will get it's own context, and we use this context instead of
the default context. The default context still exists for future
shrinker usage as well as reset handling.
v2: Updated to address Mika's recent context guilty changes
Some more changes around this come up in later patches as well.
v3: Use a fake context to avoid allocation for the !HAS_HW_CONTEXT case.
I've tried the alternatives. This looks the best to me.
Removed hangstat stuff from v2 - for a separate patch
Demote failed PPGTT set to DRM_DEBUG_DRIVER since it can now be invoked
easily from userspace.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We have a default context which suits the aliasing PPGTT well. Tie them
together so it looks like any other context/PPGTT pair. This makes the
code cleaner as it won't have to special case aliasing as often.
The patch has one slightly tricky part in the default context creation
function. In the future (and on aliased setup) we create a new VM for a
context (potentially). However, if we have aliasing PPGTT, which occurs
at this point in time for all platforms GEN6+, we can simply manage the
refcounting to allow things to behave as normal. Now is a good time to
recall that the aliasing_ppgtt doesn't have a real VM, it uses the GGTT
drm_mm.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Pretty straightforward so far except for the bit about the refcounting.
The PPGTT will potentially be shared amongst multiple contexts. Because
contexts themselves have a refcounted lifecycle, the easiest way to
manage this will be to refcount the PPGTT. To acheive this, we piggy
back off of the existing context refcount, and will increment and
decrement the PPGTT refcount with context creation, and destruction.
To put it more clearly, if context A, and context B both use PPGTT 0, we
can't free the PPGTT until both A, and B are destroyed.
Note that because the PPGTT is permanently pinned (for now), it really
just matters for the PPGTT destruction, as opposed to making space under
memory pressure.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The plan to to make every file descriptor have a default context. To
accommodate this, generalize out default context setup function so it
can be used at file open time.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We **need** to do this for exactly 1 reason, because we want to embed a
PPGTT into the context, but we don't want to special case the default
context.
To achieve that, we must be able to initialize contexts after the GTT is
setup (so we can allocate and pin the default context's BO), but before
the PPGTT and rings are initialized. This is because, currently, context
initialization requires ring usage. We don't have rings until after the
GTT is setup. If we split the enabling part of context initialization,
the part requiring the ringbuffer, we can untangle this, and then later
embed the PPGTT
Incidentally this allows us to also adhere to the original design of
context init/fini in future patches: they were only ever meant to be
called at driver load and unload.
v2: Move hw_contexts_disabled test in i915_gem_context_enable() (Chris)
v3: BUG_ON after checking for disabled contexts. Or else it blows up pre
gen6 (Ben)
v4: Forward port
Modified enable for each ring, since that patch is earlier in the series
Dropped ring arg from create_default_context so it can be used by others
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>