linux/drivers/gpu/drm/i915/intel_lrc.h

115 lines
4.3 KiB
C
Raw Normal View History

/*
* Copyright © 2014 Intel Corporation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice (including the next
* paragraph) shall be included in all copies or substantial portions of the
* Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/
#ifndef _INTEL_LRC_H_
#define _INTEL_LRC_H_
#include "intel_ringbuffer.h"
#include "i915_gem_context.h"
#define GEN8_LR_CONTEXT_ALIGN I915_GTT_MIN_ALIGNMENT
drm/i915/bdw: Pin the context backing objects to GGTT on-demand Up until now, we have pinned every logical ring context backing object during creation, and left it pinned until destruction. This made my life easier, but it's a harmful thing to do, because we cause fragmentation of the GGTT (and, eventually, we would run out of space). This patch makes the pinning on-demand: the backing objects of the two contexts that are written to the ELSP are pinned right before submission and unpinned once the hardware is done with them. The only context that is still pinned regardless is the global default one, so that the HWS can still be accessed in the same way (ring->status_page). v2: In the early version of this patch, we were pinning the context as we put it into the ELSP: on the one hand, this is very efficient because only a maximum two contexts are pinned at any given time, but on the other hand, we cannot really pin in interrupt time :( v3: Use a mutex rather than atomic_t to protect pin count to avoid races. Do not unpin default context in free_request. v4: Break out pin and unpin into functions. Fix style problems reported by checkpatch v5: Remove unpin_lock as all pinning and unpinning is done with the struct mutex already locked. Add WARN_ONs to make sure this is the case in future. Issue: VIZ-4277 Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> Signed-off-by: Thomas Daniel <thomas.daniel@intel.com> Reviewed-by: Akash Goel <akash.goels@gmail.com> Reviewed-by: Deepak S<deepak.s@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-11-13 10:28:10 +00:00
/* Execlists regs */
#define RING_ELSP(engine) _MMIO((engine)->mmio_base + 0x230)
#define RING_EXECLIST_STATUS_LO(engine) _MMIO((engine)->mmio_base + 0x234)
#define RING_EXECLIST_STATUS_HI(engine) _MMIO((engine)->mmio_base + 0x234 + 4)
#define RING_CONTEXT_CONTROL(engine) _MMIO((engine)->mmio_base + 0x244)
#define CTX_CTRL_INHIBIT_SYN_CTX_SWITCH (1 << 3)
#define CTX_CTRL_ENGINE_CTX_RESTORE_INHIBIT (1 << 0)
#define CTX_CTRL_RS_CTX_ENABLE (1 << 1)
#define RING_CONTEXT_STATUS_BUF_BASE(engine) _MMIO((engine)->mmio_base + 0x370)
#define RING_CONTEXT_STATUS_BUF_LO(engine, i) _MMIO((engine)->mmio_base + 0x370 + (i) * 8)
#define RING_CONTEXT_STATUS_BUF_HI(engine, i) _MMIO((engine)->mmio_base + 0x370 + (i) * 8 + 4)
#define RING_CONTEXT_STATUS_PTR(engine) _MMIO((engine)->mmio_base + 0x3a0)
/* The docs specify that the write pointer wraps around after 5h, "After status
* is written out to the last available status QW at offset 5h, this pointer
* wraps to 0."
*
* Therefore, one must infer than even though there are 3 bits available, 6 and
* 7 appear to be * reserved.
*/
#define GEN8_CSB_ENTRIES 6
#define GEN8_CSB_PTR_MASK 0x7
#define GEN8_CSB_READ_PTR_MASK (GEN8_CSB_PTR_MASK << 8)
#define GEN8_CSB_WRITE_PTR_MASK (GEN8_CSB_PTR_MASK << 0)
#define GEN8_CSB_WRITE_PTR(csb_status) \
(((csb_status) & GEN8_CSB_WRITE_PTR_MASK) >> 0)
#define GEN8_CSB_READ_PTR(csb_status) \
(((csb_status) & GEN8_CSB_READ_PTR_MASK) >> 8)
drm/i915: Introduce execlist context status change notification This patch introduces an approach to track the execlist context status change. GVT-g uses GVT context as the "shadow context". The content inside GVT context will be copied back to guest after the context is idle. And GVT-g has to know the status of the execlist context. This function is configurable when creating a new GEM context. Currently, Only GVT-g will create the "status-change-notification" enabled GEM context. v10: - Fix the identation. (Joonas) v8: - Remove the boolean flag in struct i915_gem_context. (Joonas) v7: - Remove per-engine ctx status notifiers. Use one status notifier for all engines. (Joonas) - Add prefix "INTEL_" for related definitions. (Joonas) - Refine the comments in execlists_context_status_change(). (Joonas) v6: - When !CONFIG_DRM_I915_GVT, make GVT code as dead code then compiler could automatically eliminate them for us. (Chris) - Always initialize the notifier header, so it could be switched on/off at runtime. (Chris) v5: - Only compile this feature when CONFIG_DRM_I915_GVT is enabled.(Tvrtko) Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> (v8) Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Signed-off-by: Zhi Wang <zhi.a.wang@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1466078825-6662-8-git-send-email-zhi.a.wang@intel.com Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-06-16 12:07:03 +00:00
enum {
INTEL_CONTEXT_SCHEDULE_IN = 0,
INTEL_CONTEXT_SCHEDULE_OUT,
INTEL_CONTEXT_SCHEDULE_PREEMPTED,
drm/i915: Introduce execlist context status change notification This patch introduces an approach to track the execlist context status change. GVT-g uses GVT context as the "shadow context". The content inside GVT context will be copied back to guest after the context is idle. And GVT-g has to know the status of the execlist context. This function is configurable when creating a new GEM context. Currently, Only GVT-g will create the "status-change-notification" enabled GEM context. v10: - Fix the identation. (Joonas) v8: - Remove the boolean flag in struct i915_gem_context. (Joonas) v7: - Remove per-engine ctx status notifiers. Use one status notifier for all engines. (Joonas) - Add prefix "INTEL_" for related definitions. (Joonas) - Refine the comments in execlists_context_status_change(). (Joonas) v6: - When !CONFIG_DRM_I915_GVT, make GVT code as dead code then compiler could automatically eliminate them for us. (Chris) - Always initialize the notifier header, so it could be switched on/off at runtime. (Chris) v5: - Only compile this feature when CONFIG_DRM_I915_GVT is enabled.(Tvrtko) Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> (v8) Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Signed-off-by: Zhi Wang <zhi.a.wang@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1466078825-6662-8-git-send-email-zhi.a.wang@intel.com Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-06-16 12:07:03 +00:00
};
/* Logical Rings */
void intel_logical_ring_cleanup(struct intel_engine_cs *engine);
int logical_render_ring_init(struct intel_engine_cs *engine);
int logical_xcs_ring_init(struct intel_engine_cs *engine);
/* Logical Ring Contexts */
drm/i915: Integrate GuC-based command submission GuC-based submission is mostly the same as execlist mode, up to intel_logical_ring_advance_and_submit(), where the context being dispatched would be added to the execlist queue; at this point we submit the context to the GuC backend instead. There are, however, a few other changes also required, notably: 1. Contexts must be pinned at GGTT addresses accessible by the GuC i.e. NOT in the range [0..WOPCM_SIZE), so we have to add the PIN_OFFSET_BIAS flag to the relevant GGTT-pinning calls. 2. The GuC's TLB must be invalidated after a context is pinned at a new GGTT address. 3. GuC firmware uses the one page before Ring Context as shared data. Therefore, whenever driver wants to get base address of LRC, we will offset one page for it. LRC_PPHWSP_PN is defined as the page number of LRCA. 4. In the work queue used to pass requests to the GuC, the GuC firmware requires the ring-tail-offset to be represented as an 11-bit value, expressed in QWords. Therefore, the ringbuffer size must be reduced to the representable range (4 pages). v2: Defer adding #defines until needed [Chris Wilson] Rationalise type declarations [Chris Wilson] v4: Squashed kerneldoc patch into here [Daniel Vetter] v5: Update request->tail in code common to both GuC and execlist modes. Add a private version of lr_context_update(), as sharing the execlist version leads to race conditions when the CPU and the GuC both update TAIL in the context image. Conversion of error-captured HWS page to string must account for offset from start of object to actual HWS (LRC_PPHWSP_PN). Issue: VIZ-4884 Signed-off-by: Alex Dai <yu.dai@intel.com> Signed-off-by: Dave Gordon <david.s.gordon@intel.com> Reviewed-by: Tom O'Rourke <Tom.O'Rourke@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-08-12 14:43:43 +00:00
drm/i915/lrc: Clarify the format of the context image Not only the context image consist of two parts (the PPHWSP, and the logical context state), but we also allocate a header at the start of for sharing data with GuC. Thus every lrc looks like this: | [guc] | [hwsp] [logical state] | |<- our header ->|<- context image ->| So far, we have oversimplified whenever we use each of these parts of the context, just because the GuC header happens to be in page 0, and the (PP)HWSP is in page 1. But this had led to using the same define for more than one meaning (as a page index in the lrc and as 1 page). This patch adds defines for the GuC shared page, the PPHWSP page and the start of the logical state. It also updated the places where the old define was being used. Since we are not changing the size (or format) of the context, there are no functional changes. v2: Use PPHWSP index for hws again. Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Michel Thierry <michel.thierry@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Oscar Mateo <oscar.mateo@intel.com> Cc: intel-gvt-dev@lists.freedesktop.org Link: http://patchwork.freedesktop.org/patch/msgid/20170712193032.27080-1-michel.thierry@intel.com Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20170913085605.18299-1-chris@chris-wilson.co.uk
2017-09-13 08:56:00 +00:00
/*
* We allocate a header at the start of the context image for our own
* use, therefore the actual location of the logical state is offset
* from the start of the VMA. The layout is
*
* | [guc] | [hwsp] [logical state] |
* |<- our header ->|<- context image ->|
*
*/
/* The first page is used for sharing data with the GuC */
drm/i915: Integrate GuC-based command submission GuC-based submission is mostly the same as execlist mode, up to intel_logical_ring_advance_and_submit(), where the context being dispatched would be added to the execlist queue; at this point we submit the context to the GuC backend instead. There are, however, a few other changes also required, notably: 1. Contexts must be pinned at GGTT addresses accessible by the GuC i.e. NOT in the range [0..WOPCM_SIZE), so we have to add the PIN_OFFSET_BIAS flag to the relevant GGTT-pinning calls. 2. The GuC's TLB must be invalidated after a context is pinned at a new GGTT address. 3. GuC firmware uses the one page before Ring Context as shared data. Therefore, whenever driver wants to get base address of LRC, we will offset one page for it. LRC_PPHWSP_PN is defined as the page number of LRCA. 4. In the work queue used to pass requests to the GuC, the GuC firmware requires the ring-tail-offset to be represented as an 11-bit value, expressed in QWords. Therefore, the ringbuffer size must be reduced to the representable range (4 pages). v2: Defer adding #defines until needed [Chris Wilson] Rationalise type declarations [Chris Wilson] v4: Squashed kerneldoc patch into here [Daniel Vetter] v5: Update request->tail in code common to both GuC and execlist modes. Add a private version of lr_context_update(), as sharing the execlist version leads to race conditions when the CPU and the GuC both update TAIL in the context image. Conversion of error-captured HWS page to string must account for offset from start of object to actual HWS (LRC_PPHWSP_PN). Issue: VIZ-4884 Signed-off-by: Alex Dai <yu.dai@intel.com> Signed-off-by: Dave Gordon <david.s.gordon@intel.com> Reviewed-by: Tom O'Rourke <Tom.O'Rourke@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-08-12 14:43:43 +00:00
#define LRC_GUCSHR_PN (0)
drm/i915/lrc: Clarify the format of the context image Not only the context image consist of two parts (the PPHWSP, and the logical context state), but we also allocate a header at the start of for sharing data with GuC. Thus every lrc looks like this: | [guc] | [hwsp] [logical state] | |<- our header ->|<- context image ->| So far, we have oversimplified whenever we use each of these parts of the context, just because the GuC header happens to be in page 0, and the (PP)HWSP is in page 1. But this had led to using the same define for more than one meaning (as a page index in the lrc and as 1 page). This patch adds defines for the GuC shared page, the PPHWSP page and the start of the logical state. It also updated the places where the old define was being used. Since we are not changing the size (or format) of the context, there are no functional changes. v2: Use PPHWSP index for hws again. Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Michel Thierry <michel.thierry@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Oscar Mateo <oscar.mateo@intel.com> Cc: intel-gvt-dev@lists.freedesktop.org Link: http://patchwork.freedesktop.org/patch/msgid/20170712193032.27080-1-michel.thierry@intel.com Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20170913085605.18299-1-chris@chris-wilson.co.uk
2017-09-13 08:56:00 +00:00
#define LRC_GUCSHR_SZ (1)
/* At the start of the context image is its per-process HWS page */
#define LRC_PPHWSP_PN (LRC_GUCSHR_PN + LRC_GUCSHR_SZ)
#define LRC_PPHWSP_SZ (1)
/* Finally we have the logical state for the context */
#define LRC_STATE_PN (LRC_PPHWSP_PN + LRC_PPHWSP_SZ)
/*
* Currently we include the PPHWSP in __intel_engine_context_size() so
* the size of the header is synonymous with the start of the PPHWSP.
*/
#define LRC_HEADER_PAGES LRC_PPHWSP_PN
drm/i915: Integrate GuC-based command submission GuC-based submission is mostly the same as execlist mode, up to intel_logical_ring_advance_and_submit(), where the context being dispatched would be added to the execlist queue; at this point we submit the context to the GuC backend instead. There are, however, a few other changes also required, notably: 1. Contexts must be pinned at GGTT addresses accessible by the GuC i.e. NOT in the range [0..WOPCM_SIZE), so we have to add the PIN_OFFSET_BIAS flag to the relevant GGTT-pinning calls. 2. The GuC's TLB must be invalidated after a context is pinned at a new GGTT address. 3. GuC firmware uses the one page before Ring Context as shared data. Therefore, whenever driver wants to get base address of LRC, we will offset one page for it. LRC_PPHWSP_PN is defined as the page number of LRCA. 4. In the work queue used to pass requests to the GuC, the GuC firmware requires the ring-tail-offset to be represented as an 11-bit value, expressed in QWords. Therefore, the ringbuffer size must be reduced to the representable range (4 pages). v2: Defer adding #defines until needed [Chris Wilson] Rationalise type declarations [Chris Wilson] v4: Squashed kerneldoc patch into here [Daniel Vetter] v5: Update request->tail in code common to both GuC and execlist modes. Add a private version of lr_context_update(), as sharing the execlist version leads to race conditions when the CPU and the GuC both update TAIL in the context image. Conversion of error-captured HWS page to string must account for offset from start of object to actual HWS (LRC_PPHWSP_PN). Issue: VIZ-4884 Signed-off-by: Alex Dai <yu.dai@intel.com> Signed-off-by: Dave Gordon <david.s.gordon@intel.com> Reviewed-by: Tom O'Rourke <Tom.O'Rourke@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-08-12 14:43:43 +00:00
drm/i915: Unify active context tracking between legacy/execlists/guc The requests conversion introduced a nasty bug where we could generate a new request in the middle of constructing a request if we needed to idle the system in order to evict space for a context. The request to idle would be executed (and waited upon) before the current one, creating a minor havoc in the seqno accounting, as we will consider the current request to already be completed (prior to deferred seqno assignment) but ring->last_retired_head would have been updated and still could allow us to overwrite the current request before execution. We also employed two different mechanisms to track the active context until it was switched out. The legacy method allowed for waiting upon an active context (it could forcibly evict any vma, including context's), but the execlists method took a step backwards by pinning the vma for the entire active lifespan of the context (the only way to evict was to idle the entire GPU, not individual contexts). However, to circumvent the tricky issue of locking (i.e. we cannot take struct_mutex at the time of i915_gem_request_submit(), where we would want to move the previous context onto the active tracker and unpin it), we take the execlists approach and keep the contexts pinned until retirement. The benefit of the execlists approach, more important for execlists than legacy, was the reduction in work in pinning the context for each request - as the context was kept pinned until idle, it could short circuit the pinning for all active contexts. We introduce new engine vfuncs to pin and unpin the context respectively. The context is pinned at the start of the request, and only unpinned when the following request is retired (this ensures that the context is idle and coherent in main memory before we unpin it). We move the engine->last_context tracking into the retirement itself (rather than during request submission) in order to allow the submission to be reordered or unwound without undue difficultly. And finally an ulterior motive for unifying context handling was to prepare for mock requests. v2: Rename to last_retired_context, split out legacy_context tracking for MI_SET_CONTEXT. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161218153724.8439-3-chris@chris-wilson.co.uk
2016-12-18 15:37:20 +00:00
struct drm_i915_private;
struct i915_gem_context;
drm/i915: Update reset path to fix incomplete requests Update reset path in preparation for engine reset which requires identification of incomplete requests and associated context and fixing their state so that engine can resume correctly after reset. The request that caused the hang will be skipped and head is reset to the start of breadcrumb. This allows us to resume from where we left-off. Since this request didn't complete normally we also need to cleanup elsp queue manually. This is vital if we employ nonblocking request submission where we may have a web of dependencies upon the hung request and so advancing the seqno manually is no longer trivial. ABI: gem_reset_stats / DRM_IOCTL_I915_GET_RESET_STATS We change the way we count pending batches. Only the active context involved in the reset is marked as either innocent or guilty, and not mark the entire world as pending. By inspection this only affects igt/gem_reset_stats (which assumes implementation details) and not piglit. ARB_robustness gives this guide on how we expect the user of this interface to behave: * Provide a mechanism for an OpenGL application to learn about graphics resets that affect the context. When a graphics reset occurs, the OpenGL context becomes unusable and the application must create a new context to continue operation. Detecting a graphics reset happens through an inexpensive query. And with regards to the actual meaning of the reset values: Certain events can result in a reset of the GL context. Such a reset causes all context state to be lost. Recovery from such events requires recreation of all objects in the affected context. The current status of the graphics reset state is returned by enum GetGraphicsResetStatusARB(); The symbolic constant returned indicates if the GL context has been in a reset state at any point since the last call to GetGraphicsResetStatusARB. NO_ERROR indicates that the GL context has not been in a reset state since the last call. GUILTY_CONTEXT_RESET_ARB indicates that a reset has been detected that is attributable to the current GL context. INNOCENT_CONTEXT_RESET_ARB indicates a reset has been detected that is not attributable to the current GL context. UNKNOWN_CONTEXT_RESET_ARB indicates a detected graphics reset whose cause is unknown. The language here is explicit in that we must mark up the guilty batch, but is loose enough for us to relax the innocent (i.e. pending) accounting as only the active batches are involved with the reset. In the future, we are looking towards single engine resetting (with minimal locking), where it seems inappropriate to mark the entire world as innocent since the reset occurred on a different engine. Reducing the information available means we only have to encounter the pain once, and also reduces the information leaking from one context to another. v2: Legacy ringbuffer submission required a reset following hibernation, or else we restore stale values to the RING_HEAD and walked over stolen garbage. v3: GuC requires replaying the requests after a reset. v4: Restore engine IRQ after reset (so waiters will be woken!) Rearm hangcheck if resetting with a waiter. Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Arun Siluvery <arun.siluvery@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20160909131201.16673-13-chris@chris-wilson.co.uk
2016-09-09 13:11:53 +00:00
void intel_lr_context_resume(struct drm_i915_private *dev_priv);
static inline uint64_t
intel_lr_context_descriptor(struct i915_gem_context *ctx,
struct intel_engine_cs *engine)
{
return ctx->engine[engine->id].lrc_desc;
}
/* Execlists */
int intel_sanitize_enable_execlists(struct drm_i915_private *dev_priv,
int enable_execlists);
#endif /* _INTEL_LRC_H_ */