Stop using the old for_each_intel_plane_in_state() type iteration
macro and replace it with for_each_new_intel_plane_in_state().
And similarly replace drm_atomic_get_existing_crtc_state() with
intel_atomic_get_new_crtc_state(). Switch over to intel_ types
as well to make the code less cluttered.
v2: s/plane/i9xx_plane/ etc. (James)
Cc: James Ausmus <james.ausmus@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20171117191917.11506-8-ville.syrjala@linux.intel.com
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Replace the 0 and 1 with PLANE_A and PLANE_B in the pre-g4x wm code.
v2: s/old_plane_id/i9xx_plane_id/ (Daniel)
v3: s/plane/i9xx_plane/ etc. (James)
Cc: James Ausmus <james.ausmus@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20171117191917.11506-5-ville.syrjala@linux.intel.com
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Rename enum plane to enum i9xx_plane_id to make it clear that it only
applies to pre-SKL platforms.
enum i9xx_plane_id is a global identifier, whereas enum plane_id is
per-pipe. We need the old global identifier to index the primary plane
(and the pre-g4x sprite C if we ever expose it) registers on pre-SKL
platforms.
v2: Reorder patches
v3: s/old_plane_id/i9xx_plane_id/ (Daniel)
Pimp the commit message a bit
Note that i9xx_plane_id doesn't apply to SKL+
v4: Rebase due to power domain handling in plane readout
v5: Rebase due to crtc->dspaddr_offset removal
v6: s/plane/i9xx_plane/ etc. (James)
Cc: James Ausmus <james.ausmus@intel.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20171117191917.11506-4-ville.syrjala@linux.intel.com
Reviewed-by: James Ausmus <james.ausmus@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Having disabled the broken semaphores on Sandybridge, there is no need
for a modparam any more, so remove it in favour of a simple
HAS_LEGACY_SEMAPHORES() guard.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171120205504.21892-5-chris@chris-wilson.co.uk
Since removing the module parameter to force selection of ringbuffer
emission for gen8, the code is defunct. Remove it.
To put the difference into perspective, a couple of microbenchmarks
(bdw i7-5557u, 20170324):
ring execlists
exec continuous nops on all rings: 1.491us 2.223us
exec sequential nops on each ring: 12.508us 53.682us
single nop + sync: 9.272us 30.291us
vblank_mode=0 glxgears: ~11000fps ~9000fps
Since the earlier submission, gen8 ringbuffer submission has fallen
further and further behind in features. So while ringbuffer may hold the
throughput crown, in terms of interactive latency, execlists is much
better. Alas, we have no convenient metrics for such, other than
demonstrating things we can do with execlists but can not using
legacy ringbuffer submission.
We have made a few improvements to lowlevel execlists throughput,
and ringbuffer currently panics on boot! (bdw i7-5557u, 20171026):
ring execlists
exec continuous nops on all rings: n/a 1.921us
exec sequential nops on each ring: n/a 44.621us
single nop + sync: n/a 21.953us
vblank_mode=0 glxgears: n/a ~18500fps
References: https://bugs.freedesktop.org/show_bug.cgi?id=87725
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Once-upon-a-time-Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171120205504.21892-2-chris@chris-wilson.co.uk
Execlists and legacy ringbuffer submission are no longer feature
comparable (execlists now offer greater functionality that should
overcome their performance hit) and obsoletes the unsafe module
parameter, i.e. comparing the two modes of execution is no longer
useful, so remove the debug tool.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> #i915_perf.c
Link: https://patchwork.freedesktop.org/patch/msgid/20171120205504.21892-1-chris@chris-wilson.co.uk
Now that we have this stored in the device info, we can drop it from perf
part of the driver.
Note that this requires to init perf after we've computed the frequency,
hence why we move i915_perf_init() from i915_driver_init_early() to after
intel_device_info_runtime_init().
v2: Use div_u64 (Chris)
v3: Drop u64 divs by switching to kHz (Chris/Ville)
Move i915_perf_fini to i915_driver_cleanup_hw (Matthew)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171113181902.12411-2-lionel.g.landwerlin@intel.com
ERROR: "__udivdi3" [drivers/gpu/drm/i915/i915.ko] undefined!
ERROR: "__divdi3" [drivers/gpu/drm/i915/i915.ko] undefined!
Store the frequency in kHz and drop 64bit divisions.
v2: Use div64_u64 (Matthew)
v3: store frequency in kHz to avoid 64bit divs (Chris/Ville)
Fixes: dab9178333 ("drm/i915: expose command stream timestamp frequency to userspace")
Reported-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171113233455.12085-3-lionel.g.landwerlin@intel.com
Reviewed-by: Ewelina Musial <ewelina.musial@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
We use to have this fixed per generation, but starting with CNL userspace
cannot tell just off the PCI ID. Let's make this information available. This
is particularly useful for performance monitoring where much of the
normalization work is done using those timestamps (this include pipeline
statistics in both GL & Vulkan as well as OA reports).
v2: Use variables for 24MHz/19.2MHz values (Ewelina)
Renamed function & coding style (Sagar)
v3: Fix frequency read on Broadwell (Sagar)
Fix missing divide by 4 on <= gen4 (Sagar)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171110190845.32574-7-lionel.g.landwerlin@intel.com
Now that we always execute a context switch upon module load, there is
no need to queue a delayed task for doing so. The purpose of the delayed
task is to enable GT powersaving, for which we need the HW state to be
valid (i.e. having loaded a context and initialised basic state). We
used to defer this operation as historically it was slow (due to slow
register polling, fixed with commit 1758b90e38 ("drm/i915: Use a hybrid
scheme for fast register waits")) but now we have a requirement to save
the default HW state.
v2: Load the kernel context (to provide the power context) upon resume.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171112112738.1463-3-chris@chris-wilson.co.uk
As we now record the default HW state and so only emit the "golden"
renderstate once to prepare the HW, there is no advantage in keeping the
renderstate batch around as it will never be used again.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171110142634.10551-8-chris@chris-wilson.co.uk
intel_modeset_gem_init() now only sets up the legacy overlay, so let's
remove the function and call the setup directly during driver load. This
should help us find a better point in the initialisation sequence for it
later.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171110142634.10551-5-chris@chris-wilson.co.uk
Rather than digging through encoder->crtc and crtc->config in the
DPIO PHY functions, pass down the correct crtc state from the caller.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171031205123.13123-6-ville.syrjala@linux.intel.com
Acked-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
We keep details of GuC and HuC in separate error state struct.
Make GuC log part of it to group all related data together.
Since we are printing uC details at the end, with this change
GuC log will be moved there too.
v2: comment on new placement of the log (Chris)
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171026173657.49648-2-michal.wajdeczko@intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Include GuC and HuC firmware details in captured error state
to provide additional debug information. To reuse existing
uc firmware pretty printer, introduce new drm-printer variant
that works with our i915_error_state_buf output. Also update
uc firmware pretty printer to accept const input.
v2: don't rely on current caps (Chris)
dump correct fw info (Michal)
v3: simplify capture of custom paths (Chris)
v4: improve 'why' comment (Joonas)
trim output if no fw path (Michal)
group code around uc error state (Michal)
v5: use error in cleanup_uc (Michal)
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171026173657.49648-1-michal.wajdeczko@intel.com
[ickle: allow printing uc_fw after allocation failure]
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
This patch adds per engine reset and recovery (TDR) support when GuC is
used to submit workloads to GPU.
In the case of i915 directly submission to ELSP, driver manages hang
detection, recovery and resubmission. With GuC submission these tasks
are shared between driver and GuC. i915 is still responsible for detecting
a hang, and when it does it only requests GuC to reset that Engine. GuC
internally manages acquiring forcewake and idling the engine before
resetting it.
Once the reset is successful, i915 takes over again and handles the
resubmission. The scheduler in i915 knows which requests are pending so
after resetting a engine, pending workloads/requests are resubmitted
again.
v2: s/i915_guc_request_engine_reset/i915_guc_reset_engine/ to match the
non-guc function names.
v3: Removed debug message about engine restarting from which request,
since the new baseline do it regardless of submission mode. (Chris)
v4: Rebase.
v5: Do not pass unnecessary reporting flags to the fw (Jeff);
tasklet_schedule(&execlists->irq_tasklet) handles the resubmit; rebase.
v6: Rename the existing reset engine function and share a similar
interface between guc and non-guc paths (Chris).
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171031225309.10888-1-michel.thierry@intel.com
Reviewed-by: Jeff McGee <jeff.mcgee@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
intel_guc_reset sounds more like the microcontroller is the one performing
a reset, while in this case is the opposite. intel_reset_guc not only
makes it clearer, it follows the other intel_reset functions available.
v2: Print error message in English (Tvrtko).
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171030185616.32836-2-michel.thierry@intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Explicitly pass the crtc and connector states into the audio
code enable/disable hooks, and plumb them all the way down.
This gets rid of almost all crtc->config and encoder->crtc
uses. The one place where we still use them is
i915_audio_component_sync_audio_rate() since that gets called from
the audio driver and we don't have explicit states around then.
Cc: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171030184654.17429-1-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Starting from version 204 VBT can specify the max TMDS clock we are
allowed to use with HDMI ports. Parse that information and take it
into account when filtering modes and computing a crtc state.
Also take the opportunity to sort the platform check if ladder
from new to old.
v2: Add defines for the values into intel_vbt_defs.h (Jani)
Don't fall back to 0 silently for unknown values (Jani)
Skip the debug print for the 0 case (Jani)
Cc: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171030145702.23662-1-ville.syrjala@linux.intel.com
Call the DDI .pre_pll_enable() hook from the MST code so that BXT gets
the correct lane latency optimal setting applied. And we obviously need
to compute the correct value, and read it out to keep the state checker
happy.
While at it drop the useless 'encoder' parameter to
bxt_ddi_phy_calc_lane_lat_optim_mask()
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171027134348.31190-1-ville.syrjala@linux.intel.com
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
We would also like to make use of execlist_cancel_port_requests and
unwind_incomplete_requests in GuC preemption backend.
Let's rename the functions to use the correct prefixes, so that we can
simply add the declarations in the following patch.
Similar thing for applies for can_preempt, except we're introducing
HAS_LOGICAL_RING_PREEMPTION macro instad, converting other users that
were previously touching device info directly.
v2: s/intel_engine/execlists and pass execlists to unwind (Chris)
v3: use locked version for exporting, drop const qual (Chris)
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171025200020.16636-11-michal.winiarski@intel.com
On CNL we may need to bump up the system agent voltage not only due
to CDCLK but also when driving DDI port with a sufficiently high clock.
To that end start tracking the minimum acceptable voltage for each crtc.
We do the tracking via crtcs because we don't have any kind of encoder
state. Also there's no downside to doing it this way, and it matches how
we track cdclk requirements on account of pixel rate.
v2: Allow disabled crtcs to use the min voltage
Add IS_CNL check to intel_ddi_compute_min_voltage() since
we're using CNL specific values there
s/intel_compute_min_voltage/cnl_compute_min_voltage/ since
the function makes hw specific assumptions about the voltage
values
v3: Drop the test hack leftovers from skl_modeset_calc_cdclk()
v4: s/voltage/voltage_level/ (Rodrigo)
Replace DPLL DVFS FIXMEs with an explanation why we don't
do anything there (Rodrigo)
Cc: Mika Kahola <mika.kahola@intel.com>
Cc: Manasi Navare <manasi.d.navare@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171024095216.1638-9-ville.syrjala@linux.intel.com
For CNL we'll need to start considering the port clocks when we select
the voltage level for the system agent. To that end start tracking the
voltage in the cdclk state (since that already has to adjust it).
v2: s/voltage/voltage_level/ (Rodrigo)
Cc: Mika Kahola <mika.kahola@intel.com>
Cc: Manasi Navare <manasi.d.navare@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171024095216.1638-3-ville.syrjala@linux.intel.com
This patch parse DSI backlight/cabc ports info from
VBT and save them inside local structure. This saved info
can be directly used while initializing DSI for different
platforms instead of parsing for each platform.
V2: Changes:
- Typo fix in commit message.
- Move up newly added port variables (Jani N)
- Remove redundant initialization (Jani N)
- Don't parse CABC ports if not supported (Jani N)
V3: Patch restructure (Suggested by Jani N)
Credits-to: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Madhav Chauhan <madhav.chauhan@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1507898700-20016-1-git-send-email-madhav.chauhan@intel.com
They're unused and unsupported. Leave the reduced_clock pointers in
place still, should they prove useful later on.
v2: go from nuking DDI lowfreq_avail to nuking it entirely (Ville)
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Suggested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171017140234.20677-1-jani.nikula@intel.com
Now that we write RING_FORCE_TO_NONPRIV registers directly to hardware,
[commit 32ced39 ("drm/i915: Transform whitelisting WAs into a simple reg
write")] there is no need to save space for them in the list of context
workarounds.
v2: Refer to previous commit in commit message (Michel)
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1508272071-15125-1-git-send-email-oscar.mateo@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Inspired by Tvrtko's critique of the reaping of the stale contexts
before allocating a new one, also limit the freed object reaping to the
oldest stale object before allocating a fresh object. Unlike contexts,
objects may have radically different sizes of backing storage, but
similar to contexts, while we want to prevent starvation due to
excessive freed lists, we also do not want to delay fresh allocations
for too long. Only freeing the oldest on the freed object list before
each allocation is a reasonable compromise.
v2: Only a single consumer of llist_del_first() is allowed (although
multiple llist_add are still allowed in parallel). Unlike
i915_gem_context, i915_gem_flush_free_objects() is itself not serialized
and so we need to add our own spinlock. Otherwise KASAN eventually spots
a use-after-free for the race on *first->next.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> #v1
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171013202621.7276-8-chris@chris-wilson.co.uk
Remove the struct_mutex requirement around dev_priv->mm.bound_list and
dev_priv->mm.unbound_list by giving it its own spinlock. This reduces
one more requirement for struct_mutex and in the process gives us
slightly more accurate unbound_list tracking, which should improve the
shrinker - but the drawback is that we drop the retirement before
counting so i915_gem_object_is_active() may be stale and lead us to
underestimate the number of objects that may be shrunk (see commit
bed50aea61 ("drm/i915/shrinker: Flush active on objects before
counting")).
v2: Crosslink the spinlock to the lists it protects, and btw this
changes s/obj->global_link/obj->mm.link/
v3: Fix decoupling of old links in i915_gem_object_attach_phys()
v3.1: Fix the fix, only unlink if it was linked
v3.2: Use a local for to_i915(obj->base.dev)->mm.obj_lock
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171016114037.5556-1-chris@chris-wilson.co.uk
Since we occasionally stuff an error pointer into obj->mm.pages for a
semi-permanent or even permanent failure, we have to be more careful and
not just test against NULL when deciding if the object has a complete
set of its concurrent pages.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171013202621.7276-1-chris@chris-wilson.co.uk
Apparently RC6 residency is lower than expected
with EI mode for most of the cases on CNL A0, B0 and C0.
This Wa doesn't solve our lower residency, but I
believe it is better to have it since EI is not
expected to work by HW engineers anyways.
Cc: David Weinehall <david.weinehall@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: David Weinehall <david.weinehall@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170822235828.18322-1-rodrigo.vivi@intel.com
Defined new struct intel_rc6 to hold RC6 specific state and
intel_ring_pstate to hold ring specific state.
v2: s/intel_ring_pstate/intel_llc_pstate. Removed checks from
autoenable_* functions. (Chris)
Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Radoslaw Szwichtenberg <radoslaw.szwichtenberg@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/1507360055-19948-13-git-send-email-sagar.a.kamble@intel.com
Acked-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171010213010.7415-12-chris@chris-wilson.co.uk
Prepared substructure rps for RPS related state. autoenable_work is
used for RC6 too hence it is defined outside rps structure. As we do
this lot many functions are refactored to use intel_rps *rps to access
rps related members. Hence renamed intel_rps_client pointer variables
to rps_client in various functions.
v2: Rebase.
v3: s/pm/gt_pm (Chris)
Refactored access to rps structure by declaring struct intel_rps * in
many functions.
Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Imre Deak <imre.deak@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Radoslaw Szwichtenberg <radoslaw.szwichtenberg@intel.com> #1
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/1507360055-19948-9-git-send-email-sagar.a.kamble@intel.com
Acked-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171010213010.7415-8-chris@chris-wilson.co.uk
In order to separate GT PM related functionality into new structure
we are updating rps structure. hw_lock in it is used for display
related PCU communication too hence move it to dev_priv.
Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Radoslaw Szwichtenberg <radoslaw.szwichtenberg@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/1507360055-19948-8-git-send-email-sagar.a.kamble@intel.com
Acked-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171010213010.7415-7-chris@chris-wilson.co.uk
It's a little unclear what the sg_mask actually is, so prefer the more
meaningful name of sg_page_sizes.
Suggested-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171009110024.29114-1-matthew.auld@intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Following the pattern now used for obj->mm.pages, use just pin_fence and
unpin_fence to control access to the fence registers. I.e. instead of
calling get_fence(); pin_fence(), we now just need to call pin_fence().
This will make it easier to reduce the locking requirements around
fence registers.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171009084401.29090-2-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
In preparation for supporting huge gtt pages for the ppgtt, we introduce
page size members for gem objects. We fill in the page sizes by
scanning the sg table.
v2: pass the sg_mask to set_pages
v3: calculate the sg_mask inline with populating the sg_table where
possible, and pass to set_pages along with the pages.
v4: bunch of improvements from Joonas
v5: fix num_pages blunder
introduce i915_sg_page_sizes helper
v6: prefer GEM_BUG_ON(sizes == 0)
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel@ffwll.ch>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-7-matthew.auld@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006221833.32439-6-chris@chris-wilson.co.uk
In preparation for huge gtt pages expose page_sizes as part of the
device info, to indicate the page sizes supported by the HW. Currently
only 4K is supported.
v2: s/page_size_mask/page_sizes/
v3: introduce I915_GTT_MAX_PAGE_SIZE
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-5-matthew.auld@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006221833.32439-4-chris@chris-wilson.co.uk
Not a fully blown gemfs, just our very own tmpfs kernel mount. Doing so
moves us away from the shmemfs shm_mnt, and gives us the much needed
flexibility to do things like set our own mount options, namely huge=
which should allow us to enable the use of transparent-huge-pages for
our shmem backed objects.
v2: various improvements suggested by Joonas
v3: move gemfs instance to i915.mm and simplify now that we have
file_setup_with_mnt
v4: fallback to tmpfs shm_mnt upon failure to setup gemfs
v5: make tmpfs fallback kinder
v5: better gemfs failure message
flags variable
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Hugh Dickins <hughd@google.com>
Cc: linux-mm@kvack.org
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-3-matthew.auld@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006221833.32439-2-chris@chris-wilson.co.uk
With preemption, we will want to "unsubmit" a request, taking it back
from the hw and returning it to the priority sorted execution list. In
order to know where to insert it into that list, we need to remember
its adjust priority (which may change even as it was being executed).
This also affects reset for execlists as we are now unsubmitting the
requests following the reset (rather than directly writing the ELSP for
the inflight contexts). This turns reset into an accidental preemption
point, as after the reset we may choose a different pair of contexts to
submit to hw.
GuC is not updated as this series doesn't add preemption to the GuC
submission, and so it can keep benefiting from the early pruning of the
DFS inside execlists_schedule() for a little longer. We also need to
find a way of reducing the cost of that DFS...
v2: Include priority in error-state
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171003203453.15692-6-chris@chris-wilson.co.uk
Add another perma-pinned context for using for preemption at any time.
We cannot just reuse the existing kernel context, as first and foremost
we need to ensure that we can preempt the kernel context itself, so
require a distinct context id. Similar to the kernel context, we may
want to interrupt execution and switch to the preempt context at any
time, and so it needs to be permanently pinned and available.
To compensate for yet another permanent allocation, we shrink the
existing context and the new context by reducing their ringbuffer to the
minimum.
v2: Assert that we never allocate a request from the preemption context.
v3: Limit perma-pin to engines that may preempt.
v4: Onion cleanup for early driver death
v5: Onion ordering in main driver cleanup as well.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171003203453.15692-4-chris@chris-wilson.co.uk