We can implement limited RC6 counter wrap-around protection under the
assumption that clients will be reading this value more frequently than
the wrap period on a given platform.
With the typical wrap-around period being ~90 minutes, even with the
exception of Baytrail which wraps every 13 seconds, this sounds like a
reasonable assumption.
Implementation works by storing a 64-bit software copy of a hardware RC6
counter, along with the previous HW counter snapshot. This enables it to
detect wrap is polled frequently enough and keep the software copy
monotonically incrementing.
v2:
* Missed GEN6_GT_GFX_RC6_LOCKED when considering slot sizing and
indexing.
* Fixed off-by-one in wrap-around handling. (Chris Wilson)
v3:
* Simplify index checking by using unsigned int. (Chris Wilson)
* Expand the comment to explain why indexing works.
v4:
* Use __int128 if supported.
v5:
* Use mul_u64_u32_div. (Chris Wilson)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94852
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> # v3
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180208160036.29919-1-tvrtko.ursulin@linux.intel.com
Acked-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Most of our ioctl functions have an _ioctl suffix in the name. I like
that idea since it makes it easy to figure out how the function is
going to get called. Rename the handful of exceptions to follow the
same pattern.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180207164841.19431-1-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Rather than having the high level ioctl interface guess the underlying
implementation details, having the implementation declare what
capabilities it exports. We define an intel_driver_caps, similar to the
intel_device_info, which instead of trying to describe the HW gives
details on what the driver itself supports. This is then populated by
the engine backend for the new scheduler capability field for use
elsewhere.
v2: Use caps.scheduler for validating CONTEXT_PARAM_SET_PRIORITY (Mika)
One less assumption of engine[RCS] \o/
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tomasz Lis <tomasz.lis@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Tomasz Lis <tomasz.lis@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180207210544.26351-2-chris@chris-wilson.co.uk
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Since unbannable contexts are special and supposed not to be causing GPU
hangs in the first place, make it clear when they are implicated in said
hang. In practice, most unbannable contexts are those created by igt
for the express purpose of throwing untold thousands of hangs at the GPU
and wish to keep doing so to finish the test. Normally they are cleaned
up, but it's when they or the other unbannable kernel contexts stay
stuck in an erroneous state that we need to worry and so need
highlighting.
Suggested-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180205094139.10671-1-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
We're using i915_inject_load_failure() to inject dummy
faults during driver load, but since this is debug utility
we shouldn't expose it in default config as it consumes
both code and data.
add/remove: 0/1 grow/shrink: 0/2 up/down: 0/-302 (-302)
Function old new delta
__i915_inject_load_failure 61 - -61
i915_gem_init 1331 1268 -63
i915_driver_load 5923 5745 -178
Total: Before=1177454, After=1177152, chg -0.03%
add/remove: 0/1 grow/shrink: 0/0 up/down: 0/-4 (-4)
Data old new delta
i915_load_fail_count 4 - -4
Total: Before=56762, After=56758, chg -0.01%
add/remove: 4/8 grow/shrink: 0/1 up/down: 245/-591 (-346)
RO Data old new delta
__param_str_inject_load_failure 20 - -20
__UNIQUE_ID_inject_load_failuretype200 34 - -34
__param_inject_load_failure 40 - -40
__func__ 4998 4896 -102
__UNIQUE_ID_inject_load_failure201 150 - -150
Total: Before=119095, After=118749, chg -0.29%
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180201173248.3912-1-michal.wajdeczko@intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
We have the max DP link rate info available in VBT since BDB version
216, included in child device config since commit c4fb60b9ab
("drm/i915/bios: add DP max link rate to VBT child device
struct"). Parse it and use it.
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/a8b1364d1f2394fba3062b6ad11b474744ea4366.1517482774.git.jani.nikula@intel.com
There is no requirement for doing the PCODE request polling atomically,
so do that only for a short time switching to sleeping poll afterwards.
The specification requires a 150usec timeout for the change notification,
so let's use that for the atomic poll. Do the extra 2ms poll - needed as
a workaround on BXT/GLK - in sleeping mode.
v2:
- rebase on v2 of patchset dropping the sandybridge_pcode_read/write
refactoring (Chris)
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180130142939.17983-2-imre.deak@intel.com
Currently we see sporadic timeouts during CDCLK changing both on BXT and
GLK as reported by the Bugzilla: ticket. It's easy to reproduce this by
changing the frequency in a tight loop after blanking the display. The
upper bound for the completion time is 800us based on my tests, so
increase it from the current 500us to 2ms; with that I couldn't trigger
the problem either on BXT or GLK.
Note that timeouts happened during both the change notification and the
voltage level setting PCODE request. (For the latter one BSpec doesn't
require us to wait for completion before further HW programming.)
This issue is similar to
commit 2c7d0602c8 ("drm/i915/gen9: Fix PCODE polling during CDCLK
change notification")
but there the PCODE request does complete (as shown by the mbox
busy flag), only the reply we get from PCODE indicates a failure.
So there we keep resending the request until a success reply, here we
just have to increase the timeout for the one PCODE request we send.
v2:
- s/snb_pcode_request/sandybridge_pcode_write_timeout/ (Ville)
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: <stable@vger.kernel.org> # v4.4+
Acked-by: Chris Wilson <chris@chris-wilson.co.uk> (v1)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103326
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180130142939.17983-1-imre.deak@intel.com
GEN9/10 had fixed DBuf block size of 512. Dbuf block size is not a
fixed number anymore in GEN11, it varies according to bits per pixel
and tiling. If 8bpp & Yf-tile surface, block size = 256 else block
size = 512
This patch addresses the same.
v2 (from Paulo):
- Make it compile.
- Fix a few coding style issues.
v3:
- Rebase on top of upstream patches
v4 (from Paulo):
- Bikeshed if statements (James).
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: James Ausmus <james.ausmus@intel.com>
Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180130134918.32283-3-paulo.r.zanoni@intel.com
On CNP boards that are using DDI F,
bit 25 (SDE_PORTE_HOTPLUG_SPT) is representing
the Digital Port F hotplug line when the Digital
Port F hotplug detect input is enabled.
v2: Reuse all existent structure instead of adding a
new HPD_PORT_F pointing to pin of port E.
v3: Use IS_CNL_WITH_PORT_F so we can start upstreaming
this right now. If that SKU ever get a proper name
we come back and update it.
v4: Rebase on top of digital connected port using encoder
instead of port.
v5: Moved IS_CNL_WITH_PORT_F definition to the PCI IDs patch.
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Manasi Navare <manasi.d.navare@intel.com>
Cc: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180129232223.766-8-rodrigo.vivi@intel.com
On some Cannonlake SKUs we have a dedicated Aux for port F,
that is only the full split between port A and port E.
There is still no Aux E for Port E, as in previous platforms,
because port_E still means shared lanes with port A.
v2: Rebase.
v3: Add couple missed PORT_F cases on intel_dp.
v4: Rebase and fix commit message.
v5: Squash Imre's "drm/i915: Add missing AUX_F power well string"
v6: Rebase on top of display headers rework.
v7: s/IS_CANNONLAKE/IS_CNL_WITH_PORT_F (DK)
v8: Fix Aux bits for Port F (DK)
v9: Fix VBT definition of Port F (DK).
v10: Squash power well addition to this patch to avoid
warns as pointed by DK.
v11: Clean up squashed commit message. (David)
v12: Remove unnecessary handling for older platforms (DK)
Adding AUX_F to PG2 following other existent ones. (DK)
Cc: David Weinehall <david.weinehall@linux.intel.com>
Cc: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Cc: Manasi Navare <manasi.d.navare@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: David Weinehall <david.weinehall@linux.intel.com>
Reviewed-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180129232223.766-2-rodrigo.vivi@intel.com
The only difference is that this SKUs has the full
Port A/E split named as Port F.
But since SKUs differences don't matter on the platform
definition group and ids, let's merge all off them together.
v2: Really include the PCI IDs to the picidlist[];
v3: Add the PCI Id for another SKU (Anusha).
v4: Update IDs, really include to pciidlists again.
v5: Unify all GT2 IDs.
v6: Unify in a way that we don't break early-quirks.c
v7: Remove GT reference since it doesn't matter here (Paulo)
Also move IS_CNL_WITH_PORT_F macro to this patch to
make it easier for review this part and also to get
used sooner.
v8: Rebased on top of commit 5db47e37b3 ("Revert "drm/i915:
mark all device info struct with __initconst"")
Cc: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Anusha Srivatsa <anusha.srivatsa@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180129232223.766-1-rodrigo.vivi@intel.com
Replace the ad-hoc plane indexing scheme used by the frontbuffer
tracking with enum plane_id.
The old video overlay not being part of the plane_id namespace
will just be given the high bit.
v2: Drop the unintended whitespace change (Chris)
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180123183343.9181-1-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
By counting the number of times we have woken up, we have a very simple
means of defining an epoch, which will come in handy if we want to
perform deferred tasks at the end of an epoch (i.e. while we are going
to sleep) without imposing on the next activity cycle.
v2: No reason to specify precise number of bits here.
v3: Take Tvrtko's advice and reserve 0 as an invalid epoch.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180124113608.14909-1-chris@chris-wilson.co.uk
Add the enum additions to ICP PCH.
v2 (from Paulo): don't set any platforms to it yet since ICP support is
incomplete.
v3 (from Rodrigo): Fix ICP name.
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Anusha Srivatsa <anusha.srivatsa@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180111180010.24357-4-paulo.r.zanoni@intel.com
Icelake is an Intel® Processor containing an Intel® Graphics
Controller.
This is just an initial Icelake definition. PCI IDs, Icelake support
and new features coming in following patches.
v2: Add .ddb_size and .has_guc (Michal Wajdeczko).
v3: Add the ICL_FEATURES macro (Kelvin Gardiner).
v4 (from Paulo): Add missing __initconst (Paulo) and say "graphics
controller" instead of something that looks like an official marketing
name but isn't (Chris).
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180111180010.24357-3-paulo.r.zanoni@intel.com
The CDCLK bypass frequency can vary on upcoming platforms, so prepare
for that now by tracking its value in the CDCLK state.
Currently on BDW+ the bypass frequency is always the reference clock and
I didn't bother with earlier platforms since it's not all that clear
what's the bypass clock on those.
I also didn't bother adding support for changing this frequency, since
atm I don't see any need for it.
Suggested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180117172508.15993-1-imre.deak@intel.com
struct timeval is deprecated because it cannot represent times
past 2038. In this driver, the only use of this structure is
to capture debug information. This is easily changed to ktime_t,
which we then format as needed when printing it later.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20180117154916.219273-1-arnd@arndb.de
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
This flag has become redundant since
commit 4d90f2d507 ("drm/i915: Start tracking PSR state in crtc state")
It is set at the same place as psr.enabled, which is also exposed via
debugfs.
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180103213824.1405-1-dhinakaran.pandiyan@intel.com
Once the Aksv is available in the PCH, we need to get it on the wire to
the receiver via DDC. The hardware doesn't allow us to read the value
directly, so we need to tell GMBUS to source the Aksv internally and
send it to the right offset on the receiver.
The way we do this is to initiate an indexed write where the index is
the Aksv register offset. We write dummy values to GMBUS3 as if we were
sending the key, and the hardware slips in the "real" values when it
goes out.
Changes in v2:
- None
Changes in v3:
- Uses new index write feature (Ville)
Changes in v4:
- None
Changes in v5:
- checkpatch whitespace fix
Changes in v6:
- None
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20180108195545.218615-8-seanpaul@chromium.org
We have plenty of global registers and whatnot programmed without
any further locking by the modeset code. Currently non-bocking
modesets are allowed to execute in parallel which could corrupt
said registers.
To avoid the problem let's run all non-blocking modesets on an
ordered workqueue. We still put page flips etc. to system_unbound_wq
allowing page flips on one pipe to execute in parallel with page flips
or a modeset on a another pipe (assuming no known state is shared
between them, at which point they would have been added to the same
atomic commit and serialized that way).
Blocking modesets are already serialized with each other by
connection_mutex, and thus are safe. To serialize them with
non-blocking modesets we just flush the workqueue before executing
blocking modesets.
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Fixes: 94f050246b ("drm/i915: nonblocking commit")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171113133622.8593-1-ville.syrjala@linux.intel.com
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
We already have dedicated file for opregion related code, dedicated
header will make our life easier.
v2: reorder includes (Chris)
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171221185334.17396-4-michal.wajdeczko@intel.com
[ickle: quieten checkpatch]
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171221215735.30314-3-chris@chris-wilson.co.uk
Convert intel_device_info_dump into pretty printer to be
consistent with the rest of the driver code.
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171219114346.26308-2-michal.wajdeczko@intel.com
We dump device flags in few places (init_early, debugfs, gpu_error)
using different functions. Lets add reusable function to avoid
code duplication.
add/remove: 1/0 grow/shrink: 0/3 up/down: 1296/-3572 (-2276)
Function old new delta
intel_device_info_dump_flags - 1296 +1296
i915_capabilities 2435 1353 -1082
i915_error_state_to_str 6642 5507 -1135
intel_device_info_dump 1507 152 -1355
Total: Before=1287992, After=1285716, chg -0.18%
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171219114346.26308-1-michal.wajdeczko@intel.com
Extract the timeout we use in i915_gem_idle_work_handler() and reuse it
for wait_for_engines() in i915_gem_wait_for_idle(). It too has the same
problem in sometimes having to wait for an extended period before the HW
settles, so make use of the same timeout.
References: 5427f20785 ("drm/i915: Bump wait-times for the final CS interrupt before parking")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171211194135.27095-1-chris@chris-wilson.co.uk
Keeps things consistent now that we make use of struct resource. This
should keep us covered in case we ever get huge amounts of stolen
memory.
v2: bunch of missing conversions (Chris)
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171211151822.20953-10-matthew.auld@intel.com
Kick it out of i915_ggtt and keep it grouped with dsm and dsm_reserved,
where it makes the most sense.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171211151822.20953-9-matthew.auld@intel.com
Now that we are using struct resource to track the stolen region, it is
more convenient if we track the reserved portion of that region in a
resource as well.
v2: s/<= end + 1/< end/ (Chris)
v3: prefer DEFINE_RES_MEM
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171211151822.20953-7-matthew.auld@intel.com
Now that we are using struct resource to track the stolen region, it is
more convenient if we track dsm in a resource as well.
v2: check range_overflow when writing to 32b registers (Chris)
pepper in some comments (Chris)
v3: refit i915_stolen_to_dma()
v4: kill ggtt->stolen_size
v5: some more polish
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171211151822.20953-6-matthew.auld@intel.com
It seems that the DMC likes to transition between the DC states a lot when
there are no connected displays (no active power domains) during command
submission.
This activity on DC states has a negative impact on the performance of the
chip with huge latencies observed in the interrupt handlers and elsewhere.
Simple tests like igt/gem_latency -n 0 are slowed down by a factor of
eight.
Work around it by introducing a new power domain named,
POWER_DOMAIN_GT_IRQ, associtated with the "DC off" power well, which is
held for the duration of command submission activity.
CNL has the same problem which will be addressed as a follow-up. Doing
that requires a fix for a DC6 context corruption problem in the CNL DMC
firmware which is yet to be released.
v2:
* Add commit text as comment in i915_gem_mark_busy. (Chris Wilson)
* Protect macro body with braces. (Jani Nikula)
v3:
* Add dedicated power domain for clarity. (Chris, Imre)
* Commit message and comment text updates.
* Apply to all big-core GEN9 parts apart for Skylake which is pending DMC
firmware release.
v4:
* Power domain should be inner to device runtime pm. (Chris)
* Simplify NEEDS_CSR_GT_PERF_WA macro. (Chris)
* Handle async DMC loading by moving the GT_IRQ power domain logic into
intel_runtime_pm. (Daniel, Chris)
* Include small core GEN9 as well. (Imre)
v5
* Special handling for async DMC load is not needed since on failure the
power domain reference is kept permanently taken. (Imre)
v6:
* Drop the NEEDS_CSR_GT_PERF_WA macro since all firmwares have now been
deployed. (Imre, Chris)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100572
Testcase: igt/gem_exec_nop/headless
Cc: Imre Deak <imre.deak@intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk> (v2)
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> (v5)
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
[Imre: Add note about applying the WA on CNL as a follow-up]
Signed-off-by: Imre Deak <imre.deak@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171205132854.26380-1-tvrtko.ursulin@linux.intel.com
As writes through the GTT and GGTT PTE updates do not share the same
path, they are not strictly ordered and so we must explicitly flush the
indirect writes prior to modifying the PTE. We do track outstanding GGTT
writes on the object itself, but since the object may have multiple GGTT
vma, that is overly coarse as we can track and flush individual vma as
required.
Whilst here, update the GGTT flushing behaviour for Cannonlake.
v2: Hard-code ring offset to allow use during unload (after RCS may have
been freed, or never existed!)
References: https://bugs.freedesktop.org/show_bug.cgi?id=104002
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171206124914.19960-2-chris@chris-wilson.co.uk
We currently have two module parameters that control GuC:
"enable_guc_loading" and "enable_guc_submission". Whenever
we need submission=1, we also need loading=1. We also need
loading=1 when we want to want to load and verify the HuC.
Lets combine above module parameters into one "enable_guc"
modparam. New supported bit values are:
0=disable GuC (no GuC submission, no HuC)
1=enable GuC submission
2=enable HuC load
Special value "-1" can be used to let driver decide what
option should be enabled for given platform based on
hardware/firmware availability or preference.
Explicit enabling any of the GuC features makes GuC load
a required step, fallback to non-GuC mode will not be
supported.
v2: Don't use -EIO
v3: define modparam bits (Chris)
v4: rely on implicit cast (Chris)
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Cc: Sujaritha Sundaresan <sujaritha.sundaresan@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171206135316.32556-6-michal.wajdeczko@intel.com
In the upcoming patch we will change the way how to recognize
when GuC is in use. Using helper macros will minimize scope
of that changes. While here, update dev_info message.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171206135316.32556-3-michal.wajdeczko@intel.com
Doing HuC firmware path selection from sanitize_options function
is not perfect, while there is no problem with doing so during
early init stage as we already have all needed data.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171206135316.32556-1-michal.wajdeczko@intel.com
It has been many years since the last confirmed sighting (and fix) of an
RC6 related bug (usually a system hang). Remove the parameter to stop
users from setting dangerous values, as they often set it during triage
and end up disabling the entire runtime pm instead (the option is not a
fine scalpel!).
Furthermore, it allows users to set known dangerous values which were
intended for testing and not for production use. For testing, we can
always patch in the required setting without having to expose ourselves
to random abuse.
v2: Fixup NEEDS_WaRsDisableCoarsePowerGating fumble, and document the
lack of ilk support better.
v3: Clear intel_info->rc6p if we don't support rc6 itself.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171201113030.18360-1-chris@chris-wilson.co.uk
Since the shrinker is registered and unregistered during
i915_driver_register and i915_driver_unregister, respectively, rename
the init/cleanup functions to match.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171123115338.10270-1-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
It may be of interest to both compare the active HW context against the
default (aka NULL) context, to see what has been changed and if either are
corrupt.
v2: Rename the fake vma as fake.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171126220901.14735-1-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
From: Chris Wilson <chris@chris-wilson.co.uk>
From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
From: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
The first goal is to be able to measure GPU (and invidual ring) busyness
without having to poll registers from userspace. (Which not only incurs
holding the forcewake lock indefinitely, perturbing the system, but also
runs the risk of hanging the machine.) As an alternative we can use the
perf event counter interface to sample the ring registers periodically
and send those results to userspace.
Functionality we are exporting to userspace is via the existing perf PMU
API and can be exercised via the existing tools. For example:
perf stat -a -e i915/rcs0-busy/ -I 1000
Will print the render engine busynnes once per second. All the performance
counters can be enumerated (perf list) and have their unit of measure
correctly reported in sysfs.
v1-v2 (Chris Wilson):
v2: Use a common timer for the ring sampling.
v3: (Tvrtko Ursulin)
* Decouple uAPI from i915 engine ids.
* Complete uAPI defines.
* Refactor some code to helpers for clarity.
* Skip sampling disabled engines.
* Expose counters in sysfs.
* Pass in fake regs to avoid null ptr deref in perf core.
* Convert to class/instance uAPI.
* Use shared driver code for rc6 residency, power and frequency.
v4: (Dmitry Rogozhkin)
* Register PMU with .task_ctx_nr=perf_invalid_context
* Expose cpumask for the PMU with the single CPU in the mask
* Properly support pmu->stop(): it should call pmu->read()
* Properly support pmu->del(): it should call stop(event, PERF_EF_UPDATE)
* Introduce refcounting of event subscriptions.
* Make pmu.busy_stats a refcounter to avoid busy stats going away
with some deleted event.
* Expose cpumask for i915 PMU to avoid multiple events creation of
the same type followed by counter aggregation by perf-stat.
* Track CPUs getting online/offline to migrate perf context. If (likely)
cpumask will initially set CPU0, CONFIG_BOOTPARAM_HOTPLUG_CPU0 will be
needed to see effect of CPU status tracking.
* End result is that only global events are supported and perf stat
works correctly.
* Deny perf driver level sampling - it is prohibited for uncore PMU.
v5: (Tvrtko Ursulin)
* Don't hardcode number of engine samplers.
* Rewrite event ref-counting for correctness and simplicity.
* Store initial counter value when starting already enabled events
to correctly report values to all listeners.
* Fix RC6 residency readout.
* Comments, GPL header.
v6:
* Add missing entry to v4 changelog.
* Fix accounting in CPU hotplug case by copying the approach from
arch/x86/events/intel/cstate.c. (Dmitry Rogozhkin)
v7:
* Log failure message only on failure.
* Remove CPU hotplug notification state on unregister.
v8:
* Fix error unwind on failed registration.
* Checkpatch cleanup.
v9:
* Drop the energy metric, it is available via intel_rapl_perf.
(Ville Syrjälä)
* Use HAS_RC6(p). (Chris Wilson)
* Handle unsupported non-engine events. (Dmitry Rogozhkin)
* Rebase for intel_rc6_residency_ns needing caller managed
runtime pm.
* Drop HAS_RC6 checks from the read callback since creating those
events will be rejected at init time already.
* Add counter units to sysfs so perf stat output is nicer.
* Cleanup the attribute tables for brevity and readability.
v10:
* Fixed queued accounting.
v11:
* Move intel_engine_lookup_user to intel_engine_cs.c
* Commit update. (Joonas Lahtinen)
v12:
* More accurate sampling. (Chris Wilson)
* Store and report frequency in MHz for better usability from
perf stat.
* Removed metrics: queued, interrupts, rc6 counters.
* Sample engine busyness based on seqno difference only
for less MMIO (and forcewake) on all platforms. (Chris Wilson)
v13:
* Comment spelling, use mul_u32_u32 to work around potential GCC
issue and somne code alignment changes. (Chris Wilson)
v14:
* Rebase.
v15:
* Rebase for RPS refactoring.
v16:
* Use the dynamic slot in the CPU hotplug state machine so that we are
free to setup our state as multi-instance. Previously we were re-using
the CPUHP_AP_PERF_X86_UNCORE_ONLINE slot which is neither used as
multi-instance, nor owned by our driver to start with.
* Register the CPU hotplug handlers after the PMU, otherwise the callback
will get called before the PMU is initialized which can end up in
perf_pmu_migrate_context with an un-initialized base.
* Added workaround for a probable bug in cpuhp core.
v17:
* Remove workaround for the cpuhp bug.
v18:
* Rebase for drm_i915_gem_engine_class getting upstream before us.
v19:
* Rebase. (trivial)
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171121181852.16128-2-tvrtko.ursulin@linux.intel.com
Stop using the old for_each_intel_plane_in_state() type iteration
macro and replace it with for_each_new_intel_plane_in_state().
And similarly replace drm_atomic_get_existing_crtc_state() with
intel_atomic_get_new_crtc_state(). Switch over to intel_ types
as well to make the code less cluttered.
v2: s/plane/i9xx_plane/ etc. (James)
Cc: James Ausmus <james.ausmus@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20171117191917.11506-8-ville.syrjala@linux.intel.com
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Replace the 0 and 1 with PLANE_A and PLANE_B in the pre-g4x wm code.
v2: s/old_plane_id/i9xx_plane_id/ (Daniel)
v3: s/plane/i9xx_plane/ etc. (James)
Cc: James Ausmus <james.ausmus@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20171117191917.11506-5-ville.syrjala@linux.intel.com
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Rename enum plane to enum i9xx_plane_id to make it clear that it only
applies to pre-SKL platforms.
enum i9xx_plane_id is a global identifier, whereas enum plane_id is
per-pipe. We need the old global identifier to index the primary plane
(and the pre-g4x sprite C if we ever expose it) registers on pre-SKL
platforms.
v2: Reorder patches
v3: s/old_plane_id/i9xx_plane_id/ (Daniel)
Pimp the commit message a bit
Note that i9xx_plane_id doesn't apply to SKL+
v4: Rebase due to power domain handling in plane readout
v5: Rebase due to crtc->dspaddr_offset removal
v6: s/plane/i9xx_plane/ etc. (James)
Cc: James Ausmus <james.ausmus@intel.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20171117191917.11506-4-ville.syrjala@linux.intel.com
Reviewed-by: James Ausmus <james.ausmus@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Having disabled the broken semaphores on Sandybridge, there is no need
for a modparam any more, so remove it in favour of a simple
HAS_LEGACY_SEMAPHORES() guard.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171120205504.21892-5-chris@chris-wilson.co.uk
Since removing the module parameter to force selection of ringbuffer
emission for gen8, the code is defunct. Remove it.
To put the difference into perspective, a couple of microbenchmarks
(bdw i7-5557u, 20170324):
ring execlists
exec continuous nops on all rings: 1.491us 2.223us
exec sequential nops on each ring: 12.508us 53.682us
single nop + sync: 9.272us 30.291us
vblank_mode=0 glxgears: ~11000fps ~9000fps
Since the earlier submission, gen8 ringbuffer submission has fallen
further and further behind in features. So while ringbuffer may hold the
throughput crown, in terms of interactive latency, execlists is much
better. Alas, we have no convenient metrics for such, other than
demonstrating things we can do with execlists but can not using
legacy ringbuffer submission.
We have made a few improvements to lowlevel execlists throughput,
and ringbuffer currently panics on boot! (bdw i7-5557u, 20171026):
ring execlists
exec continuous nops on all rings: n/a 1.921us
exec sequential nops on each ring: n/a 44.621us
single nop + sync: n/a 21.953us
vblank_mode=0 glxgears: n/a ~18500fps
References: https://bugs.freedesktop.org/show_bug.cgi?id=87725
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Once-upon-a-time-Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171120205504.21892-2-chris@chris-wilson.co.uk
Execlists and legacy ringbuffer submission are no longer feature
comparable (execlists now offer greater functionality that should
overcome their performance hit) and obsoletes the unsafe module
parameter, i.e. comparing the two modes of execution is no longer
useful, so remove the debug tool.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> #i915_perf.c
Link: https://patchwork.freedesktop.org/patch/msgid/20171120205504.21892-1-chris@chris-wilson.co.uk
Now that we have this stored in the device info, we can drop it from perf
part of the driver.
Note that this requires to init perf after we've computed the frequency,
hence why we move i915_perf_init() from i915_driver_init_early() to after
intel_device_info_runtime_init().
v2: Use div_u64 (Chris)
v3: Drop u64 divs by switching to kHz (Chris/Ville)
Move i915_perf_fini to i915_driver_cleanup_hw (Matthew)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171113181902.12411-2-lionel.g.landwerlin@intel.com
ERROR: "__udivdi3" [drivers/gpu/drm/i915/i915.ko] undefined!
ERROR: "__divdi3" [drivers/gpu/drm/i915/i915.ko] undefined!
Store the frequency in kHz and drop 64bit divisions.
v2: Use div64_u64 (Matthew)
v3: store frequency in kHz to avoid 64bit divs (Chris/Ville)
Fixes: dab9178333 ("drm/i915: expose command stream timestamp frequency to userspace")
Reported-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171113233455.12085-3-lionel.g.landwerlin@intel.com
Reviewed-by: Ewelina Musial <ewelina.musial@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
We use to have this fixed per generation, but starting with CNL userspace
cannot tell just off the PCI ID. Let's make this information available. This
is particularly useful for performance monitoring where much of the
normalization work is done using those timestamps (this include pipeline
statistics in both GL & Vulkan as well as OA reports).
v2: Use variables for 24MHz/19.2MHz values (Ewelina)
Renamed function & coding style (Sagar)
v3: Fix frequency read on Broadwell (Sagar)
Fix missing divide by 4 on <= gen4 (Sagar)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171110190845.32574-7-lionel.g.landwerlin@intel.com
Now that we always execute a context switch upon module load, there is
no need to queue a delayed task for doing so. The purpose of the delayed
task is to enable GT powersaving, for which we need the HW state to be
valid (i.e. having loaded a context and initialised basic state). We
used to defer this operation as historically it was slow (due to slow
register polling, fixed with commit 1758b90e38 ("drm/i915: Use a hybrid
scheme for fast register waits")) but now we have a requirement to save
the default HW state.
v2: Load the kernel context (to provide the power context) upon resume.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171112112738.1463-3-chris@chris-wilson.co.uk
As we now record the default HW state and so only emit the "golden"
renderstate once to prepare the HW, there is no advantage in keeping the
renderstate batch around as it will never be used again.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171110142634.10551-8-chris@chris-wilson.co.uk
intel_modeset_gem_init() now only sets up the legacy overlay, so let's
remove the function and call the setup directly during driver load. This
should help us find a better point in the initialisation sequence for it
later.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171110142634.10551-5-chris@chris-wilson.co.uk
Rather than digging through encoder->crtc and crtc->config in the
DPIO PHY functions, pass down the correct crtc state from the caller.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171031205123.13123-6-ville.syrjala@linux.intel.com
Acked-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
We keep details of GuC and HuC in separate error state struct.
Make GuC log part of it to group all related data together.
Since we are printing uC details at the end, with this change
GuC log will be moved there too.
v2: comment on new placement of the log (Chris)
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171026173657.49648-2-michal.wajdeczko@intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Include GuC and HuC firmware details in captured error state
to provide additional debug information. To reuse existing
uc firmware pretty printer, introduce new drm-printer variant
that works with our i915_error_state_buf output. Also update
uc firmware pretty printer to accept const input.
v2: don't rely on current caps (Chris)
dump correct fw info (Michal)
v3: simplify capture of custom paths (Chris)
v4: improve 'why' comment (Joonas)
trim output if no fw path (Michal)
group code around uc error state (Michal)
v5: use error in cleanup_uc (Michal)
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171026173657.49648-1-michal.wajdeczko@intel.com
[ickle: allow printing uc_fw after allocation failure]
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
This patch adds per engine reset and recovery (TDR) support when GuC is
used to submit workloads to GPU.
In the case of i915 directly submission to ELSP, driver manages hang
detection, recovery and resubmission. With GuC submission these tasks
are shared between driver and GuC. i915 is still responsible for detecting
a hang, and when it does it only requests GuC to reset that Engine. GuC
internally manages acquiring forcewake and idling the engine before
resetting it.
Once the reset is successful, i915 takes over again and handles the
resubmission. The scheduler in i915 knows which requests are pending so
after resetting a engine, pending workloads/requests are resubmitted
again.
v2: s/i915_guc_request_engine_reset/i915_guc_reset_engine/ to match the
non-guc function names.
v3: Removed debug message about engine restarting from which request,
since the new baseline do it regardless of submission mode. (Chris)
v4: Rebase.
v5: Do not pass unnecessary reporting flags to the fw (Jeff);
tasklet_schedule(&execlists->irq_tasklet) handles the resubmit; rebase.
v6: Rename the existing reset engine function and share a similar
interface between guc and non-guc paths (Chris).
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171031225309.10888-1-michel.thierry@intel.com
Reviewed-by: Jeff McGee <jeff.mcgee@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
intel_guc_reset sounds more like the microcontroller is the one performing
a reset, while in this case is the opposite. intel_reset_guc not only
makes it clearer, it follows the other intel_reset functions available.
v2: Print error message in English (Tvrtko).
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171030185616.32836-2-michel.thierry@intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Explicitly pass the crtc and connector states into the audio
code enable/disable hooks, and plumb them all the way down.
This gets rid of almost all crtc->config and encoder->crtc
uses. The one place where we still use them is
i915_audio_component_sync_audio_rate() since that gets called from
the audio driver and we don't have explicit states around then.
Cc: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171030184654.17429-1-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Starting from version 204 VBT can specify the max TMDS clock we are
allowed to use with HDMI ports. Parse that information and take it
into account when filtering modes and computing a crtc state.
Also take the opportunity to sort the platform check if ladder
from new to old.
v2: Add defines for the values into intel_vbt_defs.h (Jani)
Don't fall back to 0 silently for unknown values (Jani)
Skip the debug print for the 0 case (Jani)
Cc: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171030145702.23662-1-ville.syrjala@linux.intel.com
Call the DDI .pre_pll_enable() hook from the MST code so that BXT gets
the correct lane latency optimal setting applied. And we obviously need
to compute the correct value, and read it out to keep the state checker
happy.
While at it drop the useless 'encoder' parameter to
bxt_ddi_phy_calc_lane_lat_optim_mask()
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171027134348.31190-1-ville.syrjala@linux.intel.com
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
We would also like to make use of execlist_cancel_port_requests and
unwind_incomplete_requests in GuC preemption backend.
Let's rename the functions to use the correct prefixes, so that we can
simply add the declarations in the following patch.
Similar thing for applies for can_preempt, except we're introducing
HAS_LOGICAL_RING_PREEMPTION macro instad, converting other users that
were previously touching device info directly.
v2: s/intel_engine/execlists and pass execlists to unwind (Chris)
v3: use locked version for exporting, drop const qual (Chris)
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171025200020.16636-11-michal.winiarski@intel.com
On CNL we may need to bump up the system agent voltage not only due
to CDCLK but also when driving DDI port with a sufficiently high clock.
To that end start tracking the minimum acceptable voltage for each crtc.
We do the tracking via crtcs because we don't have any kind of encoder
state. Also there's no downside to doing it this way, and it matches how
we track cdclk requirements on account of pixel rate.
v2: Allow disabled crtcs to use the min voltage
Add IS_CNL check to intel_ddi_compute_min_voltage() since
we're using CNL specific values there
s/intel_compute_min_voltage/cnl_compute_min_voltage/ since
the function makes hw specific assumptions about the voltage
values
v3: Drop the test hack leftovers from skl_modeset_calc_cdclk()
v4: s/voltage/voltage_level/ (Rodrigo)
Replace DPLL DVFS FIXMEs with an explanation why we don't
do anything there (Rodrigo)
Cc: Mika Kahola <mika.kahola@intel.com>
Cc: Manasi Navare <manasi.d.navare@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171024095216.1638-9-ville.syrjala@linux.intel.com
For CNL we'll need to start considering the port clocks when we select
the voltage level for the system agent. To that end start tracking the
voltage in the cdclk state (since that already has to adjust it).
v2: s/voltage/voltage_level/ (Rodrigo)
Cc: Mika Kahola <mika.kahola@intel.com>
Cc: Manasi Navare <manasi.d.navare@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171024095216.1638-3-ville.syrjala@linux.intel.com
This patch parse DSI backlight/cabc ports info from
VBT and save them inside local structure. This saved info
can be directly used while initializing DSI for different
platforms instead of parsing for each platform.
V2: Changes:
- Typo fix in commit message.
- Move up newly added port variables (Jani N)
- Remove redundant initialization (Jani N)
- Don't parse CABC ports if not supported (Jani N)
V3: Patch restructure (Suggested by Jani N)
Credits-to: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Madhav Chauhan <madhav.chauhan@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1507898700-20016-1-git-send-email-madhav.chauhan@intel.com
They're unused and unsupported. Leave the reduced_clock pointers in
place still, should they prove useful later on.
v2: go from nuking DDI lowfreq_avail to nuking it entirely (Ville)
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Suggested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171017140234.20677-1-jani.nikula@intel.com
Now that we write RING_FORCE_TO_NONPRIV registers directly to hardware,
[commit 32ced39 ("drm/i915: Transform whitelisting WAs into a simple reg
write")] there is no need to save space for them in the list of context
workarounds.
v2: Refer to previous commit in commit message (Michel)
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1508272071-15125-1-git-send-email-oscar.mateo@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Inspired by Tvrtko's critique of the reaping of the stale contexts
before allocating a new one, also limit the freed object reaping to the
oldest stale object before allocating a fresh object. Unlike contexts,
objects may have radically different sizes of backing storage, but
similar to contexts, while we want to prevent starvation due to
excessive freed lists, we also do not want to delay fresh allocations
for too long. Only freeing the oldest on the freed object list before
each allocation is a reasonable compromise.
v2: Only a single consumer of llist_del_first() is allowed (although
multiple llist_add are still allowed in parallel). Unlike
i915_gem_context, i915_gem_flush_free_objects() is itself not serialized
and so we need to add our own spinlock. Otherwise KASAN eventually spots
a use-after-free for the race on *first->next.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> #v1
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171013202621.7276-8-chris@chris-wilson.co.uk
Remove the struct_mutex requirement around dev_priv->mm.bound_list and
dev_priv->mm.unbound_list by giving it its own spinlock. This reduces
one more requirement for struct_mutex and in the process gives us
slightly more accurate unbound_list tracking, which should improve the
shrinker - but the drawback is that we drop the retirement before
counting so i915_gem_object_is_active() may be stale and lead us to
underestimate the number of objects that may be shrunk (see commit
bed50aea61 ("drm/i915/shrinker: Flush active on objects before
counting")).
v2: Crosslink the spinlock to the lists it protects, and btw this
changes s/obj->global_link/obj->mm.link/
v3: Fix decoupling of old links in i915_gem_object_attach_phys()
v3.1: Fix the fix, only unlink if it was linked
v3.2: Use a local for to_i915(obj->base.dev)->mm.obj_lock
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171016114037.5556-1-chris@chris-wilson.co.uk
Since we occasionally stuff an error pointer into obj->mm.pages for a
semi-permanent or even permanent failure, we have to be more careful and
not just test against NULL when deciding if the object has a complete
set of its concurrent pages.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171013202621.7276-1-chris@chris-wilson.co.uk
Apparently RC6 residency is lower than expected
with EI mode for most of the cases on CNL A0, B0 and C0.
This Wa doesn't solve our lower residency, but I
believe it is better to have it since EI is not
expected to work by HW engineers anyways.
Cc: David Weinehall <david.weinehall@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: David Weinehall <david.weinehall@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170822235828.18322-1-rodrigo.vivi@intel.com
Defined new struct intel_rc6 to hold RC6 specific state and
intel_ring_pstate to hold ring specific state.
v2: s/intel_ring_pstate/intel_llc_pstate. Removed checks from
autoenable_* functions. (Chris)
Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Radoslaw Szwichtenberg <radoslaw.szwichtenberg@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/1507360055-19948-13-git-send-email-sagar.a.kamble@intel.com
Acked-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171010213010.7415-12-chris@chris-wilson.co.uk
Prepared substructure rps for RPS related state. autoenable_work is
used for RC6 too hence it is defined outside rps structure. As we do
this lot many functions are refactored to use intel_rps *rps to access
rps related members. Hence renamed intel_rps_client pointer variables
to rps_client in various functions.
v2: Rebase.
v3: s/pm/gt_pm (Chris)
Refactored access to rps structure by declaring struct intel_rps * in
many functions.
Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Imre Deak <imre.deak@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Radoslaw Szwichtenberg <radoslaw.szwichtenberg@intel.com> #1
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/1507360055-19948-9-git-send-email-sagar.a.kamble@intel.com
Acked-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171010213010.7415-8-chris@chris-wilson.co.uk
In order to separate GT PM related functionality into new structure
we are updating rps structure. hw_lock in it is used for display
related PCU communication too hence move it to dev_priv.
Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Radoslaw Szwichtenberg <radoslaw.szwichtenberg@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/1507360055-19948-8-git-send-email-sagar.a.kamble@intel.com
Acked-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171010213010.7415-7-chris@chris-wilson.co.uk
It's a little unclear what the sg_mask actually is, so prefer the more
meaningful name of sg_page_sizes.
Suggested-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171009110024.29114-1-matthew.auld@intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Following the pattern now used for obj->mm.pages, use just pin_fence and
unpin_fence to control access to the fence registers. I.e. instead of
calling get_fence(); pin_fence(), we now just need to call pin_fence().
This will make it easier to reduce the locking requirements around
fence registers.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171009084401.29090-2-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
In preparation for supporting huge gtt pages for the ppgtt, we introduce
page size members for gem objects. We fill in the page sizes by
scanning the sg table.
v2: pass the sg_mask to set_pages
v3: calculate the sg_mask inline with populating the sg_table where
possible, and pass to set_pages along with the pages.
v4: bunch of improvements from Joonas
v5: fix num_pages blunder
introduce i915_sg_page_sizes helper
v6: prefer GEM_BUG_ON(sizes == 0)
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel@ffwll.ch>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-7-matthew.auld@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006221833.32439-6-chris@chris-wilson.co.uk
In preparation for huge gtt pages expose page_sizes as part of the
device info, to indicate the page sizes supported by the HW. Currently
only 4K is supported.
v2: s/page_size_mask/page_sizes/
v3: introduce I915_GTT_MAX_PAGE_SIZE
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-5-matthew.auld@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006221833.32439-4-chris@chris-wilson.co.uk