The code used by the DP and HDMI paths was very similar, so make them
share it. Note that this removes the write to signal level registers
from the HDMI pre pll enable path, but that's OK since those are set
in vlv_hdmi_pre_enable() function.
Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Reviewed-by: Jim Bride <jim.bride@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461761065-21195-9-git-send-email-ander.conselvan.de.oliveira@intel.com
The logic for setting signal levels is used for both HDMI and DP with
small variations. But it is similar enough to put behind a function
called from the encoders.
v2: Remove unrelated MST changes due to rebase fumble. (Jim Bride)
Fix typo in the commit message. (Jim Bride)
v3: Really fix the typo. (Jim)
Cc: Jim Bride <jim.bride@linux.intel.com>
Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Reviewed-by: Jim Bride <jim.bride@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461761065-21195-8-git-send-email-ander.conselvan.de.oliveira@intel.com
The exact same code was used by HDMI and DP encoders, so move it to
intel_dpio_phy.c.
v2: Fix typo in the commit message. (Jim Bride)
v3: Call the new function chv_phy_post_pll_disable() instead of
chv_phy_post_disable(), as it should be called after the pll
is disabled. (Ville)
Cc: Jim Bride <jim.bride@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Reviewed-by: Jim Bride <jim.bride@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461761065-21195-7-git-send-email-ander.conselvan.de.oliveira@intel.com
The only difference between the DP and HDMI versions was the lane count.
Since lane_count is now set appropriately for HDMI too, get rid of the
duplication and move this to intel_dpio_phy.c
v2: Don't move comments about 2nd common lane staying alive. (Ville)
Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Reviewed-by: Jim Bride <jim.bride@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461761065-21195-6-git-send-email-ander.conselvan.de.oliveira@intel.com
Set the lane count for HDMI to 4. This will make it easier to
unduplicate CHV phy code.
This also fixes the the soft reset programming for HDMI with CHV. After
commit a8f327fb84 ("drm/i915: Clean up CHV lane soft reset
programming"), it wouldn't set the right bits for PCS23 since it relied
on a lane count that was never set.
v2: Set lane_count in *_get_config() to please state checker. (0day)
v3: Set lane_count for DDI in DVI mode too. (CI)
v4: Add note about CHV soft lane reset. (Ander)
Fixes: a8f327fb84 ("drm/i915: Clean up CHV lane soft reset programming")
Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Reviewed-by: Jim Bride <jim.bride@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461761065-21195-2-git-send-email-ander.conselvan.de.oliveira@intel.com
The comment about GMBUSFREQ is confused. The spec actually explains
the 4MHz thing perfectly by noting that the 4MHz divider values is
actually just bits [9:2] not [9:0], hence the divide by 1000 correct.
Replace the confused note with a quote from the spec, and eliminate
the duplicated comment that snuck in.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461689194-6079-4-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Mika Kahola <mika.kahola@intel.com>
Update CDCLK_FREQ on BDW after changing the cdclk frequency. Not sure
if this is a late addition to the spec, or if I simply overlooked this
step when writing the original code.
This is what Bspec has to say about CDCLK_FREQ:
"Program this field to the CD clock frequency minus one. This is used to
generate a divided down clock for miscellaneous timers in display."
And the "Broadwell Sequences for Changing CD Clock Frequency" section
clarifies this further:
"For CD clock 337.5 MHz, program 337 decimal.
For CD clock 450 MHz, program 449 decimal.
For CD clock 540 MHz, program 539 decimal.
For CD clock 675 MHz, program 674 decimal."
Cc: stable@vger.kernel.org
Cc: Mika Kahola <mika.kahola@intel.com>
Fixes: b432e5cfd5 ("drm/i915: BDW clock change support")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461689194-6079-2-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Mika Kahola <mika.kahola@intel.com>
In BXT DSI there is no regs programmed with few horizontal timings
in Pixels but txbyteclkhs.. So retrieval process adds some
ROUND_UP ERRORS in the process of PIXELS<==>txbyteclkhs.
Actually here for the given adjusted_mode, we are calculating the
value programmed to the port and then back to the horizontal timing
param in pixels. This is the expected value at the end of get_config,
including roundup errors. And if that is same as retrieved value
from port, then retrieved (HW state) adjusted_mode's horizontal
timings are corrected to match with SW state to nullify the errors.
Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461053894-5058-2-git-send-email-ramalingam.c@intel.com
Both execlists and legacy need to reset the context (and mode) of the
GPU before we lose control of the system. By resetting the GPU, we
revert back to default settings. This simplifies the life of any
subsequent driver (in particular for virtualized setups) as it does not
then have to try and recover from an unknown condition. As both paths
need to reset for the same reason, move the reset to a common point.
This unifies the resets added in a647828afc (drm/i915: Also perform gpu
reset under execlist mode) and 8e96d9c4d9 (drm/i915: reset the GPU on
context fini).
v2: Restrict the reset to "modern" gen (where we enable HW contexts) to
try and avoid leaving the machine in an unusable state with a risky
reset on older GPU. This should keep the status quo as to who performs
resets (i.e. currently only GPUs with HW contexts perform a reset on
shutdown).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
CC: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: "Niu, Bing" <bing.niu@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461833819-3991-25-git-send-email-chris@chris-wilson.co.uk
With the previous patch having extended the pinned lifetime of
contexts by referencing the previous context from the current
request until the latter is retired (completed by the GPU),
we can now remove usage of execlist retired queue entirely.
This is because the above now guarantees that all execlist
object access requirements are satisfied by this new tracking,
and we can stop taking additional references and stop keeping
request on the execlists retired queue.
The latter was a source of significant scalability issues in
the driver causing performance hits on some tests. Most
dramatical of which was igt/gem_close_race which had run time
in tens of minutes which is now reduced to tens of seconds.
Signed-off-by: Tvrtko Ursulin <tvrtko@ursulin.net>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1461833819-3991-24-git-send-email-chris@chris-wilson.co.uk
As the contexts are accessed by the hardware until the switch is completed
to a new context, the hardware may still be writing to the context object
after the breadcrumb is visible. We must not unpin/unbind/prune that
object whilst still active and so we keep the previous context pinned until
the following request. We can generalise the tracking we already do via
the engine->last_context and move it to the request so that it works
equally for execlists and GuC.
v2: Drop the execlists double pin as that exposes a race inside the lrc
irq handler as it tries to access the context after it may be retired.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461833819-3991-22-git-send-email-chris@chris-wilson.co.uk
If we move the release of the GEM request (i.e. decoupling it from the
various lists used for client and context tracking) after it is complete
(either by the GPU retiring the request, or by the caller cancelling the
request), we can remove the requirement that the final unreference of
the GEM request need to be under the struct_mutex.
The careful reader may notice that one or two impossible NULL pointer
tests are dropped for readability. These pointers cannot be NULL since
they are assigned during request construction and never unset.
v2,v3: Rebalance execlists by moving the context unpinning.
v4: Rebase onto -nightly
v5: Avoid trying to rebalance execlist/GuC context pinning, leave that
to the next step
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461833819-3991-21-git-send-email-chris@chris-wilson.co.uk
Refactor pinning and unpinning of contexts, such that the default
context for an engine is pinned during initialisation and unpinned
during teardown (pinning of the context handles the reference counting).
Thus we can eliminate the special case handling of the default context
that was required to mask that it was not being pinned normally.
v2: Rebalance context_queue after rebasing.
v3: Rebase to -nightly (not 40 patches in)
v4: Rebase onto request_alloc unwinding
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461833819-3991-19-git-send-email-chris@chris-wilson.co.uk
The hardware tracks contexts and expects all live contexts (those active
on the hardware) to have a unique identifier. This is used by the
hardware to assign pagefaults and the like to a particular context.
v2: Reorder to make sure ctx->link is not left dangling if the
assignment of a hw_id fails (Mika).
v3: We have 21bits of context space, not 20.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461833819-3991-17-git-send-email-chris@chris-wilson.co.uk
Rather than being interrupted when we run out of space halfway through
the request, and having to restart from the beginning (and returning to
userspace), flush a little more free space when we prepare the request.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461833819-3991-15-git-send-email-chris@chris-wilson.co.uk
In the next patches, we want to move the work out of freeing the request
and into its retirement (so that we can free the request without
requiring the struct_mutex). This means that we cannot rely on
unreferencing the request to completely teardown the request any more
and so we need to manually unwind the failed allocation. In doing so, we
reorder the allocation in order to make the unwind simple (and ensure
that we don't try to unwind a partial request that may have modified
global state) and so we end up pushing the initial preallocation down
into the engine request initialisation functions where we have the
requisite control over the state of the request.
Moving the initial preallocation into the engine is less than ideal: it
moves logic to handle a specific problem with request handling out of
the common code. On the other hand, it does allow those backends
significantly more flexibility in performing its allocations.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461833819-3991-14-git-send-email-chris@chris-wilson.co.uk
Now that we share intel_ring_begin(), reserving space for the tail of
the request is identical between legacy/execlists and so the tautology
can be removed. In the process, we move the reserved space tracking
from the ringbuffer on to the request. This is to enable us to reorder
the reserved space allocation in the next patch.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461833819-3991-13-git-send-email-chris@chris-wilson.co.uk
Combine the near identical implementations of intel_logical_ring_begin()
and intel_ring_begin() - the only difference is that the logical wait
has to check for a matching ring (which is assumed by legacy).
In the process some debug messages are culled as there were following a
WARN if we hit an actual error.
v2: Updated commentary
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461833819-3991-12-git-send-email-chris@chris-wilson.co.uk
The code to switch_mm() is already handled by i915_switch_context(), the
only difference required to setup the aliasing ppgtt is that we need to
emit te switch_mm() on the first context, i.e. when transitioning from
engine->last_context == NULL. This allows us to defer the
initialisation of the GPU from early device initialisation to first use,
which should marginally speed up both. The caveat is that we then defer
the context initialisation until first use - i.e. we cannot assume that
the GPU engines are initialised. For example, this means that power
contexts for rc6 (Ironlake) need to explicitly loaded, as they are.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461833819-3991-11-git-send-email-chris@chris-wilson.co.uk
Since we do the l3-remap on context switch, we can remove the redundant
early call to set the mapping prior to performing the first context
switch.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461833819-3991-10-git-send-email-chris@chris-wilson.co.uk
We can use a single MI_LOAD_REGISTER_IMM command packet to write all the
L3 remapping registers, shrinking the number of bytes required to emit
the context switch.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461833819-3991-9-git-send-email-chris@chris-wilson.co.uk
In order to force a reload of the context image upon resume, we first
need to mark its absence on suspend. Currently we are failing to restore
the golden context state and any context w/a to the default context
after resume.
One oversight corrected, is that we had forgotten to reapply the L3
remapping when restoring the lost default context.
v2: Remove deprecated WARN.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461833819-3991-7-git-send-email-chris@chris-wilson.co.uk
Similarly to i915_gem_object_pin_map on LLC platforms, we can
use the new VMA based io mapping on !LLC to amoritize the cost
of ringbuffer pinning and unpinning.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461833819-3991-6-git-send-email-chris@chris-wilson.co.uk
By tracking the iomapping on the VMA itself, we can share that area
between multiple users. Also by only revoking the iomapping upon
unbinding from the mappable portion of the GGTT, we can keep that iomap
across multiple invocations (e.g. execlists context pinning).
Note that by moving the iounnmap tracking to the VMA, we actually end up
fixing a leak of the iomapping in intel_fbdev.
v1.5: Rebase prompted by Tvrtko
v2: Drop dev_priv parameter, we can recover the i915_ggtt from the vma.
v3: Move handling of ioremap space exhaustion to vmap_purge and also
allow vmallocs to recover old iomaps. Add Tvrtko's kerneldoc.
v4: Fix a use-after-free in shrinker and rearrange i915_vma_iomap
v5: Back to i915_vm_to_ggtt
v6: Use i915_vma_pin_iomap and i915_vma_unpin_iomap to mark critical
sections and ensure the VMA cannot be reaped whilst mapped.
v7: Move i915_vma_iounmap so that consumers of the API are not tempted,
and add iomem annotations
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461833819-3991-5-git-send-email-chris@chris-wilson.co.uk
In a couple of places, we have an i915_address_space that we know is
really an i915_ggtt that we want to use. Create an inline helper to
convert from the i915_address_space subclass into its container.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461833819-3991-4-git-send-email-chris@chris-wilson.co.uk
The ioremap() hidden behind the io_mapping_map_wc() convenience helper
can be used for remapping multiple pages. Extend the helper so that
future callers can use it for larger ranges.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Yishai Hadas <yishaih@mellanox.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
Cc: David Hildenbrand <dahi@linux.vnet.ibm.com>
Cc: Luis R. Rodriguez <mcgrof@kernel.org>
Cc: intel-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Cc: netdev@vger.kernel.org
Cc: linux-rdma@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Reviewed-by: Luis R. Rodriguez <mcgrof@kernel.org>
Link: http://patchwork.freedesktop.org/patch/msgid/1461833819-3991-3-git-send-email-chris@chris-wilson.co.uk
When setting up the overlay page, we pin it into the GGTT (when using
virtual addresses) and store the offset as overlay->flip_addr. Rather
than doing a lookup of the GGTT address everytime, we can use the known
address instead.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461833819-3991-2-git-send-email-chris@chris-wilson.co.uk
Propagate the real error from drm_gem_object_init(). Note this also
fixes some confusion in the error return from i915_gem_alloc_object...
v2:
(Matthew Auld)
- updated new users of gem_alloc_object from latest drm-nightly
- replaced occurrences of IS_ERR_OR_NULL() with IS_ERR()
v3:
(Joonas Lahtinen)
- fix double "From:" in commit message
- add goto teardown path
v4:
(Matthew Auld)
- rebase with i915_gem_alloc_object name change
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461587533-8841-1-git-send-email-matthew.auld@intel.com
[Joonas: Removed spurious " = NULL" from _init() function]
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Faced with sporadic machine hangs on gen7, that mimic the issue of
concurrent writes to the same cacheline and seem to start with
commit 9b9ed30936 (drm/i915: Remove forcewake dance from seqno/irq
barrier on legacy gen6+), let us restore the spinlock around the mmio
read.
Fixes: 9b9ed30936 (drm/i915: Remove forcewake dance from seqno/irq...)
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461744121-27051-1-git-send-email-chris@chris-wilson.co.uk
Tested-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
I just noticed that VLV/CHV have a RAWCLK_FREQ register just like PCH
platforms. It lives in the display power well, so we should update it
when enabling the power well.
Interestingly the BIOS seems to leave it at the reset value (125) which
doesn't match the rawclk frequency on VLV/CHV (200 MHz). As always with
these register, the spec is extremely vague what the register does. All
it says is: "This is used to generate a divided down clock for
miscellaneous timers in display." Based on a quick test, at least AUX
and PWM appear to be unaffected by this.
But since the register is there, let's configure it in accordance with
the spec.
Note that we have to move intel_update_rawclk() to occur before we
touch the power wells, so that the dev_priv->rawclk_freq is already
populated when the disp2 enable hook gets called for the first time.
I think this should be safe to do on other platforms as well.
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461768202-17544-1-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Check for VLV/CHV instead if !BXT when re-enabling DPOunit clock gating
after DSI disable. That's what we checked when disabling the clock
gating when enabling DSI.
Also use the same temporary variable name in both cases, and toss in a
bit of dev vs. dev_priv cleanup while at it.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460996305-30453-1-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
amdgpu gained dev->struct_mutex usage, and that's because it's walking
the dev->filelist list. Protect that list with it's own lock to take
one more step towards getting rid of struct_mutex usage in drivers
once and for all.
While doing the conversion I noticed that 2 debugfs files in i915
completely lacked appropriate locking. Fix that up too.
v2: don't forget to switch to drm_gem_object_unreference_unlocked.
Cc: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461691808-12414-9-git-send-email-daniel.vetter@ffwll.ch
Somehow my SNB GT1 (Dell XPS 8300) gets very unhappy around
GPU hangs if the RPS EI/thresholds aren't suitably aligned.
It seems like scheduling/timer interupts stop working somehow
and things get stuck eg. in usleep_range().
I bisected the problem down to
commit 8a5864377b ("drm/i915/skl: Restructured the gen6_set_rps_thresholds function")
I observed that before all the values were at least multiples of 25,
but afterwards they are not. And rounding things up to the next multiple
of 25 does seem to help, so lets' do that. I also tried roundup(..., 5)
but that wasn't sufficient. Also I have no idea if we might need this sort of
thing on gen9+ as well.
These are the original EI/thresholds:
LOW_POWER
GEN6_RP_UP_EI 12500
GEN6_RP_UP_THRESHOLD 11800
GEN6_RP_DOWN_EI 25000
GEN6_RP_DOWN_THRESHOLD 21250
BETWEEN
GEN6_RP_UP_EI 10250
GEN6_RP_UP_THRESHOLD 9225
GEN6_RP_DOWN_EI 25000
GEN6_RP_DOWN_THRESHOLD 18750
HIGH_POWER
GEN6_RP_UP_EI 8000
GEN6_RP_UP_THRESHOLD 6800
GEN6_RP_DOWN_EI 25000
GEN6_RP_DOWN_THRESHOLD 15000
These are after 8a5864377b:
LOW_POWER
GEN6_RP_UP_EI 12500
GEN6_RP_UP_THRESHOLD 11875
GEN6_RP_DOWN_EI 25000
GEN6_RP_DOWN_THRESHOLD 21250
BETWEEN
GEN6_RP_UP_EI 10156
GEN6_RP_UP_THRESHOLD 9140
GEN6_RP_DOWN_EI 25000
GEN6_RP_DOWN_THRESHOLD 18750
HIGH_POWER
GEN6_RP_UP_EI 7812
GEN6_RP_UP_THRESHOLD 6640
GEN6_RP_DOWN_EI 25000
GEN6_RP_DOWN_THRESHOLD 15000
And these are what we have after this patch:
LOW_POWER
GEN6_RP_UP_EI 12500
GEN6_RP_UP_THRESHOLD 11875
GEN6_RP_DOWN_EI 25000
GEN6_RP_DOWN_THRESHOLD 21250
BETWEEN
GEN6_RP_UP_EI 10175
GEN6_RP_UP_THRESHOLD 9150
GEN6_RP_DOWN_EI 25000
GEN6_RP_DOWN_THRESHOLD 18750
HIGH_POWER
GEN6_RP_UP_EI 7825
GEN6_RP_UP_THRESHOLD 6650
GEN6_RP_DOWN_EI 25000
GEN6_RP_DOWN_THRESHOLD 15000
Cc: stable@vger.kernel.org
Cc: Akash Goel <akash.goel@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Testcase: igt/kms_pipe_crc_basic/hang-read-crc-pipe-B
Fixes: 8a5864377b ("drm/i915/skl: Restructured the gen6_set_rps_thresholds function")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461159836-9108-1-git-send-email-ville.syrjala@linux.intel.com
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Patrik Jakobsson <patrik.jakobsson@linux.intel.com>
(cherry picked from commit 8a292d016d)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
This patch does the following:
- Fakes live status of HDMI as connected (even if that's not).
While testing certain (monitor + cable) combinations with
various intel platforms, it seems that live status register
doesn't work reliably on some older devices. So limit the
live_status check for HDMI detection, only for platforms
from gen7 onwards.
V2: restrict faking live_status to certain platforms
V3: (Ville)
- keep the debug message for !live_status case
- fix indentation of comment
- remove "warning" from the debug message
(Jani)
- Change format of fix details in the commit message
Fixes: 237ed86c69 ("drm/i915: Check live status before reading edid")
Cc: stable@vger.kernel.org # v4.4
Suggested-by: Ville Syrjala <ville.syrjala@linux.intel.com>
Signed-off-by: Shashank Sharma <shashank.sharma@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461237606-16491-1-git-send-email-shashank.sharma@intel.com
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
(cherry picked from commit 4f4a818501)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
It was noticed on bug #94087 that module parameter
i915.edp_vswing=2 that should override the VBT setting
to use default voltage swing (400 mV) was not applied
for Broadwell.
This patch provides a fix for this by checking if default
i.e. higher voltage swing is requested to be used and
applies the DDI translations table for DP instead of eDP
(low vswing) table.
v2: Combine two if statements into one (Jani)
v3: Change dev_priv->edp_low_vswing to use dev_priv->vbt.edp.low_vswing
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94087
Signed-off-by: Mika Kahola <mika.kahola@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461155942-7749-1-git-send-email-mika.kahola@intel.com
Cc: stable@vger.kernel.org
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
(cherry picked from commit 0098351921)
[Jani: s/dev_priv->vbt.edp.low_vswing/dev_priv->edp_low_vswing/ to backport]
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
The driver's VDD on/off logic assumes that whenever the VDD is on we
also hold an AUX power domain reference. Since BIOS can leave the VDD on
during booting and resuming and on DDI platforms we won't take a
corresponding power reference, the above assumption won't hold on those
platforms and an eventual delayed VDD off work will do an extraneous AUX
power domain put resulting in a refcount underflow. Fix this the same
way we did this for non-DDI DP encoders:
commit 6d93c0c417 ("drm/i915: fix VDD state tracking after system
resume")
At the same time call the DP encoder suspend handler the same way as the
non-DDI DP encoders do to flush any pending VDD off work. Leaving the
work running may cause a HW access where we don't expect this (at a point
where power domains are suspended already).
While at it remove an unnecessary function call indirection.
This fixed for me AUX refcount underflow problems on BXT during
suspend/resume.
CC: Ville Syrjälä <ville.syrjala@linux.intel.com>
CC: stable@vger.kernel.org
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460963062-13211-4-git-send-email-imre.deak@intel.com
(cherry picked from commit bf93ba67e9)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
During system resume we depended on pci_enable_device() also putting the
device into PCI D0 state. This won't work if the PCI device was already
enabled but still in D3 state. This is because pci_enable_device() is
refcounted and will not change the HW state if called with a non-zero
refcount. Leaving the device in D3 will make all subsequent device
accesses fail.
This didn't cause a problem most of the time, since we resumed with an
enable refcount of 0. But it fails at least after module reload because
after that we also happen to leak a PCI device enable reference: During
probing we call drm_get_pci_dev() which will enable the PCI device, but
during device removal drm_put_dev() won't disable it. This is a bug of
its own in DRM core, but without much harm as it only leaves the PCI
device enabled. Fixing it is also a bit more involved, due to DRM
mid-layering and because it affects non-i915 drivers too. The fix in
this patch is valid regardless of the problem in DRM core.
v2:
- Add a code comment about the relation of this fix to the freeze/thaw
vs. the suspend/resume phases. (Ville)
- Add a code comment about the inconsistent ordering of set power state
and device enable calls. (Chris)
CC: Ville Syrjälä <ville.syrjala@linux.intel.com>
CC: Chris Wilson <chris@chris-wilson.co.uk>
CC: stable@vger.kernel.org
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460979954-14503-1-git-send-email-imre.deak@intel.com
(cherry picked from commit 44410cd0bf)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
The legacy cursor ioctl expects to be asynchronous with respect to other
screen updates, in particular page flips. As X updates the cursor from a
signal context, if the cursor blocks then it will stall both the input
and output chains causing bad stuttering and horrible UX.
Reported-and-tested-by: Rafael Ristovski <rafael.ristovski@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94980
Fixes: 5008e874ed ("drm/i915: Make wait_for_flips interruptible.")
Suggested-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: stable@vger.kernel.org
Link: http://patchwork.freedesktop.org/patch/msgid/1460922166-20292-1-git-send-email-chris@chris-wilson.co.uk
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
(cherry picked from commit acf4e84d61)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Add support for CHV gpio programming in DSI gpio elements.
v2: Overhaul macros according to Ville's review.
v3: Address Ville's review:
- swap E and SE gpio ranges
- add a note about max SE index
- use GPO, not HIZ
- swap cfg0 and cfg1
v4: fix port for dsi sequence versions 1 and 2
[Rewritten by Jani, based on earlier work by Yogesh and Deepak.]
Signed-off-by: Yogesh Mohan Marimuthu <yogesh.mohan.marimuthu@intel.com>
Signed-off-by: Deepak M <m.deepak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/bdaaf9915a5005305b31bb26cf619a5a82472f2a.1461666263.git.jani.nikula@intel.com
For dual link panel scenarios there are new fields added in the
VBT which indicate on which port the PWM cntrl and CABC ON/OFF
commands needs to be sent.
v2: Moving the comment to intel_dsi.h(Jani)
v3: Renaming the field names (Jani)
v4 by Jani: make this patch only about VBT
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Yetunde Adebisi <yetundex.adebisi@intel.com>
Signed-off-by: Deepak M <m.deepak@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459346623-30752-2-git-send-email-jani.nikula@intel.com
This patch adds support for eDP backlight control using DPCD registers to
backlight hooks in intel_panel.
It checks for backlight control over AUX channel capability and sets up
function pointers to get and set the backlight brightness level if
supported.
v2: Moved backlight functions from intel_dp.c into a new file
intel_dp_aux_backlight.c. Also moved reading of eDP display control
registers to intel_dp_get_dpcd
v3: Correct some formatting mistakes
v4: Updated to use AUX backlight control if PWM control is not possible
(Jani)
v5: Moved call to initialize backlight registers to dp_aux_setup_backlight
v6: Check DP_EDP_BACKLIGHT_PIN_ENABLE_CAP is disabled before setting up AUX
backlight control. To fix BLM_PWM_ENABLE igt test warnings on bdw_ultra
v7: Add enable_dpcd_backlight module parameter.
v8: Rebase onto latest drm-intel-nightly branch
v9: Remove changes to intel_dp_dpcd_read_wake
Split addition edp_dpcd variable into a separate patch
Cc: Bob Paauwe <bob.j.paauwe@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Yetunde Adebisi <yetundex.adebisi@intel.com>
[Jani: whitepace changes to appease checkpatch]
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459865452-9138-4-git-send-email-yetundex.adebisi@intel.com
Can use the new vma->is_ggtt to simplify the check and
remove the local variables.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Dave Gordon <david.s.gordon@intel.com>
Can use vma->is_ggtt to simplify the check and also switch the
BUG_ON to GEM_BUG_ON which is more appropriate for this.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Dave Gordon <david.s.gordon@intel.com>
Only caller is i915_gem_obj_ggtt_size which only cares about
GGTT so simplify it and implement under that name.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Because having both i915_gem_object_alloc() and i915_gem_alloc_object()
(with different return conventions) is just too confusing!
(i915_gem_object_alloc() is the low-level memory allocator, and remains
unchanged, whereas i915_gem_alloc_object() is a constructor that ALSO
initialises the newly-allocated object.)
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461348872-4702-1-git-send-email-david.s.gordon@intel.com
Move the better constructs/comments from i915_gem_stolen.c to
early-quirks.c and increase readability in preparation of only
having one set of functions.
- intel_stolen_base -> gen3_stolen_base
- use phys_addr_t instead of u32 for address for future proofing
v2:
- Print the invalid register values (Chris)
(Omitting the register prefix as it's visible from backtrace.)
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
This patch applies a performance enhancement workaround
based on analysis of DX and OCL S-Curve workloads. We
increase the General Priority Credits for L3SQ from the
hardware default of 56 to the max value 62, and decrease
the High Priority credits from 8 to 2.
v2: Only apply to B0 onwards
v3: Move w/a to per engine init, ie bxt_init_workarounds
Signed-off-by: Tim Gore <tim.gore@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461314761-36854-1-git-send-email-tim.gore@intel.com
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
As a part of WaGsvDisableTurbo, Driver makes an early exit from the
Gen9 Turbo enabling function, so doesn't program the Turbo Control register.
But BIOS could leave the Hw Turbo as enabled, so need to explicitly clear
out the Control register just to avoid inconsitency with debugfs
interface, which will show Turbo as enabled only and that is not expected
after adding the WaGsvDisableTurbo. Apart from this there is no problem
even if the Turbo is left enabled in the Control register, as the Up/Down
interrupts would remain masked.
v2: Add explicit clearing of Turbo Control register to *_disable_rps()
also for the similar consistency (Chris)
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Akash Goel <akash.goel@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1461350146-23454-2-git-send-email-akash.goel@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
There are certain registers, which captures the time elapsed in the
in current Up/Down EI, for how long GT has been Idle/Busy/Avg in the
current Up/Down EI and also in the previous Up/Down EI.
These register values are reported by the i915_frequency_info debugfs
interface. The Driver prints the 'us' suffix after the values, albeit
they are actually in raw form & not in microsecond units.
This patch removes the 'us' suffix so that its clear to User that values
are indeed in raw form.
v2: Present the values in microseconds unit also, after platform
specific conversion (Chris)
v3: Add a space between raw & microsecond value (Chris)
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Akash Goel <akash.goel@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1461350146-23454-3-git-send-email-akash.goel@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Added a new GT_PM_INTERVAL_TO_US macro to perform the platform
specific conversion of PM time interval values to microseconds unit.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Akash Goel <akash.goel@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1461350146-23454-1-git-send-email-akash.goel@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Somehow my SNB GT1 (Dell XPS 8300) gets very unhappy around
GPU hangs if the RPS EI/thresholds aren't suitably aligned.
It seems like scheduling/timer interupts stop working somehow
and things get stuck eg. in usleep_range().
I bisected the problem down to
commit 8a5864377b ("drm/i915/skl: Restructured the gen6_set_rps_thresholds function")
I observed that before all the values were at least multiples of 25,
but afterwards they are not. And rounding things up to the next multiple
of 25 does seem to help, so lets' do that. I also tried roundup(..., 5)
but that wasn't sufficient. Also I have no idea if we might need this sort of
thing on gen9+ as well.
These are the original EI/thresholds:
LOW_POWER
GEN6_RP_UP_EI 12500
GEN6_RP_UP_THRESHOLD 11800
GEN6_RP_DOWN_EI 25000
GEN6_RP_DOWN_THRESHOLD 21250
BETWEEN
GEN6_RP_UP_EI 10250
GEN6_RP_UP_THRESHOLD 9225
GEN6_RP_DOWN_EI 25000
GEN6_RP_DOWN_THRESHOLD 18750
HIGH_POWER
GEN6_RP_UP_EI 8000
GEN6_RP_UP_THRESHOLD 6800
GEN6_RP_DOWN_EI 25000
GEN6_RP_DOWN_THRESHOLD 15000
These are after 8a5864377b:
LOW_POWER
GEN6_RP_UP_EI 12500
GEN6_RP_UP_THRESHOLD 11875
GEN6_RP_DOWN_EI 25000
GEN6_RP_DOWN_THRESHOLD 21250
BETWEEN
GEN6_RP_UP_EI 10156
GEN6_RP_UP_THRESHOLD 9140
GEN6_RP_DOWN_EI 25000
GEN6_RP_DOWN_THRESHOLD 18750
HIGH_POWER
GEN6_RP_UP_EI 7812
GEN6_RP_UP_THRESHOLD 6640
GEN6_RP_DOWN_EI 25000
GEN6_RP_DOWN_THRESHOLD 15000
And these are what we have after this patch:
LOW_POWER
GEN6_RP_UP_EI 12500
GEN6_RP_UP_THRESHOLD 11875
GEN6_RP_DOWN_EI 25000
GEN6_RP_DOWN_THRESHOLD 21250
BETWEEN
GEN6_RP_UP_EI 10175
GEN6_RP_UP_THRESHOLD 9150
GEN6_RP_DOWN_EI 25000
GEN6_RP_DOWN_THRESHOLD 18750
HIGH_POWER
GEN6_RP_UP_EI 7825
GEN6_RP_UP_THRESHOLD 6650
GEN6_RP_DOWN_EI 25000
GEN6_RP_DOWN_THRESHOLD 15000
Cc: stable@vger.kernel.org
Cc: Akash Goel <akash.goel@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Testcase: igt/kms_pipe_crc_basic/hang-read-crc-pipe-B
Fixes: 8a5864377b ("drm/i915/skl: Restructured the gen6_set_rps_thresholds function")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461159836-9108-1-git-send-email-ville.syrjala@linux.intel.com
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Patrik Jakobsson <patrik.jakobsson@linux.intel.com>
This patch does the following:
- Fakes live status of HDMI as connected (even if that's not).
While testing certain (monitor + cable) combinations with
various intel platforms, it seems that live status register
doesn't work reliably on some older devices. So limit the
live_status check for HDMI detection, only for platforms
from gen7 onwards.
V2: restrict faking live_status to certain platforms
V3: (Ville)
- keep the debug message for !live_status case
- fix indentation of comment
- remove "warning" from the debug message
(Jani)
- Change format of fix details in the commit message
Fixes: 237ed86c69 ("drm/i915: Check live status before reading edid")
Cc: stable@vger.kernel.org # v4.4
Suggested-by: Ville Syrjala <ville.syrjala@linux.intel.com>
Signed-off-by: Shashank Sharma <shashank.sharma@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461237606-16491-1-git-send-email-shashank.sharma@intel.com
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Since we've fixed up drm_dp_dpcd_read() to allow for retries when things
timeout, there's no use for having this function anymore. Good riddens.
Signed-off-by: Lyude <cpaul@redhat.com>
Tested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1460559513-32280-5-git-send-email-cpaul@redhat.com
It's possible that BIOS enables PHY0, but it programmes only the first
channel on it. Since we program the PHYs only during driver loading this
is an incorrect configuration from the driver's point of view, since we
may use both channels eventually. Detect this scenario and force
reprogramming the PHY in this case.
The actual scenario for me was that the lane optimization for the second
channel in PHY0 was not setup by BIOS and so a state verification
warning was triggered. Everything else was setup properly.
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461174366-16758-4-git-send-email-imre.deak@intel.com
If we skipped PHY0 initialization because it was already enabled by
BIOS, we still have to wait for the PHY1 GRC calibration as that is
done as part of the PHY0 init.
v2:
- Use the actual PHY index in the debug message in
broxton_phy_wait_grc_done() (Ville)
CC: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461255561-1644-1-git-send-email-imre.deak@intel.com
It's possible that BIOS enables PHY1 only to read out the GRC value from
it to be used in PHY0 and then disables PHY1. In this case we can't use
the PHY1 GRC value for state verification, so use instead the one in PHY0
always.
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461174366-16758-2-git-send-email-imre.deak@intel.com
Remove dev local and use to_i915() in gen8_ppgtt_notify_vgt.
v2: use dev_priv directly for QUESTION_MACROS (Joonas Lahtinen)
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461323365-21256-1-git-send-email-matthew.auld@intel.com
Right after runtime resume we know that we can re-enable DC5, since we
just disabled DC9 and power well 2 is disabled. So enable DC5 explicitly
instead of delaying this until the next time we disable power well 2.
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Bob Paauwe <bob.j.paauwe@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461173277-16090-5-git-send-email-imre.deak@intel.com
After suspend-to-ram or -disk we don't know what power state the display
HW will be, DC0 or DC9 are both possible states, so reset the software
DC state tracking in these cases. This gets rid of 'DC state mismatch'
error messages during resuming from ram or disk where we expected to be
in DC9 (as set by the suspend handler) but we are in DC0.
v2:
- Remove extra WS in gen9_sanitize_dc_state() (Bob)
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Bob Paauwe <bob.j.paauwe@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461173277-16090-4-git-send-email-imre.deak@intel.com
Initially we thought that the platform specific suspend/resume sequences
can be shared between the runtime and system suspend/resume handlers.
This turned out to be not true, we have quite a few differences on most
of the platforms. This was realized already earlier by Paulo who
inlined the platform specific resume_prepare handlers. We have the
same problem with the corresponding suspend_complete handlers, there are
platform differences that make it unfeasible to share the code between
the runtime and system suspend paths. Also now we call functions that
need to be paired like hsw_enable_pc8()/hsw_disable_pc8() from different
levels of the call stack, which is confusing. Fix this by inlining the
suspend_complete handlers too.
This is also needed by the next patch that removes a redundant
uninit/init call during system suspend/resume on BXT.
No functional change.
CC: Paulo Zanoni <przanoni@gmail.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Bob Paauwe <bob.j.paauwe@intel.com>
[s/uninline/inline in the commit message]
Link: http://patchwork.freedesktop.org/patch/msgid/1461173277-16090-2-git-send-email-imre.deak@intel.com
Avoids drivers knowing where the kref is stored.
[airlied: add kerneldoc]
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@redhat.com>
- make modeset hw state checker atomic aware (Maarten)
- close races in gpu stuck detection/seqno reading (Chris)
- tons&tons of small improvements from Chris Wilson all over the gem code
- more dsi/bxt work from Ramalingam&Jani
- macro polish from Joonas
- guc fw loading fixes (Arun&Dave)
- vmap notifier (acked by Andrew) + i915 support by Chris Wilson
- create bottom half for execlist irq processing (Chris Wilson)
- vlv/chv pll cleanup (Ville)
- rework DP detection, especially sink detection (Shubhangi Shrivastava)
- make color manager support fully atomic (Maarten)
- avoid livelock on chv in execlist irq handler (Chris)
* tag 'drm-intel-next-2016-04-11' of git://anongit.freedesktop.org/drm-intel: (82 commits)
drm/i915: Update DRIVER_DATE to 20160411
drm/i915: Avoid allocating a vmap arena for a single page
drm,i915: Introduce drm_malloc_gfp()
drm/i915/shrinker: Restrict vmap purge to objects with vmaps
drm/i915: Refactor duplicate object vmap functions
drm/i915: Consolidate common error handling in intel_pin_and_map_ringbuffer_obj
drm/i915/dmabuf: Tighten struct_mutex for unmap_dma_buf
drm/i915: implement WaClearTdlStateAckDirtyBits
drm/i915/bxt: Reversed polarity of PORT_PLL_REF_SEL bit
drm/i915: Rename hw state checker to hw state verifier.
drm/i915: Move modeset state verifier calls.
drm/i915: Make modeset state verifier take crtc as argument.
drm/i915: Replace manual barrier() with READ_ONCE() in HWS accessor
drm/i915: Use simplest form for flushing the single cacheline in the HWS
drm/i915: Harden detection of missed interrupts
drm/i915: Separate out the seqno-barrier from engine->get_seqno
drm/i915: Remove forcewake dance from seqno/irq barrier on legacy gen6+
drm/i915: Fixup the free space logic in ring_prepare
drm/i915: Simplify check for idleness in hangcheck
drm/i915: Apply a mb between emitting the request and hangcheck
...
misc pull req all over. Biggest thing is the
drm_connector_(un)register_all cleanup from Alexey for drivers without the
load/unload midlayer hooks. I.e. all the new ones, and a bunch of the
pending new atomic drivers depend upon this. Or at least I asked them to
rebase ;-)
* tag 'topic/drm-misc-2016-04-21' of git://anongit.freedesktop.org/drm-intel:
drm: Make drm.debug parameter description more helpful
drm: Remove warning from drm_connector_unregister_all()
drm: probe_helper: Hide ugly ifdef
drm: rcar-du: Use generic drm_connector_register_all() helper
drm: atmel_hldc: Use generic drm_connector_register_all() helper
drm: Introduce drm_connector_register_all() helper
drm: fix lut value extraction function
drm/atomic-helper: Print an error if vblank wait times out
drm/dp/mst: Restore primary hub guid on resume
drm: Release driver references to handle before making it available again
drm/i915/dp/mst: Add source port info to debugfs output
drm/dp/mst: Enhance DP MST debugfs output
drm/edid: Add drm_edid_get_monitor_name()
include/drm: Reword debug categories comment.
drm/crtc_helper: Reset empty plane state in drm_helper_crtc_mode_set_base()
drm/virtio: Drop dummy gamma table support
drm/bochs: Drop fake gamma support
drm/core: Fix ordering in drm_mode_config_cleanup.
In commit 5f304c8736 ("drm/i915/kbl: Reset secondary power well requests
left on by DMC/KVMR") I forgot about the fact that SKL==KBL most of the
time and that a secondary MISC IO power well request left on by the DMC is
"expected". Tune down the corresponding WARN to be a debug message. This
was caught by CI suspend tests.
CC: Patrik Jakobsson <patrik.jakobsson@linux.intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Patrik Jakobsson <patrik.jakobsson@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461060036-19043-1-git-send-email-imre.deak@intel.com
It was noticed on bug #94087 that module parameter
i915.edp_vswing=2 that should override the VBT setting
to use default voltage swing (400 mV) was not applied
for Broadwell.
This patch provides a fix for this by checking if default
i.e. higher voltage swing is requested to be used and
applies the DDI translations table for DP instead of eDP
(low vswing) table.
v2: Combine two if statements into one (Jani)
v3: Change dev_priv->edp_low_vswing to use dev_priv->vbt.edp.low_vswing
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94087
Signed-off-by: Mika Kahola <mika.kahola@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461155942-7749-1-git-send-email-mika.kahola@intel.com
Cc: stable@vger.kernel.org
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
The newly-introduced function i915_gem_object_pin_map() returns an
ERR_PTR (not NULL) if the pin-and-map opertaion fails, so that's what we
must check for. And it's nicer not to assign such a pointer-or-error to
a structure being filled in until after it's been validated, so we
should keep it local and avoid exporting a bogus pointer. Also, for
clarity and symmetry, we should clear 'virtual_start' along with 'vma'
when unmapping a ringbuffer.
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Now that we keep the GuC client process descriptor permanently mapped,
we don't really need to keep a local copy of the GuC's work-queue-head.
So we can simplify the code a little by not doing this.
Signed-off-by: Alex Dai <yu.dai@intel.com>
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Don't use kmap_atomic() for doorbell & process descriptor access.
This patch fixes the BUG shown below, where the thread could sleep
while holding a kmap_atomic mapping. In order not to need to call
kmap_atomic() in this code path, we now set up a permanent kernel
mapping of the shared doorbell and process-descriptor page, and
use that in all doorbell and process-descriptor related code.
BUG: scheduling while atomic: gem_close_race/1941/0x00000002
Modules linked in: hid_generic usbhid i915 asix usbnet libphy mii
i2c_algo_bit drm_kms_helper cfbfillrect syscopyarea cfbimgblt
sysfillrect sysimgblt fb_sys_fops cfbcopyarea drm coretemp i2c_hid
hid video pinctrl_sunrisepoint pinctrl_intel acpi_pad nls_iso8859_1
e1000e ptp psmouse pps_core ahci libahci
CPU: 0 PID: 1941 Comm: gem_close_race Tainted: G U 4.4.0-160121+ #123
Hardware name: Intel Corporation Skylake Client platform/Skylake AIO
DDR3L RVP10, BIOS SKLSE2R1.R00.X100.B01.1509220551 09/22/2015
0000000000013e40 ffff880166c27a78 ffffffff81280d02 ffff880172c13e40
ffff880166c27a88 ffffffff810c203a ffff880166c27ac8 ffffffff814ec808
ffff88016b7c6000 ffff880166c28000 00000000000f4240 0000000000000001
Call Trace:
[<ffffffff81280d02>] dump_stack+0x4b/0x79
[<ffffffff810c203a>] __schedule_bug+0x41/0x4f
[<ffffffff814ec808>] __schedule+0x5a8/0x690
[<ffffffff814ec927>] schedule+0x37/0x80
[<ffffffff814ef3fd>] schedule_hrtimeout_range_clock+0xad/0x130
[<ffffffff81090be0>] ? hrtimer_init+0x10/0x10
[<ffffffff814ef3f1>] ? schedule_hrtimeout_range_clock+0xa1/0x130
[<ffffffff814ef48e>] schedule_hrtimeout_range+0xe/0x10
[<ffffffff814eef9b>] usleep_range+0x3b/0x40
[<ffffffffa01ec109>] i915_guc_wq_check_space+0x119/0x210 [i915]
[<ffffffffa01da47c>] intel_logical_ring_alloc_request_extras+0x5c/0x70 [i915]
[<ffffffffa01cdbf1>] i915_gem_request_alloc+0x91/0x170 [i915]
[<ffffffffa01c1c07>] i915_gem_do_execbuffer.isra.25+0xbc7/0x12a0 [i915]
[<ffffffffa01cb785>] ? i915_gem_object_get_pages_gtt+0x225/0x3c0 [i915]
[<ffffffffa01d1fb6>] ? i915_gem_pwrite_ioctl+0xd6/0x9f0 [i915]
[<ffffffffa01c2e68>] i915_gem_execbuffer2+0xa8/0x250 [i915]
[<ffffffffa00f65d8>] drm_ioctl+0x258/0x4f0 [drm]
[<ffffffffa01c2dc0>] ? i915_gem_execbuffer+0x340/0x340 [i915]
[<ffffffff8111590d>] do_vfs_ioctl+0x2cd/0x4a0
[<ffffffff8111eac2>] ? __fget+0x72/0xb0
[<ffffffff81115b1c>] SyS_ioctl+0x3c/0x70
[<ffffffff814effd7>] entry_SYSCALL_64_fastpath+0x12/0x6a
------------[ cut here ]------------
v4:
Only tear down doorbell & kunmap() client object if we actually
succeeded in allocating a client object (Tvrtko Ursulin)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93847
Original-version-by: Alex Dai <yu.dai@intel.com>
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Cc: Tvtrko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Since we can only swap out shmemfs objects, those are the only ones that
can influence the ability of the shrinker to free pages. Currently, all
non-shmemfs objects have a raised pages_pin_count to protect them from
the shrinker, so this just makes the logic for can_release_pages()
clearer (and safer in future so that we don't over estimate our ability
to free up pages from future non-swappable objects).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461150592-27818-3-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Inside the shrinker we call can_release_pages() to indicate whether or
not we can make forward progress in freeing up memory by unbinding that
object. When adding our report to oom, we should be using the same
logic.
Whilst here, change the reporting from bytes to pages so that it looks
smaller to the user!, is consistent with the neighbouring oom report
itself which displays counts in pages, and makes the unsigned long
overflow less likely.
v2: Split oversized format string into two lines
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461150592-27818-2-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
When iterating over the bound list, we expect all objects there to have
their pages pinned (by the bound VMA). So only report those objects with
additional pin count on their pages as "pinned". These should be those
objects used for display and hardware access.
Reported-by: Akash Goel <akash.goel@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Akash Goel <akash.goel@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461150592-27818-1-git-send-email-chris@chris-wilson.co.uk
It's racy, creating mmap offsets is a slowpath, so better to remove it
to avoid drivers doing broken things.
The only user is i915, and it's ok there because everything (well
almost) is protected by dev->struct_mutex in i915-gem.
While at it add a note in the create_mmap_offset kerneldoc that
drivers must release it again. And then I also noticed that
drm_gem_object_release entirely lacks kerneldoc.
Cc: David Herrmann <dh.herrmann@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459330852-27668-14-git-send-email-daniel.vetter@ffwll.ch
Looks like DPF was not implemented for gen8+ but the IER and IMR
are still enabled on initialization.
Since there is no code to handle this interrupt, gate the irq
enablement behind HAS_L3_DPF in case the feature gets enabled
in the future.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Since commit 30d9aa4265 ("drm/i915: Read sink_count dpcd always"),
the status of a DP connector depends on its sink count value.
However, some eDP panels don't set that value appropriately,
causing them to be reported as disconnected.
Fix this by ignoring sink count for eDP.
v2: Rephrased commit message. (Ander)
In case of eDP, returning status as connected if DPCD
read succeeds to avoid any further operations.
Fixes: 30d9aa4265 ("drm/i915: Read sink_count dpcd always")
Cc: Ander Conselvan De Oliveira <conselvan2@gmail.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Reported-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Signed-off-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com>
Signed-off-by: Shubhangi Shrivastava <shubhangi.shrivastava@intel.com>
Tested-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com>
Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460444034-22320-1-git-send-email-shubhangi.shrivastava@intel.com
In commit 7d23e3c37b ("drm/i915: Cleaning up intel_dp_hpd_pulse") some
much needed clean-up was done, but unfortunately part of the change
broke DP MST. The real issue was setting the connector state to
disconnected in the MST case, which is good, but the code then (after
a goto) checks if the connector state is not connected and shuts down
MST if this is the case, which is bad. With this change both SST and
MST seem to be happy.
v2: Add removed check further up in the function to be sure that MST
is shut down when we lose the link. (Ander)
Fixes: commit 7d23e3c37b ("drm/i915: Cleaning up intel_dp_hpd_pulse")
cc: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com>
cc: Shubhangi Shrivastava <shubhangi.shrivastava@intel.com>
cc: Ander Conselvan de Oliveira <conselvan2@gmail.com>
cc: Nathan D Ciobanu <nathan.d.ciobanu@intel.com>
Signed-off-by: Jim Bride <jim.bride@linux.intel.com>
Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com>
Reviewed-by: Lyude <cpaul@redhat.com>
Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460394684-7036-1-git-send-email-jim.bride@linux.intel.com
Do not use magic numbers, do not prefix stuff with "PCI_", do not
declare registers in implementation files. Also move the PCI
registers under correct comment in i915_reg.h.
v2:
- Consistently use BSM (not BDSM or other variants from PRM) (Chris)
- Also include register address to help identify the register (Chris)
v3:
- Refer to register value as *_val instead of *_reg (Chris)
v4:
- Make style checker happy
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
We need to kunmap pt_vaddr and not pt itself, otherwise we end up
mapping a bunch of pages without ever unmapping them.
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Fixes: d1c54acd67 ("drm/i915/gtt: Introduce kmap|kunmap for dma page")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460476663-24890-4-git-send-email-matthew.auld@intel.com
The power cycle delay starts _after_ turning off the panel power. Do the
msleep after frobbing the pmic panel power gpio.
Also toss in a FIXME about optimizing away needless waits.
Cc: Shobhit Kumar <shobhit.kumar@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Fixes: fc45e82199 ("drm/i915: Use the CRC gpio for panel enable/disable")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460996271-29795-1-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Shobhit Kumar <shobhit.kumar@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Currently we're trying to define HSW/BDW power wells by what's not
included. Let's do it the other way around, so that you can actually
tell when the power well would get enabled. This will also allow us to
add new power domains without accidentally adding it to the HSW/BDW
display power domains.
The current set of domains looks rather buggy even:
- POWER_DOMAIN_MODESET is included in the display power well needlessly
- DDI-B to DDI-E were not part of the display power well when they
should be
So let's fix that up while at it.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460977348-32260-4-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>
Currently we're using POWER_DOMAIN_MASK as the power domains for the
display power well on VLV/CHV. That includes all power domains even
though the disp2d/pipe-a power well is not needed for a lot of things.
Let's reduce these to what we actually need.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460977348-32260-3-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>
While we disable runtime PM and with that display power well support if
the DMC firmware isn't loaded, we still want to disable power wells
during system suspend and driver unload. So drop/reacquire the
corresponding power refcount during suspend/resume and driver unloading.
This also means we have to check if DMC is not loaded and skip enabling
DC states in the power well code.
v2:
- Reuse intel_csr_ucode_suspend() in intel_csr_ucode_fini() instead of
opencoding the former. (Chris)
- Add docbook comment to the public resume and suspend functions.
CC: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1460980101-14713-1-git-send-email-imre.deak@intel.com
The driver's VDD on/off logic assumes that whenever the VDD is on we
also hold an AUX power domain reference. Since BIOS can leave the VDD on
during booting and resuming and on DDI platforms we won't take a
corresponding power reference, the above assumption won't hold on those
platforms and an eventual delayed VDD off work will do an extraneous AUX
power domain put resulting in a refcount underflow. Fix this the same
way we did this for non-DDI DP encoders:
commit 6d93c0c417 ("drm/i915: fix VDD state tracking after system
resume")
At the same time call the DP encoder suspend handler the same way as the
non-DDI DP encoders do to flush any pending VDD off work. Leaving the
work running may cause a HW access where we don't expect this (at a point
where power domains are suspended already).
While at it remove an unnecessary function call indirection.
This fixed for me AUX refcount underflow problems on BXT during
suspend/resume.
CC: Ville Syrjälä <ville.syrjala@linux.intel.com>
CC: stable@vger.kernel.org
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460963062-13211-4-git-send-email-imre.deak@intel.com
During system resume we depended on pci_enable_device() also putting the
device into PCI D0 state. This won't work if the PCI device was already
enabled but still in D3 state. This is because pci_enable_device() is
refcounted and will not change the HW state if called with a non-zero
refcount. Leaving the device in D3 will make all subsequent device
accesses fail.
This didn't cause a problem most of the time, since we resumed with an
enable refcount of 0. But it fails at least after module reload because
after that we also happen to leak a PCI device enable reference: During
probing we call drm_get_pci_dev() which will enable the PCI device, but
during device removal drm_put_dev() won't disable it. This is a bug of
its own in DRM core, but without much harm as it only leaves the PCI
device enabled. Fixing it is also a bit more involved, due to DRM
mid-layering and because it affects non-i915 drivers too. The fix in
this patch is valid regardless of the problem in DRM core.
v2:
- Add a code comment about the relation of this fix to the freeze/thaw
vs. the suspend/resume phases. (Ville)
- Add a code comment about the inconsistent ordering of set power state
and device enable calls. (Chris)
CC: Ville Syrjälä <ville.syrjala@linux.intel.com>
CC: Chris Wilson <chris@chris-wilson.co.uk>
CC: stable@vger.kernel.org
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460979954-14503-1-git-send-email-imre.deak@intel.com
The workaround added in
commit c6782b76d3 ("drm/i915/gen9: Reset secondary power well
requests left on by DMC/KVMR")
needs to be applied on Kabylake too as shown by the corresponding
timeout errors about power well 1 and MISC IO power well disabling in
the latest CI run.
CC: Patrik Jakobsson <patrik.jakobsson@linux.intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Patrik Jakobsson <patrik.jakobsson@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460748778-4484-1-git-send-email-imre.deak@intel.com
When a vblank wait times out in intel_atomic_wait_for_vblanks() we just
get a cryptic 'WARN_ON(!ret)' backtrace in dmesg. Repace it with
something that tells you what actually happened.
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460978973-24945-1-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
The legacy cursor ioctl expects to be asynchronous with respect to other
screen updates, in particular page flips. As X updates the cursor from a
signal context, if the cursor blocks then it will stall both the input
and output chains causing bad stuttering and horrible UX.
Reported-and-tested-by: Rafael Ristovski <rafael.ristovski@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94980
Fixes: 5008e874ed ("drm/i915: Make wait_for_flips interruptible.")
Suggested-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: stable@vger.kernel.org
Link: http://patchwork.freedesktop.org/patch/msgid/1460922166-20292-1-git-send-email-chris@chris-wilson.co.uk
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
For reasons unknown Sandybridge GT1 (at least) will eventually hang when
it encounters a ring wraparound at offset 0. The test case that
reproduces the bug reliably forces a large number of interrupted context
switches, thereby causing very frequent ring wraparounds, but there are
similar bug reports in the wild with the same symptoms, seqno writes
stop just before the wrap and the ringbuffer at address 0. It is also
timing crucial, but adding various delays hasn't helped pinpoint where
the window lies.
Whether the fault is restricted to the ringbuffer itself or the GTT
addressing is unclear, but moving the ringbuffer fixes all the hangs I
have been able to reproduce.
References: (e.g.) https://bugs.freedesktop.org/show_bug.cgi?id=93262
Testcase: igt/gem_exec_whisper/render-contexts-interruptible #snb-gt1
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: stable@vger.kernel.org
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-12-git-send-email-chris@chris-wilson.co.uk
(cherry picked from commit a687a43a48)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
We started to use PIPE_CONTROL to write render ring seqno in order to
combat seqno write vs interrupt generation problems. This was introduced
by commit 7c17d37737 ("drm/i915: Use ordered seqno write interrupt
generation on gen8+ execlists").
On gen8+ size of PIPE_CONTROL with Post Sync Operation should be
6 dwords. When we're using older 5-dword variant it's possible to
observe inconsistent values written by PIPE_CONTROL with Post
Sync Operation from user batches, resulting in rendering corruptions.
v2: Fix BAT failures
v3: Comments on alignment and thrashing high dword of seqno (Chris)
v4: Updated commit msg (Mika)
Testcase: igt/gem_pipe_control_store_loop/*-qword-write
Issue: VIZ-7393
Cc: stable@vger.kernel.org
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Tested-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460469115-26002-1-git-send-email-michal.winiarski@intel.com
(cherry picked from commit ce81a65c79)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Experiments with heaven 4.0 benchmark and skylake gt3e (rev 0xa)
suggest that WaForceContextSaveRestoreNonCoherent is needed for all
revs. Extending this to all revs cures a gpu hang with rev 0xa when
running heaven4.0 gpu benchmark.
We have been here before, with problems enabling gt4e and extending
up to revision F0 instead of false claims of bspec of E0 only. See
commit <e238659ddd88> ("drm/i915/skl: Default to noncoherent access
up to F0"). In retrospect we should have covered this with this big
blanket back then already, as E0 vs F0 discrepancy was suspicious
enough.
Previously the WaForceEnableNonCoherent has been tied to
context non-coherence, atleast in relevant hsds. So keep this tie
and extended this alongside.
Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Cc: Ben Widawsky <benjamin.widawsky@intel.com>
Cc: Timo Aaltonen <tjaalton@ubuntu.com>
Cc: stable@vger.kernel.org
Reported-by: Mike Lothian <mike@fireburn.co.uk>
References: https://bugs.freedesktop.org/show_bug.cgi?id=93491
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Ben Widawsky <benjamin.widawsky@intel.com>
Tested-by: Timo Aaltonen <tjaalton@ubuntu.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459860977-27751-2-git-send-email-mika.kuoppala@intel.com
(cherry picked from commit 97ea6be161)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
For all gt3 and gt4 skylake variants, extend the usage of
WaRsDisableCoarsePowerGating for all revisions. Without this
gt3 and gt4 skylakes up to atleast rev 0xa can gpu hang or
system hang.
Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Cc: Ben Widawsky <benjamin.widawsky@intel.com>
Cc: Timo Aaltonen <tjaalton@ubuntu.com>
Cc: <stable@vger.kernel.org>
Reported-by: Mikael Djurfeldt <mikael@djurfeldt.com>
References: https://bugs.freedesktop.org/show_bug.cgi?id=94161
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Ben Widawsky <benjamin.widawsky@intel.com>
Tested-by: Timo Aaltonen <tjaalton@ubuntu.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459860977-27751-1-git-send-email-mika.kuoppala@intel.com
(cherry picked from commit 185c66e57c)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Holding a reference to the containing task_struct is not sufficient to
prevent the mm_struct from being reaped under memory pressure. If this
happens whilst we are calling get_user_pages(), explosions erupt -
sometimes an immediate GPF, sometimes page flag corruption. To prevent
the target mm from being reaped as we are reading from it, acquire a
reference before we begin.
Testcase: igt/gem_shrink/*userptr
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459864801-28606-2-git-send-email-chris@chris-wilson.co.uk
(cherry picked from commit 40313f0cd0)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Currently for the case where there is enough space at the end of Ring
buffer for accommodating only the base request, the wrapround is done
immediately and as a result the base request gets added at the start
of Ring buffer. But there may not be enough free space at the beginning
to accommodate the base request, as before the wraparound, the wait was
effectively done for the reserved_size free space from the start of
Ring buffer. In such a case there is a potential of Ring buffer overflow,
the instructions at the head of Ring (ACTHD) can get overwritten.
Since the base request can fit in the remaining space, there is no need
to wraparound immediately. The wraparound will anyway happen later when
the reserved part starts getting used.
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Akash Goel <akash.goel@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1457688402-10411-1-git-send-email-akash.goel@intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: stable@vger.kernel.org
(cherry picked from commit 782f6bc0ab)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Due to "some hardware limitation" the DPI enable bit in port C control
register does not get set on VLV. As a workaround we check the status in
pipe B conf register instead. The workaround was added in
commit c0beefd29f
Author: Gaurav K Singh <gaurav.k.singh@intel.com>
Date: Tue Dec 9 10:59:20 2014 +0530
drm/i915: Software workaround for getting the HW status of DSI Port C on BYT
Empirical evidence (on Surface 3 with DSI on port C per VBT) shows that
this is the case also on CHV, so extend the workaround to CHV. We still
have the device ready register check in place, so this should not get
confused with e.g. HDMI on pipe B.
This fixes a number of state checker warnings on CHV DSI port C.
Cc: stable@vger.kernel.org
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460724451-13810-1-git-send-email-jani.nikula@intel.com
Show a total and purgeable number of pin mapped objects
and their total and purgeable size.
Example output (new stat prefixed with a star):
# cat i915_gem_objects
19920 objects, 289243136 bytes
19920 [18466] objects, 288714752 [267911168] bytes in gtt
0 [0] active objects, 0 [0] bytes
19917 [18466] inactive objects, 288714752 [267911168] bytes
0 unbound objects, 0 bytes
0 purgeable objects, 0 bytes
1 pinned mappable objects, 3145728 bytes
0 fault mappable objects, 0 bytes
* 19914 [0] pin mapped objects, 285560832 [0] bytes [purgeable]
4294967296 [268435456] gtt total
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1460716493-27826-1-git-send-email-tvrtko.ursulin@linux.intel.com
Reflect the status of obj->mapping as added with the
i915_gem_object_pin_map API.
'M' was chosen to designate the pin mapped status.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
We don't have a LVDS_BORDER_ENABLE type of bit for either eDP or DSI,
and just trying to frob the display timings to include borders results
in a corrupted picture. So reject the 'Center' scaling mode on GMCH
platforms for eDP and DSI.
TODO: Should really filter out the unsupported modes from the prop,
but that would be fairly invasive since the prop is now created and
stored by drm core. So leave it for a rainy day.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460488478-18311-6-git-send-email-ville.syrjala@linux.intel.com
Tested-by: Jani Nikula <jani.nikula@intel.com>
Add the scaling mode property to DSI connectors, handle changes in the
property value, and compute the panel fitter state during
.compute_config().
v2: Handle BXT as well
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460488478-18311-5-git-send-email-ville.syrjala@linux.intel.com
Tested-by: Jani Nikula <jani.nikula@intel.com>
Compute the DSI PLL parameters during .compute_config() rather than
.pre_pll_enable() so that we can fail gracefully if we can't find
suitable parameters.
In order to do that we need to store the DSI PLL parameters in
pipe_config.
v2: Handle BXT too
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460488478-18311-3-git-send-email-ville.syrjala@linux.intel.com
Tested-by: Jani Nikula <jani.nikula@intel.com>
Set up DPLL and DPLL_MD even when driving DSI output on VLV/CHV. While
the DPLL isn't used to provide the clock we still need the refclock, and
it appears that the pixel repeat factor also has an effect on DSI
output. So set up eveyrhing in DPLL and DPLL_MD as we would do for
DP/HDMI/VGA, but don't actually enable the DPLL or configure the
dividers via DPIO.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460488478-18311-2-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Tested-by: Jani Nikula <jani.nikula@intel.com>
This patch is to correct one thing in this commit:
commit 25a5670533
Author: Dongwon Kim <dongwon.kim@intel.com>
Date: Wed Mar 16 18:06:13 2016 -0700
drm/i915/bxt: Reversed polarity of PORT_PLL_REF_SEL bit
This reversed bit polarity is actually common
for all BXT and APL SoCs. Therefore, revision checking
in the original commit should be removed to make
the bit set regardless of revision ID of GFX block.
Signed-off-by: Dongwon Kim <dongwon.kim@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460673463-14453-1-git-send-email-dongwon.kim@intel.com
Modify the debugfs output for i915_dp_mst_info to list the source port for
the DP MST topology in question.
v2: rebase
v3: rebase
v4: rebase
cc: Jani Nikula <jani.nikula@linux.intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Jim Bride <jim.bride@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1460654317-31288-3-git-send-email-jim.bride@linux.intel.com
I caught a few errors in our current PHY/CDCLK programming by sanity
checking the actual programmed state, so I thought it would be also
useful for the future. In addition to verifying the state after
programming it also verify it after exiting DC5, to make sure DMC
restored/kept intact everything related.
v2:
- Inlining __phy_reg_verify_state() doesn't make sense and also
incorrect, so don't do it (PW/CI gcc)
v3:
- Rebase on latest -nightly
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: David Weinehall <david.weinehall@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459780030-15781-1-git-send-email-imre.deak@intel.com
If BIOS has already programmed and enabled a PHY, don't reprogram it as
that may interfere with the currently active outputs. A follow-up patch
will add state verification, so we can catch any misconfiguration on
BIOS's behalf.
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459515767-29228-14-git-send-email-imre.deak@intel.com
When determining whether CDCLK is enabled by BIOS and so we should skip
reprogramming it, we didn't check the related DBUF power request and
state. In theory BIOS could enable one without the other so check for
this case and reprogram things if something is amiss.
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459515767-29228-13-git-send-email-imre.deak@intel.com
Power well 1 is managed by the DMC firmware so don't toggle it on-demand
from the driver. This means we need to follow the BSpec display
initialization sequence during driver loading and resuming (both system
and runtime) and enable power well 1 only once there. Afterwards DMC
will toggle power well 1 whenever entering/exiting DC5.
For this to work we also need to do away getting the PLL power domain,
since that just kept runtime PM disabled for good.
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459515767-29228-12-git-send-email-imre.deak@intel.com
The power-down step logically belongs to the individual PHY uninit
sequence so move it there. The only functional change is that we will
power down now PHY 1 separately before PHY 0 and preserve the other bits
in the register which are defined as reserved.
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459515767-29228-11-git-send-email-imre.deak@intel.com
For internal APIs passing dev_priv is preferred to reduce indirections,
so convert over a few DDI PHY, CDCLK helpers.
No functional change.
Signed-off-by: Imre Deak <imre.deak@intel.com>
Acked-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: David Weinehall <david.weinehall@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459515767-29228-10-git-send-email-imre.deak@intel.com
On Broxton we need to enable/disable power well 1 during the init/unit
display sequence similarly to Skylake/Kabylake. The code for this will
be added in a follow-up patch, but to prepare for that unexport
skl_pw1_misc_io_init(). It's a simple function called only from a single
place and having it inlined in the Skylake display core init/unit
functions will make it easier to compare it with its Broxton
counterpart.
This also flips the order of Misc IO and power well 1 disabling which
matches the enabling order. The specification doesn't prescribe the
disabling order, so this should be fine.
v2:
- Fix incorrect enable vs. disable power well call in
skl_display_core_uninit() (Patrik)
- Add commit comment about chaning the order of PW1 and Misc IO power
well disabling (Patrik)
CC: Patrik Jakobsson <patrik.jakobsson@linux.intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Patrik Jakobsson <patrik.jakobsson@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459773777-10701-1-git-send-email-imre.deak@intel.com
On SKL/KBL suspend-to-idle (aka freeze/s0ix) is performed with DMC
firmware assistance where the target display power state is DC6. On
Broxton on the other hand we don't use the firmware for this, but rely
instead on a manual DC9 flow. For this we have to uninitialize the
display following the BSpec display uninit sequence, just as during
S3/S4, so make sure we follow this sequence.
CC: Patrik Jakobsson <patrik.jakobsson@linux.intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Patrik Jakobsson <patrik.jakobsson@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459515767-29228-8-git-send-email-imre.deak@intel.com
The display power well support and DC state management doesn't depend on
runtime PM support, so remove the incorrect asserts about this.
Also Broxton does support DC5, so the related assert in
assert_can_enable_dc5() is incorrect. There is a more generic and
correct assert for this already in gen9_set_dc_state(), so we can remove
all the other ones.
At the same time convert WARNs to WARN_ONCE for consistency with the
other DC state asserts.
CC: Patrik Jakobsson <patrik.jakobsson@linux.intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Patrik Jakobsson <patrik.jakobsson@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459515767-29228-7-git-send-email-imre.deak@intel.com
So far we only power well enabling was synchronous not disabling. Since
we don't exactly know how the firmware (both DMC and PCU) synchronizes
against the actual power well state during DC transitions, make the
disabling also synchronous.
CC: Mika Kuoppala <mika.kuoppala@linux.intel.com>
CC: Patrik Jakobsson <patrik.jakobsson@linux.intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Patrik Jakobsson <patrik.jakobsson@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459515767-29228-6-git-send-email-imre.deak@intel.com
DMC forces on power well 1 and the misc IO power well by setting the
corresponding request bits both in the BIOS and the DEBUG power well
request registers. This is somewhat unexpected since the firmware should
really just save and restore state but not alter it. We also depend on
being able to disable power well 1, and the misc IO power well before
entering S3/S4 on BXT and SKL or entering DC9 on BXT. To fix this make
sure these request bits are cleared whenever we want to disable the
given power wells.
On SKL there is another twist where the firmware also clears the power
well 1 request bit in HSW_POWER_WELL_DRIVER (but not that of the misc IO
power well). This happens to not cause a problem due to the forced-on
request bits in the other request registers.
I've filed a bug about all this, but fixing that may take a while and
having this sanity check in place makes sense even for future firmware
versions.
At the same time also check the KVMR request bits. I haven't seen this
being altered, but we don't expect any request bits in here either, so
sanitize this register as well.
v2:
- Apply the workaround on SKL as well. I noticed the related failure
from the CI report, later Patrik also reported seeing it on his
machine.
CC: Patrik Jakobsson <patrik.jakobsson@linux.intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Patrik Jakobsson <patrik.jakobsson@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459851965-6137-1-git-send-email-imre.deak@intel.com
This register is read-only, so we have never actually set
OCL2_LDOFUSE_PWR_DIS in it as specified by the specification. Add a code
comment about this. I filed a specification update request to clarify
this there.
CC: Arthur J Runyan <arthur.j.runyan@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: David Weinehall <david.weinehall@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459515767-29228-4-git-send-email-imre.deak@intel.com
This has been corrected in BSpec quite some time ago, but we missed it
somehow. The wrong field definitions resulted in configuring PHY0 with
an incorrect GRC value.
v2:
- Remove the FIXME comment, we left in the code exactly about this
issue. (Ville)
CC: Arthur J Runyan <arthur.j.runyan@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459515767-29228-3-git-send-email-imre.deak@intel.com
DMC version 1.06 has a known bug, where the firmware polls forever for a
port PLL to lock, if the PLL was disabled when entering DC5, which locks
up the machine. Version 1.07 fixes this, so make that the minimum
required version on BXT.
CC: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459515767-29228-2-git-send-email-imre.deak@intel.com
Split the VLV/CHV hoplug irq handling to ack and handler phases. This
way we can move the actual irq handling outside the section where
we have disabled the interrupt sources.
For now, we leave things as is for pre-VLV GMCH platforms, but
eventually they could get the same treatment.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460571598-24452-9-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
On VLV/CHV the master interrupt enable bit only affects GT/PM
interrupts. Display interrupts are not affected by the master
irq control.
Also it seems that the CPU interrupt will only be generated when
the combined result of all GT/PM/display interrupts has a 0->1
edge. We already use the master interrupt enable bit to make sure
GT/PM interrupt can generate such an edge if we don't end up clearing
all IIR bits. We must do the same for display interrupts, and for
that we can simply clear out VLV_IER, and restore after we've acked
all the interrupts we are about to process.
So with both master interrupt enable and VLV_IER cleared out, we will
guarantee that there will be a 0->1 edge if any IIR bits are still set
at the end, and thus another CPU interrupt will be generated.
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Fixes: 579de73b04 ("drm/i915: Exit cherryview_irq_handler() after one pass")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460571598-24452-6-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
On VLV/CHV VLV_IIR is not double double buffered, and it doesn't detect
edges from PIPESTAT & co. like it does on gen4. Instead it just
directly latches the level from PIPESTAT & co. That means we must clear
VLV_IIR after PIPESTAT & co. or else we'll get a spurious bit in VLV_IIR
every single time.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460571598-24452-4-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Use GEN8_MASTER_IRQ_CONTROL instead of DE_MASTER_IRQ_CONTROL or
MASTER_INTERRUPT_ENABLE with the GEN8_MASTER_IRQ register. They're
all bit 31 so there's no actual bug here, but let's be consistent
which name we use for the bit.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460571598-24452-2-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
On CHV GTFIFODBG has some read-only bits to indicate the number
of free FIFO entries. Ignore these when checking to see if any
of the sticky error bits are set.
This gets rid of these during device resume:
[drm:cherryview_enable_rps] GT fifo had a previous error 1080000
While at it, move the assignments out of the if().
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460570970-14073-1-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Allow for the MOCS to be programmed for all engines.
Currently we program the MOCS when the first render batch
goes through. This works on most platforms but fails on
platforms that do not run a render batch early,
i.e. headless servers. The patch now programs all initialised engines
on init and the RCS is programmed again within the initial batch. This
is done for predictable consistency with regards to the hardware
context.
Hardware context loading sets the values of the MOCS for RCS
and L3CC. Programming them from within the batch makes sure that
the render context is valid, no matter what the previous state of
the saved-context was.
v2: posted correct version to the mailing list.
v3: moved programming to within engine->init_hw() (Chris Wilson)
v4: code formatting and white-space changes. (Chris Wilson)
Testcase: igt/gem_mocs_settings
Signed-off-by: Peter Antoine <peter.antoine@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1460556205-6644-1-git-send-email-chris@chris-wilson.co.uk
Conceptually, each request is a record of a hardware transaction - we
build up a list of pending commands and then either commit them to
hardware, or cancel them. However, whilst building up the list of
pending commands, we may modify state outside of the request and make
references to the pending request. If we do so and then cancel that
request, external objects then point to the deleted request leading to
both graphical and memory corruption.
The easiest example is to consider object/VMA tracking. When we mark an
object as active in a request, we store a pointer to this, the most
recent request, in the object. Then we want to free that object, we wait
for the most recent request to be idle before proceeding (otherwise the
hardware will write to pages now owned by the system, or we will attempt
to read from those pages before the hardware is finished writing). If
the request was cancelled instead, that wait completes immediately. As a
result, all requests must be committed and not cancelled if the external
state is unknown.
All that remains of i915_gem_request_cancel() users are just a couple of
extremely unlikely allocation failures, so remove the API entirely.
A consequence of committing all incomplete requests is that we generate
excess breadcrumbs and fill the ring much more often with dummy work. We
have completely undone the outstanding_last_seqno optimisation.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93907
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-16-git-send-email-chris@chris-wilson.co.uk
After mi_set_context() succeeds, we need to update the state of the
engine's last_context. This ensures that we hold a pin on the context
whilst the hardware may write to it. However, since we didn't complete
the post-switch setup of the context, we need to force the subsequent
use of the same context to complete the setup (which means updating
should_skip_switch()).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-15-git-send-email-chris@chris-wilson.co.uk
Having the !RCS legacy context switch threaded through the RCS switching
code makes it much harder to follow and understand. In the next patch, I
want to fix a bug handling the incomplete switch, this is made much
simpler if we segregate the two paths now.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Daniel Vetter <daniel@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-14-git-send-email-chris@chris-wilson.co.uk
As paranoia, we want to ensure that the CPU's PTEs have been revoked for
the object before we return from i915_gem_release_mmap(). This allows us
to rely on there being no outstanding memory accesses from userspace
and guarantees serialisation of the code against concurrent access just
by calling i915_gem_release_mmap().
v2: Reduce the mb() into a wmb() following the revoke.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: "Goel, Akash" <akash.goel@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-13-git-send-email-chris@chris-wilson.co.uk
For reasons unknown Sandybridge GT1 (at least) will eventually hang when
it encounters a ring wraparound at offset 0. The test case that
reproduces the bug reliably forces a large number of interrupted context
switches, thereby causing very frequent ring wraparounds, but there are
similar bug reports in the wild with the same symptoms, seqno writes
stop just before the wrap and the ringbuffer at address 0. It is also
timing crucial, but adding various delays hasn't helped pinpoint where
the window lies.
Whether the fault is restricted to the ringbuffer itself or the GTT
addressing is unclear, but moving the ringbuffer fixes all the hangs I
have been able to reproduce.
References: (e.g.) https://bugs.freedesktop.org/show_bug.cgi?id=93262
Testcase: igt/gem_exec_whisper/render-contexts-interruptible #snb-gt1
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: stable@vger.kernel.org
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-12-git-send-email-chris@chris-wilson.co.uk
Two concurrent writes into the same register cacheline has the chance of
killing the machine on Ivybridge and other gen7. This includes LRI
emitted from the command parser. The MI_SET_CONTEXT itself serves as
serialising barrier and prevents the pair of register writes in the first
packet from triggering the fault. However, if a second switch-context
immediately occurs then we may have two adjacent blocks of LRI to the
same registers which may then trigger the hang. To counteract this we
need to insert a delay after the second register write using SRM.
This is easiest to reproduce with something like
igt/gem_ctx_switch/interruptible that triggers back-to-back context
switches (with no operations in between them in the command stream,
which requires the execbuf operation to be interrupted after the
MI_SET_CONTEXT) but can be observed sporadically elsewhere when running
interruptible igt. No reports from the wild though, so it must be of low
enough frequency that no one has correlated the random machine freezes
with i915.ko
The issue was introduced with
commit 2c55018347 [v3.19]
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Tue Dec 16 10:02:27 2014 +0000
drm/i915: Disable PSMI sleep messages on all rings around context switches
Testcase: igt/gem_ctx_switch/render-interruptible #ivb
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Daniel Vetter <daniel@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-11-git-send-email-chris@chris-wilson.co.uk
If we do not have lowlevel support for reseting the GPU, or if the user
has explicitly disabled reseting the device, the failure is expected.
Since it is an expected failure, we should be using a lower priority
message than *ERROR*, perhaps NOTICE. In the absence of DRM_NOTICE, just
emit the expected failure as a DEBUG message.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-10-git-send-email-chris@chris-wilson.co.uk
Reporting -EIO from i915_wait_request() has proven very troublematic
over the years, with numerous hard-to-reproduce bugs cropping up in the
corner case of where a reset occurs and the code wasn't expecting such
an error.
If the we reset the GPU or have detected a hang and wish to reset the
GPU, the request is forcibly complete and the wait broken. Currently, we
report either -EAGAIN or -EIO in order for the caller to retreat and
restart the wait (if appropriate) after dropping and then reacquiring
the struct_mutex (essential to allow the GPU reset to proceed). However,
if we take the view that the request is complete (no further work will
be done on it by the GPU because it is dead and soon to be reset), then
we can proceed with the task at hand and then drop the struct_mutex
allowing the reset to occur. This transfers the burden of checking
whether it is safe to proceed to the caller, which in all but one
instance it is safe - completely eliminating the source of all spurious
-EIO.
Of note, we only have two API entry points where we expect that
userspace can observe an EIO. First is when submitting an execbuf, if
the GPU is terminally wedged, then the operation cannot succeed and an
-EIO is reported. Secondly, existing userspace uses the throttle ioctl
to detect an already wedged GPU before starting using HW acceleration
(or to confirm that the GPU is wedged after an error condition). So if
the GPU is wedged when the user calls throttle, also report -EIO.
v2: Split more carefully the change to i915_wait_request() and assorted
ABI from the reset handling.
v3: Add a couple of WARN_ON(EIO) to the interruptible modesetting code
so that we don't start to leak EIO there in future (and break our hang
resistant modesetting).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-9-git-send-email-chris@chris-wilson.co.uk
Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-1-git-send-email-chris@chris-wilson.co.uk
Now that the reset_counter is stored on the request, we can rearrange
the code to handle reading the counter versus waiting during the atomic
modesetting for readibility (by deleting the hairiest of codes).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-8-git-send-email-chris@chris-wilson.co.uk
As the request is only valid during the same global reset epoch, we can
record the current reset_counter when constructing the request and reuse
it when waiting upon that request in future. This removes a very hairy
atomic check serialised by the struct_mutex at the time of waiting and
allows us to transfer those waits to a central dispatcher for all
waiters and all requests.
PS: With per-engine resets, we obviously cannot assume a global reset
epoch for the requests - a per-engine epoch makes the most sense. The
challenge then is how to handle checking in the waiter for when to break
the wait, as the fine-grained reset may also want to requeue the
request (i.e. the assumption that just because the epoch changes the
request is completed may be broken - or we just avoid breaking that
assumption with the fine-grained resets).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-7-git-send-email-chris@chris-wilson.co.uk
In the reset_counter, we use two bits to track a GPU hang and reset. The
low bit is a "reset-in-progress" flag that we set to signal when we need
to break waiters in order for the recovery task to grab the mutex. As
soon as the recovery task has the mutex, we can clear that flag (which
we do by incrementing the reset_counter thereby incrementing the gobal
reset epoch). By clearing that flag when the recovery task holds the
struct_mutex, we can forgo a second flag that simply tells GEM to ignore
the "reset-in-progress" flag.
The second flag we store in the reset_counter is whether the
reset failed and we consider the GPU terminally wedged. Whilst this flag
is set, all access to the GPU (at least through GEM rather than direct mmio
access) is verboten.
PS: Fun is in store, as in the future we want to move from a global
reset epoch to a per-engine reset engine with request recovery.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-6-git-send-email-chris@chris-wilson.co.uk
If we, when we store the reset_counter for the operation, we ensure that
it is not in a wedged or in the middle of a reset, we can then assert that
if any reset occurs the reset_counter must change. Later we can just
compare the operation's reset epoch against the current counter to see
if we need to abort the operation (to handle the hang).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-5-git-send-email-chris@chris-wilson.co.uk
This is principally a little bit of syntatic sugar to hide the
atomic_read()s throughout the code to retrieve the current reset_counter.
It also provides the other utility functions to check the reset state on the
already read reset_counter, so that (in later patches) we can read it once
and do multiple tests rather than risk the value changing between tests.
v2: Be more strict on converting existing i915_reset_in_progress() over to
the more verbose i915_reset_in_progress_or_wedged().
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-4-git-send-email-chris@chris-wilson.co.uk
Currently there is a #define to enable extra BUG_ON for debugging
requests and associated activities. I want to expand its use to cover
all of GEM internals (so that we can saturate the code with asserts).
We can add a Kconfig option to make it easier to enable - with the usual
caveats of not enabling unless explicitly requested.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-3-git-send-email-chris@chris-wilson.co.uk
Separate out the layers of includes (linux, drm, intel, i915) so that it
is a little easier to order our definitions between our multiple
reentrant headers. A couple of headers needed fixes to make them more
standalone (forgotten includes, forward declarations etc).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-2-git-send-email-chris@chris-wilson.co.uk
Our driver compiles clean (nowadays thanks to 0day) but for me, at least,
it would be beneficial if the compiler threw an error rather than a
warning when it found a piece of suspect code. (I use this to
compile-check patch series and want to break on the first compiler error
in order to fix the patch.)
v2: Kick off a new "Debugging" submenu for i915.ko
At this point, we applied it to the kernel and promptly kicked it out
again as it broke buildbots (due to a compiler warning on 32bits):
commit 908d759b21
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date: Tue May 26 07:46:21 2015 +0200
Revert "drm/i915: Force clean compilation with -Werror"
v3: Avoid enabling -Werror for allyesconfig/allmodconfig builds, using
COMPILE_TEST as a suitable proxy suggested by Andrew Morton. (Damien)
Only make the option available for EXPERT to reinforce that the option
should not be casually enabled.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Damien Lespiau <damien.lespiau@intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-1-git-send-email-chris@chris-wilson.co.uk
With gen9+ the edram capabilities are defined so
that we can calculate the edram (ellc) size accordingly.
Note that there are undefined combinations for some subset of
edram capability bits. Return the closest size for undefined indexes.
Even if we get it wrong with beginning of future gen enabling, the size
information is currently only used for boot message and in debugfs entry.
v2: Use function instead of hard to read macro (Daniel)
v3: s/INTEL_INFO/INTEL_GEN (Matthew)
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460557604-7126-2-git-send-email-mika.kuoppala@intel.com
Store the edram capabilities instead of only the size of
edram. This is preparatory patch to allow edram size calculation
based on edram capability bits for gen9+. With gen9 the
edram is behind llc and is a separate entity. With hsw/bdw
it was more of a victim cache for LLC so the name 'eLLC' might
be warranted. Regardless, rename all mentions of eLLC to EDRAM to
clear the confusion.
v2: return bytes for edram size (Chris)
s/eLLC/eDRAM in output if we are gen > 8
v3: rebase, INTEL_GEN (Chris)
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
For gen9 onwards, eDRAM is a true memory side cache. So
there is no need to program idi hash mask as it is for eLLC
only.
v2: INTEL_GEN (Chris), s/has/hash (Matthew)
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
We started to use PIPE_CONTROL to write render ring seqno in order to
combat seqno write vs interrupt generation problems. This was introduced
by commit 7c17d37737 ("drm/i915: Use ordered seqno write interrupt
generation on gen8+ execlists").
On gen8+ size of PIPE_CONTROL with Post Sync Operation should be
6 dwords. When we're using older 5-dword variant it's possible to
observe inconsistent values written by PIPE_CONTROL with Post
Sync Operation from user batches, resulting in rendering corruptions.
v2: Fix BAT failures
v3: Comments on alignment and thrashing high dword of seqno (Chris)
v4: Updated commit msg (Mika)
Testcase: igt/gem_pipe_control_store_loop/*-qword-write
Issue: VIZ-7393
Cc: stable@vger.kernel.org
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Tested-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460469115-26002-1-git-send-email-michal.winiarski@intel.com
Experiments with heaven 4.0 benchmark and skylake gt3e (rev 0xa)
suggest that WaForceContextSaveRestoreNonCoherent is needed for all
revs. Extending this to all revs cures a gpu hang with rev 0xa when
running heaven4.0 gpu benchmark.
We have been here before, with problems enabling gt4e and extending
up to revision F0 instead of false claims of bspec of E0 only. See
commit <e238659ddd88> ("drm/i915/skl: Default to noncoherent access
up to F0"). In retrospect we should have covered this with this big
blanket back then already, as E0 vs F0 discrepancy was suspicious
enough.
Previously the WaForceEnableNonCoherent has been tied to
context non-coherence, atleast in relevant hsds. So keep this tie
and extended this alongside.
Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Cc: Ben Widawsky <benjamin.widawsky@intel.com>
Cc: Timo Aaltonen <tjaalton@ubuntu.com>
Cc: stable@vger.kernel.org
Reported-by: Mike Lothian <mike@fireburn.co.uk>
References: https://bugs.freedesktop.org/show_bug.cgi?id=93491
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Ben Widawsky <benjamin.widawsky@intel.com>
Tested-by: Timo Aaltonen <tjaalton@ubuntu.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459860977-27751-2-git-send-email-mika.kuoppala@intel.com
For all gt3 and gt4 skylake variants, extend the usage of
WaRsDisableCoarsePowerGating for all revisions. Without this
gt3 and gt4 skylakes up to atleast rev 0xa can gpu hang or
system hang.
Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Cc: Ben Widawsky <benjamin.widawsky@intel.com>
Cc: Timo Aaltonen <tjaalton@ubuntu.com>
Cc: <stable@vger.kernel.org>
Reported-by: Mikael Djurfeldt <mikael@djurfeldt.com>
References: https://bugs.freedesktop.org/show_bug.cgi?id=94161
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Ben Widawsky <benjamin.widawsky@intel.com>
Tested-by: Timo Aaltonen <tjaalton@ubuntu.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459860977-27751-1-git-send-email-mika.kuoppala@intel.com
We can use the new pin/lazy unpin API for simplicity
and more performance in the execlist submission paths.
v2:
* Fix error handling and convert more users.
* Compact some names for readability.
v3:
* intel_lr_context_free was not unpinning.
* Special case for GPU reset which otherwise unbalances
the HWS object pages pin count by running the engine
initialization only (not destructors).
v4:
* Rebased on top of hws setup/init split.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1460472042-1998-1-git-send-email-tvrtko.ursulin@linux.intel.com
[tursulin: renames: s/hwd/hws/, s/obj_addr/vaddr/]
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Split the hardware status page into setup and initialisation,
where setup means setting up the driver state to support the
engine, and initialization means programming the hardware
with the before set up state.
This way the design matches the design of the engine setup/init
code which is split in the same fashion and it enables the
stages to be used in a balanced fashion (engine setup - hws
setup, engine init - hws init).
This will enable the upcoming improvements to slot in without
any kludges on the GPU reset path.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Check whether the DPLL is even enabled before readoing out the dividers
and trying to derive port_clock on CHV. We already did this on VLV.
Also remove the comment "MIPI" comment from the VLV code since we call
this function whenever the pipe is enabled.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1458052809-23426-9-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
On VLV at least, the BIOS may leave the DSI PLL enabled in some wonky
state where it just refuses to lock. Simply disabling the PLL before
reconfiguring it is not enough to fix it, but power gating the PLL
prior to reconfiguring does work.
This happens on BYT FFRD8 when booting with HDMI connected so the DSI
display will not be lit up by the BIOS.
Also we can remove the code for BXT that disables the PLL before
enabling it again.
v2: s/vlv/intel/ since BXT made thing generic
v3: Remove the BXT disable PLL before enable trick
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1458052809-23426-11-git-send-email-ville.syrjala@linux.intel.com
Acked-by: Jani Nikula <jani.nikula@intel.com>
The registers frobbed by vlv_init_display_clock_gating() libve inside
the disp2d power well, so frobbing them while the power well is down
results in unclaimed register access warning (and of course the values
won't stick). Let's do this setup after we know the power well is
enabled.
It's also worth noting that DSPCLK_GATE_D and CBR1_VLV lose their state
when the power well goes down, but fortunately the values we've been
writing are actually the reset defaults.
MI_ARB_VLV actually retains its value even if the power well was turned
off, we just can't access it while the power well is down.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94164
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460382992-28728-9-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>
For a bit of extra paranoia make sure the display irqs are all cleared
before we enabled them when turning on the power well. This should
really be the case already since the power well was off which resets
everything.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460382992-28728-6-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>
During runtime PM we'll be reinitializing interrupt support from the
ground up. However since the display power well will be off at that
time, well end up with a ton of unclaimed register accesses from the
display irq setup. Since we turned off the power well already before
runtime suspend, we've flagged display irqs as disabled during runtime
PM transitions. So we can just check that flag to see if we should do
skip display irqs during irq setup.
During driver load display irqs will be flagged as enabled since we've
turned on the power well already, however the power well code will have
skipped the display irq setup since irq support as a whole wasn't yet
enabled when the power well was enabled. So we'll want to do the display
irq setup in that case.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94164
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460382992-28728-4-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>
The vlv/chv display irq setup was a bit of mess after I ran out of steam
when working on it last. Fix it up so that we just have a _reset() and
_postinstall() hooks for the display irqs, and use those consistently.
v2: Clear out pipestat_irq_mask[] and PIPE_FIFO_UNDERRUN_STATUS in
vlv_display_irq_reset() (Imre)
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com> (v1)
Link: http://patchwork.freedesktop.org/patch/msgid/1460476574-1921-1-git-send-email-ville.syrjala@linux.intel.com
The underruns we were seeing when enabling eDP port A on ILK seem to
have been caused by prematurely clearing the LP1+ watermark values when
disabling LP1+ watermarks. Now that the watermarks are handled
properly, we can rip out the underrun suppression around the port A
enable.
We still need to worry about the underruns on FDI when enabling
the eDP PLL. But as Bspec tells us, we can avoid that by a vblank
wait on the pipe driving FDI just prior to enabling the eDP PLL.
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459536799-18109-4-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Patrik Jakobsson <patrik.jakobsson@linux.intel.com>
Once again ILK is unhappy if we clear out the LP1+ watermark levels
outright, and instead we must disable the levels we don't want while
still leaving the actual programmed watermark levels intact.
Fixes underruns on the already enabled pipe when programming watermarks
while enabling the second pipe.
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Matt Roper <matthew.d.roper@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93787
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459536799-18109-3-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Patrik Jakobsson <patrik.jakobsson@linux.intel.com>
Take a bigger hammer to the underrun suppression on ILK. Instead of
trying to suppress them at specific points in the modeset sequence just
silence them across the entire sequence. This gets rid of some underruns
at least on my ILK. Note that this changes SNB and IVB to follow the
same approach just to keep the code less convoluted. The difference is
that on those platforms we won't suppress CPU underruns for port A since
it doesn't seem to be necessary.
My ILK has port A eDP and two PCH HDMI ports, so I can't be sure this is
as effective on other PCH port types. Perhaps we still need some of
Daniel's extra vblank waits [2]?
I've still been able to trigger an underrun on the other pipe, but
fixing that perhaps needs the LP1+ disable trick I implemented here [1]
which never got merged.
A few details which hamper stress testing on my ILK are that sometimes
the PCH transcoder gets messed up and refuses to shut down, and sometimes
even the panel power sequencer apparently gets stuck on the always on
position.
[1] https://lists.freedesktop.org/archives/intel-gfx/2014-March/041317.html
[2] https://lists.freedesktop.org/archives/intel-gfx/2016-January/086397.html
v2: Add a note that we also get underruns when enabling PCH ports
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> (v1)
Link: http://patchwork.freedesktop.org/patch/msgid/1459536799-18109-2-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Patrik Jakobsson <patrik.jakobsson@linux.intel.com>
Rather than blindly waking up all forcewake domains on command
submission, we can teach each engine what is (or are) the correct
one to take.
On platforms with multiple forcewake domains like VLV, CHV, SKL
and BXT, this has the potential of lowering the GPU and CPU
power use and submission latency.
To implement it we add a function named
intel_uncore_forcewake_for_reg whose purpose is to query which
forcewake domains need to be taken to read or write a specific
register with raw mmio accessors.
These enables the execlists engine setup to query which
forcewake domains are relevant per engine on the currently
running platform.
v2:
* Kerneldoc.
* Split from intel_uncore.c macro extraction, WARN_ON,
no warns on old platforms. (Chris Wilson)
v3:
* Single domain per engine, mention all registers,
bi-directional function and a new name, fix handling
of gen6 and gen7 writes. (Chris Wilson)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1460468251-14069-1-git-send-email-tvrtko.ursulin@linux.intel.com
Chris Wilson points out that we can remove them from the array
since they are always written to with raw accessors.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Knowledge of which register per platform belonds in which
forcewake domain was embedded in the MMIO accessors themselves.
Extract it into standalone macros so they can be used from
new code in the following patches.
This causes GCC to compile some of the MMIO accessors slightly
differently and grows the code a tiny amount. But none of the
growth is on the fast-path so it does not matter hugely.
Affected sizes before:
00000000000026f0 00000000000001a5 t gen6_read16
0000000000002390 00000000000001a5 t gen6_read32
00000000000028a0 00000000000001a5 t gen6_read64
00000000000061d0 000000000000019e t gen8_write16
0000000000006510 000000000000019d t gen8_write32
0000000000006370 000000000000019d t gen8_write64
00000000000021f0 000000000000019d t gen8_write8
Affected sizes after:
0000000000002840 00000000000001aa t gen6_read16
00000000000024e0 00000000000001a9 t gen6_read32
00000000000029f0 00000000000001a9 t gen6_read64
0000000000004f20 00000000000001b5 t gen8_write16
0000000000004ba0 00000000000001b4 t gen8_write32
00000000000050e0 00000000000001b4 t gen8_write64
0000000000004d60 00000000000001b4 t gen8_write8
Other MMIO accessors are not affected in size.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
As the vast majority of users do not use the domain id variable,
we can eliminate it from the iterator and also change the latter
using the same principle as was recently done for for_each_engine.
For a couple of callers which do need the domain mask, store it
in the domain array (which already has the domain id), then both
can be retrieved thence.
Result is clearer code and smaller generated binary, especially
in the tight fw get/put loops. Also, relationship between domain
id and mask is no longer assumed in the macro.
v2: Improve grammar in the commit message and rename the
iterator to for_each_fw_domain_masked for consistency.
(Dave Gordon)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Dave Gordon <david.s.gordon@intel.com>
Because it is based on jiffies, current implementation releases the
forcewake at any time between straight away and between 1ms and 10ms,
depending on the kernel configuration (CONFIG_HZ).
This is probably not what has been desired, since the dynamics of keeping
parts of the GPU awake should not be correlated with this kernel
configuration parameter.
Change the auto-release mechanism to use hrtimers and set the timeout to
1ms with a 1ms of slack. This should make the GPU power consistent
across kernel configs, and timer slack should enable some timer coalescing
where multiple force-wake domains exist, or with unrelated timers.
For GlBench/T-Rex this decreases the number of forcewake releases from
~480 to ~300 per second, and for a heavy combined OGL/OCL test from
~670 to ~360 (HZ=1000 kernel).
Even though this reduction can be attributed to the average release period
extending from 0-1ms to 1-2ms, as discussed above, it will make the
forcewake timeout consistent for different CONFIG_HZ values.
Real life measurements with the above workload has shown that, with this
patch, both manage to auto-release the forcewake between 2-4 times per
10ms, even though the number of forcewake gets is dramatically different.
T-Rex requests between 5-10 explicit gets and 5-10 implict gets in each
10ms period, while the OGL/OCL test requests 250 and 380 times in the same
period.
The two data points together suggest that the nature of the forwake
accesses is bursty and that further changes and potential timeout
extensions, or moving the start of timeout from the first to the last
automatic forcewake grab, should be carefully measured for power and
performance effects.
v2:
* Commit spelling. (Dave Gordon)
* More discussion on numbers in the commit. (Chris Wilson)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Dave Gordon <david.s.gordon@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
We've had problems on several occasions with using the panel type
from the VBT block 40. Usually it seems to be 2, which often
doesn't give us the correct timings for the panel. After some
more digging I found a way to get a panel type via the OpRegion
SWSCI GBDA "Get Panel Details" method. Let's try to use it.
The spec has this to say about the output:
"Bits [15:8] - Panel Type
Bits contain the panel type user setting from CMOS
00h = Not Valid, use default Panel Type & Timings from VBT
01h - 0Fh = Panel Number"
Another version of the spec lists the valid range as 1-16, which makes
more sense since VBT supports 16 panels. Based on actual results
from Rob's G45, 1-16 is what we need to accept.
The other bits in the output don't look relevant for the problem at
hand.
The input is specified as:
"Bits [31:4] - Reserved
Reserved (must be zero)
Bits [3:0] - Panel Number
These bits contain the sequential index of Panel, starting at 0 and
counting upwards from the first integrated Internal Flat-Panel Display
Encoder present, and then from the first external Display Encoder
(e.g., S/DVO-B then S/DVO-C) which supports Internal Flat-Panels.
0h - 0Fh = Panel number"
For now I've just hardcoded the input panel number as 0. That would seem
like a decent choise for LVDS. Not so sure about eDP when port != A.
v2: Accept values 1-16
Filter out bogus results in opregion code (Jani)
Add debug logging for all the different branches (Jani)
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Rob Kramer <rob@solution-space.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94825
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460359431-11003-1-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Tested-by: Rob Kramer <rob@solution-space.com>
Store the extracted panel_type under dev_priv.vbt instead of keeping
around a static variable for it.
Cc: Rob Kramer <rob@solution-space.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
VBT can only contain 16 panel entries, indexed with the panel_type.
To play it safe we should reject panel_type > 0xf, so that we don't
read past the valid data.
v2: Add debug logging (Jani)
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Rob Kramer <rob@solution-space.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com> (v1)
Link: http://patchwork.freedesktop.org/patch/msgid/1460359329-10817-1-git-send-email-ville.syrjala@linux.intel.com
When the GMBUS based i2c transfer times out, we try to fall back to
bit-banging and retry the operation that way. However if the bit-banging
attempt also fails, we should probably go back to the GMBUS method for
the next attempt. Maybe there simply wasn't anyone one the bus at this
time.
There's also a bit of a mess going on with the force_bit handling.
It's supposed to be a ref count actually, and it is as far as
intel_gmbus_force_bit() is concerned. But it's treated as just a
flag by the timeout based bit-banging fallback. I suppose that's
fine since we should never end up in the timeout fallback case
if force_bit was already non-zero. However now that we want to restore
things back to where they were after the bit-banging attempt failed,
we're going to have to do things a bit differently to avoid clobbering
the force_bit count as set up by intel_gmbus_force_bit(). So let's
dedicate the high bit as a flag for the low level timeout based fallback
and treat the rest of the bits as a ref count just as before.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1457366220-29409-4-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Extend the protection of gmbus_mutex around the force_bit
RMW in intel_gmbus_force_bit(), in case someone gets the
idea of calling it from a separate thread while there's
other stuff happening on the same bus.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1457366220-29409-3-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Since we only ever use the drm_i915_private from the stored
i915_mm_struct->dev, save some electrons by storing the right
backpointer.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459864801-28606-3-git-send-email-chris@chris-wilson.co.uk
Holding a reference to the containing task_struct is not sufficient to
prevent the mm_struct from being reaped under memory pressure. If this
happens whilst we are calling get_user_pages(), explosions erupt -
sometimes an immediate GPF, sometimes page flag corruption. To prevent
the target mm from being reaped as we are reading from it, acquire a
reference before we begin.
Testcase: igt/gem_shrink/*userptr
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459864801-28606-2-git-send-email-chris@chris-wilson.co.uk
In order to ensure that all invalidations are completed before the
operation returns to userspace (i.e. before the munmap() syscall returns)
we need to wait upon the outstanding operations.
We are allowed to block inside the invalidate_range_start callback, and
as struct_mutex is the inner lock with mmap_sem we can wait upon the
struct_mutex without provoking lockdep into warning about a deadlock.
However, we don't actually want to wait upon outstanding rendering
whilst holding the struct_mutex if we can help it otherwise we also
block other processes from submitting work to the GPU. So first we do a
wait without the lock and then when we reacquire the lock, we double
check that everything is ready for removing the invalidated pages.
Finally to wait upon the outstanding unpinning tasks, we create a
private workqueue as a means to conveniently wait upon all at once. The
drawback is that this workqueue is per-mm, so any threads concurrently
invalidating objects will wait upon each other. The advantage of using
the workqueue is that we can wait in parallel for completion of
rendering and unpinning of several objects (of particular importance if
the process terminates with a whole mm full of objects).
v2: Apply a cup of tea to the changelog.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94699
Testcase: igt/gem_userptr_blits/sync-unmap-cycles
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459864801-28606-1-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJXCva8AAoJEHm+PkMAQRiGXBoIAIkrjxdbuT2nS9A3tHwkiFXa
6/Th1UjbNaoLuZ+MckQHayAD9NcWY9lVjOUmFsSiSWMCQK/rTWDl8x5ITputrY2V
VuhrJCwI7huEtu6GpRaJaUgwtdOjhIHz1Ue2MCdNIbKX3l+LjVyyJ9Vo8rruvZcR
fC7kiivH04fYX58oQ+SHymCg54ny3qJEPT8i4+g26686m11hvZLI3UAs2PAn6ut+
atCjxdQ4yLN3DWsbjuA7wYGWhTgFloxL4TIoisuOUc3FXnSi/ivIbXZvu4lUfisz
LA2JBhfII3AEMBWG9xfGbXPijJTT4q7yNlTD0oYcnMtAt/Roh2F04asqB1LetEY=
=bri6
-----END PGP SIGNATURE-----
Merge tag 'v4.6-rc3' into drm-intel-next-queued
Linux 4.6-rc3
Backmerge requested by Chris Wilson to make his patches apply cleanly.
Tiny conflict in vmalloc.c with the (properly acked and all) patch in
drm-intel-next:
commit 4da56b99d9
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Mon Apr 4 14:46:42 2016 +0100
mm/vmap: Add a notifier for when we run out of vmap address space
and Linus' tree.
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
If we want a contiguous mapping of a single page sized object, we can
forgo using vmap() and just use a regular kmap(). Note that this is only
suitable if the desired pgprot_t is compatible.
v2: Use is_vmalloc_addr()
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Dave Gordon <david.s.gordon@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460113874-17366-7-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
I have instances where I want to use drm_malloc_ab() but with a custom
gfp mask. And with those, where I want a temporary allocation, I want to
try a high-order kmalloc() before using a vmalloc().
So refactor my usage into drm_malloc_gfp().
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: dri-devel@lists.freedesktop.org
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Acked-by: Dave Airlie <airlied@redhat.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460113874-17366-6-git-send-email-chris@chris-wilson.co.uk
When called because we have run out of vmap address space, we only need
to recover objects that have vmappings and not all.
v2: Start using is_vmalloc_addr()
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460113874-17366-5-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
We now have two implementations for vmapping a whole object, one for
dma-buf and one for the ringbuffer. If we couple the mapping into the
obj->pages lifetime, then we can reuse an obj->mapping for both and at
the same time couple it into the shrinker. There is a third vmapping
routine in the cmdparser that maps only a range within the object, for
the time being that is left alone, but will eventually use these routines
in order to cache the mapping between invocations.
v2: Mark the failable kmalloc() as __GFP_NOWARN (vsyrjala)
v3: Call unpin_vmap from the right dmabuf unmapper
v4: Rename vmap to map as we don't wish to imply the type of mapping
involved, just that it contiguously maps the object into kernel space.
Add kerneldoc and lockdep annotations
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Dave Gordon <david.s.gordon@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460113874-17366-4-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
After we pin the ringbuffer into the GGTT, all error paths need to unpin
it again. Move this common step into one block, and make the unable to
iomap error code consistent (i.e. treat it as out of memory to avoid
confusing it with a invalid argument).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460113874-17366-3-git-send-email-chris@chris-wilson.co.uk
We only need the struct_mutex to manipulate the pages_pin_count on the
object, we do not need to hold our BKL when freeing the exported
scatterlist.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460113874-17366-2-git-send-email-chris@chris-wilson.co.uk
This is to fix a GPU hang seen with mid thread pre-emption
and pooled EUs.
v2. Use IS_BXT_REVID instead of IS_BROXTON and INTEL_REVID
v3. And use correct type for register addresses
Signed-off-by: Tim Gore <tim.gore@intel.com>
Reviewed-by: Arun Siluvery <arun.siluvery@linux.intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1458571049-854-1-git-send-email-tim.gore@intel.com
For BXT, description of polarities of PORT_PLL_REF_SEL
has been reversed for newer Gen9LP steppings according to the
recent update in Bspec. This bit now should be set for
"Non-SSC" mode for all Gen9LP starting from B0 stepping.
v2: Only B0 and newer stepping should be affected by this
change.
Signed-off-by: Dongwon Kim <dongwon.kim@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94866
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1458176773-26925-1-git-send-email-dongwon.kim@intel.com
Check functions are used by atomic to see if the new state will
be allowed. There's also a hw state checker which checks afterwards
that the committed state is correct. Rename it to hw state verifier
to reduce some confusion.
Suggested-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/56FB8785.8020506@linux.intel.com
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
The modeset state verifier no longer has full access to the hardware,
instead it should only verify affected crtc's.
Looking for disabled stuff can be verified immediately after all crtc
disables have completed, while each enabled crtc can be verified right
after being enabled.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1458741487-23801-3-git-send-email-maarten.lankhorst@linux.intel.com
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
[mlankhorst: check -> verify]
This will make it easier to keep the crtc checker when atomic
commit is reworked for asynchronous commits. This prevents checking
crtc's that were not part of the state. It's safe to verify disabled
encoders, connectors and dpll's that are not part of the state,
because during modeset connection_mutex is held.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1458741487-23801-2-git-send-email-maarten.lankhorst@linux.intel.com
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
[mlankhorst: Extend commit message and rename check to verify.]
misc i915 fixes.
* tag 'drm-intel-fixes-2016-04-07' of git://anongit.freedesktop.org/drm-intel:
drm/i915: fix deadlock on lid open
drm/i915: Exit cherryview_irq_handler() after one pass
drm/i915: Call intel_dp_mst_resume() before resuming displays
drm/i915: Fix race condition in intel_dp_destroy_mst_connector()
When reading from the HWS page, we use barrier() to prevent the compiler
optimising away the read from the volatile (may be updated by the GPU)
memory address. This is more suited to READ_ONCE(); make it so.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460195877-20520-5-git-send-email-chris@chris-wilson.co.uk
Rather than call a function to compute the matching cachelines and
clflush them, just call the clflush *instruction* directly. We also know
that we can use the unpatched plain clflush rather than the clflushopt
alternative.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460195877-20520-4-git-send-email-chris@chris-wilson.co.uk
Only declare a missed interrupt if we find that the GPU is idle with
waiters and a hangcheck interval has passed in which no new user
interrupts have been raised.
v2: Clear the stuck interrupt marker between successful batches
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460195877-20520-3-git-send-email-chris@chris-wilson.co.uk
In order to simplify future patches, extract the
lazy_coherency optimisation our of the engine->get_seqno() vfunc into
its own callback.
v2: Rename the barrier to engine->irq_seqno_barrier to try and better
reflect that the barrier is only required after the user interrupt before
reading the seqno (to ensure that the seqno update lands in time as we
do not have strict seqno-irq ordering on all platforms).
Reviewed-by: Dave Gordon <david.s.gordon@intel.com> [#v2]
v3: Comments for hangcheck paranoia. Mika wanted to keep the extra
barrier inside the hangcheck, just in case. I can argue that it doesn't
provide a barrier against anything, but the side-effects of applying the
barrier may prevent a false declaration of a hung GPU.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Dave Gordon <david.s.gordon@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460195877-20520-2-git-send-email-chris@chris-wilson.co.uk
In order to ensure seqno/irq coherency, we currently read a ring register.
The mmio transaction following the interrupt delays the inspection of
the seqno long enough for the MI_STORE_DWORD_IMM to update the CPU
cache. However, it is only the memory timing that is important for the
purposes of the delay, we do not need nor desire the extra forcewake.
v3: Update commentary
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> [v2]
Link: http://patchwork.freedesktop.org/patch/msgid/1460195877-20520-1-git-send-email-chris@chris-wilson.co.uk
Currently for the case where there is enough space at the end of Ring
buffer for accommodating only the base request, the wrapround is done
immediately and as a result the base request gets added at the start
of Ring buffer. But there may not be enough free space at the beginning
to accommodate the base request, as before the wraparound, the wait was
effectively done for the reserved_size free space from the start of
Ring buffer. In such a case there is a potential of Ring buffer overflow,
the instructions at the head of Ring (ACTHD) can get overwritten.
Since the base request can fit in the remaining space, there is no need
to wraparound immediately. The wraparound will anyway happen later when
the reserved part starts getting used.
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Akash Goel <akash.goel@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1457688402-10411-1-git-send-email-akash.goel@intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: stable@vger.kernel.org
Having fixed the tracking of the engine's last_submitted_seqno, we can
now rely on it for detecting when the engine is idle (and not have to
touch the requests pointer).
Testcase: igt/gem_exec_whisper
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460010558-10705-9-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Seal the request and mark it as pending execution before we submit it to
hardware. We assume that the actual submission cannot fail (that
guarantee is provided by preallocating space in the request for the
submission). As we may inspect this state without holding any locks
during hangcheck we should apply a barrier to ensure that we do
not see a more recent value in the HWS than we are tracking.
Based on a patch by Mika Kuoppala.
Suggested-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460010558-10705-8-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
When we change the current seqno, we also need to remember to reset the
last_submitted_seqno for the engine.
Testcase: igt/gem_exec_whisper
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460010558-10705-7-git-send-email-chris@chris-wilson.co.uk
An oversight is that when we wrap the seqno, we need to reset the hw
semaphore counters to 0. We did this for gen6 and gen7 and forgot to do
so for the new implementation required for gen8 (legacy).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460010558-10705-6-git-send-email-chris@chris-wilson.co.uk
Since we are setting engine local values that are tied to the hardware,
move it out of i915_gem_init_seqno() into the intel_ring_init_seqno()
backend, next to where the other hw semaphore registers are written.
v2: Make the explanatory comment about always resetting the semaphores to
0 irrespective of the value of the reset seqno.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460010558-10705-4-git-send-email-chris@chris-wilson.co.uk
We only use drm_i915_private within the function, so delete the unneeded
drm_device local.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460010558-10705-3-git-send-email-chris@chris-wilson.co.uk
After the GPU reset and we discard all of the incomplete requests, mark
the GPU as having advanced to the last_submitted_seqno (as having
completed the requests and ready for fresh work). The impact of this is
negligible, as all the requests will be considered completed by this
point, it just brings the HWS into line with expectations for external
viewers.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460010558-10705-2-git-send-email-chris@chris-wilson.co.uk
It's useful to look at the last seqno submitted on a particular engine
and compare it against the HWS value to check for irregularities.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460010558-10705-1-git-send-email-chris@chris-wilson.co.uk
At BXT DSI, PIPE registers are inactive. So we can't get the
PIPE's mode parameters from them. The possible option is
retriving them from the PORT registers.
The required changes are added for BXT in intel_dsi_get_config
(encoder->get_config).
v2: Addressed the Jani's comments
-removed the redundant call to encoder->get_config
-read bpp from port register
-removed retrival of src_size from encoder->get_config
v3: pipe_config->pipe_bpp is fixed
Jani's review comments addressed:
Few horizontal timing parameters dropped from the patch to make
progress, as there seems to be some disagreement on
best/feasible/possible options.
Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Signed-off-by: Uma Shankar <uma.shankar@intel.com>
Previously Reviewed at: https://lists.freedesktop.org/archives/intel-gfx/2016-April/091737.html
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460019967-26501-2-git-send-email-ramalingam.c@intel.com
Define and store the pad base offset in the array, and reference the
pconf0 and padval registers through macros. Add VLV prefixes to
macros. Use spec nomenclature for pconf0 and padval.
v2: Address Ville's review comments, squash another patch here.
v3: Use the names Ville dug up in the specs.
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/34932140b78a3de7f825c78380a08c930694651b.1459884518.git.jani.nikula@intel.com
dev_priv is what the macro works hard to extract, pass it directly.
> sed 's/\([A-Z].*(dev_priv\)->dev)/\1)/g'
v2:
- Include all wrapper macros too (Chris)
v3:
- Include sed cmdline (Chris)
v4:
- Break long line
- Rebase
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460016485-8089-1-git-send-email-joonas.lahtinen@linux.intel.com
According to Chris, use of i915_vm_to_ppgtt is visible in benchmark
unless WARN_ON is removed, so lets get rid of it.
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reported-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v1)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Looks much better without container_of everywhere.
v2:
- In i915_gem_restore_gtt_mappings too (Chris)
v3:
- Do not cause WARN by calling on non PPGTT object (Chris)
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
intel_update_max_cdclk() doesn't have a switch case for Broxton, so
dev_priv->max_cdclk_freq gets set to whatever clock frequency we're
currently running at (e.g., 144 MHz) rather than the true maximum. This
causes our max dotclock to also be set too low and in turn leads mode
verification to reject perfectly valid modes while loading EDID firmware
blobs.
Cc: Imre Deak <imre.deak@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459892239-14041-1-git-send-email-matthew.d.roper@intel.com
This patch sets the invert bit for hpd detection for each port
based on VBT configuration. Since each AOB can be designed to
depend on invert bit or not, it is expected if an AOB requires
invert bit, the user will set respective bit in VBT.
v2: Separated VBT parsing from the rest of the logic. (Jani)
v3: Moved setting invert bit logic to bxt_hpd_irq_setup()
and changed its logic to avoid looping twice. (Ville)
v4: Changed the logic to mask out the bits first and then
set them to remove need of temporary variable. (Ville)
v5: Moved defines to existing set of defines for the register
and added required breaks. (Ville)
Signed-off-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com>
Signed-off-by: Durgadoss R <durgadoss.r@intel.com>
Signed-off-by: Shubhangi Shrivastava <shubhangi.shrivastava@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
[Jani: fixed some checkpatch noise, added kernel-doc.]
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459420907-11383-2-git-send-email-shubhangi.shrivastava@intel.com
This patch adds new fields that are not yet added in drm code
in child devices struct
Signed-off-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com>
Signed-off-by: Durgadoss R <durgadoss.r@intel.com>
Signed-off-by: Shubhangi Shrivastava <shubhangi.shrivastava@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459420907-11383-1-git-send-email-shubhangi.shrivastava@intel.com
- VBT code refactor for a clean split between parsing&using of firmware
information (Jani)
- untangle the pll computation code, and splitting up the monster
i9xx_crtc_compute_clocks (Ander)
- dsi support for bxt (Jani, Shashank Sharma and others)
- color manager (i.e. de-gamma, color conversion matrix & gamma support) from
Lionel Landwerlin
- Vulkan hsw support in the command parser (Jordan Justen)
- large-scale renaming of intel_engine_cs variables/parameters to avoid the epic
ring vs. engine confusion introduced in gen8 (Tvrtko Ursulin)
- few atomic patches from Maarten&Matt, big one is two-stage wm programming on ilk-bdw
- refactor driver load and add infrastructure to inject load failures for
testing, from Imre
- various small things all over
* tag 'drm-intel-next-2016-03-30' of git://anongit.freedesktop.org/drm-intel: (179 commits)
drm/i915: Update DRIVER_DATE to 20160330
drm/i915: Call intel_dp_mst_resume() before resuming displays
drm/i915: Fix races on fbdev
drm/i915: remove unused dev_priv->render_reclock_avail
drm/i915: move sdvo mappings to vbt data
drm/i915: move edp low vswing config to vbt data
drm/i915: use a substruct in vbt data for edp
drm/i915: replace for_each_engine()
drm/i915: introduce for_each_engine_id()
drm/i915/bxt: Fix DSI HW state readout
drm/i915: Remove vblank wait from hsw_enable_ips, v2.
drm/i915: Tidy aliasing_gtt_bind_vma()
drm/i915: Split PNV version of crtc_compute_clock()
drm/i915: Split g4x_crtc_compute_clock()
drm/i915: Split i8xx_crtc_compute_clock()
drm/i915: Split CHV and VLV specific crtc_compute_clock() hooks
drm/i915: Merge ironlake_compute_clocks() and ironlake_crtc_compute_clock()
drm/i915: Move fp divisor calculation into ironlake_compute_dpll()
drm/i915: Pass crtc_state->dpll directly to ->find_dpll()
drm/i915: Simplify ironlake_crtc_compute_clock() CPU eDP case
...
Currently we set the initial GPU frequency to min_freq_softlimit
on gen9, and to efficient_freq on VLV/CHV. On all the other platforms
we set it to idle_freq. Let's use idle_freq across the board to make
sure we don't waste power. This is especially relevant for VLV since
Vnn won't drop to minimum unless the GPU is at the minimum frequency.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1457120584-26080-3-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>
Extract the GPLL reference frequency from CCK and use it in the
GPU freq<->opcode conversions on VLV/CHV. This eliminates all the
assumptions we have about which divider is used for which czclk
frequency.
Note that unlike most clocks from CCK, the GPLL ref clock is a divided
down version of the CZ clock rather than the HPLL clock. CZ clock itself
is a divided down version of the HPLL clock though, so in effect it just
gets divided down twice.
While at it, throw in a few comments explaining the remaining constants
for anyone who later wants to compare this to the spreadsheets.
v2: Add slow/fast notes for CHV clocks (Imre)
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1457120584-26080-2-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com> (v1)
After a suspend-resume cycle, the resumed kernel has no idea what the
booted kernel may have done to the GuC before replacing itself with the
resumed image. In particular, it may have already loaded the GuC with
firmware, which will then cause this kernel's attempt to (re)load the
firmware to fail (GuC program memory is write-once!). The symptoms
(GuC firmware reload fails after hibernation) are further described
in the Bugzilla reference below.
So let's *always* reset the GuC just before (re)loading the firmware;
the hardware should then be in a well-known state, and we may even
avoid some of the issues arising from unpredictable timing.
Also added some more fields & values to the definition of the GUC_STATUS
register, which is the key diagnostic indicator if the GuC load fails.
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Reviewed-by: Arun Siluvery <arun.siluvery@linux.intel.com>
Cc: Alex Dai <yu.dai@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94390
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Due to timing issues in the HW, some of the status bits required for GuC
authentication occasionally don't get set; when that happens, the GuC
cannot be initialized and we will be left with a wedged GPU. The W/A
suggested is to perform a soft reset of the GuC and attempt to reload
the F/W again for few times before giving up.
As the failure is dependent on timing, tests performed by triggering
manual full gpu reset (i915_wedged) showed that we could sometimes hit
this after several thousand iterations, but sometimes tests ran even
longer without any issues. Reset and reload mechanism proved helpful
when we indeed hit f/w load failure, so it is better to include this
to improve driver stability.
This change implements the following WAs,
WaEnableuKernelHeaderValidFix:skl,bxt
WaEnableGuCBootHashCheckNotSet:skl,bxt
Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com>
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Reviewed-by: Alex Dai <yu.dai@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Both the oom and vmap notifier callbacks have a loop to acquire the
struct_mutex and set the device as uninterruptible, within a certain
time. Refactor the common code into a pair of functions.
Suggested-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459848145-24042-1-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
If the core runs out of vmap address space, it will call a notifier in
case any driver can reap some of its vmaps. As i915.ko is possibily
holding onto vmap address space that could be recovered, hook into the
notifier chain and try and reap objects holding onto vmaps.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Roman Pen <r.peniaev@gmail.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: linux-mm@kvack.org
Cc: linux-kernel@vger.kernel.org
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kahola <mika.kahola@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459777603-23618-4-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Since we only attempt to purge an object if can_release_pages() report
true, we should also only add it to the count of potential recoverable
pages when can_release_pages() is true.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Akash Goel <akash.goel@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459777603-23618-2-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
This effectively reverts
commit 8e5fd599eb
Author: Ville Syrjälä <ville.syrjala@linux.intel.com>
Date: Wed Apr 9 13:28:50 2014 +0300
drm/i915/chv: Make CHV irq handler loop until all interrupts are consumed
as under continuous execlists load we can saturate the IRQ handler,
destablising the tsc clock and triggering the NMI watchdog to declare a hung
CPU.
[ 552.756051] clocksource: timekeeping watchdog on CPU0: Marking clocksource 'tsc' as unstable because the skew is too large:
[ 552.756080] clocksource: 'refined-jiffies' wd_now: 10003b480 wd_last: 10003b28c mask: ffffffff
[ 552.756091] clocksource: 'tsc' cs_now: d55d31aa50 cs_last: d17446166c mask: ffffffffffffffff
[ 552.756210] clocksource: Switched to clocksource refined-jiffies
[ 575.217870] NMI watchdog: Watchdog detected hard LOCKUP on cpu 1
[ 575.217893] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.5.0-rc7+ #18
[ 575.217905] Hardware name: /NUC5CPYB, BIOS PYBSWCEL.86A.0027.2015.0507.1758 05/07/2015
[ 575.217915] 0000000000000000 ffff88027fd05bc0 ffffffff81288c6d 0000000000000000
[ 575.217935] 0000000000000001 ffff88027fd05be0 ffffffff810e72d1 0000000000000000
[ 575.217951] ffff88027fd05c80 ffff88027fd05c20 ffffffff81114b60 0000000181015f1e
[ 575.217967] Call Trace:
[ 575.217973] <NMI> [<ffffffff81288c6d>] dump_stack+0x4f/0x72
[ 575.217994] [<ffffffff810e72d1>] watchdog_overflow_callback+0x151/0x160
[ 575.218003] [<ffffffff81114b60>] __perf_event_overflow+0xa0/0x1e0
[ 575.218016] [<ffffffff811154c4>] perf_event_overflow+0x14/0x20
[ 575.218028] [<ffffffff8101d2ca>] intel_pmu_handle_irq+0x1da/0x460
[ 575.218042] [<ffffffff814a8aae>] ? poll_idle+0x3e/0x70
[ 575.218052] [<ffffffff814a8aae>] ? poll_idle+0x3e/0x70
[ 575.218064] [<ffffffff81014ae8>] perf_event_nmi_handler+0x28/0x50
[ 575.218075] [<ffffffff81007540>] nmi_handle+0x60/0x130
[ 575.218086] [<ffffffff814a8aae>] ? poll_idle+0x3e/0x70
[ 575.218096] [<ffffffff810079c0>] do_nmi+0x140/0x470
[ 575.218108] [<ffffffff81559ec7>] end_repeat_nmi+0x1a/0x1e
[ 575.218119] [<ffffffff814a8aae>] ? poll_idle+0x3e/0x70
[ 575.218129] [<ffffffff814a8aae>] ? poll_idle+0x3e/0x70
[ 575.218139] [<ffffffff814a8aae>] ? poll_idle+0x3e/0x70
[ 575.218148] <<EOE>> [<ffffffff814a8353>] cpuidle_enter_state+0xf3/0x2f0
[ 575.218164] [<ffffffff814a8587>] cpuidle_enter+0x17/0x20
[ 575.218175] [<ffffffff810aaa3a>] call_cpuidle+0x2a/0x40
[ 575.218185] [<ffffffff810aade3>] cpu_startup_entry+0x273/0x330
[ 575.218196] [<ffffffff81033a1e>] start_secondary+0x10e/0x130
However, not servicing all available IIR within the handler does hurt the
throughput of pathological nop execbuf by about 20%, with a similar effect
upon the dispatch latency of a series of execbuf.
v2: use do {} while(0) for a smaller patch, and easier to revert again
I have reasonable confidence that we do not miss GT interrupts (as
execlists provides a stress case with a failure mechanism easily
detected by igt), however I have less confidence about all the other
sources of interrupts and worry that may lose a display hotplug
interrupt, for example.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93467
Testcase: igt/gem_exec_nop/basic # requires NMI watchdog
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Antti Koskipää <antti.koskipaa@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1457946117-6714-1-git-send-email-chris@chris-wilson.co.uk
(cherry picked from commit 579de73b04)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Since we need MST devices ready before we try to resume displays,
calling this after intel_display_resume() can result in some issues with
various laptop docks where the monitor won't turn back on after
suspending the system.
This order was originally changed in
commit e7d6f7d708 ("drm/i915: resume MST after reading back hw state")
In order to fix some unclaimed register errors, however the actual cause
of those has since been fixed.
CC: stable@vger.kernel.org
Signed-off-by: Lyude <cpaul@redhat.com>
[danvet: Resolve conflicts with locking changes.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
(cherry picked from commit a16b7658f4)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
After unplugging a DP MST display from the system, we have to go through
and destroy all of the DRM connectors associated with it since none of
them are valid anymore. Unfortunately, intel_dp_destroy_mst_connector()
doesn't do a good enough job of ensuring that throughout the destruction
process that no modesettings can be done with the connectors. As it is
right now, intel_dp_destroy_mst_connector() works like this:
* Take all modeset locks
* Clear the configuration of the crtc on the connector, if there is one
* Drop all modeset locks, this is required because of circular
dependency issues that arise with trying to remove the connector from
sysfs with modeset locks held
* Unregister the connector
* Take all modeset locks, again
* Do the rest of the required cleaning for destroying the connector
* Finally drop all modeset locks for good
This only works sometimes. During the destruction process, it's very
possible that a userspace application will attempt to do a modesetting
using the connector. When we drop the modeset locks, an ioctl handler
such as drm_mode_setcrtc has the oppurtunity to take all of the modeset
locks from us. When this happens, one thing leads to another and
eventually we end up committing a mode with the non-existent connector:
[drm:intel_dp_link_training_clock_recovery [i915]] *ERROR* failed to enable link training
[drm:intel_dp_aux_ch] dp_aux_ch timeout status 0x7cf0001f
[drm:intel_dp_start_link_train [i915]] *ERROR* failed to start channel equalization
[drm:intel_dp_aux_ch] dp_aux_ch timeout status 0x7cf0001f
[drm:intel_mst_pre_enable_dp [i915]] *ERROR* failed to allocate vcpi
And in some cases, such as with the T460s using an MST dock, this
results in breaking modesetting and/or panicking the system.
To work around this, we now unregister the connector at the very
beginning of intel_dp_destroy_mst_connector(), grab all the modesetting
locks, and then hold them until we finish the rest of the function.
CC: stable@vger.kernel.org
Signed-off-by: Lyude <cpaul@redhat.com>
Signed-off-by: Rob Clark <rclark@redhat.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1458155884-13877-1-git-send-email-cpaul@redhat.com
(cherry picked from commit 1f7717552e)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
ago with promise that one day it will be possible to implement page
cache with bigger chunks than PAGE_SIZE.
This promise never materialized. And unlikely will.
We have many places where PAGE_CACHE_SIZE assumed to be equal to
PAGE_SIZE. And it's constant source of confusion on whether
PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
especially on the border between fs and mm.
Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
breakage to be doable.
Let's stop pretending that pages in page cache are special. They are
not.
The changes are pretty straight-forward:
- <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
- <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
- PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};
- page_cache_get() -> get_page();
- page_cache_release() -> put_page();
This patch contains automated changes generated with coccinelle using
script below. For some reason, coccinelle doesn't patch header files.
I've called spatch for them manually.
The only adjustment after coccinelle is revert of changes to
PAGE_CAHCE_ALIGN definition: we are going to drop it later.
There are few places in the code where coccinelle didn't reach. I'll
fix them manually in a separate patch. Comments and documentation also
will be addressed with the separate patch.
virtual patch
@@
expression E;
@@
- E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E
@@
expression E;
@@
- E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E
@@
@@
- PAGE_CACHE_SHIFT
+ PAGE_SHIFT
@@
@@
- PAGE_CACHE_SIZE
+ PAGE_SIZE
@@
@@
- PAGE_CACHE_MASK
+ PAGE_MASK
@@
expression E;
@@
- PAGE_CACHE_ALIGN(E)
+ PAGE_ALIGN(E)
@@
expression E;
@@
- page_cache_get(E)
+ get_page(E)
@@
expression E;
@@
- page_cache_release(E)
+ put_page(E)
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Doing a lot of work in the interrupt handler introduces huge
latencies to the system as a whole.
Most dramatic effect can be seen by running an all engine
stress test like igt/gem_exec_nop/all where, when the kernel
config is lean enough, the whole system can be brought into
multi-second periods of complete non-interactivty. That can
look for example like this:
NMI watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [kworker/u8:3:143]
Modules linked in: [redacted for brevity]
CPU: 0 PID: 143 Comm: kworker/u8:3 Tainted: G U L 4.5.0-160321+ #183
Hardware name: Intel Corporation Broadwell Client platform/WhiteTip Mountain 1
Workqueue: i915 gen6_pm_rps_work [i915]
task: ffff8800aae88000 ti: ffff8800aae90000 task.ti: ffff8800aae90000
RIP: 0010:[<ffffffff8104a3c2>] [<ffffffff8104a3c2>] __do_softirq+0x72/0x1d0
RSP: 0000:ffff88014f403f38 EFLAGS: 00000206
RAX: ffff8800aae94000 RBX: 0000000000000000 RCX: 00000000000006e0
RDX: 0000000000000020 RSI: 0000000004208060 RDI: 0000000000215d80
RBP: ffff88014f403f80 R08: 0000000b1b42c180 R09: 0000000000000022
R10: 0000000000000004 R11: 00000000ffffffff R12: 000000000000a030
R13: 0000000000000082 R14: ffff8800aa4d0080 R15: 0000000000000082
FS: 0000000000000000(0000) GS:ffff88014f400000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fa53b90c000 CR3: 0000000001a0a000 CR4: 00000000001406f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Stack:
042080601b33869f ffff8800aae94000 00000000fffc2678 ffff88010000000a
0000000000000000 000000000000a030 0000000000005302 ffff8800aa4d0080
0000000000000206 ffff88014f403f90 ffffffff8104a716 ffff88014f403fa8
Call Trace:
<IRQ>
[<ffffffff8104a716>] irq_exit+0x86/0x90
[<ffffffff81031e7d>] smp_apic_timer_interrupt+0x3d/0x50
[<ffffffff814f3eac>] apic_timer_interrupt+0x7c/0x90
<EOI>
[<ffffffffa01c5b40>] ? gen8_write64+0x1a0/0x1a0 [i915]
[<ffffffff814f2b39>] ? _raw_spin_unlock_irqrestore+0x9/0x20
[<ffffffffa01c5c44>] gen8_write32+0x104/0x1a0 [i915]
[<ffffffff8132c6a2>] ? n_tty_receive_buf_common+0x372/0xae0
[<ffffffffa017cc9e>] gen6_set_rps_thresholds+0x1be/0x330 [i915]
[<ffffffffa017eaf0>] gen6_set_rps+0x70/0x200 [i915]
[<ffffffffa0185375>] intel_set_rps+0x25/0x30 [i915]
[<ffffffffa01768fd>] gen6_pm_rps_work+0x10d/0x2e0 [i915]
[<ffffffff81063852>] ? finish_task_switch+0x72/0x1c0
[<ffffffff8105ab29>] process_one_work+0x139/0x350
[<ffffffff8105b186>] worker_thread+0x126/0x490
[<ffffffff8105b060>] ? rescuer_thread+0x320/0x320
[<ffffffff8105fa64>] kthread+0xc4/0xe0
[<ffffffff8105f9a0>] ? kthread_create_on_node+0x170/0x170
[<ffffffff814f351f>] ret_from_fork+0x3f/0x70
[<ffffffff8105f9a0>] ? kthread_create_on_node+0x170/0x170
I could not explain, or find a code path, which would explain
a +20 second lockup, but from some instrumentation it was
apparent the interrupts off proportion of time was between
10-25% under heavy load which is quite bad.
When a interrupt "cliff" is reached, which was >~320k irq/s on
my machine, the whole system goes into a terrible state of the
above described multi-second lockups.
By moving the GT interrupt handling to a tasklet in a most
simple way, the problem above disappears completely.
Testing the effect on sytem-wide latencies using
igt/gem_syslatency shows the following before this patch:
gem_syslatency: cycles=1532739, latency mean=416531.829us max=2499237us
gem_syslatency: cycles=1839434, latency mean=1458099.157us max=4998944us
gem_syslatency: cycles=1432570, latency mean=2688.451us max=1201185us
gem_syslatency: cycles=1533543, latency mean=416520.499us max=2498886us
This shows that the unrelated process is experiencing huge
delays in its wake-up latency. After the patch the results
look like this:
gem_syslatency: cycles=808907, latency mean=53.133us max=1640us
gem_syslatency: cycles=862154, latency mean=62.778us max=2117us
gem_syslatency: cycles=856039, latency mean=58.079us max=2123us
gem_syslatency: cycles=841683, latency mean=56.914us max=1667us
Showing a huge improvement in the unrelated process wake-up
latency. It also shows an approximate halving in the number
of total empty batches submitted during the test. This may
not be worrying since the test puts the driver under
a very unrealistic load with ncpu threads doing empty batch
submission to all GPU engines each.
Another benefit compared to the hard-irq handling is that now
work on all engines can be dispatched in parallel since we can
have up to number of CPUs active tasklets. (While previously
a single hard-irq would serially dispatch on one engine after
another.)
More interesting scenario with regards to throughput is
"gem_latency -n 100" which shows 25% better throughput and
CPU usage, and 14% better dispatch latencies.
I did not find any gains or regressions with Synmark2 or
GLbench under light testing. More benchmarking is certainly
required.
v2:
* execlists_lock should be taken as spin_lock_bh when
queuing work from userspace now. (Chris Wilson)
* uncore.lock must be taken with spin_lock_irq when
submitting requests since that now runs from either
softirq or process context.
v3:
* Expanded commit message with more testing data;
* converted missed locking sites to _bh;
* added execlist_lock comment. (Chris Wilson)
v4:
* Mention dispatch parallelism in commit. (Chris Wilson)
* Do not hold uncore.lock over MMIO reads since the block
is already serialised per-engine via the tasklet itself.
(Chris Wilson)
* intel_lrc_irq_handler should be static. (Chris Wilson)
* Cancel/sync the tasklet on GPU reset. (Chris Wilson)
* Document and WARN that tasklet cannot be active/pending
on engine cleanup. (Chris Wilson/Imre Deak)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Imre Deak <imre.deak@intel.com>
Testcase: igt/gem_exec_nop/all
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94350
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1459768316-6670-1-git-send-email-tvrtko.ursulin@linux.intel.com
Deal with errors from drm_universal_plane_init() in primary and cursor
plane init paths (sprites were already covered). Also make the code
neater by using goto for error handling.
v2: Rebased due to drm_universal_plane_init() 'name' parameter
v3: Another rebase due to s/""/NULL/
v4: Rebased on drm-nightly (Matthew Auld)
v5: Fix email address (Matthew Auld)
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1458571402-32749-1-git-send-email-matthew.auld@intel.com
Supposedly the power sequencer still locks out the DPLL registers on
CHV, so let's issue a warning if it's still locked when enabling the
DPLL.
Also drop the redundant IS_MOBILE() check for VLV when we check the same
thing.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1458052809-23426-6-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
DPLL_MD(PIPE_C) is AWOL on CHV. Instead of fixing it someone added
chicken bits to propagate the pixel multiplier from DPLL_MD(PIPE_B)
to either pipe B or C. So do that to make pixel repeat work on pipes
B and C. Pipe A is fine without any tricks.
Fortunately the pixel repeat propagation appears to be a oneshot
operation, so once the value has been written we can clear the
chicken bits. So it is still possible to drive pipe B and C with
different pixel multipliers simultaneosly.
Looks like DPLL_VGA_MODE_DIS must also be set in DPLL(PIPE_B)
for this to work. But since we keep that bit always set in all
DPLLs there's no problem.
This of course means we can't reliably read out the pixel multiplier
for pipes B and C. That would make the state checker unhappy, so I
added shadow copies of those registers in to dev_priv. The other
option would have been to skip pixel multiplier, dpll_md an dotclock
checks entirely on CHV, but that feels like a serious loss of cross
checking, so just pretending that we have working DPLL MD registers
seemed better. Obviously with the shadow copies we can't detect if
the pixel multiplier was properly configured, nor can we take over
its state from the BIOS, but hopefully people won't have displays
that would be limitd to such crappy modes.
There is one strange flicker still remaining. It's visible on
pipe C/HDMID when HDMIB is enabled while driven by pipe B.
It doesn't occur if pipe A drives HDMIB, nor is there any glitch
on pipe B/HDMIB when port C/HDMID starts up. I don't have a board
with HDMIC so not sure if it happens there too. So I'm not sure
if it's somehow tied in with this strange linkage between pipe B
and C. Sadly I was unable to find an enable sequence that would
avoid the glitch, but at least it's not fatal ie. the output
recovers afterwards.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1458052809-23426-4-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
The VLV and CHV DPLL disable and update are almost identical in
how the DPLL/DPLL_MD registers need to be set up. But the code
looks more different than it really is. Try to bring them into
line.
Note that we now leave the refclock always enabled for both
DPLLs in the dual channel PHY. But that's perfectly fine since
it's the same clock, and we anyway already do that when turning
the disp2d power well on.
v2: s/chv_update_pll/chv_compute_dpll/
v3: Add a note that we leave refclocks enabled for both DPLLs (Jani)
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1458052809-23426-3-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Bspec is confused w.r.t. the HSW/BDW FDI disable sequence. It lists
FDI RX disable both as step 13 and step 18 in the sequence. But I dug
up an old BUN mail from Art that moved the FDI RX disable to happen
before DDI_BUF_CTL disable. That BUN did not renumber the steps and just
added a note:
"Workaround: Disable PCH FDI Receiver before disabling DDI_BUF_CTL."
The BUN described the symptoms of the fixed issue as:
"PCH display underflow and a black screen on the analog CRT port that
happened after a FDI re-train"
I suppose later someone tried to renumber the steps to match, but forgot
to remove the FDI RX disable from its old position in the sequence.
They also forgot to update the note describing what should be done in
case of an FDI training failure. Currently it says:
"To retry FDI training, follow the Disable Sequence steps to Disable FDI,
but skip the steps related to clocks and PLLs (16, 19, and 20), ..."
It should really say "17, 20, and 21" with the current sequence because
those are the steps that deal with PLLs and whatnot, after step 13 became
FDI RX disable. And had the step 18 FDI RX disable been removed, as I
suspect it should have, the note should actually say "17, 19, and 20".
So, let's move the FDI RX disable to happen before DDI_BUF_CTL disable,
as that would appear to be the correct order based on the BUN.
Note that Art has since unconfused the spec, and so this patch should
now match the steps listed in the spec.
v2: Add a note that the spec is now correct
Cc: Paulo Zanoni <przanoni@gmail.com>
Cc: Art Runyan <arthur.j.runyan@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1456841783-4779-1-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
This reverts commit a7442b93cf.
With the patch applied SNB, IVB and ILK are experiencing hard machine
hangs. Original patch was to fix "just" kernel panics so it's not a good
trade-off.
Proper fix for the panic is on the way, lets revert until then.
Fixes: a7442b93cf ("drm/i915: Fix races on fbdev")
Cc: Lukas Wunner <lukas@wunner.de>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tomi Sarvela <tomi.p.sarvela@intel.com>
Cc: stable@vger.kernel.org
Acked-by: Lukas Wunner <lukas@wunner.de>
Tested-by: Tomi Sarvela <tomi.p.sarvela@intel.com>
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459510861-29035-1-git-send-email-joonas.lahtinen@linux.intel.com
According to the BSpec update, bit 7 of PORT_CL1CM_DW0 register needs to be
checked to ensure that the register is in accessible state.
Also, based on a BSpec update, changing the timeout value to
check iphypwrgood, from 10ms to wait for up to 100us.
v2: [Ville] use wait_for_us instead of the atomic call.
v3: [Jani/Imre] read register only once
Signed-off-by: Vandana Kannan <vandana.kannan@intel.com>
Reported-by: Philippe Lecluse <Philippe.Lecluse@intel.com>
Cc: Deak, Imre <imre.deak@intel.com>
Cc: Nikula, Jani <jani.nikula@intel.com>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459446354-19012-1-git-send-email-vandana.kannan@intel.com
This patch checks for changes in sink count between short pulse
hpds and forces full detect when there is a change.
This will allow both detection of hotplug and unplug of panels
through dongles that give only short pulse for such events.
v2: changed variable type from u8 to bool (Jani)
return immediately if perform_full_detect is set(Siva)
v3: changed method of determining full detection from using
pointer to return code (Siva)
v4: changed comments to indicate meaning of return value of
intel_dp_short_pulse and explain the use of return value
from intel_dp_get_dpcd in intel_dp_short_pulse (Ander)
Tested-by: Nathan D Ciobanu <nathan.d.ciobanu@intel.com>
Signed-off-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com>
Signed-off-by: Shubhangi Shrivastava <shubhangi.shrivastava@intel.com>
Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com>
Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459341326-13142-5-git-send-email-shubhangi.shrivastava@intel.com
Sink count can change between short pulse hpd hence this patch
adds a member variable to intel_dp so we can track any changes
between short pulse interrupts.
This patch reads sink_count dpcd always and removes its
read operation based on values in downstream port dpcd.
SINK_COUNT dpcd is not dependent on DOWNSTREAM_PORT_PRESENT dpcd.
SINK_COUNT denotes if a display is attached, while
DOWNSTREAM_PORT_PRESET indicates how many ports are available
in the dongle where display can be attached. so it is possible
for sink count to change irrespective of value in downstream
port dpcd.
Here is a table of possible values and scenarios
sink_count downstream_port
present
0 0 no display is attached
0 1 dongle is connected without display
1 0 display connected directly
1 1 display connected through dongle
v2: Storing value of intel_dp->sink_count that is ready
for consumption. (Ander)
Squashing two commits into one. (Ander)
v3: Added comment to explain the need of early return when
sink count is 0. (Ander)
Tested-by: Nathan D Ciobanu <nathan.d.ciobanu@intel.com>
Signed-off-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com>
Signed-off-by: Shubhangi Shrivastava <shubhangi.shrivastava@intel.com>
Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com>
Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459341326-13142-4-git-send-email-shubhangi.shrivastava@intel.com
When created originally intel_dp_check_link_status()
was supposed to handle only link training for short
pulse but has grown into handler for short pulse itself.
This patch cleans up this function by splitting it into
two halves. First intel_dp_short_pulse() is called,
which will be entry point and handle all logic for
short pulse handling while intel_dp_check_link_status()
will retain its original purpose of only doing link
status related work.
intel_dp_short_pulse: All existing code other than
link status read and link training upon error status.
intel_dp_check_link_status:
The link status should be read on short pulse
irrespective of panel being enabled or not so
intel_dp_get_link_status() performs dpcd read first
then based on crtc active / enabled it will
perform the link training.
This is because short pulse is a generic interrupt
which should always be handled, because it may mean:
1. Hotplug/unplug of MST panel
2. Hotplug/unplug of dongle
3. Link status change for other DP panels
v2: Added WARN_ON to intel_dp_check_link_status()
Removed a call to intel_dp_get_link_status() (Ander)
v3: Changed commit message to explain need of link status
being read before performing encoder checks (Daniel)
v4: Changed commit message to explain need of reading
link status on short pulse (Ander)
Tested-by: Nathan D Ciobanu <nathan.d.ciobanu@intel.com>
Signed-off-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com>
Signed-off-by: Shubhangi Shrivastava <shubhangi.shrivastava@intel.com>
Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com>
[anderco: fix parenthesis alignment]
Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459341326-13142-3-git-send-email-shubhangi.shrivastava@intel.com
Current DP detection has DPCD operations split across
intel_dp_hpd_pulse and intel_dp_detect which contains
duplicates as well. Also intel_dp_detect is called
during modes enumeration as well which will result
in multiple dpcd operations. So this patch tries
to solve both these by bringing all DPCD operations
in one single function and make intel_dp_detect
use existing values instead of repeating same steps.
v2: Pulled in a hunk from last patch of the series to
this patch. (Ander)
v3: Added MST hotplug handling. (Ander)
v4: Added a flag to check if detect is performed to
prevent multiple detects on hotplug. (Ander)
Tested-by: Nathan D Ciobanu <nathan.d.ciobanu@intel.com>
Signed-off-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com>
Signed-off-by: Shubhangi Shrivastava <shubhangi.shrivastava@intel.com>
Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com>
[anderco: fix parenthesis aligment]
Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459341326-13142-2-git-send-email-shubhangi.shrivastava@intel.com
intel_dp_detect() is called for not just detection but
during modes enumeration as well. Repeating the whole
sequence during each of these calls is wasteful and
time consuming.
This patch moves probing for panel, DPCD read etc done in
intel_dp_detect() to a new function intel_dp_long_pulse().
Note that the behavior of intel_dp_detect() is changed to
report connected or disconnected depending on whether the
EDID is available or not.
This change will be required by further patches in the series
to avoid performing duplicated DPCD operations on hotplug.
v2: Moved a hunk to next patch of the series.
Moved intel_dp_unset_edid to out. (Ander)
v3: Rephrased commit message and intel_dp_unset_dp() is called
within intel_dp_set_dp() to free the previous EDID. (Ander)
v4: Added overriding of status to disconnected for MST. (Ander)
Tested-by: Nathan D Ciobanu <nathan.d.ciobanu@intel.com>
Signed-off-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com>
Signed-off-by: Shubhangi Shrivastava <shubhangi.shrivastava@intel.com>
Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com>
[anderco: fix parenthesis alignment]
Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459341326-13142-1-git-send-email-shubhangi.shrivastava@intel.com