Our interrupt handler (in hardirq context) could race with the timer
(in softirq context), hence we need to hold the spinlock around the
call to ->hdp_irq_setup in intel_hpd_irq_handler, too.
But as an optimization (and more so to clarify things) we don't need
to do the irqsave/restore dance in the hardirq context.
Note also that on ilk+ the race isn't just against the hotplug
reenable timer, but also against the fifo underrun reporting. That one
also modifies the SDEIMR register (again protected by the same
dev_priv->irq_lock).
To lock things down again sprinkle a assert_spin_locked. But exclude
the functions touching SDEIMR for now, I want to extract them all into
a new helper function (like we do already for pipestate, display
interrupts and all the various gt interrupts).
v2: Add the missing 't' Egbert spotted in a comment.
v3: Actually fix the right misspelled comment (Paulo).
Cc: Egbert Eich <eich@suse.de>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The usual pattern for our sub-function irq_handlers is that they check
for the no-irq case themselves. This results in more streamlined code
in the upper irq handlers.
v2: Rebase on top of the i965g/gm sdvo hpd fix.
Cc: Egbert Eich <eich@suse.de>
Reviewed-by: Egbert Eich <eich@suse.de>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Everywhere the same.
Note that this patch leaves unnecessary braces behind, but the next
patch will kill those all anyway (including the if itself) so I've
figured I can keep the diff a bit smaller.
v2: Rebase on top of the i965g/gm sdvo hpd fix.
Cc: Egbert Eich <eich@suse.de>
Reviewed-by: Egbert Eich <eich@suse.de>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We already have a vfunc for this (and other parts of the hpd storm
handling code already use it).
v2: Rebase on top of the i965g/gm sdvo hpd fix.
Cc: Egbert Eich <eich@suse.de>
Reviewed-by: Egbert Eich <eich@suse.de>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The combination of Paulo's fifo underrun detection code and Egbert's
hpd storm handling code unfortunately made the hpd storm handling code
racy.
To avoid duplicating tricky interrupt locking code over all platforms
start with a bit of refactoring. This patch is the very first step
since in the end the irq storm handling code will handle all hotplug
logic (and so also encapsulate the locking nicely).
v2: Rebase on top of the i965g/gm sdvo hpd fix.
Cc: Egbert Eich <eich@suse.de>
Reviewed-by: Egbert Eich <eich@suse.de>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
By the time we write DEIER in the postinstall hook the interrupt
handler could run any time. And it does modify DEIER to handle
interrupts.
Hence the DEIER read-modify-write cycle for enabling the PCU event
source is racy. Close this races the same way we handle vblank
interrupts: Unconditionally enable the interrupt in the IER register,
but conditionally mask it in IMR. The later poses no such race since
the interrupt handler does not touch DEIMR.
Also update the comment, the clearing has already happened
unconditionally above.
v2: Actually shove the updated comment into the right train^W commit,
as spotted by Paulo.
Cc: Paulo Zanoni <przanoni@gmail.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The haswell unclaimed register handling code forgot to take the
spinlock. Since this is in the context of the non-rentrant interupt
handler and we only have one interrupt handler it is sufficient to
just grab the spinlock - we do not need to exclude any other
interrupts from running on the same cpu.
To prevent such gaffles in the future sprinkle assert_spin_locked over
these functions. Unfornately this requires us to hold the spinlock in
the ironlake postinstall hook where it is not strictly required:
Currently that is run in single-threaded context and with userspace
exlcuded from running concurrent ioctls. Add a comment explaining
this.
v2: ivb_can_enable_err_int also needs to be protected by the spinlock.
To ensure this won't happen in the future again also sprinkle a
spinlock assert in there.
v3: Kill the 2nd call to ivb_can_enable_err_int I've accidentally left
behind, spotted by Paulo.
Cc: Paulo Zanoni <przanoni@gmail.com>
Reviewed-by: Paulo Zanoni <przanoni@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
If the current GPU frquency is below RPe, and we're asked to increase
it, just go directly to RPe. This should provide better performance
faster than letting the frequency trickle up in response to the up
threshold interrupts.
For now just do it for VLV, since that matches quite closely how VLV
used to operate when the rps delayed timer kept things at RPe always.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Eliminate the weird inverted logic from the rps new_delay comparison.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Bspec seems to be full of lies, at least it disagress with reality:
Two systems corrobated that SDVO hpd bits are the same as on gen3.
v2: Update comment a bit.
Cc: Arthur Ranyan <arthur.j.runyan@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Tested-by: Chris Wilson <chris@chris-wilson.co.uk>
Reported-and-tested-by: Alex Fiestas <afiestas@kde.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=58405
Cc: stable@vger.kernel.org
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The ring names already have "ring" in it.
CC: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
For guilty batchbuffer analysis later on when rings are reset,
store what state the ring was on when hang was declared.
This helps to weed out the waiting rings from the active ones.
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This is of no value to the developer reading the report, let alone the
bamboozled user.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
If we detect a ring is in a valid wait for another, just let it be.
Eventually it will either begin to progress again, or the entire system
will come grinding to a halt and then hangcheck will fire as soon as the
deadlock is detected.
This error was foretold by Ben in
commit 05407ff889
Author: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Date: Thu May 30 09:04:29 2013 +0300
drm/i915: detect hang using per ring hangcheck_score
"If ring B is waiting on ring A via semaphore, and ring A is making
progress, albeit slowly - the hangcheck will fire. The check will
determine that A is moving, however ring B will appear hung because
the ACTHD doesn't move. I honestly can't say if that's actually a
realistic problem to hit it probably implies the timeout value is too
low."
v2: Make sure we don't even incur the KICK cost whilst waiting.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=65394
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
After kicking a ring, it should be free to make progress again and so
should not be accused of being stuck until hangcheck fires once more. In
order to catch a denial-of-service within a batch or across multiple
batches, we still do increment the hangcheck score - just not as
severely so that it takes multiple kicks to fail.
This should address part of Ben's justified criticism of
commit 05407ff889
Author: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Date: Thu May 30 09:04:29 2013 +0300
drm/i915: detect hang using per ring hangcheck_score
"There's also another corner case on the kick. If the seqno = 2
(though not stuck), and on the 3rd hangcheck, the ring is stuck, and
we try to kick it... we don't actually try to find out if the kick
helped."
v2: Make sure we catch DoS attempts with batches full of invalid WAITs.
v3: Preserve the ability to detect loops by always charging the ring
if it is busy on the same request.
v4: Make sure we queue another check if on a new batch
References: https://bugs.freedesktop.org/show_bug.cgi?id=65394
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
So we can remove some duplicate code. All the PCHs are very similar
and right now the code is the same. I plan to add more code, so we
would have more duplicated code.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Rework of per ring hangcheck made this obsolete.
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Keep track of ring seqno progress and if there are no
progress detected, declare hang. Use actual head (acthd)
to distinguish between ring stuck and batchbuffer looping
situation. Stuck ring will be kicked to trigger progress.
This commit adds a hard limit for batchbuffer completion time.
If batchbuffer completion time is more than 4.5 seconds,
the gpu will be declared hung.
Review comment from Ben which nicely clarifies the semantic change:
"Maybe I'm just stating the functional changes of the patch, but in case
they were unintended here is what I see as potential issues:
1. "If ring B is waiting on ring A via semaphore, and ring A is making
progress, albeit slowly - the hangcheck will fire. The check will
determine that A is moving, however ring B will appear hung because
the ACTHD doesn't move. I honestly can't say if that's actually a
realistic problem to hit it probably implies the timeout value is too
low.
2. "There's also another corner case on the kick. If the seqno = 2
(though not stuck), and on the 3rd hangcheck, the ring is stuck, and
we try to kick it... we don't actually try to find out if the kick
helped"
v2: use atchd to detect stuck ring from loop (Ben Widawsky)
v3: Use acthd to check when ring needs kicking.
Declare hang on third time in order to give time for
kick_ring to take effect.
v4: Update commit msg
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Paste in Ben's review comment.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Since it will be used for the global bound/unbound list with full PPGTT,
this helps clarify things for upcoming code rework.
Recommended-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Similar to a patch originally written by:
v2: Reversed the meanings of masked and enabled (Haihao)
Made non-destructive writes in case enable/disabler rps runs first
(Haihao)
v3: Reword error message (Damien)
Modify postinstall to do the right thing based on previous fixup. (Ben)
CC: Xiang, Haihao <haihao.xiang@intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The motivation here is we're going to add some new interrupt definitions
and handling outside of the GT interrupts which is all we've managed so
far (with some RPS exceptions). By consolidating the names in the future
we can make thing a bit cleaner as we don't need to define register
names twice, and we can leverage pretty decent overlap in HW registers
since ILK.
To explain briefly what is in the comments: there are two sets of
interrupt masking/enabling registers. At least so far, the definitions
of the two sets overlap. The old code setup distinct names for
interrupts in each set, ie. one for global, and one for ring. This made
things confusing when using the wrong defines in the wrong places.
rebase: Modified VLV bits
v2: Renamed GT_RENDER_MASTER to GT_RENDER_CS_MASTER (Damien)
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
PM interrupts have an expanded role on HSW. It helps route the EBOX
interrupts. This patch is necessary to make the existing code which
touches the mask, and enable registers more friendly to other code paths
that also will need these registers.
To be more explicit:
At preinstall all interrupts are masked and disabled. This implies that
preinstall should always happen before any enabling/disabling of RPS or
other interrupts.
The PMIMR is touched by the workqueue, so enable/disable touch IER and
IIR. Similarly, the code currently expects IMR has no use outside of the
RPS related interrupts so they unconditionally set 0, or ~0. We could
use IER in the workqueue, and IMR elsewhere, but since the workqueue
use-case is more transient the existing usage makes sense.
Disable RPS events:
IER := IER & ~GEN6_PM_RPS_EVENTS // Disable RPS related interrupts
IIR := GEN6_PM_RPS_EVENTS // Disable any outstanding interrupts
Enable RPS events:
IER := IER | GEN6_PM_RPS_EVENTS // Enable the RPS related interrupts
IIR := GEN6_PM_RPS_EVENTS // Make sure there were no leftover events
(really shouldn't happen)
v2: Shouldn't destroy PMIIR or PMIMR VEBOX interrupt state in
enable/disable rps functions (Haihao)
v3: Bug found by Chris where we were clearing the wrong bits at rps
disable.
expanded commit message
v4: v3 was based off the wrong branch
v5: Added the setting of PMIMR because of previous patch update
CC: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
At the moment, these values are wiped out anyway by the rps
enable/disable. That will be changed in the next patch though.
v2: Add post install setup to address issue found by Damien in the next
patch.
replaced
WARN_ON(dev_priv->rps.pm_iir != 0);
with rps.pm_iir = 0;
With the v2 of this patch and the deferred pm enabling (which changed
since the original patches) we're now able to get PM interrupts before
we've brought up enabled rps. At this point in boot, we don't want to do
anything about it, so we simply ignore it. Since writing the original
assertion, the code has changed quite a bit, and I believe removing this
assertion is perfectly safe.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
[danvet: I don't agree with the justification to drop the WARN and
added a FIXME to that effect.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
HSW has some special requirements for the VEBOX. Splitting out the
interrupt handler will make the code a bit nicer and less error prone
when we begin to handle those.
The slight functional change in this patch (queueing work while holding
the spinlock) is intentional as it makes a subsequent patch a bit nicer.
The change should also only effect HSW platforms.
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This was accidentally broken in the south error interrupt handling
work:
commit 8664281b64
Author: Paulo Zanoni <paulo.r.zanoni@intel.com>
Date: Fri Apr 12 17:57:57 2013 -0300
drm/i915: report Gen5+ CPU and PCH FIFO underruns
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Cc: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Well, as well as we can without completely revamping the drm vblank
code. The issue are that
- The vblank code needs to work on both ums and kms.
- It deals always deals with pipes.
- It doesn't take any of the kms locks.
The last part is not really fixable without revamping the drm vblank
code, since the drm core <-> driver interactions is a veritable pile
of spaghettis. But the other pieces can be fixed by switching on the
MODESET driver flag and either checking the hw state directly (ums
case) or just querying our sw tracking (with broken locking, but
that's not worse than what we've had).
Note that this essentially reverts
commit 702e7a56af
Author: Paulo Zanoni <paulo.r.zanoni@intel.com>
Date: Tue Oct 23 18:29:59 2012 -0200
drm/i915: convert PIPECONF to use transcoder instead of pipe
for the ums case, which will fix a NULL deref (since we really don't
have any crtcs set up).
But the real reason to do this is to drop our reliance on the
cpu_transcoder: By only checking intel_crtc->active we don't need to
make sure that the pipe_config (or at least the cpu_transcoder)
contain safe values even when the pipe is off.
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Cc: Damien Lespiau <damien.lespiau@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
In preparation to track per ring progress in hangcheck,
add i915_hangcheck_ring_hung.
v2: omit dev parameter (Ben Widawsky)
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Instead of relying in acthd, track ring seqno progression
to detect if ring has hung.
v2: put hangcheck stuff inside struct (Chris Wilson)
v3: initialize hangcheck.seqno (Ben Widawsky)
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
In preparation for next commit, pass seqno as a parameter
to i915_hangcheck_ring_idle as it will be used inside
i915_hangcheck_elapsed.
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
commit 142e239849
Author: Egbert Eich <eich@suse.de>
Date: Thu Apr 11 15:57:57 2013 +0200
drm/i915: Add bit field to record which pins have received HPD events (v3)
added a bit field for hotplug event tracking. There ended up being three
different v3 of the patch: [1], [2], and [3]. Apparently [1] was the
correct one, but some frankenstein combination of the three got
committed, which reversed the logic for setting the hotplug bits and
misplaced a continue statement, skipping the hotplug irq storm handling
altogether.
This lead to broken hotplug detection, bisected to
commit 321a1b3026
Author: Egbert Eich <eich@suse.de>
Date: Thu Apr 11 16:00:26 2013 +0200
drm/i915: Only reprobe display on encoder which has received an HPD event (v2)
which uses the incorrectly set hotplug event bits.
Fix the mess.
[1] http://mid.gmane.org/1366112220-7638-6-git-send-email-eich@suse.de
[2] http://mid.gmane.org/1365688677-13682-1-git-send-email-eich@suse.de
[3] http://mid.gmane.org/1365688996-13874-1-git-send-email-eich@suse.de
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This fixes "unclaimed register" messages when the power well is
disabled and there's a GPU hang.
v2: Use the new intel_display_power_enabled().
v3: Use the new domains for intel_display_power_enabled().
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Both intel_opregion_enable_asle() and intel_enable_asle() have shrunk
considerably. Merge them together into a static function in i915_irq.c,
and rename to better reflect the purpose and the related platforms.
No functional changes.
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Realize that intel_enable_asle() is never called on PCH-split platforms
or on VLV. Rip out the GSE irq enable for PCH-split platforms, which
also happens to be incorrect for IVB+.
This should not cause any functional changes.
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
With the previous work asle and gse interrupt handlers should now be
functionally the same. Drop the duplicated code.
v2: Drop intel_opregion_gse_intr() also in the !CONFIG_ACPI path. (Damien)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
On VLV, the Punit doesn't automatically drop the GPU to it's minimum
voltage level when entering RC6, so we arm a timer to do it for us from
the RPS interrupt handler. It'll generally only fire when we go idle
(or if for some reason there's a long delay between RPS interrupts), but
won't be re-armed again until the next RPS event, so shouldn't affect
power consumption after we go idle and it triggers.
v2: use delayed work instead of timer + work queue combo (Ville)
v3: fix up delayed work cancel (must be outside lock) (Daniel)
fix up delayed work handling func for delayed work (Jesse)
v4: cancel delayed work before RPS shutdown (Jani)
pass delay not absolute time to mod_delayed_work (Jani)
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Instead of calling into the DRM helper layer to poll all connectors for
changes in connected displays probe only those connectors which have
received a hotplug event.
v2: Resolved conflicts with changes in previous commits.
Renamed function and and added a WARN_ON() to warn of
intel_hpd_irq_event() from being called without
mode_config.mutex held - suggested by Jani Nikula.
Signed-off-by: Egbert Eich <eich@suse.de>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This way it is possible to limit 're'-detect() of displays to connectors
which have received an HPD event.
v2: Reordered drm_i915_private: Move hpd_event_bits to hpd state tracking.
v3: Fixed merge conflicts with previous patches.
Signed-off-by: Egbert Eich <eich@suse.de>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This is bad news and shouldn't be happening.
V2: Rebase.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
In this commit we enable both CPU and PCH FIFO underrun reporting and
start reporting them. We follow a few rules:
- after we receive one of these errors, we mask the interrupt, so
we won't get an "interrupt storm" and we also won't flood dmesg;
- at each mode set we enable the interrupts again, so we'll see each
message at most once per mode set;
- in the specific places where we need to ignore the errors, we
completely mask the interrupts.
The downside of this patch is that since we're completely disabling
(masking) the interrupts instead of just not printing error messages,
we will mask more than just what we want on IVB/HSW CPU interrupts
(due to GEN7_ERR_INT) and on CPT/PPT/LPT PCHs (due to SERR_INT). So
when we decide to mask PCH FIFO underruns for pipe A on CPT, we'll
also be masking PCH FIFO underruns for pipe B, because both are
reported by SERR_INT, which has to be either completely enabled or
completely disabled (in othe words, there's no way to disable/enable
specific bits of GEN7_ERR_INT and SERR_INT).
V2: Rename some functions and variables, downgrade messages to
DRM_DEBUG_DRIVER and rebase.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Uses slightly different interfaces than other platforms.
v2: track actual set freq, not requested (Rohit)
fix debug prints in init code (Jesse)
v3: don't write sleep reg (Jesse)
re-add RC6 wake limit write (Ben)
fixup thresholds to match other platforms (Ben)
clean up mem freq calculation (Ben)
clean up debug prints (Ben)
v4: move defines from punit patch (Ville)
v5: remove writes to nonexistent regs (Jesse)
put RP and RC regs together (Jesse)
fix RC6 enable (Jesse)
v6: use correct fuse reads from NC (Jesse)
split out min/max funcs for use in sysfs (Jesse)
add debugfs & sysfs freq controls (Jesse)
v7: update with Ben's hw_max changes (Jesse)
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net> (v6)
[danvet: Follow checkpatch sugggestion to use min_t to avoid casting
fun.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We disable hoptplug detection when we encounter a hotplug event
storm. Still hotplug detection is required on some outputs (like
Display Port). The interrupt storm may be only temporary (on certain
Dell Laptops for instance it happens at certain charging states of
the system). Thus we enable it after a certain grace period (2 minutes).
Should the interrupt storm persist it will be detected immediately
and it will be disabled again.
v2: Reordered drm_i915_private: moved hotplug_reenable_timer to hpd state tracker.
v3: Clarified loop start value,
Removed superfluous test for Ivybridge and Haswell,
Restructured loop to avoid deep nesting (all suggested by Ville Syrjälä)
v4: Fixed two bugs pointed out by Jani Nikula.
Signed-off-by: Egbert Eich <eich@suse.de>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This patch disables hotplug interrupts if an 'interrupt storm'
has been detected.
Noise on the interrupt line renders the hotplug interrupt useless:
each hotplug event causes the devices to be rescanned which will
will only increase the system load.
Thus disable the hotplug interrupts and fall back to periodic
device polling.
v2: Fixed cleanup typo.
v3: Fixed format issues, clarified a variable name,
changed pr_warn() to DRM_INFO() as suggested by
Jani Nikula <jani.nikula@linux.intel.com>.
Signed-off-by: Egbert Eich <eich@suse.de>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
To disable previously enabled HPD IRQs we need to reset them and
set the enabled ones individually.
Signed-off-by: Egbert Eich <eich@suse.de>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
When an encoder is shared on several connectors there is only
one hotplug line, thus this line needs to be shared among these
connectors.
If HPD detect only works reliably on a subset of those connectors,
we want to poll the others. Thus we need to make sure that storm
detection doesn't mess up the settings for those connectors.
Therefore we store the settings in the intel_connector struct and
restore them from there.
If nothing is set but the encoder has a hpd_pin set we assume this
connector is hotplug capable.
On init/reset we make sure the polled state of the connectors
is (re)set to the default value, the HPD interrupts are marked
enabled.
Signed-off-by: Egbert Eich <eich@suse.de>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Add a hotplug IRQ storm detection (triggered when a hotplug interrupt
fires more than 5 times / sec).
Rationale:
Despite of the many attempts to fix the problem with noisy hotplug
interrupt lines we are still seeing systems which have issues:
Once cause of noise seems to be bad routing of the hotplug line
on the board: cross talk from other signals seems to cause erronous
hotplug interrupts. This has been documented as an erratum for the
the i945GM chipset and thus hotplug support was disabled for this
chipset model but others seem to have this problem, too.
We have seen this issue on a G35 motherboard for example:
Even different motherboards of the same model seem to behave
differently: while some only see only around 10-100 interrupts/s
others seem to see 5k or more.
We've also observed a dependency on the selected video mode.
Also on certain laptops interrupt noise seems to occur duing
battery charging when the battery is at a certain charge levels.
Thus we add a simple algorithm here that detects an 'interrupt storm'
condition.
v2: Fixed comment.
v3: Reordered drm_i915_private: moved hpd state tracking to hotplug work stuff.
v4: Followed by Jesse Barnes to use a time_..() macro.
v5: Fixed coding style as suggested by Jani Nikula.
Signed-off-by: Egbert Eich <eich@suse.de>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Increase the number of fence registers to 32 on IVB/HSW. VLV however
only has 16 fence registers according to the docs.
Increasing the number of fences was attempted before [1], but there was
some uncertainty about the maximum CPU fence number for FBC. Since then
BSpec has been updated to state that there are in fact 32 fence registers,
and the CPU fence number field in the SNB_DPFC_CTL_SA register is 5 bits,
and the CPU fence number field in the ILK_DPFC_CONTROL register must be
zero. So now it all makes sense.
[1] http://lists.freedesktop.org/archives/intel-gfx/2011-October/012865.html
v2: Include some background information based on the previous attempt
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>