Instead of waiting for device-registration, we now allocate minor-objects
during device allocation. The minors are not registered or assigned an ID.
This is still postponed to device-registration.
While at it, remove the superfluous output-parameter in drm_get_minor().
The reason for this early allocation is to make
dev->primary/control/render available atomically. So once the device is
alive, all of them are already set and we never have the situation where
one of them is set after another (they're either NULL or set, but never
changed). This will eventually allow us to reduce minor-ID allocation to
one base-ID instead of a single ID for each.
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Instead of accessing drm_minors_idr directly, this adds a small helper to
hide the internals. This will help us later to remove the drm_global_mutex
requirement for minor-lookup.
Furthermore, this also makes sure that minor->dev is always valid and
takes a reference-count to the device as long as the minor is used in an
open-file. This way, "struct file*"->private_data->dev is guaranteed to be
valid (which it has to, as we cannot reset it).
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Lets not trick ourselves into thinking "drm_device" objects are not
ref-counted. That's just utterly stupid. We manage "drm_minor" objects on
each drm-device and each minor can have an unlimited number of open
handles. Each of these handles has the drm_minor (and thus the drm_device)
as private-data in the file-handle. Therefore, we may not destroy
"drm_device" until all these handles are closed.
It is *not* possible to reset all these pointers atomically and restrict
access to them, and this is *not* how this is done! Instead, we use
ref-counts to make sure the object is valid and not freed.
Note that we currently use "dev->open_count" for that, which is *exactly*
the same as a reference-count, just open coded. So this patch doesn't
change any semantics on DRM devices (well, this patch just introduces the
ref-count, anyway. Follow-up patches will replace open_count by it).
Also note that generic VFS revoke support could allow us to drop this
ref-count again. We could then just synchronously disable any fops->xy()
calls. However, this is not the case, yet, and no such patches are
in sight (and I seriously question the idea of dropping the ref-cnt
again).
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
The drm_open_helper() function is only used internally for drm_open() so
we can safely pass in the minor-object directly instead of the minor-id.
This way, we avoid the additional minor IDR lookup, which we already do
twice in drm_stub_open() and drm_open().
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
With dev->anon_inode we have a global address_space ready for operation
right from the beginning. Therefore, there is no need to do a delayed
setup with TTM. Instead, set dev_mapping during initialization in
ttm_bo_device_init() and remove any "if (dev_mapping)" conditions.
Cc: Dave Airlie <airlied@redhat.com>
Cc: Ben Skeggs <bskeggs@redhat.com>
Cc: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Cc: Alex Deucher <alexdeucher@gmail.com>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
DRM drivers share a common address_space across all character-devices of a
single DRM device. This allows simple buffer eviction and mapping-control.
However, DRM core currently waits for the first ->open() on any char-dev
to mark the underlying inode as backing inode of the device. This delayed
initialization causes ugly conditions all over the place:
if (dev->dev_mapping)
do_sth();
To avoid delayed initialization and to stop reusing the inode of the
char-dev, we allocate an anonymous inode for each DRM device and reset
filp->f_mapping to it on ->open().
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Our current DRM design uses a single address_space for all users of the
same DRM device. However, there is no way to create an anonymous
address_space without an underlying inode. Therefore, we wait for the
first ->open() callback on a registered char-dev and take-over the inode
of the char-dev. This worked well so far, but has several drawbacks:
- We screw with FS internals and rely on some non-obvious invariants like
inode->i_mapping being the same as inode->i_data for char-devs.
- We don't have any address_space prior to the first ->open() from
user-space. This leads to ugly fallback code and we cannot allocate
global objects early.
As pointed out by Al-Viro, fs/anon_inode.c is *not* supposed to be used by
drivers for anonymous inode-allocation. Therefore, this patch follows the
proposed alternative solution and adds a pseudo filesystem mount-point to
DRM. We can then allocate private inodes including a private address_space
for each DRM device at initialization time.
Note that we could use:
sysfs_get_inode(sysfs_mnt->mnt_sb, drm_device->dev->kobj.sd);
to get access to the underlying sysfs-inode of a "struct device" object.
However, most of this information is currently hidden and it's not clear
whether this address_space is suitable for driver access. Thus, unless
linux allows anonymous address_space objects or driver-core provides a
public inode per device, we're left with our own private internal mount
point.
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
There is no need to initialize this variable, so drop it. Otherwise, the
compiler won't warn if we use it unintialized.
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Lets make sure some basic expressions are always true:
bpp != NULL
width != NULL
height != NULL
stride = bpp * width < 2^32
size = stride * height < 2^32
PAGE_ALIGN(size) < 2^32
At least the udl driver doesn't check for multiplication-overflows, so
lets just make sure it will never happen. These checks allow drivers to do
any 32bit math without having to test for mult-overflows themselves.
The two divisions might hurt performance a bit, but dumb_create() is only
used for scanout-buffers, so that should be fine. We could use 64bit math
to avoid the divisions, but that may be slow on 32bit machines.. Or maybe
there should just be a "safe_mult32()" helper, which currently doesn't
exist (I think?).
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
All drivers currently need to clean up the vma-node manually. There is no
fancy logic involved so lets just clean it up unconditionally. The
vma-manager correctly catches multiple calls so we are fine.
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Probably a typo.. we obviously need "(bpp + 7) / 8" instead of
"(bpp + 1) / 8". Unlikely to be hit in any sane code, but lets be safe.
Use DIV_ROUND_UP() to avoid the problem entirely and make the core more
readable.
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
We need to call dma_buf_end_cpu_access() in case a damage-request.
Unlikely, but might happen during device unplug.
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
We have two paths that try to reset the forcewake registers back to
known good values, with slightly different semantics and levels of
paranoia. Combine the two by passing a parameter to either restore the
forcewake status or to clear our bookkeeping, and raise the paranoia
level to max.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
It's been in there since forever, and no one cared. Doesn't put a too
good light onto our bug handling and QA efforts really ...
References: https://bugs.freedesktop.org/attachment.cgi?id=90970
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
- Standardized on "Returns:" Block.
- Sprinkle missing kerneldoc over all exported functions and all
ioctls.
- Add a stern warning that driver's really shouldn't use
drm_mode_group_init_legacy_group.
- Usual attempt at more consistency.
- Add warnings that drm_mode_object_get/put don't do refcounting,
despite what the names might lead to believe.
- Try to clarify the framebuffer setup/cleanup functions wrt driver
private framebuffers - I've fallen recently over this when reviewing
i915 fbdev patches.
- Align function parameters where the kerneldoc has been updated.
- Most of the drm_get_*_name functions aren't thread safe. Add stern
warnings where this is the case.
Since a lot of the functions in drm_crtc.c are boilerplate to handle
properties and create default sets of them it might be useful to
extract all that code into a new file drm_property.c. Especially since
properties will be used a lot more in the future.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Most of this is newly added kerneldoc for the hotplug and output
polling code. But I've also thrown in a bit lesser polish, most of it
is tuning down the shouting RETURN: headers.
Overview documentation for the output probing and mode setting support
code will be added in later patches.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
No driver cares, and it should generally work. Add a big comment
when drivers can't use this for recompense.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
- It yells.
- WARNing about incorrect locking is harder to ignore, so better
than kerneldoc.
- Since those have been written per-crtc locks were added ...
So remove them and replace them by appropriate WARNs.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Rightfully no driver ever checked this - it can't fail.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
- Tune down yelling RETURNS.
- OCD align all the parameters the same.
- Add missing kerneldoc, which also means that we need to include the
kerneldoc from the drm_modes.h header now.
- Add missing Returns: sections.
- General polish and clarification - especially the kerneldoc for the
mode creation helpers seems to have been some good specimen of
copypasta gone wrong.
All actual code changes have all been extracted into prep patches
since there was simply too much to polish.
v2: More polish for the command line modeline functions.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Totally unused and actually redundant with maxX for display mode
validation. The fb helper otoh needs to check pitch limits,
but that is delegated into drivers instead.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
It never fails and no one ever checked anyway.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
There's a neat FIXME asking whether this is really need. I'd
say really no.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
I want to also include kerneldoc from the header (for static inline
functions and structs), but fishing the right pieces out of a giant
header is a real pain. So split things out.
Note that it's not a really clean header with sane include orders, but
given's drm historical knack for giant headers detangling this is a
major task.
v2: Also extract struct drm_cmdline_mode.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Makes more sense and gives better grouping in the DocBook function
reference sections. To make this possible we need to expose two
functions from drm_crtc.c though. To avoid further namespace pollution
in the system wide headers create a new internal header for such drm
internal symbols.
I expect that longer-term we'll add tons more, but since my goal here
is to polish the kerneldoc that's for another day.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
There's not really any value in stating that no locking is needed. And
even if the comment is useful, a check for the right mutex at the
beginning of the function is better since that can't be ingored as
easily as a bit of documentation.
Note that drm_mode_probed_add in drm_crtc.c is also changed, the next
patch will move this into drm_modes.c
v2: Don't add locking WARN_ONs where it is not strictly required (i.e.
the two functions to validate/prune mode lists).
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
And clean it up so that there's no kerneldoc warnings. There's still a
lot to do with this one here.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
It's only used by imx, and that one gets it wrong - there's no need
to deteach the encoder before removing it.
And really, neither current drm modesetting code nor all the userspace
we have can handle dynamic changes in the set of possible encoders for
a given connector. So let's just remove this before someone starts
doing something really nasty with it.
As a plus, one less kerneldoc comment to write.
Cc: Sascha Hauer <s.hauer@pengutronix.de>
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
While at it do a tiny bit of interface cleanup and convert boolean
return values to bool. With this patch all exported functions and inline
helpers which are part of the drm_mm public interface are documented.
Also drop superflous extern function modifiers since most of drm_mm.h
doesn't use them - more consistent that way.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
kerneldoc polish will follow in the next patch.
Hopefully documenting the lru scan support a bit better spurs someone
to give this a shot in the ttm eviction code. At least in i915 it
helped quite a lot with memory thrashing on platforms where eviction
was (we've fixed that too meanwhile) fairly expensive.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This was missed in
commit c700c67bae
Author: David Herrmann <dh.herrmann@gmail.com>
Date: Sat Jul 27 13:39:28 2013 +0200
drm/mm: remove unused API
Cc: David Herrmann <dh.herrmann@gmail.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
For giant hilarity the DocBook reference overview is only generated
when in a level 2 section, not in a level 3 section. So we need to
move this up a bit as a side-by-side section to the main PRIME
documentation.
Whatever.
To have a complete set of references add the missing kerneldoc for all
functions exported to modules with the exception of the file private
init/destroy functions - drivers have no business calling those, so
let's just drop the EXPORT_SYMBOL instead.
Also reflow the function parameters to align correctly and break at 80
chars - my OCD couldn't stand them while writing the kerneldoc ;-)
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Stumbled over while reviewing all occurences in the DRM doc talking
about suspend/resume.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
v2: Also do s/RETURNS/Returns/, less yelling in docs is always good.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Now that we regularly defer the forcewake dance to a timer func, it is
likely to fire after we disable the device during suspend. This
generates an oops as we detect inconsistency in the hardware state. So
before suspend, we want to complete the outstanding dance and generally
sanitize the registers before handing back to the BIOS.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
When we change the cache_level for an object we need to make sure
we don't put differing types of snoopable memory too close to each
other on non-LLC machines.
Currently i915_gem_object_set_cache_level() will stop looking when
it finds just one vma that has such a conflict. Drop the bogus break
statement to make sure it will unbind all vmas which need to be moved
around to avoid the conflict.
I suppose this is a theoretical issue as currently we don't enable
ppgtt on non-LLC machines, so each object can only have one vma.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We will call ppgtt_bind_vma() with flags != 0, so the WARN_ON(flags)
is bogus. Kill it.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Second pull request of 2014-03-12. The first one was requested to be canceled.
Rob's fix for oops on invalidate_caches() and a fix for a
performance regression.
* tag 'ttm-fixes-3.14-2014-03-12' of git://people.freedesktop.org/~thomash/linux:
drm/ttm: don't oops if no invalidate_caches()
drm/ttm: Work around performance regression with VM_PFNMAP
A few more radeon fixes.
* 'drm-fixes-3.14' of git://people.freedesktop.org/~agd5f/linux:
drm/radeon/cik: properly set compute ring status on disable
drm/radeon/cik: stop the sdma engines in the enable() function
drm/radeon/cik: properly set sdma ring status on disable
drm/radeon: fix runpm disabling on non-PX harder
If running on a gb-object capable device with a non-gb capable surface
exporter (X server) and a gb capable surface referencing client (GL driver),
the referencing client expects to find a shareable backing buffer attached to
the surface at reference time. This may not be the case if the surface has
not yet been validated. This would cause the surface reference IOCTL to
return an error.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>
While wandering in the spec, I noticed that BDW removes those 2 bits
from INSTPM. I couldn't find any direct way to invalidate the TLB (ie
without the ring working already). Maybe someone will be more lucky.
At least, we now know we may be a problem.
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
When we disable the rings, set the status properly. If
not other code pathes may try and use the rings which are
not functional at this point.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
We always stop the rings when disabling the engines so just
call the stop functions directly from the sdma enable function.
This way the rings' status is set correctly on suspend so
there are no problems on resume. Fixes resume failures that
result in acceleration getting disabled.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
When we disable the rings, set the status properly. If
not other code pathes may try and use the rings which are
not functional at this point.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Make sure runtime pm is disabled on non-PX hardware.
Should fix powerdown problems without displays attached.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
We need to enable interrupt processing before all the modeset
state is set up. But that means we can fall over when we get a pipe
underrun. This shouldn't happen as long as the bios works correctly
but as usual this turns out to be wishful thinking.
So disable error interrupts at irq install time and rely on the
re-enabling code in the modeset functions to take care of this.
Note that due to the SDE interrupt handling race we must
uncondtionally enable all interrupt sources in SDEIER, hence no need
to enable the SERR bit specifically.
On gmch platforms we don't have an explicit enable/mask bit for fifo
underruns. Fixing this up would require a bit of software tracking,
hence is material for a separate patch. To make this possible we need
to switch all gmch platforms to the new pipestat interrupt handling
scheme Imre implemented for vlv, and then also add a safe form of sw
state checking to __cpu_fifo_underrun_reporting_enabled a bit.
v2: Also handle the ilk/snb cpu fifo underrun bits accordingly.
Spotted by Ville.
v3: Also handle the south interrupt underrun bits on ibx. Again
spotted by Ville.
Reported-by: Rob Clark <robdclark@gmail.com>
Cc: Rob Clark <robdclark@gmail.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: stable@vger.kernel.org
Tested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
I have the occasional absent cursor on i845 and I want to know why.
This should help by revealing the last known cursor state.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The display interrupts changed on BDW, so the current ILK-HSW specific
code in ilk_pipe_in_vblank_locked() doesn't work there. Add the required
bits for BDW, and while at it, change the existing code to use nicer
looking vblank status bit macros.
Also remove the now stale __raw_i915_read16() definition which was
left over from the failed gen2 ISR experiment.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=73962
Tested-by: Lu Hua <huax.lu@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
We don't need to hold struct_mutex all through intel_pipe_set_base(),
just need to hold it while pinning/unpinning the buffers.
So reduce the struct_mutext usage in intel_pipe_set_base() just like we
did for the sprite code in:
commit 82284b6bec
Author: Ville Syrjälä <ville.syrjala@linux.intel.com>
Date: Tue Oct 1 18:02:12 2013 +0300
drm/i915: Reduce the time we hold struct mutex in sprite update_plane code
The FBC and PSR locking is still entirely fubar. That stuff was
previouly done while holding struct_mutex, so leave it there for now.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
On HSW the scanline counter seems to behave differently depending on
the output type. eDP on port A does what you would expect an the normal
+1 fixup is sufficient to cover it. But on HDMI outputs we seem to need
a +2 fixup. Just assume we always need the +2 fixup and accept the
slight inaccuracy on eDP.
This fixes a regression introduced in:
commit 8072bfa604
Author: Ville Syrjälä <ville.syrjala@linux.intel.com>
Date: Mon Oct 28 21:22:52 2013 +0200
drm/radeon: Move the early vblank IRQ fixup to radeon_get_crtc_scanoutpos()
That commit removed the heuristic that tried to fix up the timestamps
for vblank interrupts that fire a bit too early. Since then the vblank
timestamp code would treat some vblank interrupts as spurious since the
scanline counter would indicate that vblank_start wasn't reached yet.
That in turn lead to incorrect vblank event sequence numbers being
reported to userspace, which lead to unsteady framerate in applications
such as XBMC which uses them for timing purposes.
v2: Remember to call ilk_pipe_in_vblank_locked() on HSW too (Mika)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75725
Tested-by: bugzilla1@gmx.com
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Based on Bspec the command parser must be stopped prior to
issuing sync flush. This should be done by the caller of
intel_ring_setup_status_page. Patch adds a warning if it is
not done.
v2: rebased based on new patch (wait for ring to become idle)
Signed-off-by: Naresh Kumar Kachhi <naresh.kumar.kachhi@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
make sure we wait for rings to become idle once they are
disabled. In case of timeout print an error message
Signed-off-by: Naresh Kumar Kachhi <naresh.kumar.kachhi@intel.com>
[danvet: Frob patch as suggested by Chris.]
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Rings should be idle before issuing sync_flush
(in intel_ring_setup_status_page). This patch moves the ring
disabling before doing the HW status page setup.
Signed-off-by: Naresh Kumar Kachhi <naresh.kumar.kachhi@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
A performance regression was introduced in TTM in linux 3.13 when we started using
VM_PFNMAP for shared mappings. In theory this should've been faster due to
less page book-keeping but it appears like VM_PFNMAP + x86 PAT + write-combine
is a particularly cpu-hungry combination, as seen by largely increased
cpu-usage on r200 GL video playback.
Until we've sorted out why, revert to always use VM_MIXEDMAP.
Reference: freedesktop.org bugzilla bug #75719
Reported-and-tested-by: <smoki00790@gmail.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Cc: stable@vger.kernel.org
Our code allows have a PPGTT that is smaller than the maximum size for
GEN6-GEN7. Though I don't think this actually ever occurs, the code may
as well work properly and more importantly look correct by using the
variable size instead of the HW max.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
I'm not clear if the hardware is still subject to the same prefetching
issues that made us use a scratch page in the first place. In either
case, we're using garbage with the current code (we will end up using
offset 0).
This may be the cause of our current gem_cpu_reloc regression with
PPGTT. I cannot test it at the moment.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Introduced in
commit e0e33f8ff6f0b6d286afc314802be4993341bd47
Author: Imre Deak <imre.deak@intel.com>
Date: Tue Mar 4 19:23:07 2014 +0200
The impact was luckily minimal, due to the extra check we do against a
software pipestat IRQ mask.
Caught by Fengguang's 0-day tester.
Cc: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
BSpec is a bit unclear whether HDMI+HDMI cloning should work on g4x.
Tests on real hardware say that it does. Since g4x can't send
infoframes to more than one HDMI port anyway, we don't lose anything
by allow it.
For PCH platforms BSpec explicitly forbids HDMI+HDMI cloning.
Whether HDMI+HDMI cloning might also work on VLV is a bit unclear, but
since we'd at least lose the capability of sending infoframes to more
than one cloned HDMI port, it doesn't seem like a good idea to allow it.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=73850
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
HDMI+VGA cloning should be supported on all platforms. The only real
obstacle is the 1.5x clock adjustment for 12bpc HDMI, but that is now
taken care of, so we can allow HDMI+VGA cloning.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=73850
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
When cloning HDMI with other output types, we can't use 12bpc since the
clocks for the other encoder types would be off. So have
intel_hdmi_compute_config() check if there are other encoders besides
HDMI being fed from the same pipe, and if so, pick 8bpc insted if 12bpc.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJTHSaRAAoJEHm+PkMAQRiG7G8IAJHElwFDNSQE7Y9MmbicrAMG
kfjhBtBpTaVrJKQXegCNUwDaLLyC4oLIxDheW84oPXbrEGDLqPtBov/hrcFkHVr4
lh/ZYk02nYtcfpN0JnL/Yj2oKHVmBWs0vFlM7StSFsJCj10DoCVQQdmAJ8XODTPo
CXMapk+UikTX1TlIO8+B5toyl3R1OqPmW211UV1vQVLKy66hu+MKVN/V+/EyopL0
1jO81EDpaRaeIJh1/okcyUoIq9pqLkAWNpeQ7uyXZ+Sfivt9RXwLYKmAB3lP20Hc
ZMIIoHSCyYRFjxLlQvt02bA9nY4wTY7YN5kZ2kk65y7TFfhcGsCw1Sc69iyCoKs=
=CJcA
-----END PGP SIGNATURE-----
Merge tag 'v3.14-rc6' into drm-intel-next-queued
Linux 3.14-rc6
I need the hdmi/dvi-dual link fixes in 3.14 to avoid ugly conflicts
when merging Ville's new hdmi cloning support into my -next tree
Conflicts:
drivers/gpu/drm/i915/Makefile
drivers/gpu/drm/i915/intel_dp.c
Makefile cleanup conflicts with an acpi build fix, intel_dp.c is
trivial.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Currently we allow encoders to indicate whether they can be part of a
cloned set with just one flag. That's not flexible enough to describe
the actual hardware capabilities. Instead make it a bitmask of encoder
types with which the current encoder can be cloned.
For now we set the bitmask to allow DVO+DVO and DVO+VGA, which should
match what the old boolean flag allowed. We will add some more cloning
options in the future.
Note that this patch also removes the encoder.possible_clones setting
from encoder setup code - we compute this dynamically.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
[danvet: Add Ville's explanation why removing the encoder
possible_clones is save.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
When adding new gunk, _always_ think of a good place. Start/end
usually just means that this didn't happen, and on top of that results
in needless conflicts with other patches doing the same.
Introduced in
commit 62d5d69b49
Author: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Date: Tue Feb 25 17:11:28 2014 +0200
drm/i915: Add suspend count to error state
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The stolen allocator objects loudly if the caller requests a zero-sized
object. This is a useful verbose check as in most cases the request
should have been pruned much early. Here we just want to silently return
before attempting the allocation.
Regression from
commit 484b41dd70
Author: Jesse Barnes <jbarnes@virtuousgeek.org>
Date: Fri Mar 7 08:57:55 2014 -0800
drm/i915: remove early fb allocation dependency on CONFIG_FB v2
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75963
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
During KMS takeover, we try to capture the current configuration and
preserve it across our initialisation. For a variety of reasons, we may
fail this, for example if the current mode was using the legacy VGA
plane. Under such circumstances, we discard the fb in the plane config
and tried to find a matching fb on another CRTC. This obviously also
failed, leaving the plane config fb dangling, pointing to the freed block.
Regression from
commit 484b41dd70
Author: Jesse Barnes <jbarnes@virtuousgeek.org>
Date: Fri Mar 7 08:57:55 2014 -0800
drm/i915: remove early fb allocation dependency on CONFIG_FB v2
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75963
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
By stuffing the fb allocation into the crtc, we get mode set lifetime
refcounting for free, but have to handle the initial pin & fence
slightly differently. It also means we can move the shared fb handling
into the core rather than leaving it out in the fbdev code.
v2: null out crtc->fb on error (Daniel)
take fbdev fb ref and remove unused error path (Daniel)
Requested-by: Daniel Vetter <daniel@ffwll.ch>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Retrieve current framebuffer config info from the regs and create an fb
object for the buffer the BIOS or boot loader left us. This should
allow for smooth transitions to userspace apps once we finish the
initial configuration construction.
v2: check for non-native modes and adjust (Jesse)
fixup aperture and cmap frees (Imre)
use unlocked unref if init_bios fails (Jesse)
fix curly brace around DSPADDR check (Imre)
comment failure path for pin_and_fence (Imre)
v3: fixup fixup of aperture frees (Chris)
v4: update to current bits (locking & pin_and_fence hack) (Jesse)
v5: move fb config fetch to display code (Jesse)
re-order hw state readout on initial load to suit fb inherit (Jesse)
re-add pin_and_fence in fbdev code to make sure we refcount properly (Je
v6: rename to plane_config (Daniel)
check for valid object when initializing BIOS fb (Jesse)
split from plane_config readout and other display changes (Jesse)
drop use_bios_fb option (Chris)
update comments (Jesse)
rework fbdev_init_bios for clarity (Jesse)
drop fb obj ref under lock (Chris)
v7: use fb object from plane_config instead (Ville)
take ref on fb object (Jesse)
v8: put under i915_fastboot option (Jesse)
fix fb ptr checking (Jesse)
inform drm_fb_helper if we fail to enable a connector (Jesse)
drop unnecessary enabled[] modifications in failure cases (Chris)
split from BIOS connector config readout (Daniel)
don't memset the fb buffer if preallocated (Chris)
alloc ifbdev up front and pass to init_bios (Chris)
check for bad ifbdev in restore_mode too (Chris)
v9: fix up !fastboot bpp setting (Jesse)
fix up !fastboot helper alloc (Jesse)
make sure BIOS fb is sufficient for biggest active pipe (Jesse)
v10:fix up size calculation for proposed fbs (Chris)
go back to two pass pipe fb assignment (Chris)
add warning for active pipes w/o fbs (Chris)
clean up num_pipes checks in fbdev_init and fbdev_restore_mode (Chris)
move i915.fastboot into fbdev_init (Chris)
v11:make BIOS connector config usage unconditional (Daniel)
v12:fix up fb vs pipe size checking (Chris)
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This should allow BIOS fb inheritance to work on ILK+ machines too.
v2: handle tiled BIOS fbs (Kristian)
split out common bits (Jesse)
v3: alloc fb obj out in _init
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Read out the current plane configuration at init time into a new
plane_config structure. This allows us to track any existing
framebuffers attached to the plane and potentially re-use them in our
fbdev code for a smooth handoff.
v2: update for new pitch_for_width function (Jesse)
comment how get_plane_config works with shared fbs (Jesse)
v3: s/ARGB/XRGB (Ville)
use pipesrc width/height (Ville)
fix fourcc comment (Bob)
use drm_format_plane_cpp (Ville)
v4: use fb for tracking fb data object (Ville)
v5: fix up gen2 pitch limits (Ville)
v6: read out stride as well (Daniel)
v7: split out init ordering changes (Daniel)
don't fetch config if !CONFIG_FB
v8: use proper height in get_plane_config (Chris)
v9: fix CONFIG_FB check for modular configs (Jani)
v10: add comment about stolen allocation stomping
v11: drop hw state readout hunk (Daniel)
v12: handle tiled BIOS fbs (Kristian)
pull out common bits (Jesse)
v13: move fb obj alloc out to _init
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Early at init time, we can try to read out the plane config structure
and try to preserve it if possible.
v2: alloc fb obj at init time after fetching plane config
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We don't always want to write into main memory with pwrite. The shmem
fast path in particular is used for memory that is cacheable - under
such circumstances forcing the cache eviction is undesirable. As we will
always flush the cache when targeting incoherent buffers, we can rely on
that second pass to apply the cache coherency rules and so benefit from
in-cache copies otherwise.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Brad Volkin <bradley.d.volkin@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We used to lock individual pages inside the buffer object and so needed
to update the page flags every time. However, we now pin the pages into
the object for the duration of the pwrite/pread (and hopefully much
longer) and so we can forgo the flag updates until we release all the
pages.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Brad Volkin <bradley.d.volkin@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Chris suggested to split things up a bit into the different parts of
the driver and also sort it all correctly, with the hope that we're
trying to organize things a bit better eventually. It should also
help newcomers to orient themselves a bit better.
v2:
- Move intel_pm.c to the core - to make things perfect we should split
out the modeset related pm features (psr/fbc) into a separate file.
Maybe something Rodrigo can do once the PSR patches have settled.
- Split the modesetting sections into core and encoders/outputs.
intel_ddi.c is a bit funky since it has core hsw+ support and ddi
output support. Whatever.
v3: Failed to git add ...
v4: Really go ocd, i.e. spelling fix in a comment from Jani.
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The command parser scans batch buffers submitted via execbuffer ioctls before
the driver submits them to hardware. At a high level, it looks for several
things:
1) Commands which are explicitly defined as privileged or which should only be
used by the kernel driver. The parser generally rejects such commands, with
the provision that it may allow some from the drm master process.
2) Commands which access registers. To support correct/enhanced userspace
functionality, particularly certain OpenGL extensions, the parser provides a
whitelist of registers which userspace may safely access (for both normal and
drm master processes).
3) Commands which access privileged memory (i.e. GGTT, HWS page, etc). The
parser always rejects such commands.
See the overview comment in the source for more details.
This patch only implements the logic. Subsequent patches will build the tables
that drive the parser.
v2: Don't set the secure bit if the parser succeeds
Fail harder during init
Makefile cleanup
Kerneldoc cleanup
Clarify module param description
Convert ints to bools in a few places
Move client/subclient defs to i915_reg.h
Remove the bits_count field
OTC-Tracker: AXIA-4631
Change-Id: I50b98c71c6655893291c78a2d1b8954577b37a30
Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
[danvet: Appease checkpatch.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The command parser is going to need the same synchronization and
setup logic, so factor it out for reuse.
v2: Add a check that the object is backed by shmem
Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Make sure the line_time_us isn't zero in the gmch watermarks code as
that would cause a div by zero. This can be triggered by specifying
a very fast pixel clock for the mode.
At some point we should probably just switch over to using the same
math we use on PCH platforms which avoids such intermediate rounded
results.
Also we should verify the user provided mode much more rigorously.
At the moment we accept pretty much anything.
Note that "very fast mode" here means above 74.25 GHz.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: Add Ville's clarification of what "very fast" means.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Based on an early draft from Jesse.
Add support for powering on/off the dynamic power wells on VLV by
registering its display and dpio dynamic power wells with the power
domain framework.
For now power on all PHY TX lanes regardless of the actual lane
configuration. Later this can be optimized when the PHY side setup
enables only the required lanes. Atm, it enables all lanes in all
cases.
v2:
- undef function local COND macro after its last use (Ville)
- Take dev_priv->irq_lock around the whole sequence of
intel_set_cpu_fifo_underrun_reporting_nolock() and
valleyview_disable_display_irqs(). They are short and releasing
the lock in between only makes proving correctness more difficult.
- sanitize local var names in vlv_power_well_enabled()
v3:
- rebase on latest -nightly
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
[danvet: Resolve conflict due to my changes in the previous patch.
Also throw in an assert_spin_locked for safety. And finally appease
checkpatch.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Needed by the next patch, wanting to set the underrun reporting as part
of a bigger dev_priv->irq_lock'ed sequence.
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
[danvet: Use more customary __ prefix instead of _nolock postfix.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We'll need to disable/re-enable the display-side IRQs when turning
off/on the VLV display power well. Factor out the helper functions
for this. For now keep the display IRQs enabled by default, so the
functionality doesn't change. This will be changed to enable/disable
the IRQs on-demand when adding support for VLV power wells in an
upcoming patch.
v2:
- take the irq spin lock for the whole enable/disable sequence as
these can be called with interrupts enabled
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Suggested by Daniel.
v2:
- sanitize the state checking condition, the original was rather
confusing (partly due to the unfortunate naming of
i915.disable_power_well) (Ville)
- simpler message+backtrace generation by using WARN instead of WARN_ON
(Ville)
- check if always-on power wells are truly on all the time
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We need to do the same for other platforms in upcoming patches.
v2:
- s/p/pipe (Ville)
- Call the new helper with the vbl_lock already held. The part it
protects is short, so releasing it between pipes only makes proving
correctness more difficult.
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
[danvet: Resolve conflict with Damien's s/p/pipe/ change.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
In the upcoming patches we'll need to access the rest of the fields in
the punit power gating register, so prepare for that.
v2:
- add doc reference for the power well subsystem IDs (Jesse)
- remove IDs for non-existant DPIO_RX[23] subsystems (Jesse)
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This is a left-over from
commit b7e634cc8d
Author: Imre Deak <imre.deak@intel.com>
Date: Tue Feb 4 21:35:45 2014 +0200
drm/i915: vlv: don't unmask IIR[DISPLAY_PIPE_A/B_VBLANK] interrupt
where we stopped unmasking the vblank IRQs, but left them enabled in the
IER register. Disable them in IER too.
v2:
- remove comment becoming stale after this change (Ville)
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We can read out the pipe HW state only if the required power domain is
on. If not we consider the pipe to be off.
v2:
- no change
v3:
- push down the power domain checks into the specific crtc
get_pipe_config handlers (Daniel)
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
[danvet: Appease checkpatch.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Since the encoder is tied to its port, we need to make sure the power
domain for that port is on before reading out the encoder HW state.
Note that this also covers also all connector get_hw_state handlers,
since all those just call the corresponding encoder get_hw_state
handler, which checks - after this change - for all power domains
the connector needs.
v2:
- no change
v3:
- push down the power domain checks into the specific encoder
get_hw_state handlers (Daniel)
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The connector detect and get_mode handlers need to access the port
specific HW blocks to read the EDID etc. Get/put the port power domains
around these handlers.
v2:
- get port power domain for HDMI too (Ville)
- get port power domain for the DP,HDMI audio detect handlers (Jesse)
- Leave the intel_runtime_pm_get/put in the DP detect function in place.
Instead of just removing them, these should be moved to the appropriate
power_well enable/disable handlers. We can do this after Paulo's
'Merge PC8 with runtime PM, v2' patchset.
v3:
- rebased on latest -nightly
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Parts that poke port specific HW blocks like the encoder HW state
readout or connector hotplug detect code need a way to check whether
required power domains are on or enable/disable these. For this purpose
add a set of power domains that refer to the port HW blocks. Get the
proper port power domains during modeset.
For now when requesting the power domain for a DDI port get it for a 4
lane configuration. This can be optimized later to request only the 2
lane power domain, when proper support is added on the VLV PHY side for
this. Atm, the PHY setup code assumes a 4 lane config in all cases.
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reading code free of special cases wins over the small overhead of
calling a noop handler. Suggested by Jesse.
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Split the 'set' power well handler into an 'enable', 'disable' and
'sync_hw' handler. This maps more conveniently to higher level
operations, for example it allows us to push the hsw package c8 handling
into the corresponding hsw/bdw enable/disable handlers and the hsw BIOS
hand-over setting into the hsw/bdw sync_hw handler.
No functional change.
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
[danvet: Appease checkpatch's whitespace complaints.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Whenever we request a power domain it has to guarantee that all HW
resources are enabled that are needed to access a HW register associated
with that power domain. In case a register is on an always-on power well
this won't result in turning on a power well, but it may require
enabling some other HW resource. One such resource is the HSW/BDW device
D0 state that is required for all register accesses and thus for all
power wells/power domains.
So far the init power domain (guaranteeing access to all HW registers)
was part of the default i9xx always-on power well, but not the HSW/BDW
always-on power wells. Add the domain to the latter power wells too.
Atm, all the always-on power wells have noop handlers, so this doesn't
change the functionality.
v2:
- clarify semantics of always-on power wells (Paulo)
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
These macros are used only locally, so move them to the .c file.
No functional change.
v2:
- add init power domain to always-on power wells in the following
- separate - patch (Paulo)
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
There are too many oustanding issues:
- Fence handling in the current code is broken. There's a patch series
from me, but it's blocked on and extended review (which includes
writing the testcases).
- IOMMU mapping handling is broken, we need to properly refcount it -
currently it gets destroyed when the first vma is unbound, so way
too early.
- There's a pending reset issue on snb. Since Mika's reset work and
full ppgtt have been pulled in in separate branches and ended up
intermittingly breaking each another it's unclear who's the exact
culprit here.
- We still have persistent evidince of crazy recursion bugs through
vma_unbind and ppgtt_relase, e.g.
https://bugs.freedesktop.org/show_bug.cgi?id=73383
This issue (and a few others meanwhile resolved) have blocked our
performance measuring/tuning group since 3 months.
- Secure batch dispatching is broken. This is blocking Brad Volkin's
command checker work since 3 months.
All these issues are confirmed to only happen when full ppgtt is
enabled, falling back to aliasing ppgtt resolves them. But even
aliasing ppgtt itself still has a regression:
- We currently unconditionally bind objects into the aliasing ppgtt,
which means all priviledged objects like ringbuffers are visible to
unpriviledged access again. On top of that this also breaks the
command checker for aliasing ppgtt, since it can't hide the
validated batch any more.
Furthermore topic/full-ppgtt has never been reviewed:
- Lifetime rules around vma unbinding/release are unclear, resulting
into this awesome hack called ppgtt_release. Which seems to take the
blame for most of the recursion fallout.
- Context/ring init works different on gpu reset than anywhere else.
Such differeneces have in the past always lead to really hard to
track down bugs.
- Aliasing ppgtt is treated in a bunch of places as a real address
space, but it isn't - the real address space is always the global
gtt in that case. This results in a bit a mess between contexts and
ppgtt object, further complication the context/ppgtt/vma lifetime
rules.
- We don't have any docs describing the overall concepts introduced
with full ppgtt. A short, concise overview describing vmas and some
of the strange bits around them (like the unbound vmas used by
execbuf, or the new binding rules) really is needed.
Note that a lot of the post topic/full-ppgtt merge fallout has already
been addressed, this entire list here of 10 issues really only contains
the still outstanding issues.
Finally the 3.15 merge window is approaching and I think we need to
use the remaining time to ensure that our fallback option of using
aliasing ppgtt is in solid shape. Hence I think it's time to throw the
switch. While at it demote the helper from static inline status
because really.
Cc: Ben Widawsky <ben@bwidawsk.net>
Cc: Dave Airlie <airlied@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
These functions will be needed by the valleyview specific power well
update functionality added in an upcoming patch, so move them earlier.
No functional change.
v2:
- no change
v3:
- rebase on latest -nightly
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> (v2)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
These functions are used only by a single call site and are simple
enough to just fold them in.
Note that in later patches the parts folded in here are further
simplified as we'll remove hsw_{disable,enable}_package_c8 and the NULL
check of the power well enable/disable handlers. All this means that at
the end intel_display_power_get/put() becomes more understandable as we
don't need to jump between two functions when reading the code.
No functional change.
v2:
- clarify the rational for the change (Chris)
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
one more radeon fix.
* 'drm-fixes-3.14' of git://people.freedesktop.org/~agd5f/linux:
drm/radeon/atom: select the proper number of lanes in transmitter setup
Building radeon_ttm.o on 32 bit x86 triggers a warning:
In file included from include/asm-generic/bug.h:13:0,
from [...]/arch/x86/include/asm/bug.h:38,
from include/linux/bug.h:4,
from include/drm/drm_mm.h:39,
from include/drm/drm_vma_manager.h:26,
from include/drm/ttm/ttm_bo_api.h:35,
from drivers/gpu/drm/radeon/radeon_ttm.c:32:
drivers/gpu/drm/radeon/radeon_ttm.c: In function 'radeon_ttm_gtt_read':
include/linux/kernel.h:712:17: warning: comparison of distinct pointer types lacks a cast [enabled by default]
(void) (&_min1 == &_min2); \
^
drivers/gpu/drm/radeon/radeon_ttm.c:938:22: note: in expansion of macro 'min'
ssize_t cur_size = min(size, PAGE_SIZE - off);
^
Silence this warning by using min_t(). Since cur_size will never be
negative and its upper bound is PAGE_SIZE, we can change its type to
size_t and use min_t(size_t, [...]) here.
Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Moving the pm resume up in the init order to fix
dpm seems to have regressed somes cases with the old
pm code. Move it back to late resume.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Without this, a bo may get created in the cpu-inaccessible vram.
Before the CP engines get setup, all copies are done via cpu memcpy.
This means that the cpu tries to read from inaccessible memory, fails,
and the radeon module proceeds to disable acceleration.
Doing this has no downsides, as the real VRAM size gets set as soon as the
CP engines get init.
This is a candidate for 3.14 fixes.
v2: Add comment on why the function is used
Signed-off-by: Lauri Kasanen <cand@gmx.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Cc: stable@vger.kernel.org
| has a higher precedence than ?. Therefore, the calculation doesn't do
at all what you would expect. Thanks to Ken for convincing me that this
was indeed the issue. Send me back to C programmer school, please.
I'm sort of surprised PSR was continuing to work for people. It should
be broken IMO (and it was broken for me, but I had assumed it never
worked).
Regression from:
commit ed8546ac1f
Author: Ben Widawsky <benjamin.widawsky@intel.com>
Date: Mon Nov 4 22:45:05 2013 -0800
drm/i915/bdw: Support eDP PSR
Cc: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Cc: Kenneth Graunke <kenneth.w.graunke@intel.com>
Cc: Art Runyan <arthur.j.runyan@intel.com>
Reported-by: "Kumar, Kiran S" <kiran.s.kumar@intel.com>
Cc: stable@vger.kernel.org [v3.13+]
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
We have two names for the same register CHICKEN_PIPESL_1 and
HSW_PIPE_SLICE_CHICKEN_1. Unify it to just one.
Also rename the FBCQ disable bit to resemble the name we've
given to a similar bit on earlier platforms.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
gen7_enable_fbc() may write to some registers which we've already
touched, so use RMW so that we don't undo any previous updates.
Also note that we implemnt WaFbcAsynchFlipDisableFbcQueue:bdw.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Misplaced parens cause us to totally clobber the CHICKEN_PIPESL_1
registers with 0xffffffff. Move the parens to the correct place
to avoid this.
In particular this caused bit 30 of said registers to be set, which
caused the sprite CSC to produce incorrect results.
Cc: stable@vger.kernel.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72220
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
... it's this time of the year again. Originally we've frobbed this to
fix up some regressions, but maybe our DP code improved sufficiently
now that we can dare to do again what the spec recommends.
This reverts
commit 2514bc510d
Author: Jesse Barnes <jbarnes@virtuousgeek.org>
Date: Thu Jun 21 15:13:50 2012 -0700
drm/i915: prefer wide & slow to fast & narrow in DP configs
I'm pretty sure I'll regret this patch, but otoh I expect we won't
make progress here without poking the devil occasionally.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=73694
Cc: peter@colberg.org
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Tested-by: Itai BEN YAACOV <candeb@free.fr>
Tested-by: David En <d.engraf@arcor.de>
Reported-and-Tested-by: Marcus Bergner <marcusbergner@gmail.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
As we now have intel_uncore_forcewake_reset() no need
to do explicit put after reset.
v2: rebase
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
While reading some code, out of boredom, stumbled on a tiny tiny fix.
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
That macro was only ever used to convert ring->private into a gem object
(hence the forceful cast). ring->private doesn't even exist anymore as
it was transmogrified by Chris in:
commit 0d1aacac36
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Mon Aug 26 20:58:11 2013 +0100
drm/i915: Embed the ring->private within the struct intel_ring_buffer
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Its last usage outside of i915_gem.c was removed in:
commit 1f70999f90
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Mon Jan 27 22:43:07 2014 +0000
drm/i915: Prevent recursion by retiring requests when the ring is full
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This patch fixes the blank screen bug introduced in 3.14-rc1 on the
MacBook Air 6,2. The comments state that we need to force edp vdd so
lets put it back.
The regression was introduced by the following commit:
commit dff392dbd2
Author: Paulo Zanoni <paulo.r.zanoni@intel.com>
Date: Fri Dec 6 17:32:41 2013 -0200
drm/i915: don't touch the VDD when disabling the panel
v2: Wrap intel_disable_dp() with _vdd_on and _vdd_off
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74628
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Patrik Jakobsson <patrik.r.jakobsson@gmail.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
In the future, we need to be able to specify per-pipe number of
planes/sprites. Let's start today!
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This macro is similar to for_each_pipe() we already have. Convert the
two call sites we have at the same time.
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Consistency throughout the code base is good and remove some room for
mistakes (as explained in the "drm/i915: Use a pipe variable to cycle
through the pipes" commit)
So, let's replace the for_each_pipe(i) occurences by for_each_pipe(pipe)
when it's reasonable and practical to do so (eg. when there isn't another
pipe variable already).
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
'i' is already defined in the function scope and used elsewhere. Let's
use it instead.
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
I recently fumbled a patch because I wrote twice num_sprites[i], and it
was the right thing to do in only 50% of the cases.
This patch ensures I need to write num_sprites[pipe], ie it should be
self-documented that it's per-pipe number of sprites without having to
look at what is 'i' this time around.
It's all a lame excuse, but it does make it harder to redo the same
mistake.
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
According to BSpec we need to always set this magic bit in ring buffer
mode.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
If we need precisely N lanes to satisfy the FDI bandwidth requirement,
the code would still claim that we need N+1 lanes. Use DIV_ROUND_UP()
to get a more accurate answer.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
On DDI there's no PLL as such to generate the pixel clock for VGA.
Instead we derive the pixel clock from the FDI link frequency. So
to make .compute_config match what .get_config does, we need to
set the port_clock based on the FDI link frequency.
Note that we don't even check the port_clock when selecting the
PLL for VGA output. We just assume SPLL at 1.35GHz is what we want,
and that does match with the asumption of FDI frequency of 2.7Ghz
we have in intel_fdi_link_freq().
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74955
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
as they don't exists.
v2: rename gen6_*_mt_* to gen7_*_mt_* as they never get called
with gen6 (Chris)
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net> (v1)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
When we get control from BIOS there might be mt forcewake
bits already set. This causes us to do double mt get
without proper clear/ack sequence.
Fix this by clearing mt forcewake register on init,
like we do with older gens.
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
BDW is no longer flagged as preliminary hw, but without
i915.preliminary_hw_support module param set the logs are filled with
WARNs about it.
Just make semaphores off the BDW per-chip default for now.
CC: Ben Widawsky <ben@bwidawsk.net>
Reported-by: Sebastien Dufour <sebastien.dufour@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Ben and I believe this will be necessary on production hardware.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
[danvet: Shuffle lines to group all ROW_CHICKEN writes and add a
cautious comment that this might not be needed on production hw.]
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
I believe this will be necessary on production hardware.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Fix whitespace fail spotted by checkpatch. Also add missing
:bdw w/a tag that Ville spotted.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
For example if we get bug reports with similar error states and
suspend count is always 1, that might lead the Sherlocks to
right general direction.
Suggested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
By default we keep only the error state from first hang. However
some sneaky user might have cleared the first error state and we
assume mistakenly that it is from first hang. As sometimes this
matters, it is better to explicitly store the reset count.
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We capture error state not only when the GPU hangs but also on
other situations as in interrupt errors and in situations where
we can kick things forward without GPU reset. There will be log
entry on most of these cases. But as error state capture might be
only thing we have, if dmesg was not captured. Or as in GEN4 case,
interrupt error can trigger error state capture without log entry,
the exact reason why capture was made is hard to decipher.
v2: Split out the the error code stuff to separate patch (Ben)
References: https://bugs.freedesktop.org/show_bug.cgi?id=74193
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
commit 011cf577b2
Author: Ben Widawsky <benjamin.widawsky@intel.com>
Date: Tue Feb 4 12:18:55 2014 +0000
drm/i915: Generate a hang error code
added error code debug into dmesg. Store this also
with error state to make matching dmesg logs and error
states easier.
As we need to have full ring state for error code generation,
do full capture always, print hang message into log and then
decide if we need to keep the error state.
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
After finding the guilty batch and request, we can use it to find the
process that submitted the batch and then add the culprit into the error
state.
This is a slightly different approach from Ben's in that instead of
adding the extra information into the struct i915_hw_context, we use the
information already captured in struct drm_file which is then referenced
from the request.
v2: Also capture the workaround buffer for gen2, so that we can compare
its contents against the intended batch for the active request.
v3: Rebase (Mika)
v4: Check for null context (Chris)
checkpatch warnings fixed
Link: http://lists.freedesktop.org/archives/intel-gfx/2013-August/032280.html
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> (v2)
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> (v4)
Acked-by: Ben Widawsky <ben@bwidawsk.net>
Cc: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
In the past, it was possible to have multiple batches per request due to
a stray signal or ENOMEM. As a result we had to scan each active object
(filtered by those having the COMMAND domain) for the one that contained
the ACTHD pointer. This was then made more complicated by the
introduction of ppgtt, whereby ACTHD then pointed into the address space
of the context and so also needed to be taken into account.
This is a fairly robust approach (though the implementation is a little
fragile and depends upon the per-generation setup, registers and
parameters). However, due to the requirements for hangstats, we needed a
robust method for associating batches with a particular request and
having that we can rely upon it for finding the associated batch object
for error capture.
If the batch buffer tracking is not robust enough, that should become
apparent quite quickly through an erroneous error capture. That should
also help to make sure that the runtime reporting to userspace is
robust. It also means that we then report the oldest incomplete batch on
each ring, which can be useful for determining the state of userspace at
the time of a hang.
v2: Use i915_gem_find_active_request (Mika)
v3: remove check for ring->get_seqno, split long lines (Ben)
v4: check that context is available (Chris)
checkpatch warnings fixed
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> (v1)
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> (v3)
Cc: Ben Widawsky <benjamin.widawsky@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net> (v3)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
In place of true activity counting, we walk the list of vma associated
with an object managing each on the vm's active/inactive list everytime
we call move-to-inactive. This depends upon the vma->mm_list being
cleared after unbinding, or else we run into difficulty when tracking
the object in multiple vm's - we see a use-after free and corruption of
the mm_list.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
It occured to me that when we're trying to wake up both render
and media wells on VLV, we might end up calling the low level
force_wake_get/put two times even though one call would be
enough. Make that happen by figuring out which wells really
need to be woken up based on the forcewake counts.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by:Deepak S <deepak.s@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
VLV is the only platform where we increment/decrement the forcewake
count around register access. Drop the inc/dec on VLV to make the
forcewake code a bit more unified.
The inc/dec are not necessary since we hold the uncore lock around
the whole operation.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Deepak S <deepak.s@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Use the render/media specific forcewake counts to properly restore the
forcewake status after a GPU reset on VLV.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Deepak S <deepak.s@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
After a hang and failed reset, we cannot use the GPU to execute the page
flip instructions. Instead we can force a synchronous mmio flip. (Later,
we can reduce the synchronicity of the mmio flip by moving some of the
delays off to a worker, like the current page flip code; see vblank
tasks.)
References: https://bugs.freedesktop.org/show_bug.cgi?id=72631
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
I could swear this was already happening in the current code...
Also, put the reads and writes in a generic place, so we don't forget
it again when we add runtime PM support to new platforms.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Just to be sure...
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Because we shouldn't be runtime suspended when forcewake is supposed
to be enabled.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
[danvet: Update commit message - no WARN expected since the bugfix for
issues hit with this assert is already in. And resolve conflicts with
the change from worker to timer for the delayed fw release.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Since the addition of dev_priv->mm.busy, there's no more need for
dev_priv->pc8.gpu_idle, so kill it.
Notice that when you remove gpu_idle, hsw_package_c8_gpu_idle and
hsw_package_c8_gpu_busy become identical to hsw_enable_package_c8 and
hsw_disable_package_c8, so just use them.
Also, when we boot the machine, dev_priv->mm.busy initially considers
the machine as idle. This is opposed to dev_priv->pc8.gpu_idle, which
considered it busy. So dev_priv->pc8.disable_count has to be
initalized to 1 now.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
These are places where we read (not write) registers while we're
runtime suspended.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Otherwise we'll read registers that return 0xffffffff, trigger some
WARNs, think CRT is actually connected (because certain bits are 1),
and fail the drm-resources-equal testcase!
Tested on a SNB machine with runtime PM support (which is not upstream
yet, but is already on my public tree at freedesktop.org, and will
hopefully eventually become upstream).
Testcase: igt/pm_pc8/drm-resources-equal
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
When we call gen6_gt_force_wake_put we don't actually put force_wake,
we just schedule gen6_force_wake_work through mod_delayed_work, and
that will eventually release force_wake.
The problem is that we call intel_runtime_pm_put directly at
gen6_gt_force_wake_put, so most of the times we put our runtime PM
reference before the delayed work happens, so we may runtime suspend
while force_wake is still supposed to be enabled if the graphics
autosuspend_delay_ms is too small.
Now the nice thing about the current code is that after it triggers
the delayed work function it gets a refcount, and it only triggers the
delayed work function if refcount is zero. This guarantees that when
we schedule the funciton, it will run before we try to schedule it
again, which simplifies the problem and allows for the current
solution to work properly (hopefully!).
v2: - Keep the VLV refcounts balanced (Jesse)
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Because intel_mark_idle still touches some registers: it needs the
machine to be awake. If you set both the autosuspend and PC8 delays to
zero, you can get a "Device suspended" WARN when gen6_rps_idle touches
registers.
This is not easy to reproduce, but happens once in a while when
running pm_pc8.
Testcase: igt/pm_pc8
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>