Allows reading/writing via SOC15 macros with offset for
various register banks.
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Same as other asics. If enabled, exposes a user selectable
number of virtual displays.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This got lost when the code was revamped. Copy/paste bug from
gfx8.
Reported-by: Evan Quan <evan.quan@amd.com>
Fixes: 78c168342 (drm/amdgpu: allow split of queues with kfd at queue granularity v4)
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Swap space for underscore in ring name.
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
A couple of simple tidy ups to register programming.
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
(v2): Avoid using 'data' uninitialized
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Under VF environment, the ucode would be settled to the visible VRAM,
As it would be pinned to the visible VRAM, it's better to add
contiguous flag,otherwise it need to move gpu address during the pin
process. This movement is not necessary.
Signed-off-by: horchen <horace.chen@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
gpu_info firmware is released after data is used. But when system enters into
suspend, upper class driver will cache all firmware names. At that time,
gpu_info will be failing to load. It seems an upper class issue, that we should
not release gpu_info firmware until device finished.
[ 903.236589] cache_firmware: amdgpu/vega10_sdma1.bin
[ 903.236590] fw_set_page_data: fw-amdgpu/vega10_sdma1.bin buf=ffff88041eee10c0 data=ffffc90002561000 size=17408
[ 903.236591] cache_firmware: amdgpu/vega10_sdma1.bin ret=0
[ 903.464160] __allocate_fw_buf: fw-amdgpu/vega10_gpu_info.bin buf=ffff88041eee2c00
[ 903.471815] (NULL device *): loading /lib/firmware/updates/4.11.0-custom/amdgpu/vega10_gpu_info.bin failed with error -2
[ 903.482870] (NULL device *): loading /lib/firmware/updates/amdgpu/vega10_gpu_info.bin failed with error -2
[ 903.492716] (NULL device *): loading /lib/firmware/4.11.0-custom/amdgpu/vega10_gpu_info.bin failed with error -2
[ 903.503156] (NULL device *): direct-loading amdgpu/vega10_gpu_info.bin
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
As Christian and David's suggestion, submit the test ib ring debug interfaces.
It's useful for debugging with the command submission without VM case.
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Eric Huang <JinHuiEric.Huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Eric Huang <JinHuiEric.Huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
No need to clear it. The values are set explicitly.
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Maarten and Ville noticed that we are enabling backlight via DP aux very
early in the modeset_init path via the intel_dp_aux_setup_backlight()
function, since commit e7156c8339 ("drm/i915: Add Backlight Control using
DPCD for eDP connectors (v9)"). Looks like all we need to do during
_setup_backlight() is read the current brightness state instead of
modifying it.
v2: Rewrote commit message.
Cc: Ville Syrjala <ville.syrjala@linux.intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Yetunde Adebisi <yetundex.adebisi@intel.com>
Signed-off-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Tested-by: Puthikorn Voravootivat <puthik@chromium.org>
Fixes: e7156c8339 ("drm/i915: Add Backlight Control using DPCD for eDP connectors (v9)")
Link: http://patchwork.freedesktop.org/patch/msgid/1497384239-2965-1-git-send-email-dhinakaran.pandiyan@intel.com
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
The function cnl_ddi_dp_set_dpll_hw_state does not need to be in global
scope, so make it static.
Cleans up sparse warning:
"symbol 'cnl_ddi_dp_set_dpll_hw_state' was not declared. Should it
be static?"
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170613134751.29196-1-colin.king@canonical.com
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
With 830 the only thing needing pipe quirks, we can just drop the quirk
defines and replace the checks with IS_I830() checks.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170601143619.27840-8-ville.syrjala@linux.intel.com
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
The pipe A force quirk shouldn't needed except on 830. So let's nuke it
for the IBM Thinkpad T60 945 machines. This quirk pre-dates
KMS so it's usefulness is doubtful at best now.
The original bug report [1] describes the symptoms as "system hang on
closing T60 panel lid", and we already dropped a similar quirk for
another 945 machine in
commit 736a69ca8c ("drm/i915: Drop PIPE-A quirk for 945GSE HP Mini")
so I'm hopeful we can drop this one as well.
The quirk was added into xf86-video-intel in
commit 08903abe4dc0 ("Add pipe a force enable quirk for Lenovo T60")
[1] https://bugs.freedesktop.org/show_bug.cgi?id=16494
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170601143619.27840-7-ville.syrjala@linux.intel.com
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
The pipe A force quirk shouldn't needed except on 830. So let's nuke it
for the Toshiba Protege R-205/S-209 945 machines. This quirk pre-dates
KMS so it's usefulness is doubtful at best now.
Unfortunately the original bug report [1] isn't very helpful since it
doesn't describe the symptoms. And the commit message in xf86-video-intel
commit ecdb5963ef68 ("Add pipe A force enable quirk for Toshiba Portege R205-S209")
is not much help either.
However, if we assume the problem was the typical "closing the lid
hangs the box" type of thing, we already nuked the quirk for another
945 machine in
commit 736a69ca8c ("drm/i915: Drop PIPE-A quirk for 945GSE HP Mini")
and so I hope we can drop this one as well.
[1] https://bugs.freedesktop.org/show_bug.cgi?id=14944
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170601143619.27840-6-ville.syrjala@linux.intel.com
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
830 more or less requires both pipes and DPLLs to remain on as long
as either pipe is needed. However, when neither pipe is actually needed,
we can save a bit of power by turning everything off. To do that we add
a new "power well" that turns both pipes and DPLLs on and off in the
right order. Seems to save ~50mW on my Fujitsu-Siemens Lifebook S6010.
This also avoids having to abuse the load detection to force pipe A on
at init time. That was never very robust, and it only worked for one
pipe, whereas 830 really needs both pipes enabled. As a bonus the 830
pipe quirk is now a bit more isolated from the rest of the mode setting
infrastructure, which should mean that it's much less likely someone
will accidentally break it in the future. The extra cost is of course
slight code duplication, but that seems like a worthwile tradeoff here.
v2; s/BIT/BIT_ULL/
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170601143619.27840-5-ville.syrjala@linux.intel.com
Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
The magic "enable the DPLL three times" sequence feels like it
deserves a loop.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170601143619.27840-4-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
The blocking gather copy allocation is a major performance downside of the
Host1x firewall, it may take hundreds milliseconds which is unacceptable
for the real-time graphics operations. Let's try a non-blocking allocation
first as a least invasive solution, it makes opentegra (Xorg driver)
performance indistinguishable with/without the firewall.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Erik Faye-Lund <kusmabite@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
This is largely a rewrite of the Host1x channel allocation code, bringing
several changes:
- The previous code could deadlock due to an interaction
between the 'reflock' mutex and CDMA timeout handling.
This gets rid of the mutex.
- Support for more than 32 channels, required for Tegra186
- General refactoring, including better encapsulation
of channel ownership handling into channel.c
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
There is no host1x_cdma_stop() in the code, let's remove its definition
from the header file.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Erik Faye-Lund <kusmabite@gmail.com>
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
The struct host1x_cmdbuf is unused, let's remove it.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Erik Faye-Lund <kusmabite@gmail.com>
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Check waits in the firewall in a way it is done for relocations.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
Reviewed-by: Erik Faye-Lund <kusmabite@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
If intel_crtc_disable_noatomic() were to ever get called during resume
we'd end up deadlocking since resume has its own acqcuire_ctx but
intel_crtc_disable_noatomic() still tries to use the
mode_config.acquire_ctx. Pass down the correct acquire ctx from the top.
Cc: stable@vger.kernel.org
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Fixes: e2c8b8701e ("drm/i915: Use atomic helpers for suspend, v2.")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170601143619.27840-3-ville.syrjala@linux.intel.com
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Several channels could be made to write the same unit concurrently via
the SETCLASS opcode, trusting userspace is a bad idea. It should be
possible to drop the per-client channel reservation and add a per-unit
locking by inserting MLOCK's to the command stream to re-allow the
SETCLASS opcode, but it will be much more work. Let's forbid the
unit-unrelated class changes for now.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Erik Faye-Lund <kusmabite@gmail.com>
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
The RESTART opcode terminates the gather and restarts the CDMA fetching
from a specified word << 2 relative to the CDMA start address. That
shouldn't be allowed to be done by userspace.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Erik Faye-Lund <kusmabite@gmail.com>
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Incorrectly shifted relocation address will cause a lower memory
corruption and likely a hang on a write or a read of an arbitrary data
in case of IOMMU absence. As of now, there is no known use for the
address shifting and adding a proper shifts / sizes validation is a much
more work. Let's forbid shifts in the firewall till a proper validation
is implemented.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Erik Faye-Lund <kusmabite@gmail.com>
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Perform gathers coping before patching them, so that original gathers are
left untouched. That's not as bad as leaking kernel addresses, but still
doesn't feel right.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
In case of relocations / waitchecks patching failure the jobs pins stay
referenced till DRM file get closed, wasting memory. Add the missed
unpinning.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Erik Faye-Lund <kusmabite@gmail.com>
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
The commands stream is prepended by the jobs class on the CDMA
submission, so that explicitly setting a module class in the commands
stream isn't necessary. The firewall initializes its class to 0 and the
command stream that doesn't explicitly specify the class effectively
bypasses the firewall.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
Reviewed-by: Erik Faye-Lund <kusmabite@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
On Tegra20 if plane has width or height equal to 0, it will be infinitely
wide or tall. Let's disable the plane if it is invisible on atomic state
committing to fix the issue. The Rockchip DRM driver does the same.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Erik Faye-Lund <kusmabite@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
On Tegra20 an overlay plane should be clipped, otherwise its output is
distorted once plane crosses display boundary.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Erik Faye-Lund <kusmabite@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Pass down the correct acquire context to the pipe A quirk load detect
hack during display resume. Avoids deadlocking the entire thing.
Cc: stable@vger.kernel.org
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Fixes: e2c8b8701e ("drm/i915: Use atomic helpers for suspend, v2.")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170601143619.27840-2-ville.syrjala@linux.intel.com
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Commit 33a8eb8d40 ("drm/tegra: dc: Implement runtime PM") introduced
HW reset control. It causes a hang on Tegra20 if both display
controllers are utilized (RGB panel and HDMI). The TRM suggests that
each display controller has its own reset control, apparently it is not
correct.
Fixes: 33a8eb8d40 ("drm/tegra: dc: Implement runtime PM")
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Erik Faye-Lund <kusmabite@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
In case of invalid syncpoint ID, the host1x_syncpt_get() returns NULL and
none of its users perform a check of the returned pointer later. Let's bail
out until it's too late.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
The waitchecks along with multiple syncpoints per submit are not ready
for use yet, let's forbid them for now.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
If commands buffer claims a number of words that is higher than its BO can
fit, a kernel OOPS will be fired on the out-of-bounds BO access. This was
triggered by an opentegra Xorg driver that erroneously pushed too many
commands to the pushbuf.
The CDMA commands buffer address is 4 bytes aligned, so check its
alignment.
The maximum number of the CDMA gather fetches is 16383, add a check for
it.
Add a sanity check for the relocations in a same way.
[ 46.829393] Unable to handle kernel paging request at virtual address f09b2000
...
[<c04a3ba4>] (host1x_job_pin) from [<c04dfcd0>] (tegra_drm_submit+0x474/0x510)
[<c04dfcd0>] (tegra_drm_submit) from [<c04deea0>] (tegra_submit+0x50/0x6c)
[<c04deea0>] (tegra_submit) from [<c04c07c0>] (drm_ioctl+0x1e4/0x3ec)
[<c04c07c0>] (drm_ioctl) from [<c02541a0>] (do_vfs_ioctl+0x9c/0x8e4)
[<c02541a0>] (do_vfs_ioctl) from [<c0254a1c>] (SyS_ioctl+0x34/0x5c)
[<c0254a1c>] (SyS_ioctl) from [<c0107640>] (ret_fast_syscall+0x0/0x3c)
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Erik Faye-Lund <kusmabite@gmail.com>
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Improve kerneldoc for the public parts of the host1x infrastructure in
preparation for adding driver-specific part to the GPU documentation.
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Currently the vma has one link member that is used for both holding its
place in the execbuf reservation list, and in any eviction list. This
dual property is quite tricky and error prone.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170615081435.17699-3-chris@chris-wilson.co.uk
This has the benefit of not requiring us to manipulate the
vma->exec_link list when tearing down the execbuffer, and is a
marginally cheaper test to detect the user error.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170615081435.17699-2-chris@chris-wilson.co.uk
Add OA support for Kabylake (pretty much identical to Skylake), and
also add the associated OA configurations.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Add macros to detect GT2/GT3 skus so we can apply the proper OA
configuration later.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
In earlier iterations of the i915-perf driver we had a number of
callbacks/hooks from other parts of the i915 driver to e.g. notify us
when a legacy context was pinned and these could run asynchronously with
respect to the stream file operations and might also run in atomic
context.
dev_priv->perf.hook_lock had been for serialising access to state needed
within these callbacks, but as the code has evolved some of the hooks
have gone away or are implemented to avoid needing to lock any state.
The remaining use of this lock was actually redundant considering how
the gen7 oacontrol state used to be updated as part of a context pin
hook.
Signed-off-by: Robert Bragg <robert@sixbynine.org>
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
An oa_exponent_to_ns() utility and per-gen timebase constants where
recently removed when updating the tail pointer race condition WA, and
this restores those so we can update the _PROP_OA_EXPONENT validation
done in read_properties_unlocked() to not assume we have a 12.5MHz
timebase as we did for Haswell.
Accordingly the oa_sample_rate_hard_limit value that's referenced by
proc_dointvec_minmax defining the absolute limit for the OA sampling
frequency is now initialized to (timestamp_frequency / 2) instead of the
6.25MHz constant for Haswell.
v2:
Specify frequency of 19.2MHz for BXT (Ville)
Initialize oa_sample_rate_hard_limit per-gen too (Lionel)
Signed-off-by: Robert Bragg <robert@sixbynine.org>
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
These are auto generated from an XML description of metric sets,
currently maintained in gputop, ref:
https://github.com/rib/gputop
> gputop-data/oa-*.xml
> scripts/i915-perf-kernelgen.py
$ make -C gputop-data -f Makefile.xml
Signed-off-by: Robert Bragg <robert@sixbynine.org>
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Enables access to OA unit metrics for BDW, CHV, SKL and BXT which all
share (more-or-less) the same OA unit design.
Of particular note in comparison to Haswell: some OA unit HW config
state has become per-context state and as a consequence it is somewhat
more complicated to manage synchronous state changes from the cpu while
there's no guarantee of what context (if any) is currently actively
running on the gpu.
The periodic sampling frequency which can be particularly useful for
system-wide analysis (as opposed to command stream synchronised
MI_REPORT_PERF_COUNT commands) is perhaps the most surprising state to
have become per-context save and restored (while the OABUFFER
destination is still a shared, system-wide resource).
This support for gen8+ takes care to consider a number of timing
challenges involved in synchronously updating per-context state
primarily by programming all config state from the cpu and updating all
current and saved contexts synchronously while the OA unit is still
disabled.
The driver intentionally avoids depending on command streamer
programming to update OA state considering the lack of synchronization
between the automatic loading of OACTXCONTROL state (that includes the
periodic sampling state and enable state) on context restore and the
parsing of any general purpose BB the driver can control. I.e. this
implementation is careful to avoid the possibility of a context restore
temporarily enabling any out-of-date periodic sampling state. In
addition to the risk of transiently-out-of-date state being loaded
automatically; there are also internal HW latencies involved in the
loading of MUX configurations which would be difficult to account for
from the command streamer (and we only want to enable the unit when once
the MUX configuration is complete).
Since the Gen8+ OA unit design no longer supports clock gating the unit
off for a single given context (which effectively stopped any progress
of counters while any other context was running) and instead supports
tagging OA reports with a context ID for filtering on the CPU, it means
we can no longer hide the system-wide progress of counters from a
non-privileged application only interested in metrics for its own
context. Although we could theoretically try and subtract the progress
of other contexts before forwarding reports via read() we aren't in a
position to filter reports captured via MI_REPORT_PERF_COUNT commands.
As a result, for Gen8+, we always require the
dev.i915.perf_stream_paranoid to be unset for any access to OA metrics
if not root.
v5: Drain submitted requests when enabling metric set to ensure no
lite-restore erases the context image we just updated (Lionel)
v6: In addition to drain, switch to kernel context & update all
context in place (Chris)
v7: Add missing mutex_unlock() if switching to kernel context fails
(Matthew)
v8: Simplify OA period/flex-eu-counters programming by using the
batchbuffer instead of modifying ctx-image (Lionel)
v9: Back to updating the context image (due to erroneous testing,
batchbuffer programming the OA unit doesn't actually work)
(Lionel)
Pin context before updating context image (Chris)
Drop MMIO programming now that we switch to a kernel context with
right values in initial context image (Chris)
v10: Just pin_map the contexts we want to modify or let the
configuration happen on first use (Chris)
v11: Update kernel context OA config through the batchbuffer rather
than on the fly ctx-image update (Lionel)
v12: Rework OA context registers update again by swithing away from
user contexts and reconfiguring the kernel context through the
batchbuffer and updating all the other contexts' context image.
Also take care to lock slice/subslice configuration when OA is
on. (Lionel)
v13: Request rpcs updates on all engine when updating the OA config
(Lionel)
v14: Drop any kind of rpcs management now that we monitor sseu
configuration changes in a later patch (Lionel)
Remove usleep after programming the NOA configs on Gen8+, this
doesn't seem to be needed (Lionel)
v15: Respect coding style for block comments (Chris)
v16: Add missing i915_add_request() in case we fail to emit OA
configuration (Matthew)
Signed-off-by: Robert Bragg <robert@sixbynine.org>
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com> \o/
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Adds a static OA unit, MUX, B Counter + Flex EU configurations for basic
render metrics on Broadwell, Cherryview, Skylake and Broxton. These are
auto generated from an XML description of metric sets, currently
maintained in gputop, ref:
https://github.com/rib/gputop
> gputop-data/oa-*.xml
> scripts/i915-perf-kernelgen.py
$ make -C gputop-data -f Makefile.xml WHITELIST=RenderBasic
v2: add newlines to debug messages + fix comment (Matthew Auld)
Signed-off-by: Robert Bragg <robert@sixbynine.org>
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Gen8+ might have mux configurations per slices/subslices. Depending on
whether slices/subslices have been fused off, only part of the
configuration needs to be applied. This change reworks the mux
configurations query mechanism to allow more than one set of registers
to be programmed.
v2: s/n_mux_regs/n_mux_configs/ (Matthew)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Assuming a uniform mask across all slices, this enables userspace to
determine the specific sub slices can be enabled. This information is
required, for example, to be able to analyse some OA counter reports
where the counter configuration depends on the HW sub slice
configuration.
Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Enables userspace to determine the maximum number of slices that can
be enabled on the device and also know what specific slices can be
enabled. This information is required, for example, to be able to
analyse some OA counter reports where the counter configuration
depends on the HW slice configuration.
Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
This patch supports TM2e panel and the panel has 1600x2560 resolution
in 5.65" physical.
This identify panel type with compatibility string, also invoke
display mode that matches the type. So add the check code for s6e3ha2
compatibility and s6e3hf2 type and select the drm_display_mode of
default and edge type.
Signed-off-by: Hoegeun Kwon <hoegeun.kwon@samsung.com>
Reviewed-by: Andrzej Hajda <a.hajda@samsung.com>
Reviewed-by: Inki Dae <inki.dae@samsung.com>
[treding@nvidia.com: fixup checkpatch warnings]
Signed-off-by: Thierry Reding <treding@nvidia.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1492504836-19225-3-git-send-email-hoegeun.kwon@samsung.com
Without the dependency, we run into a link error:
drivers/gpu/drm/panel/panel-sitronix-st7789v.o: In function `st7789v_probe':
panel-sitronix-st7789v.c:(.text.st7789v_probe+0xc0): undefined reference to `of_find_backlight_by_node'
Fixes: 7142afb3a1 ("drm/panel: Add driver for sitronix ST7789V LCD controller")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Hoegeun Kwon <hoegeun.kwon@samsung.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170419180326.303994-1-arnd@arndb.de
The new S6E3HA2 driver fails to link when backlight is disabled:
ERROR: "backlight_device_register" [drivers/gpu/drm/panel/panel-samsung-s6e3ha2.ko] undefined!
ERROR: "backlight_device_unregister" [drivers/gpu/drm/panel/panel-samsung-s6e3ha2.ko] undefined!
This adds a Kconfig dependency like we have it for some other panel drivers.
Fixes: ed29f9426d ("drm/panel: Add support for S6E3HA2 panel driver on TM2 board")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Hoegeun Kwon <hoegeun.kwon@samsung.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170419175939.189098-2-arnd@arndb.de
This adds support for the AU Optronics Corporation 31.5"
FHD (1920x1080) LVDS TFT LCD panel, which can be supported
by the simple panel driver
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170608180758.31020-4-l.stach@pengutronix.de
This adds support for the NLT Technologies NL192108AC18-02D
15.6" LVDS FullHD TFT LCD panel, which can be supported
by the simple panel driver.
Timings are taken from the preliminary datasheet, as a final
one is not yet available.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170608180758.31020-3-l.stach@pengutronix.de
I removed the zapping of the reservation_object->fence array of shared
fences prematurely. We don't yet have the code to zap that array when
retiring the object, and so currently it remains possible to continually
grow the shared array trapping requests when reusing the batch_pool
object across many timelines.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170518094638.5469-4-chris@chris-wilson.co.uk
Having resolved whether or not we would deadlock upon a call to
mutex_lock(&dev->struct_mutex), we can then spin for the contended
struct_mutex if we are not the owner. We cannot afford to simply block
and wait for the mutex, as the owner may itself be waiting for the
allocator -- i.e. a cyclic deadlock. This should significantly improve
the chance of running the shrinker for other processes whilst the GPU is
busy.
A more balanced approach would be to optimistically spin whilst the
mutex owner was on the cpu and there was an opportunity to acquire the
mutex for ourselves quickly. However, that requires support from
kernel/locking/ and a new mutex_spin_trylock() primitive.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170609110350.1767-4-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
In our first pass, we do not want to use reclaim at all as we want to
solely reap the i915 buffer caches (its purgeable pages). But we don't
mind it initiates IO or pulls via the FS (but it shouldn't anyway as we
say no to reclaim!). Just drop the GFP_IO constraint for simplicity.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170609110350.1767-3-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
I tried __GFP_NORETRY in the belief that __GFP_RECLAIM was effective. It
struggles with handling reclaim of our dirty buffers and relies on
reclaim via kswapd. As a result, a single pass of direct reclaim is
unreliable when i915 occupies the majority of available memory, and the
only means of effectively waiting on kswapd to amke progress is by not
setting the __GFP_NORETRY flag and lopping. That leaves us with the
dilemma of invoking the oomkiller instead of propagating the allocation
failure back to userspace where it can be handled more gracefully (one
hopes). In the future we may have __GFP_MAYFAIL to allow repeats up until
we genuinely run out of memory and the oomkiller would have been invoked.
Until then, let the oomkiller wreck havoc.
v2: Stop playing with side-effects of gfp flags and await __GFP_MAYFAIL
v3: Update comments that direct reclaim only appears to be ignoring our
dirty buffers!
Fixes: 24f8e00a8a ("drm/i915: Prefer to report ENOMEM rather than incur the oom for gfx allocations")
Testcase: igt/gem_tiled_swapping
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Michal Hocko <mhocko@suse.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170609110350.1767-2-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Commit 24f8e00a8a ("drm/i915: Prefer to report ENOMEM rather than
incur the oom for gfx allocations") made the bold decision to try and
avoid the oomkiller by reporting -ENOMEM to userspace if our allocation
failed after attempting to free enough buffer objects. In short, it
appears we were giving up too easily (even before we start wondering if
one pass of reclaim is as strong as we would like). Part of the problem
is that if we only shrink just enough pages for our expected allocation,
the likelihood of those pages becoming available to us is less than 100%
To counter-act that we ask for twice the number of pages to be made
available. Furthermore, we allow the shrinker to pull pages from the
active list in later passes.
v2: Be a little more cautious in paging out gfx buffers, and leave that
to a more balanced approach from shrink_slab(). Important when combined
with "drm/i915: Start writeback from the shrinker" as anything shrunk is
immediately swapped out and so should be more conservative.
Fixes: 24f8e00a8a ("drm/i915: Prefer to report ENOMEM rather than incur the oom for gfx allocations")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170609110350.1767-1-chris@chris-wilson.co.uk
This interface allows importing the fence from a sync_file into
an existing drm sync object, or exporting the fence attached to
an existing drm sync object into a new sync file object.
This should only be used to interact with sync files where necessary.
v1.1: fence put fixes (Chris), drop fence from ioctl names (Chris)
fixup for new fence replace API.
Reviewed-by: Sean Paul <seanpaul@chromium.org>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Sync objects are new toplevel drm object, that contain a
pointer to a fence. This fence can be updated via command
submission ioctls via drivers.
There is also a generic wait obj API modelled on the vulkan
wait API (with code modelled on some amdgpu code).
These objects can be converted to an opaque fd that can be
passes between processes.
v2: rename reference/unreference to put/get (Chris)
fix leaked reference (David Zhou)
drop mutex in favour of cmpxchg (Chris)
v3: cleanups from danvet, rebase on drm_fops rename
check fd_flags is 0 in ioctls.
v4: export find/free, change replace fence to take a
syncobj. In order to support lookup first, replace
later semantics which seem in the end to be cleaner.
Reviewed-by: Sean Paul <seanpaul@chromium.org>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Dave Airlie <airlied@redhat.com>
If one 'drm_gem_handle_create()' fails, we leak somes handles and some
memory.
In order to fix it:
- move the 'free(bo_state)' at the end of the function so that it is also
called in the eror handling path. This has the side effect to also try
to free it if the first 'kcalloc' fails. This is harmless.
- add a new label, err_delete_handle, in order to delete already
allocated handles in error handling path
- remove the now useless 'err' label
The way the code is now written will also delete the handles if the
'copy_to_user()' call fails.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Eric Anholt <eric@anholt.net>
Link: http://patchwork.freedesktop.org/patch/msgid/20170512123803.1886-1-christophe.jaillet@wanadoo.fr
The bo->resv pointer could be NULL, leading to kernel oopses
like the one below.
This patch ensures that bo->resv is always set in vc4_create_object
ensuring that it is never NULL.
Thanks to Eric Anholt for pointing to the correct solution.
[ 19.738487] Unable to handle kernel NULL pointer dereference at virtual address 00000000
[ 19.746805] pgd = ffff8000275fc000
[ 19.750319] [00000000] *pgd=0000000000000000
[ 19.754715] Internal error: Oops: 96000004 [#1] PREEMPT SMP
[ 19.760369] Modules linked in: smsc95xx usbnet vc4 drm_kms_helper drm pwm_bcm2835 i2c_bcm2835 bcm2835_rng rng_core bcm2835_dma virt_dma
[ 19.772767] CPU: 0 PID: 1297 Comm: Xorg Not tainted 4.12.0-rc1-rpi3 #58
[ 19.779476] Hardware name: Raspberry Pi 3 Model B (DT)
[ 19.784688] task: ffff800028268000 task.stack: ffff800026c08000
[ 19.790705] PC is at ww_mutex_lock_interruptible+0x14/0xc0
[ 19.796329] LR is at vc4_submit_cl_ioctl+0x4fc/0x998 [vc4]
...
[ 20.240855] [<ffff0000088975f4>] ww_mutex_lock_interruptible+0x14/0xc0
[ 20.247528] [<ffff0000009b3ea4>] vc4_submit_cl_ioctl+0x4fc/0x998 [vc4]
[ 20.254372] [<ffff0000008f75f8>] drm_ioctl+0x180/0x438 [drm]
[ 20.260120] [<ffff00000821383c>] do_vfs_ioctl+0xa4/0x7d0
[ 20.265510] [<ffff000008213fe4>] SyS_ioctl+0x7c/0x98
[ 20.270550] [<ffff000008082f30>] el0_svc_naked+0x24/0x28
[ 20.275941] Code: d2800002 d5384103 910003fd f9800011 (c85ffc04)
[ 20.282527] ---[ end trace 1f6bd640ff32ae12 ]---
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Link: http://patchwork.freedesktop.org/patch/msgid/14e68768-6c92-2d74-92fd-196dbc50d8f7@xs4all.nl
All here is pretty much like Kabylake.
Including CFL-U has to use same ddi translation table
as KBL-U for now.
v2: Include missed IS_COFFEELAKE on edp trans table. (DK)
Handle CFL-U with same translation table as KBL-U. (DK and
confirmed with HW engineers)
v3: Adding missed case for IS_CFL_ULT. (DK).
v4: Duh! Now with the real IS_CFL_ULT instead of KBL one. (DK)
Also use IS_GEN9_BC when possible. (DK)
Cc: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1497045770-21302-1-git-send-email-rodrigo.vivi@intel.com
Enable wrpll computation for Cannonlake platform to support
pll's required for HDMI output. The patch contains the following features
- compute Cannonlake port clock programming
dividers P, Q, and K.
- compute PLL parameters for Cannonlake. These parameters
set the values on DPLL registers.
- find the register values to program wrpll for Cannonlake.
The reference clock can be either 19.2MHz or 24MHz.
v2: rebase
v3: squash wrpll patches into one (Rodrigo)
v4: switch order of getting even dividers (Paulo)
update divider register values for PDiv and KDiv (Paulo)
update wrpll computation algorithm (Paulo)
v5: Remove ref clock division by 1000. (Rodrigo)
v6: Rodrigo rebasing on top of latest code.
Signed-off-by: Kahola, Mika <mika.kahola@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Clint Taylor <Clinton.A.Taylor@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1497047175-27250-18-git-send-email-rodrigo.vivi@intel.com
vswing programming sequence step 2 requires the Loadgen_select bit to
be set in PORT_TX_DW4 lane reigsters per table defined by Bit rate and
lane width. Implemented the change that was marked as FIXME in the
driver.
v2: (Rodrigo) checkpatch fixes.
Signed-off-by: Clint Taylor <clinton.a.taylor@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1497047175-27250-12-git-send-email-rodrigo.vivi@intel.com
This is an important part of the DDI initalization as well as
for changing the voltage during DisplayPort link training.
This new sequence for Cannonlake is more like Broxton style
but still with different registers, different table and
different steps.
v2: Do not write to DW4_GRP to avoid overwrite individual loadgen.
Fix PORT_CL_DW5 SUS Clock Config set.
v3: As previous platforms use only eDP table if low voltage was
requested.
v4: fix Werror:maybe uninitialized (Paulo)
v5: Rebase on top of dw2_swing_sel changes
on previous patches.
v6: Using flexible SCALING_MODE_SEL(x).
Cc: Manasi Navare <manasi.d.navare@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1497047175-27250-11-git-send-email-rodrigo.vivi@intel.com
These tables are used on voltage wswing sequence initialization
on Cannonlake.
It is a complete new format now in use by the voltage swing team,
not following any other standard in use by any other platform.
Also the registers are different as well. So let's redefine
the translation table for Cannonlake.
The table is huge. So we minimized with the fields that are
different or might be different anytime soon. The common
values will be hardcoded on the voltage swing sequence.
v2: Merge the lower and the upper bits to match the spec table
and make review easier. This was possible with the good
idea for Manasi with a better way to handle it on the bit
macro definition presented on previous patch.
Credits-to: Manasi
Cc: Manasi Navare <manasi.d.navare@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1497047175-27250-10-git-send-email-rodrigo.vivi@intel.com
This are the registers and bits needed for the voltage swing
sequence on Cannonlake.
v2: Remove CL_DW5 that was wrongly defined.
v3: Use (1 << 1) instead of (1<<1) as Paulo suggested
Change DW2 swing sel upper and lower macros to do the
bit selection instead of definint a table that doesn't
match the spec. It is based on a Manasi version of it.
Credits-to: Manasi.
v4: Let SCALING_MODE_SEL flexible. (Manasi)
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Cc: Manasi Navare <manasi.d.navare@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1497047175-27250-9-git-send-email-rodrigo.vivi@intel.com
Also new registers can have different mmio offsets
per different lane per port.
v2: Use _PICK as PORT3 instead of creating a new
macro with if per port.
v3: Use _PICK directly on MMIO_PORT6. While MMIO_PORT
isn't flexible enough let's continue with MMIO_PORT6
as we have MMIO_PORT3.
Cc: Manasi Navare <manasi.d.navare@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1497047175-27250-8-git-send-email-rodrigo.vivi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Although CNL follows PLL initialization more like Skylake
than Broxton we have a completely different initialization
sequence and registers used.
One big difference from SKL is that CDCLK PLL is now
exclusive (ADPLL) and for DDIs and MIPI we need to use
DFGPLLs 0, 1 or 2.
v2: Accept all Ander's suggestions and fixes:
- Registers and bits names prefix
- Group pll functions
- bits masks fixes
- remove read and modify on cfgcr1
- fix cfgcr0 setup
v3: Set SSC_ENABLE for DP.
Fix HDMI_MODE cfgcr0.
Avoid touch cfgcr0 on DP.
Add missed else on dpll_mgr definition so we use cnl one, not hsw.
v3: Centra freq should be always set to default and change bits
definitions to (1 << 1) instead of (1<<1). (by Paulo)
v4: Rebased.
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Kahola, Mika <mika.kahola@intel.com>
Reviewed-by: Ander Conselvan De Oliveira <ander.conselvan.de.oliveira@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1497047175-27250-7-git-send-email-rodrigo.vivi@intel.com
DPLL's are defined in DPCLKA_CFGCR0 register (0x6C200). Let's use these
definitions when computing dpll's for ddi ports.
v2: (Rodrigo) Remove register that was defined in another patch with
fixed name and more bits.
Signed-off-by: Kahola, Mika <mika.kahola@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1497047175-27250-6-git-send-email-rodrigo.vivi@intel.com
One of the steps for PLL (un)initialization is to (un)map
the correspondent DDI that is actually using that PLL.
So, let's do this step following the places already stablished
and used so far, although spec put this as part of PLL
initialization sequences.
v2: Use proper prefix on bits names as suggested by Ander.
v3: Add missed "~". Without that the logic was inverted
so we were disabling interrupts.
Credits-to: Clinton
Credits-to: Art
v4: Spec is getting updated to do DDI -> PLL mapping
and clock on in 2 separated reg writes. (Paulo)
Also update bits definitions to use space
(1 << 1) instead of (1<<1). (Paulo)
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Cc: Art Runyan <arthur.j.runyan@intel.com>
Cc: Clint Taylor <clinton.a.taylor@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Kahola, Mika <mika.kahola@intel.com>
Cc: Ander Conselvan De Oliveira <ander.conselvan.de.oliveira@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Kahola, Mika <mika.kahola@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1497047175-27250-5-git-send-email-rodrigo.vivi@intel.com
All the low level cdclk bits are present, so let's add the required
hooks to reconfigure cdclk on the fly.
Cannonlake also needs to adjust the minimal pixel rate
as gen9 platforms. Specially for the Azalia audio case.
v2: Rebase due to cnl_sanitize_cdclk()
v3: Rebased by Rodrigo on top of Ville's cdclk rework.
v4: Rebase moving cnl_calc_cdclk up to follow same order
as previous platforms.
v2: Squash drm/i915/cnl: Adjust min pixel rate. to address
the current limitation where CDCLK cannot be set to 168MHz
if audio is used with 96MHz. (Imre)
v3: adjust some of the clock limits within
bdw_adjust_min_pipe_pixel_rate. (Ville/DK/Imre).
Fix commit message messed by squash.
Cc: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Cc: Sanyog Kale <sanyog.r.kale@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1497047175-27250-4-git-send-email-rodrigo.vivi@intel.com
Implement the CNL display init/uninit sequence as outlined in Bspec.
Quite similar to SKL/BXT. The main complicaiton is probably the extra
procmon setup we must do based on the process/voltage information we
can read out from some register.
v2: s/skl_dbuf/gen9_dbuf/ to follow upstream
bxt needed a cdclk sanitize step, so let's add it for cnl too
v3: s/CHICKEN_MISC_1/CHICKEN_MISC_2/ (Ander)
v4: Rebased by Rodrigo after Ville's cdclk rework
v5: Removed unecessary Aux IO forced enable/disable, Fix DW10 setup
Fix procpon Mask. (Credits-to Paulo and Clint)
Remove A0 workaround.
v6: Rebased on top of recent code (Rodrigo).
v7: Respect the order of sanitize_ after set_
(Done by Rodrigo, Requested by Ville)
v8: Commit message updated to matvh v5 changes besides
Remove unused DW8 and an extra blank line. (all noticed
by Imre).
v9: Remove __attribute__((unused)) added on latest version
of drm/i915/cnl: Implement .set_cdclk() for CNL.
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Cc: Clint Taylor <clinton.a.taylor@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1497047175-27250-3-git-send-email-rodrigo.vivi@intel.com
Add support for changing the cdclk frequency on CNL. Again, quite
similar to BXT, but there are some annoying differences which means
trying to share more code might not be feasible:
* PLL ratio now lives in the PLL enable register
* pcode came from SKL, not from BXT
We support three cdclk frequencies: 168,336,528 Mhz. The first two
use the same PLL frequency, the last one uses a different one meaning
we once again may need to toggle the PLL off and on when changing
cdclk.
v2: Rebased by Rodrigo on top of Ville's cdclk rework.
v3: Respect order of set_ bellow get_ (Ville)
v4: Added __attribute__((unused)) to avoid broken compilation with Werror.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1497047175-27250-2-git-send-email-rodrigo.vivi@intel.com
Add support for reading out the cdclk frequency from the hardware on
CNL. Very similar to BXT, with a few new twists and turns:
* the PLL is now called CDCLK PLL, not DE PLL
* reference clock can be 24 MHz in addition to the 19.2 MHz BXT had
* the ratio now lives in the PLL enable register
* Only 1x and 2x CD2X dividers are supported
v2: Deal with PLL lock bit the same way as BXT/SKL do now
v3: DSSM refclk indicator is bit 31 not 24 (Ander)
v4: Rebased by Rodrigo after Ville's cdclk rework.
v5: Set cdclk to the ref clock as previous platforms. (Imre)
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1497047175-27250-1-git-send-email-rodrigo.vivi@intel.com
Current it's strictly checked if PVINFO version matches 1.0
for GVT-g i915 guest which doesn't help for compatibility at
all and forces GVT-g host can't extend PVINFO easily with version
bump for real compatibility check.
This fixes that to check minimal required PVINFO version instead.
v2:
- drop unneeded version macro
- use only major version for sanity check
v3:
- fix up PVInfo value with kernel type
- one indent fix
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chuanxiao Dong <chuanxiao.dong@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: stable@vger.kernel.org # v4.10+
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170609074805.5101-1-zhenyuw@linux.intel.com
The interrupt registers are not indexed.
Fixes: 763a47b8e (drm/amdgpu: teach amdgpu how to enable interrupts for any pipe v3)
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
v2: fix typos.
should disable led dpm feature when stop dpm.
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Xie <AlexBin.Xie@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Xie <AlexBin.Xie@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
There are two identical function prototypes in same header file
Signed-off-by: Alex Xie <AlexBin.Xie@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
v2: Fix logical mistake. If CPU update failed amdgpu_vm_bo_update_mapping()
would not return and instead fall through to SDMA update. Minor change due to
amdgpu_vm_bo_wait() prototype change
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
If amdgpu.vm_update_context param is set to use CPU, then Page
Directories will be updated by CPU instead of SDMA
v2: Call amdgpu_vm_bo_wait before updating the page tables to ensure the
PD/PT BOs are free
v3: Minor changes - due to amdgpu_vm_bo_wait() prototype change, local
variable declaration order and function comments.
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
v2: Add intr option
Helper function useful for CPU update of VM page tables. Also useful if
kernel have to synchronously wait till VM page tables are updated.
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add VM update mode module param (amdgpu.vm_update_mode) that can used to
control how VM pde/pte are updated for Graphics and Compute.
BIT0 controls Graphics and BIT1 Compute.
BIT0 [= 0] Graphics updated by SDMA [= 1] by CPU
BIT1 [= 0] Compute updated by SDMA [= 1] by CPU
By default, only for large BAR system vm_update_mode = 2, indicating
that Graphics VMs will be updated via SDMA and Compute VMs will be
updated via CPU. And for all all other systems (by default)
vm_update_mode = 0
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
For planes handled by a VSP instance, map the framebuffer memory through
the VSP to ensure proper IOMMU handling.
Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Reviewed-by: Kieran Bingham <kieran.bingham+renesas@ideasonboard.com>
[Kieran: Fix infinite loop on fail]
Signed-off-by: Kieran Bingham <kieran.bingham+renesas@ideasonboard.com>
Acked-by: Mauro Cavalho Chehab <mchehab@s-opensource.com>
A bunch of fixes for vmwgfx 4.12 regressions and older stuff. In the latter
case either trivial, cc'd stable or requiring backports for stable.
* 'vmwgfx-fixes-4.12' of git://people.freedesktop.org/~thomash/linux:
drm/vmwgfx: Bump driver minor and date
drm/vmwgfx: Remove unused legacy cursor functions
drm/vmwgfx: fix spelling mistake "exeeds" -> "exceeds"
drm/vmwgfx: Fix large topology crash
drm/vmwgfx: Make sure to update STDU when FB is updated
drm/vmwgfx: Make sure backup_handle is always valid
drm/vmwgfx: Handle vmalloc() failure in vmw_local_fifo_reserve()
drm/vmwgfx: Don't create proxy surface for cursor
drm/vmwgfx: limit the number of mip levels in vmw_gb_surface_define_ioctl()
drm/i915 fixes for v4.12-rc5
* tag 'drm-intel-fixes-2017-06-08' of git://anongit.freedesktop.org/git/drm-intel:
drm/i915: fix warning for unused variable
drm/i915: Fix 90/270 rotated coordinates for FBC
drm/i915: Restore has_fbc=1 for ILK-M
drm/i915: Workaround VLV/CHV DSI scanline counter hardware fail
drm/i915: Fix logical inversion for gen4 quirking
drm/i915: Guard against i915_ggtt_disable_guc() being invoked unconditionally
drm/i915: Always recompute watermarks when distrust_bios_wm is set, v2.
drm/i915: Prevent the system suspend complete optimization
drm/i915/psr: disable psr2 for resolution greater than 32X20
drm/i915: Hold a wakeref for probing the ring registers
drm/i915: Short-circuit i915_gem_wait_for_idle() if already idle
drm/i915: Disable decoupled MMIO
drm/i915/guc: Remove stale comment for q_fail
drm/i915: Serialize GTT/Aperture accesses on BXT
Driver Changes:
- kirin: Use correct dt port for the bridge (John)
- meson: Fix regression caused by adding HDMI support to allow board
configurations without HDMI (Neil)
Cc: John Stultz <john.stultz@linaro.org>
Cc: Neil Armstrong <narmstrong@baylibre.com>
* tag 'drm-misc-fixes-2017-06-07' of git://anongit.freedesktop.org/git/drm-misc:
drm/meson: Fix driver bind when only CVBS is available
drm: kirin: Fix drm_of_find_panel_or_bridge conversion
- Keep the external clock input to the PRE ungated and only use the internal
soft reset to keep the module in low power state, to avoid sporadic startup
failures.
- Ignore -ENODEV return values from drm_of_find_panel_or_bridge in the LDB
driver to fix probing for devices that still do not specify a panel in the
device tree.
- Fix the CSI input selection to the VDIC. According to experiments, the real
behaviour differs a bit from the documentation.
-----BEGIN PGP SIGNATURE-----
iQJLBAABCAA1FiEEBsBxhV1FaKwXuCOBUMKIHHCeYOsFAlk49O8XHHAuemFiZWxA
cGVuZ3V0cm9uaXguZGUACgkQUMKIHHCeYOsYjhAAtb34B6bpkuG8iZeHdKQ2kwd1
wjfHQpKH9q2oMzRwWbUOAkCR95Cgs1GMAMowUVpulT0HMN/epGjHODfSnpl/AdUq
WNdWMT44/GtS26umjjWIFBizdrdqxsaF735gb+1QB8QaGnGiQt8xMVaw1kdR6X1v
fJ1zo7fL8rcEW6W0lLDEHDJdCoQJyi/j+w4CN6RQn4KXH2O8z/SIIVmx18nsIoRE
eKlepcPClBDHfiBAtSkpS3ZDQkjsP1QBcdvinU/e/BIDymCVMVuhQTi7HhpMR5zO
ga08r5AwaQR3+hgyl2WgD1ex+oPUqWvBS9uzKy92IExjncTfWBDFpJT8hp8CsQt5
P5e57mafz9tB8T0B5kNnRE+Nwh0IYZCuSREGQF7hBcUDIMKbrjcZDeZp6IyMIG2z
7FEYAuu/bz7CAAJMJtJskLbPWM0x4ZQk5rzH0lZ3wp0l80jlMiFTb3RiFBTFU83o
m8SJLVDmhtUUVAlRuwGrlZd8hAZGTdFiry7cs0zNbqAgkcHlPWNX+ErY4QGDRPWY
L7G2I3F4TAtEzMXwcYUuISBiwUp2U/Z7m2eLTJ3o9hsclSnup4/XSCobGEHDsNVQ
wFRh+ho0UqkxCv29rNpLMiUHcGGnKdYAsqE01f9/D4tj/T9sJaUV0+TYO0Wq/EYH
HojaENfhPwPkpn26MLE=
=sqiE
-----END PGP SIGNATURE-----
Merge tag 'imx-drm-fixes-2017-06-08' of git://git.pengutronix.de/git/pza/linux into drm-fixes
imx-drm: PRE clock gating, panelless LDB, and VDIC CSI selection fixes
- Keep the external clock input to the PRE ungated and only use the internal
soft reset to keep the module in low power state, to avoid sporadic startup
failures.
- Ignore -ENODEV return values from drm_of_find_panel_or_bridge in the LDB
driver to fix probing for devices that still do not specify a panel in the
device tree.
- Fix the CSI input selection to the VDIC. According to experiments, the real
behaviour differs a bit from the documentation.
* tag 'imx-drm-fixes-2017-06-08' of git://git.pengutronix.de/git/pza/linux:
gpu: ipu-v3: Fix CSI selection for VDIC
drm/imx: imx-ldb: Accept drm_of_find_panel_or_bridge failure
gpu: ipu-v3: pre: only use internal clock gating
Commit 18dddadc78 ("drm/atomic: Introduce drm_atomic_helper_shutdown")
introduced a new helper to shutdown all CRTCs to replace the buggy
drm_crtc_force_disable_all() function. Make use of the new atomic
helper drm_atomic_helper_shutdown() to shutdown CRTCs.
Signed-off-by: Stefan Agner <stefan@agner.ch>
Make use of the irq_preinstall/uninstall callback to clear and
mask all interrupts. Use write 1 to clear as documented by the
data sheet (writing a 0 seems to have cleared interrupt status
too). Remove fsl_dcu_drm_irq_init and call drm_irq_install
directly from fsl_dcu_load makes error handling a bit simpler.
Do not set irq_enabled since drm_irq_install is taking care of
it.
Signed-off-by: Stefan Agner <stefan@agner.ch>
Again cleanup before irq disabling doesn't really stop the races,
so just drop it. Proper fix would be to put drm_atomic_helper_shutdown
before everything gets cleaned up.
Cc: Stefan Agner <stefan@agner.ch>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Stefan Agner <stefan@agner.ch>
The whole Display engine for Coffee Lake is pretty much
identical to the Kabylake. For this reason let's reuse
all display related production workardounds here even though
CFL is not explicit listed at Display workarounds page at Spec.
v2: moved intel_pm.c chunck to this patch in order to address
all display related w/a in a single place.
Cc: Arthur Runyan <arthur.j.runyan@intel.com>
Cc: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1496937000-8450-3-git-send-email-rodrigo.vivi@intel.com
So let's force it on the virtual detection.
Also it is still the only silicon for now on this PCH,
so WARN otherwise.
v2: Rebased on top of Cannonlake and added the missed
debug message as pointed by DK.
Cc: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Anusha Srivatsa <anusha.srivatsa@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1496937000-8450-2-git-send-email-rodrigo.vivi@intel.com
Coffee Lake is a Intel® Processor containing Intel® HD Graphics
following Kabylake.
It is Gen9 graphics based platform on top of CNP PCH.
Let's start by adding the platform definition based on previous
platforms but yet as preliminary_hw_support.
On following patches we will start adding PCI IDs and the
platform specific changes.
v2: Also add BS2 ring that is present on GT3. As on KBL, according
spec: "GT3 also has additional media blocks with second instance
of VEBox and VDBox each", i.e. BSD2 ring in our case. Noticed
when reviewing PCI ID patches.
v3: CFL_PLATFORM instead for CFL_FEATURES because it contains
Platform information and no new features when compared to
BDW_FEATURES definition.
v4: Rebased on top of Cannonlake patches.
Cc: Anusha Srivatsa <anusha.srivatsa@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Anusha Srivatsa <anusha.srivatsa@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1496937000-8450-1-git-send-email-rodrigo.vivi@intel.com
Open code them so we can adjust the order in the
driver more easily.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Rather than calling the deprecated drm_pci_init() and
drm_pci_exit() which just wrapped the pci functions
anyway.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Even if CONFIG_DRM_AMDGPU_CIK is enabled.
There is no feature parity yet for CIK, in particular amdgpu doesn't
support HDMI/DisplayPort audio without DC.
v2:
* Clarify the lack of feature parity being related to HDMI/DP audio.
* Fix "SI" typo in DRM_AMDGPU_CIK help entry.
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
This will allow amdgpu-pro / other out-of-tree amdgpu builds to make use
of these options for using the out-of-tree amdgpu driver instead of the
in-tree radeon driver in a clean way.
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Acked-by: Michel Dänzer <michel.daenzer@amd.com>
If AMDGPU supports SI, add a module parameter to control SI
support. It's off by default in AMDGPU as long as SI suppost is
experimental, while it is on by default in radeon.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Acked-by: Michel Dänzer <michel.daenzer@amd.com>
[ Michel Dänzer: Squash in amdgpu_si_support initialization fix ]
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
If AMDGPU supports SI, add a module parameter to control SI
support in radeon. It's on by default in radeon, while it will be
off by default in AMDGPU as long as SI support is experimental.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Acked-by: Michel Dänzer <michel.daenzer@amd.com>
If AMDGPU supports CIK, add a module parameter to control CIK
support. It's on by default in AMDGPU, while it is off by default
in radeon.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Acked-by: Michel Dänzer <michel.daenzer@amd.com>
drivers/gpu/drm/i915/intel_engine_cs.c: In function ‘intel_engine_is_idle’:
drivers/gpu/drm/i915/intel_engine_cs.c:1103:27: error: unused variable ‘dev_priv’ [-Werror=unused-variable]
struct drm_i915_private *dev_priv = engine->i915;
^~~~~~~~
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Originally we would enable and disable the breadcrumb interrupt
immediately on demand. This was slow enough to have a large impact
(>30%) on tasks that hopped between engines. However, by using a shadow
to keep the irq alive for an extra interrupt (see commit 67b807a892
("drm/i915: Delay disabling the user interrupt for breadcrumbs")) and
by recently reducing the cost in adding ourselves to the signal tree, we
no longer need to spin-request during await_request to avoid delays in
throughput tests. Without the earlier patches to stop the wakeup when
signaling if the irq was already active, we saw no improvement in
execbuf overhead (and corresponding contention in other clients) despite
the removal of the spinner in a simple test like glxgears. This means
there will be scenarios where now we spend longer enabling the interrupt
than we would have spent spinning, but these are not likely to have as
noticeable an impact as the high frequency test cases (where there
should not be any regression).
Ulterior motive: generalising the engine->sync_to to handle different
types of semaphores and non-semaphores.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Oscar Mateo <oscar.mateo@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170608111405.16466-4-chris@chris-wilson.co.uk
Enabling the interrupt for the signaler takes a finite amount of time (a
few microseconds) during which it is possible for the request to
complete. Check afterwards and skip adding the request to the signal
rbtree if it complete.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170608111405.16466-3-chris@chris-wilson.co.uk
The important condition that we need to check after enabling the
interrupt for signaling is whether the request completed in the process
(and so we missed that interrupt). A large cost in enabling the
signaling (rather than waiters) is in waking up the auxiliary signaling
thread, but we only need to do so to catch that missed interrupt. If we
know we didn't miss any interrupts (because we didn't arm the interrupt)
then we can skip waking the auxiliary thread.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170608111405.16466-2-chris@chris-wilson.co.uk
Setting up the irq to signal the request completion takes a finite
amount of time, during which it is possible that the request already
completed. Check afterwards, just in case, so that we can respond
immediately.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170608111405.16466-1-chris@chris-wilson.co.uk
And prevent calling i915_ggtt_disable_guc twice (the first when GuC init
failed, and the second time during driver unload / intel_uc_fini_hw),
and hitting the GEM_BUG_ON.
v2: Clear enable_guc_loading unconditionally (Michal)
Make sure guc_free_load_err_log is still called (Daniele)
Don't shoot the messenger (Chris)
Fixes: 3950bf3dbf ("drm/i915/guc: Add onion teardown to the GuC
setup")
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170605171251.9905-1-michel.thierry@intel.com
Replace the large comment about requiring the powerwell for
intel_uncore_arm_unclaimed_mmio_detection() by moving the arming of the
mmio error detection into the powerwell held for modesetting. Thereby
also accomplishing the goal of only arming the mmio detection after a
full modeset.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170504115508.13571-1-chris@chris-wilson.co.uk
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
The field order selection in VDIC_C register uses different bits
depending on whether the VDIC is receiving from a CSI ("AUTO") or
from memory ("MAN"). Since the VDIC cannot receive from both CSI
and memory at the same time, set or clear both field order bits to
cover both cases.
Signed-off-by: Steve Longerbeam <steve_longerbeam@mentor.com>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
This is not used anymore since commit eb8c88808c ("drm/imx: add
deferred plane disabling"), remove it.
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Most of the 64 IPUv3 DMA channels are never used, some of them (channels
16, 30, 32, 34-39, and 53-63) are even marked as reserved.
Allocate the channel control structure only when a channel is actually
requested, replace the fixed size array with a list, and remove the
unused enabled and busy fields from the ipuv3_channel structure.
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Allow to skip writing odd chroma rows by setting the RDRW bit for
4:2:0 chroma subsampled formats for any IDMAC write channel. This
also allows to skip reading odd rows for the VDIC read channel.
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
The counter load enable bit has no effect when the shadow register
set is activated. As we always operate the PRG with shadow enabled
it is safe to remove this.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
during the emulation of virtual reset:
1. only reset the engine related mmio ending with MMIO
offset Master_IRQ, not include display stuff.
2. fences are not required to set default
value as well to prevent screen flicking.
this will fix the issue of Guest screen hang while running
Force tdr in Linux guest.
v2:
- only reset the engine related mmio. (Zhenyu & Zhiyuan)
v3:
- IMR/Ring mode registers are not save/restored. (Changbin)
v4:
- redefine the MMIO reset offset for easy understanding. (Zhenyu)
- pvinfo can be reset. (Zhenyu)
v5:
- add more comments for mmio reset. (Zhenyu)
Cc: Changbin Du <changbin.du@intel.com>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Cc: Lv zhiyuan <zhiyuan.lv@intel.com>
Cc: Zhang Yulei <yulei.zhang@intel.com>
Signed-off-by: fred gao <fred.gao@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Emulating the GDRST read behavior correctly to ack the
guest reset request.
v2:
- split the original patch into two:
GDRST read handler and virtual gpu reset. (Zhenyu)
v3:
- emulate the GDRST read right after write. (Zhenyu)
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Cc: Zhang Yulei <yulei.zhang@intel.com>
Signed-off-by: fred gao <fred.gao@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
On Skylake platform, The traced virtual mmio registers are up to 2039.
So tuning the hash table size to improve lookup performance.
Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
We count all the tracked virtual MMIO registers, which can help us to
tune the MMIO hash table.
v2: Move num_tracked_mmio into gvt structure.
Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Function calls are expensive. I have see obvious overhead call to
these wrappers in perf data, especially from the cmd parser side.
So make these simple wrappers be inline to kill them all.
Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Type u8 is big enough to contain all MMIO attribute flags. As the
total MMIO size is 2MB so we saved 1.5MB memory.
Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
The size, length, addr_mask fields actually are not necessary. Every
tracked mmio has DWORD size, and addr_mask is a legacy field.
Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Some of traced MMIO registers are a large continuous section. These
stuffed the MMIO lookup hash table and so waste lots of memory and
get much lower lookup performance.
Here we picked out these sections by special handling. These sections
include:
o Display pipe registers, total 768.
o The PVINFO page, total 1024.
o MCHBAR_MIRROR, total 65536.
o CSR_MMIO, total 3072.
So we removed 70,400 items from the hash table, and speed up guest
boot time by ~500ms.
v2:
o add a local function find_mmio_block().
o fix comments.
Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
add gtt_invalidate API to handle the GTT TLB flush instead of
hiding in write_pte64 function. This can avoid overkill when using
write_pte64
Suggested-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Chuanxiao Dong <chuanxiao.dong@intel.com>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
In some cases, GVT-g is accessing MMIO without holding runtime_pm
and this patch can add the inline API for doing the runtime_pm get/put
to make sure when accessing HW MMIO the i915 HW is really powered on.
Suggested-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Chuanxiao Dong <chuanxiao.dong@intel.com>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
This flag is already set in the top level Makefile of the kernel.
Also, by having set CONFIG_DRM_I915_GVT, thereby appending -Wall to
ccflags, you undo all the -Wno-* cflags previously set in the Make
variable KBUILD_CFLAGS.
For example:
cc foo.c -Wall -Wno-format -Wall
resets -Wformat.
Signed-off-by: Nick Desaulniers <nick.desaulniers@gmail.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
remove all the legacy pre-BDW mmio handlers and the corresponding
usage/definition since pre-BDW platforms are not supported in GVT
environment.
v2:
- clean up all the left dirty code before BDW, e.g
all D_HSW usage and itself, D_IVB, D_PRE_BDW. (Zhenyu)
v3:
- change is based on gvt-staging. (Zhenyu)
Signed-off-by: fred gao <fred.gao@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
The time based scheduler poll context busy status at every
micro-second during vGPU switch, it will make GPU idle for a while
when the context is very small and completed before the next
micro-second arrival. Trigger scheduling immediately after context
complete will eliminate GPU idle and improve performance.
Create two vGPU with same type, run Heaven simultaneously:
Before this patch:
+---------+----------+----------+
| | vGPU1 | vGPU2 |
+---------+----------+----------+
| Heaven | 357 | 354 |
+-------------------------------+
After this patch:
+---------+----------+----------+
| | vGPU1 | vGPU2 |
+---------+----------+----------+
| Heaven | 397 | 398 |
+-------------------------------+
v2: Let need_reschedule protect by gvt-lock.
Signed-off-by: Ping Gao <ping.a.gao@intel.com>
Signed-off-by: Weinan Li <weinan.z.li@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
This patch decouple the time slice calculation and scheduler, let
other event be able to trigger scheduling without impact the
calculation for QoS.
v2: add only one new enum definition.
v3: fix typo.
Signed-off-by: Ping Gao <ping.a.gao@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Since cmd message have been recorded in trace, gvt_dbg_cmd isn't
necessary. This will reduce much of dmesg as gvt_dbg_cmd is repeated
on each workload.
Signed-off-by: Xiong Zhang <xiong.y.zhang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Currently gvt dmesg is so heavy at drm.debug=0x2 that guest and
host almost couldn't run on xengt.
This patch transfer these repeated messages into trace, so dmesg
is light at drm.debug=0x2, and user could get the target message through
trace event and trace filter.
Suggested-by: Zhi Wang <zhi.a.wang@intel.com>
Signed-off-by: Xiong Zhang <xiong.y.zhang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
kernel hangcheck needs to check RING_INSTDONE and SC_INSTDONE registers'
state to know if hardware is still running. In GVT-g environment, we need
to emulate these registers changing for all the guests although they are
not render owner. Here we return the physical state for all the guests,
then if INSTDONE is changing guest can know hardware is still running
although its workload is pending.
Read INSTDONE isn't one correct way to know if guest trigger gfx reset,
especially with Linux guest, it will read ACTH first, then check INSTDONE
and SUBSLICE registers to check if hardware is still running, at last
trigger gfx reset when it finds all the registers is frozen. In Windows
guest, read INSTDONE usually happens when OS detect TDR.
With the difference between Windows and Linux guest, "disable_warn_untrack"
may let debug log run into wrong state(Linux guest trigger hangcheck
with no ACTHD changed, then check INSTDONE), but actually there is no TDR
happened.
The new policy is always WARN with untrack MMIO r/w. Bad effect is many
noisy untrack mmio warning logs exist when real TDR happen. Even so you can
control the log output or not by setting the debug mask bit.
v2: remove log in instdone_mmio_read
Suggested-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Weinan Li <weinan.z.li@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Commit ab9da627906a ("drm/i915: make context status notifier head be
per engine") gives us a chance to inspect every single request. Then
we can eliminate unnecessary mmio switching for same vGPU. We only
need mmio switching for different VMs (including host).
This patch introduced a new general API intel_gvt_switch_mmio() to
replace the old intel_gvt_load/restore_render_mmio(). This function
can be further optimized for vGPU to vGPU switching.
To support individual ring switch, we track the owner who occupy
each ring. When another VM or host request a ring we do the mmio
context switching. Otherwise no need to switch the ring.
This optimization is very useful if only one guest has plenty of
workloads and the host is mostly idle. The best case is no mmio
switching will happen.
v2:
o fix missing ring switch issue. (chuanxiao)
o support individual ring switch.
Signed-off-by: Changbin Du <changbin.du@intel.com>
Reviewed-by: Chuanxiao Dong <chuanxiao.dong@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
The function intel_vgpu_submit_execlist could be more simpler. It
actually does:
1) validate the submission. The first context must be valid,
and all two must be privilege_access.
2) submit valid contexts. The first one need emulate schedule_in.
We do not need a bitmap, valid desc copy valid_desc. Local variable
emulate_schedule_in also can be optimized out.
v2: dump desc content in err msg (Zhi Wang)
Signed-off-by: Changbin Du <changbin.du@intel.com>
Reviewed-by: Zhi Wang <zhi.a.wang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
The gvt:gvt_command trace involve unnecessary overhead even this trace is
not enabled. We need improve it.
The kernel trace infrastructure provide a full api to define a trace event.
We should leverage them if possible. And one important thing is that a trace
point should store raw data but not format string.
This patch include two part work:
1) Refactor the gvt_command trace definition, including:
o only store raw trace data.
o use __dynamic_array() to declare a variable size buffer.
o use __print_array() to format raw cmd data.
o rename vm_id as vgpu_id.
2) Improve the trace invoking, including:
o remove the cycles calculation for handler. We can get this data
by any perf tool.
o do not make a backup for raw cmd data which just doesn't make sense.
With this patch, this trace has no overhead if it is not enabled. And we are
trace style now.
The final output example:
gvt workload 0-211 [000] ...1 120.555964: gvt_command: vgpu1 ring 0: buf_type 0, ip_gma e161e880, raw cmd {0x4000000}
gvt workload 0-211 [000] ...1 120.556014: gvt_command: vgpu1 ring 0: buf_type 0, ip_gma e161e884, raw cmd {0x7a000004,0x1004000,0xe1511018,0x0,0x7d,0x0}
gvt workload 0-211 [000] ...1 120.556062: gvt_command: vgpu1 ring 0: buf_type 0, ip_gma e161e89c, raw cmd {0x7a000004,0x140000,0x0,0x0,0x0,0x0}
gvt workload 0-211 [000] ...1 120.556110: gvt_command: vgpu1 ring 0: buf_type 0, ip_gma e161e8b4, raw cmd {0x10400002,0xe1511018,0x0,0x7d}
Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
This overrode what queue was actually assigned for kiq.
Reviewed-by: Alex Xie <AlexBin.Xie@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This was missed when Andres' queue patches were rebased.
Fixes: 42794b27 (drm/amdgpu: take ownership of per-pipe configuration v3)
Reviewed-by: Alex Xie <AlexBin.Xie@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Fixes hangs on single MEC asics.
Fixes: 2ed286fb434 (drm/amdgpu: new queue policy, take first 2 queues of each pipe v2)
Reviewed-by: Alex Xie <AlexBin.Xie@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
If src_x/y were nonzero, we failed to shift them down by 16 to get the
pixel offset. The recent CMA helper function gets it right.
Signed-off-by: Eric Anholt <eric@anholt.net>
Fixes: bed41005e6 ("drm/pl111: Initial drm/kms driver for pl111")
Reported-by: Mircea Carausu <mircea.carausu@broadcom.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170603015733.13266-1-eric@anholt.net
Reviewed-by: Sean Paul <seanpaul@chromium.org>
This fixes the following depmod error when building drm as a module:
depmod: ERROR: Found 6 modules in dependency cycles!
depmod: ERROR: Cycle detected: drm -> drm_kms_helper -> drm
Fixes: 13dfc0540a ("drm/bridge: Refactor out the panel wrapper from the lvds-encoder bridge.")
Tested-by: Lofstedt, Marta <marta.lofstedt@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Archit Taneja <architt@codeaurora.org>
Link: http://patchwork.freedesktop.org/patch/msgid/3fd262cf-1db6-4335-320c-af92f9014502@linux.intel.com
This patch clean up a bit the platform definition block in
a way to avoid duplications and to let clear that GT3 for
the current platform only have the extra Media engine (BSD2).
v2: Kabylake IS_KABYLAKE as Anusha noticed.
v3: Avoid EXTRA_ENGINE_MASK and list rings out on GT3 to
make it more clear.
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Anusha Srivatsa <anusha.srivatsa@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Anusha Srivatsa <anusha.srivatsa@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1496765166-7068-1-git-send-email-rodrigo.vivi@intel.com
Let's be picky and just use PICK directly.
So we can extend this later without creating
a new PORT_X por every new number of ports we
have to handle.
Cc: Manasi Navare <manasi.d.navare@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1496700722-13755-1-git-send-email-rodrigo.vivi@intel.com
The workaround added in
commit c6782b76d3 ("drm/i915/gen9: Reset secondary power well
equests left on by DMC/KVMR")
needs to be applied on Cannonlake as well.
So let's assume any platform using this power well setup
will also need and let's just go ahead and remove if condition.
Cc: Imre Deak <imre.deak@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1496781040-20888-11-git-send-email-rodrigo.vivi@intel.com
CNL power wells are very similar to SKL, with the exception that the
misc IO well has been split into separate AUX IO wells.
Not sure if DMC is supposed to manage the AUX wells for us or not.
Let's assume so for now.
v2: DDI A power well wants DDI A domains, not DDI B domains
v3: s/BIT/BIT_ULL and add proper Aux IO domains. (Rodrigo)
v4: Remove PW_DDI_E. Not supported on Current CNL SKUs. (Rodrigo).
v5: Removed DDI_E_IO_DOMAINS and moved PORT_DDI_E_IO to DDI_A_IO
for the same reasons as v4 when we found out that current CNL
SKUs don't have the full port E split.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1496781040-20888-10-git-send-email-rodrigo.vivi@intel.com
Indirect Context Offset Pointer has changed for Cannonlake.
INDIRECT_CTX_OFFSET[15:6] valid value for CNL is 19h per Spec.
v2: rebased to intel_lr_indirect_ctx_offset
v3: Commit message added per Tvrtko request.
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1496781040-20888-9-git-send-email-rodrigo.vivi@intel.com
All registers and default configuration are the same for Skylake
and Cannonlake.
v2: Don't apply Wa for platforms without MOCS. (Paulo)
v3: Removed WaDisableSkipCaching that Joonas noticed that
according to spec it is not applicable to CNL.
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1496781040-20888-8-git-send-email-rodrigo.vivi@intel.com
Platform enabling and its power-on are organized in different
skus (U x Y x S x H, etc). So instead of organizing it in
GT1 x GT2 x GT3 let's also use the platform sku.
This is also the new Spec style what makes the review much
more easy and straightforward.
v2: Really include the PCI IDs to the picidlist[];
v3: Remove PCI IDs not present in spec.
v4: Rebase.
Signed-off-by: Anusha Srivatsa <anusha.srivatsa@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Clinton Taylor <clinton.a.taylor@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1496781040-20888-3-git-send-email-rodrigo.vivi@intel.com
Cannonlake is a Intel® Processor containing Intel® HD Graphics
following Kabylake.
It is Gen10.
Let's start by adding the platform definition based on previous
platforms but yet as alpha_support.
On following patches we will start adding PCI IDs and the
platform specific changes.
CNL has an increased DDB size as Damien had previously
noticed and provided a separated patch that got squashed here.
v2: Squash DDB size here per Ander request.
Credits-to: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1496781040-20888-1-git-send-email-rodrigo.vivi@intel.com
While introducing HDMI support, component matching on connectors node
were bypassed since no driver would actually bind on the DT node.
But when only a CVBS connector is present, only a single node is found
in the graph, but ignored and a NULL match table is given to the
component code.
This code permits bypassing the components framework by binding directly
the DRM driver when no components needs to be loaded.
Fixes: a41e82e6c4 ("drm/meson: Add support for components")
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: http://patchwork.freedesktop.org/patch/msgid/1496067352-8733-1-git-send-email-narmstrong@baylibre.com
The clipped src coordinates have already been rotated by 270 degrees for
when the plane rotation is 90/270 degrees, hence the FBC code should no
longer swap the width and height.
Cc: stable@vger.kernel.org
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Fixes: b63a16f6cd ("drm/i915: Compute display surface offset in the plane check hook for SKL+")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170331180056.14086-4-ville.syrjala@linux.intel.com
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Tested-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
(cherry picked from commit 73714c05df)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Restore the lost has_fbc flag for mobile ILK.
Cc: Carlos Santa <carlos.santa@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Fixes: a132338046 ("drm/i915: Introduce GEN5_FEATURES for device info")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170606133229.12439-1-ville.syrjala@linux.intel.com
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
(cherry picked from commit c2d1a0ced2)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
The scanline counter is bonkers on VLV/CHV DSI. The scanline counter
increment is not lined up with the start of vblank like it is on
every other platform and output type. This causes problems for
both the vblank timestamping and atomic update vblank evasion.
On my FFRD8 machine at least, the scanline counter increment
happens about 1/3 of a scanline ahead of the start of vblank (which
is where all register latching happens still). That means we can't
trust the scanline counter to tell us whether we're in vblank or not
while we're on that particular line. In order to keep vblank
timestamping in working condition when called from the vblank irq,
we'll leave scanline_offset at one, which means that the entire
line containing the start of vblank is considered to be inside
the vblank.
For the vblank evasion we'll need to consider that entire line
to be bad, since we can't tell whether the registers already
got latched or not. And we can't actually use the start of vblank
interrupt to get us past that line as the interrupt would fire
too soon, and then we'd up waiting for the next start of vblank
instead. One way around that would using the frame start
interrupt instead since that wouldn't fire until the next
scanline, but that would require some bigger changes in the
interrupt code. So for simplicity we'll just poll until we get
past the bad line.
v2: Adjust the comments a bit
Cc: stable@vger.kernel.org
Cc: Jonas Aaberg <cja@gmx.net>
Tested-by: Jonas Aaberg <cja@gmx.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99086
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161215174734.28779-1-ville.syrjala@linux.intel.com
Tested-by: Mika Kahola <mika.kahola@intel.com>
Reviewed-by: Mika Kahola <mika.kahola@intel.com>
(cherry picked from commit ec1b4ee283)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
The assertion that we want to make before disabling the pin of the pages
for the unknown swizzling quirk is that the quirk is indeed active, and
that the quirk is disabled before we do apply it to the pages.
Fixes: 2c3a3f44dc ("drm/i915: Fix pages pin counting around swizzle quirk")
Fixes: 957870f934 ("drm/i915: Split out i915_gem_object_set_tiling()")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170521124014.27678-1-chris@chris-wilson.co.uk
Reviewed-bhy: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
(cherry picked from commit 20bb377106)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Commit 7c3f86b6dc ("drm/i915: Invalidate the guc ggtt TLB upon
insertion") added the restoration of the invalidation routine after the
GuC was disabled, but missed that the GuC was unconditionally disabled
when not used. This then overwrites the invalidate routine for the older
chipsets, causing havoc and breaking resume as the most obvious victim.
We place the guard inside i915_ggtt_disable_guc() to be backport
friendly (the bug was introduced into v4.11) but it would be preferred
to be in more control over when this was guard (i.e. do not try and
teardown the data structures before we have enabled them). That should
be true with the reorganisation of the guc loaders.
Reported-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Fixes: 7c3f86b6dc ("drm/i915: Invalidate the guc ggtt TLB upon insertion")
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Oscar Mateo <oscar.mateo@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Cc: <stable@vger.kernel.org> # v4.11+
Link: http://patchwork.freedesktop.org/patch/msgid/20170531190514.3691-1-chris@chris-wilson.co.uk
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
(cherry picked from commit cb60606d83)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
On some systems there can be a race condition in which no crtc state is
added to the first atomic commit. This results in all crtc's having a
null DDB allocation, causing a FIFO underrun on any update until the
first modeset.
Changes since v1:
- Do not take the connection_mutex, this is already done below.
Reported-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Inspired-by: Mahesh Kumar <mahesh1.kumar@intel.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Fixes: 98d39494d3 ("drm/i915/gen9: Compute DDB allocation at atomic
check time (v4)")
Cc: <stable@vger.kernel.org> # v4.8+
Cc: Mahesh Kumar <mahesh1.kumar@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170531154236.27180-1-maarten.lankhorst@linux.intel.com
Reviewed-by: Mahesh Kumar <mahesh1.kumar@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
(cherry picked from commit 367d73d280)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Since
commit bac2a909a0
Author: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Date: Wed Jan 21 02:17:42 2015 +0100
PCI / PM: Avoid resuming PCI devices during system suspend
PCI devices will default to allowing the system suspend complete
optimization where devices are not woken up during system suspend if
they were already runtime suspended. This however breaks the i915/HDA
drivers for two reasons:
- The i915 driver has system suspend specific steps that it needs to
run, that bring the device to a different state than its runtime
suspended state.
- The HDA driver's suspend handler requires power that it will request
from the i915 driver's power domain handler. This in turn requires the
i915 driver to runtime resume itself, but this won't be possible if the
suspend complete optimization is in effect: in this case the i915
runtime PM is disabled and trying to get an RPM reference returns
-EACCESS.
Solve this by requiring the PCI/PM core to resume the device during
system suspend which in effect disables the suspend complete optimization.
Regardless of the above commit the optimization stayed disabled for DRM
devices until
commit d14d2a8453
Author: Lukas Wunner <lukas@wunner.de>
Date: Wed Jun 8 12:49:29 2016 +0200
drm: Remove dev_pm_ops from drm_class
so this patch is in practice a fix for this commit. Another reason for
the bug staying hidden for so long is that the optimization for a device
is disabled if it's disabled for any of its children devices. i915 may
have a backlight device as its child which doesn't support runtime PM
and so doesn't allow the optimization either. So if this backlight
device got registered the bug stayed hidden.
Credits to Marta, Tomi and David who enabled pstore logging,
that caught one instance of this issue across a suspend/
resume-to-ram and Ville who rememberd that the optimization was enabled
for some devices at one point.
The first WARN triggered by the problem:
[ 6250.746445] WARNING: CPU: 2 PID: 17384 at drivers/gpu/drm/i915/intel_runtime_pm.c:2846 intel_runtime_pm_get+0x6b/0xd0 [i915]
[ 6250.746448] pm_runtime_get_sync() failed: -13
[ 6250.746451] Modules linked in: snd_hda_intel i915 vgem snd_hda_codec_hdmi x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul
snd_hda_codec_realtek snd_hda_codec_generic ghash_clmulni_intel e1000e snd_hda_codec snd_hwdep snd_hda_core ptp mei_me pps_core snd_pcm lpc_ich mei prime_
numbers i2c_hid i2c_designware_platform i2c_designware_core [last unloaded: i915]
[ 6250.746512] CPU: 2 PID: 17384 Comm: kworker/u8:0 Tainted: G U W 4.11.0-rc5-CI-CI_DRM_334+ #1
[ 6250.746515] Hardware name: /NUC5i5RYB, BIOS RYBDWi35.86A.0362.2017.0118.0940 01/18/2017
[ 6250.746521] Workqueue: events_unbound async_run_entry_fn
[ 6250.746525] Call Trace:
[ 6250.746530] dump_stack+0x67/0x92
[ 6250.746536] __warn+0xc6/0xe0
[ 6250.746542] ? pci_restore_standard_config+0x40/0x40
[ 6250.746546] warn_slowpath_fmt+0x46/0x50
[ 6250.746553] ? __pm_runtime_resume+0x56/0x80
[ 6250.746584] intel_runtime_pm_get+0x6b/0xd0 [i915]
[ 6250.746610] intel_display_power_get+0x1b/0x40 [i915]
[ 6250.746646] i915_audio_component_get_power+0x15/0x20 [i915]
[ 6250.746654] snd_hdac_display_power+0xc8/0x110 [snd_hda_core]
[ 6250.746661] azx_runtime_resume+0x218/0x280 [snd_hda_intel]
[ 6250.746667] pci_pm_runtime_resume+0x76/0xa0
[ 6250.746672] __rpm_callback+0xb4/0x1f0
[ 6250.746677] ? pci_restore_standard_config+0x40/0x40
[ 6250.746682] rpm_callback+0x1f/0x80
[ 6250.746686] ? pci_restore_standard_config+0x40/0x40
[ 6250.746690] rpm_resume+0x4ba/0x740
[ 6250.746698] __pm_runtime_resume+0x49/0x80
[ 6250.746703] pci_pm_suspend+0x57/0x140
[ 6250.746709] dpm_run_callback+0x6f/0x330
[ 6250.746713] ? pci_pm_freeze+0xe0/0xe0
[ 6250.746718] __device_suspend+0xf9/0x370
[ 6250.746724] ? dpm_watchdog_set+0x60/0x60
[ 6250.746730] async_suspend+0x1a/0x90
[ 6250.746735] async_run_entry_fn+0x34/0x160
[ 6250.746741] process_one_work+0x1f2/0x6d0
[ 6250.746749] worker_thread+0x49/0x4a0
[ 6250.746755] kthread+0x107/0x140
[ 6250.746759] ? process_one_work+0x6d0/0x6d0
[ 6250.746763] ? kthread_create_on_node+0x40/0x40
[ 6250.746768] ret_from_fork+0x2e/0x40
[ 6250.746778] ---[ end trace 102a62fd2160f5e6 ]---
v2:
- Use the new pci_dev->needs_resume flag, to avoid any overhead during
the ->pm_prepare hook. (Rafael)
v3:
- Update commit message to reference the actual regressing commit.
(Lukas)
v4:
- Rebase on v4 of patch 1/2.
Fixes: d14d2a8453 ("drm: Remove dev_pm_ops from drm_class")
References: https://bugs.freedesktop.org/show_bug.cgi?id=100378
References: https://bugs.freedesktop.org/show_bug.cgi?id=100770
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Marta Lofstedt <marta.lofstedt@intel.com>
Cc: David Weinehall <david.weinehall@linux.intel.com>
Cc: Tomi Sarvela <tomi.p.sarvela@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Lukas Wunner <lukas@wunner.de>
Cc: linux-pci@vger.kernel.org
Cc: <stable@vger.kernel.org> # v4.10.x: 4d071c3 - PCI/PM: Add needs_resume flag
Cc: <stable@vger.kernel.org> # v4.10.x
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reported-and-tested-by: Marta Lofstedt <marta.lofstedt@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1493726649-32094-2-git-send-email-imre.deak@intel.com
(cherry picked from commit adfdf85d79)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
psr1 is also disabled for panel resolution greater than 32X20.
Added psr2 check to disable only for psr2 panels having resolution
greater than 32X20.
issue was introduced by
commit-id : "acf45d11050abd751dcec986ab121cb2367dcbba"
commit message: "PSR2 is restricted to work with panel resolutions
upto 3200x2000, move the check to intel_psr_match_conditions and fully
block psr."
v2: (Rodrigo)
Add previous commit details which introduced the issue
Fixes: acf45d1105 ("drm/i915/psr: disable psr2 for resolution greater than 32X20")
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Jim Bride <jim.bride@linux.intel.com>
Cc: Yaroslav Shabalin <yaroslav.shabalin@gmail.com>
Reported-by: Yaroslav Shabalin <yaroslav.shabalin@gmail.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: vathsala nagaraju <vathsala.nagaraju@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/49935bdff896ee3140bed471012b9f9110a863a4.1495729964.git.vathsala.nagaraju@intel.com
(cherry picked from commit bef8c056fb)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>