The amdgpu_ucode_fini_bo should be called after gfx_v8_0_hw_fini,
or it will have KCQ disable failed issue.
For Tonga, as it firstly finishes SMC block, and the SMC hw fini
will call amdgpu_ucode_fini, which will lead the amdgpu_ucode_fini_bo
called before gfx_v8_0_hw_fini, this is incorrect.
Signed-off-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The amdgpu_pm_sysfs_fini should call before amdgpu_device_ip_fini,
or the adev->pm.dpm_enabled would be set to 0, then the device files
related to pp won't be removed by amdgpu_pm_sysfs_fini when unload
driver.
Signed-off-by: Emily Deng <Emily.Deng@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
We don't need the page array for prime shared BOs, stop allocating it.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Roger He <Hongbo.He@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This allows drivers to only allocate dma addresses, but not a page
array.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Roger He <Hongbo.He@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Let's stop mangling everything in a single header and create one header
per object instead.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Roger He <Hongbo.He@amd.com>
Acked-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This patch replace instances of drm_framebuffer_unreference with _put()
suffix, because it is shorter and consistent with the kernel use of
*_get/put() suffixes.
This was done with the following Coccinelle script:
@r@
expression e;
@@
(
-drm_framebuffer_reference(e);
+drm_framebuffer_get(e);
|
-drm_framebuffer_unreference(e);
+drm_framebuffer_put(e);
)
Signed-off-by: Haneen Mohammed <hamohammed.sa@gmail.com>
Acked-by: Sinclair Yeh <syeh@vmware.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20180311233313.GA19721@Haneen
The rockchip DRM driver is quite careful to disable interrupts
when taking a lock that is also taken in interrupt context,
which is a good thing.
What is a bit over the top is to use spin_lock_irqsave when
already in interrupt context, as you cannot take another
interrupt again, and disabling interrupt is just pure
overhead.
Switching to the non _irqsave version in interrupt context is
more logical, and less heavy handed.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20180220130120.5254-4-marc.zyngier@arm.com
memcpy is only meant to be used for memory, and only that.
MMIO accessors should be used to access MMIO regions, preferably
the ones that correspond to the size of the register accessed.
Let's convert the bulk register copy to writel/readl_relaxed,
which is the correct API.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20180220130120.5254-3-marc.zyngier@arm.com
Calling request_irq() followed by disable_irq() is usually a bad idea,
specially if the interrupt can be pending, and you're not yet in a
position to handle it.
This is exactly what happens on my kevin system when rebooting in a
second kernel using kexec: Some interrupt is left pending from
the previous kernel, and we take it too early, before disable_irq()
could do anything.
Let's clear the pending interrupts as we initialize the HW, and move
the interrupt request after that point. This ensures that we're in
a sane state when the interrupt is requested.
Cc: stable@vger.kernel.org
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
[adapted to recent rockchip-drm changes]
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20180220130120.5254-2-marc.zyngier@arm.com
Internally Mali DP uses an RGB pipeline so video layers that support
YUV input buffers need to convert the input data to RGB. The YUV
buffers can have various encodings and this patch introduces support
for BT.601, BT.709 and BT.2020 encodings, both limited and full ranges.
This patch adds support for specifying the color encoding of the
input buffers for the planes that are backed by the video layers
and programs the YUV2RGB coefficients into hardware based on the
selected encoding.
Signed-off-by: Mihail Atanassov <mihail.atanassov@arm.com>
[updated to use standard properties]
Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
When unbinding the mali-dp driver the drm_vblank_cleanup() function
warns us that the vblanks are still enabled. Fix that by calling
drm_crtc_vblank_off() in the malidp_unbind() function.
Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
The plane cleanup handler currently calls drm_plane_helper_disable(),
which is a legacy helper function. Replace it with a call to
drm_atomic_helper_shutdown() at removal time.
Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
The top-level error handler calls drm_mode_config_cleanup() which will
destroy all planes. There's no need to destroy them manually in lower
error handlers.
As plane cleanup is now handled entirely by drm_mode_config_cleanup(),
we must ensure that the plane .destroy() handler frees allocated memory
for the plane object that was freed by malidp_de_planes_destroy(). Do so
by replacing the call to devm_kfree() in the .destroy() handler by
kfree(). devm_kfree() is currently a no-op as the plane memory is
allocated with kzalloc(), not devm_kzalloc().
Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
Mali DP hardware has a 'go' bit (config_valid) for making the new scene
parameters active at the next page flip. The problem with the current
code is that the driver first sets this bit and then proceeds to wait
for confirmation from the hardware that the configuration has been
updated before arming the vblank event. As config_valid is actually
asserted by the hardware after the vblank event, during the prefetch
phase, when we get to arming the vblank event we are going to send it
at the next vblank, in effect halving the vblank rate from the userspace
perspective.
Fix it by sending the userspace event from the IRQ handler, when we
handle the config_valid interrupt, which syncs with the time when the
hardware is active with the new parameters.
Reported-by: Alexandru-Cosmin Gheorghe <alexandru-cosmin.gheorghe@arm.com>
Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
Mali dp needs to disable pixel alpha blending (use layer alpha blending) to
display color formats that do not contain alpha bits per pixel
This patch depends on:
"[PATCH v2 01/19] drm/fourcc: Add a alpha field to drm_format_info"
Signed-off-by: Ayan Kumar Halder <ayan.halder@arm.com>
Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
In the case, when the user wants to scale and rotate a layer by 90/270
degrees, the scaling engine input dimensions' parameters ie width and
height needs to be swapped with respect to the layer's input dimensions.
This means scaling engine input height should be set to layer's input
width and scaling engine input width should be set to
layer's input height.
Signed-off-by: Ayan Halder <ayan.halder@arm.com>
Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
Currently the scaling engine gets enabled for a plane where the input
size differs from the composition size. As rotation is done natively
by the plane's hardware layer, we don't need the scaling engine to be
enabled.
Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
We use "mc" without initializing it if scaling is not necessary.
Fixes: 28ce675b74 ("drm: mali-dp: Add plane upscaling support")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Mihail Atanassov <Mihail.Atanassov@arm.com>
Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
Mali DP hardware needs pitch line sizes aligned to the bus burst
size for reads, so take that into consideration when allocating dumb
buffers. If the layer is rotated then the stride size requirement is
even larger for some hardware versions, so allocate for the worst case
scenario. Update the ->dumb_create() hook to a driver specific function
that sets the correct pitch size.
Reported-by: Ayan Halder <ayan.halder@arm.com>
Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
Rotated planes need a pitch size that is aligned to 8 bytes
for older DP500 and DP550 and at least 64 bytes for DP650. Replace
the malidp_hw_pitch_valid() function with one that calculates
the correct pitch alignment to take into account rotation.
Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
We currently wait for the panel to mirror our intended PSR state
before continuing on both PSR enter and PSR exit. This is really
only important to do when we're entering PSR, since we want to
be sure the last frame we pushed is being served from the panel's
internal fb before shutting down the soc blocks (vop/analogix).
This patch changes the behavior such that we only wait for the
panel to complete the PSR transition when we're entering PSR, and
to skip verification when we're exiting.
Cc: Stéphane Marchesin <marcheu@chromium.org>
Cc: Sonny Rao <sonnyrao@chromium.org>
Signed-off-by: zain wang <wzz@rock-chips.com>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Signed-off-by: Thierry Escande <thierry.escande@collabora.com>
Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Archit Taneja <architt@codeaurora.org>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20180309222327.18689-7-enric.balletbo@collabora.com
Like many other panel drivers, this one fails to build when backlight
support is disabled:
drivers/gpu/drm/panel/panel-raydium-rm68200.o: In function `rm68200_probe':
panel-raydium-rm68200.c:(.text+0x14a): undefined reference to `devm_of_find_backlight'
This adds the appropriate dependency.
Note that while include/linux/backlight.h provides a stub inline when
backlight support is not enabled, this isn't enough to deal with the
case where backlight support is built as a module but the panel driver
is built-in, in which case linking will still fail as above.
One way to avoid this is to add a dependency such as this:
depends on BACKLIGHT_CLASS_DEVICE || BACKLIGHT_CLASS_DEVICE=n
but that is rather complex and misses the point that the panel support
is mostly useless without backlight support.
Fixes: 2b7ed18bed ("drm/panel: Add support for Raydium RM68200 panel driver")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
[treding@nvidia.com: clarify the need for the dependency]
Signed-off-by: Thierry Reding <treding@nvidia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180313210015.3344380-1-arnd@arndb.de
Add a lock to vop to avoid disabling the crtc while waiting for a line
flag while enabling psr. If we disable in the middle of waiting for the
line flag, we'll end up timing out or worse.
Signed-off-by: zain wang <wzz@rock-chips.com>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Signed-off-by: Thierry Escande <thierry.escande@collabora.com>
Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20180309222327.18689-5-enric.balletbo@collabora.com
We would meet a short black screen when exit PSR with the full link
training, In this case, we should use fast link train instead of full
link training.
Signed-off-by: zain wang <wzz@rock-chips.com>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Signed-off-by: Thierry Escande <thierry.escande@collabora.com>
Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Archit Taneja <architt@codeaurora.org>
[dropped header reordering]
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20180309222327.18689-6-enric.balletbo@collabora.com
There is a race between AUX CH bring-up and enabling bridge which will
cause link training to fail. To avoid hitting it, don't change psr state
while enabling the bridge.
Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Cc: Sean Paul <seanpaul@chromium.org>
Signed-off-by: zain wang <wzz@rock-chips.com>
Signed-off-by: Caesar Wang <wxt@rock-chips.com>
[seanpaul fixed up the commit message a bit and renamed *_supported to *_enabled]
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Signed-off-by: Thierry Escande <thierry.escande@collabora.com>
Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Archit Taneja <architt@codeaurora.org>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20180309222327.18689-4-enric.balletbo@collabora.com
Now that the spinlocks and timers are gone, we can remove the psr
worker located in rockchip's analogix driver and do the enable/disable
directly. This should simplify the code and remove races on disable.
Cc: 征增 王 <wzz@rock-chips.com>
Cc: Stéphane Marchesin <marcheu@chromium.org>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Signed-off-by: Thierry Escande <thierry.escande@collabora.com>
Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20180309222327.18689-3-enric.balletbo@collabora.com
Make sure the request PSR state takes effect in analogix_dp_send_psr_spd()
function, or print the sink PSR error state if we failed to apply the
requested PSR setting.
Cc: 征增 王 <wzz@rock-chips.com>
Cc: Stéphane Marchesin <marcheu@chromium.org>
Signed-off-by: Yakir Yang <ykk@rock-chips.com>
[seanpaul changed timeout loop to a readx poll]
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Signed-off-by: Thierry Escande <thierry.escande@collabora.com>
Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Archit Taneja <architt@codeaurora.org>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20180309222327.18689-2-enric.balletbo@collabora.com
When CONFIG_OMAP2_DSS_DPI is disabled, compilation fails due to:
drivers/gpu/drm/omapdrm/dss/dss.h:388:25: error: conflicting types for ‘port’
struct device_node *port,
^~~~
Fix this by renaming the first parameter correctly.
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
When compiling with CONFIG_OMAP2_DSS_DEBUGFS disabled, build fails due
to:
drivers/gpu/drm/omapdrm/dss/dss.c:1474:10: error: ‘dss_debug_dump_clocks’ undeclared (first use in this function); did you mean ‘dispc_dump_clocks’?
dss_debug_dump_clocks, dss);
^~~~~~~~~~~~~~~~~~~~~
dispc_dump_clocks
Fix this by moving the required functions outside #if
defined(CONFIG_OMAP2_DSS_DEBUGFS).
In the long term, we perhaps want to try to get all the debugfs support
left out if debugfs is not enabled.
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
Handle both positive and negative dclk polarity,
according to bus_flags, taking care of this:
On A20 and similar SoCs, the only way to achieve Positive Edge
(Rising Edge), is setting dclk clock phase to 2/3(240°).
By default TCON works in Negative Edge(Falling Edge), this is why phase
is set to 0 in that case.
Unfortunately there's no way to logically invert dclk through IO_POL
register.
The only acceptable way to work, triple checked with scope,
is using clock phase set to 0° for Negative Edge and set to 240° for
Positive Edge.
On A33 and similar SoCs there would be a 90° phase option, but it divides
also dclk by 2.
This patch is a way to avoid quirks all around TCON and DOTCLOCK drivers
for using A33 90° phase divided by 2 and consequently increase code
complexity.
Signed-off-by: Giulio Benetti <giulio.benetti@micronovasrl.com>
Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1520963677-124239-1-git-send-email-giulio.benetti@micronovasrl.com
mode_valid function must be connected to encoder.
Otherwise it could get not be called by drm in the case there's a
bridge connected to encoder instead of a panel.
Move mode_valid function pointer to encoder helper functions,
changing its prototype according to encoder helper function pointer.
Signed-off-by: Giulio Benetti <giulio.benetti@micronovasrl.com>
Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1520941017-81177-1-git-send-email-giulio.benetti@micronovasrl.com
UAPI Changes:
- Query uAPI interface (used for GPU topology information currently)
* Mesa: https://patchwork.freedesktop.org/series/38795/
Driver Changes:
- Increase PSR2 size for CNL (DK)
- Avoid retraining LSPCON link unnecessarily (Ville)
- Decrease request signaling latency (Chris)
- GuC error capture fix (Daniele)
* tag 'drm-intel-next-2018-03-08' of git://anongit.freedesktop.org/drm/drm-intel: (127 commits)
drm/i915: Update DRIVER_DATE to 20180308
drm/i915: add schedule out notification of preempted but completed request
drm/i915: expose rcs topology through query uAPI
drm/i915: add query uAPI
drm/i915: add rcs topology to error state
drm/i915/debugfs: add rcs topology entry
drm/i915/debugfs: reuse max slice/subslices already stored in sseu
drm/i915: store all subslice masks
drm/i915/guc: work around gcc-4.4.4 union initializer issue
drm/i915/cnl: Add Wa_2201832410
drm/i915/icl: Gen11 forcewake support
drm/i915/icl: Add Indirect Context Offset for Gen11
drm/i915/icl: Enhanced execution list support
drm/i915/icl: new context descriptor support
drm/i915/icl: Correctly initialize the Gen11 engines
drm/i915: Assert that the request is indeed complete when signaled from irq
drm/i915: Handle changing enable_fbc parameter at runtime better.
drm/i915: Track whether the DP link is trained or not
drm/i915: Nuke intel_dp->channel_eq_status
drm/i915: Move SST DP link retraining into the ->post_hotplug() hook
...
Major points for this pull request:
- Add dGPU support for amdkfd initialization code and queue handling. It's
not complete support since the GPUVM part is missing (the under debate stuff).
- Enable PCIe atomics for dGPU if present
- Various adjustments to the amdgpu<-->amdkfd interface for dGPUs
- Refactor IOMMUv2 code to allow loading amdkfd without IOMMUv2 in the system
- Add HSA process eviction code in case of system memory pressure
- Various fixes and small changes
* tag 'drm-amdkfd-next-2018-03-11' of git://people.freedesktop.org/~gabbayo/linux: (24 commits)
uapi: Fix type used in ioctl parameter structures
drm/amdkfd: Implement KFD process eviction/restore
drm/amdkfd: Add GPUVM virtual address space to PDD
drm/amdkfd: Remove unaligned memory access
drm/amdkfd: Centralize IOMMUv2 code and make it conditional
drm/amdgpu: Add submit IB function for KFD
drm/amdgpu: Add GPUVM memory management functions for KFD
drm/amdgpu: add amdgpu_sync_clone
drm/amdgpu: Update kgd2kfd_shared_resources for dGPU support
drm/amdgpu: Add KFD eviction fence
drm/amdgpu: Remove unused kfd2kgd interface
drm/amdgpu: Fix wrong mask in get_atc_vmid_pasid_mapping_pasid
drm/amdgpu: Fix header file dependencies
drm/amdgpu: Replace kgd_mem with amdgpu_bo for kernel pinned gtt mem
drm/amdgpu: remove useless BUG_ONs
drm/amdgpu: Enable KFD initialization on dGPUs
drm/amdkfd: Add dGPU device IDs and device info
drm/amdkfd: Add dGPU support to kernel_queue_init
drm/amdkfd: Add dGPU support to the MQD manager
drm/amdkfd: Add dGPU support to the device queue manager
...
Commit 5addcf0a5f ("nouveau: add runtime PM support (v0.9)") prevents
runtime suspend of the GPU if its integrated HDA controller is not bound
to a driver. The rationale appears to be that probing the HDA fails if
the GPU is in D3cold.
However we now use a device link to ensure that the GPU is runtime
resumed while the HDA controller is probed, rendering this safety
measure obsolete. Remove it.
Cc: Dave Airlie <airlied@redhat.com>
Cc: Ben Skeggs <bskeggs@redhat.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Peter Wu <peter@lekensteyn.nl>
Tested-by: Denis Lisov <dennis.lissov@gmail.com> # Nvidia Optimus
Tested-by: Peter Wu <peter@lekensteyn.nl> # Nvidia Optimus
Tested-by: Lukas Wunner <lukas@wunner.de> # MacBook Pro
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Link: https://patchwork.freedesktop.org/patch/msgid/77e0ab74f3377ea9b6acf8fab624acfb4f7dbeca.1520068884.git.lukas@wunner.de
When switching the display on muxed machines, we currently force the HDA
controller into runtime suspend on the previously used GPU and into
runtime active state on the newly used GPU.
That's unnecessary if the GPU uses driver power control, we can just let
the audio device autosuspend or autoresume as it sees fit.
Cc: Dave Airlie <airlied@redhat.com>
Cc: Ben Skeggs <bskeggs@redhat.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Peter Wu <peter@lekensteyn.nl>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Link: https://patchwork.freedesktop.org/patch/msgid/098ed883460eb4976a899eac6f5192fefc877c0f.1520068884.git.lukas@wunner.de
Back in 2013, runtime PM for GPUs with integrated HDA controller was
introduced with commits 0d69704ae3 ("gpu/vga_switcheroo: add driver
control power feature. (v3)") and 246efa4a07 ("snd/hda: add runtime
suspend/resume on optimus support (v4)").
Briefly, the idea was that the HDA controller is forced on and off in
unison with the GPU.
The original code is mostly still in place even though it was never a
100% perfect solution: E.g. on access to the HDA controller, the GPU
is powered up via vga_switcheroo_runtime_resume_hdmi_audio() but there
are no provisions to keep it resumed until access to the HDA controller
has ceased: The GPU autosuspends after 5 seconds, rendering the HDA
controller inaccessible.
Additionally, a kludge is required when hda_intel.c probes: It has to
check whether the GPU is powered down (check_hdmi_disabled()) and defer
probing if so.
However in the meantime (in v4.10) the driver core has gained a feature
called device links which promises to solve such issues in a clean way:
It allows us to declare a dependency from the HDA controller (consumer)
to the GPU (supplier). The PM core then automagically ensures that the
GPU is runtime resumed as long as the HDA controller's ->probe hook is
executed and whenever the HDA controller is accessed.
By default, the HDA controller has a dependency on its parent, a PCIe
Root Port. Adding a device link creates another dependency on its
sibling:
PCIe Root Port
^ ^
| |
| |
HDA ===> GPU
The device link is not only used for runtime PM, it also guarantees that
on system sleep, the HDA controller suspends before the GPU and resumes
after the GPU, and on system shutdown the HDA controller's ->shutdown
hook is executed before the one of the GPU. It is a complete solution.
Using this functionality is as simple as calling device_link_add(),
which results in a dmesg entry like this:
pci 0000:01:00.1: Linked as a consumer to 0000:01:00.0
The code for the GPU-governed audio power management can thus be removed
(except where it's still needed for legacy manual power control).
The device link is added in a PCI quirk rather than in hda_intel.c.
It is therefore legal for the GPU to runtime suspend to D3cold even if
the HDA controller is not bound to a driver or if CONFIG_SND_HDA_INTEL
is not enabled, for accesses to the HDA controller will cause the GPU to
wake up regardless if they're occurring outside of hda_intel.c (think
config space readout via sysfs).
Contrary to the previous implementation, the HDA controller's power
state is now self-governed, rather than GPU-governed, whereas the GPU's
power state is no longer fully self-governed. (The HDA controller needs
to runtime suspend before the GPU can.)
It is thus crucial that runtime PM is always activated on the HDA
controller even if CONFIG_SND_HDA_POWER_SAVE_DEFAULT is set to 0 (which
is the default), lest the GPU stays awake. This is achieved by setting
the auto_runtime_pm flag on every codec and the AZX_DCAPS_PM_RUNTIME
flag on the HDA controller.
A side effect is that power consumption might be reduced if the GPU is
in use but the HDA controller is not, because the HDA controller is now
allowed to go to D3hot. Before, it was forced to stay in D0 as long as
the GPU was in use. (There is no reduction in power consumption on my
Nvidia GK107, but there might be on other chips.)
The code paths for legacy manual power control are adjusted such that
runtime PM is disabled during power off, thereby preventing the PM core
from resuming the HDA controller.
Note that the device link is not only added on vga_switcheroo capable
systems, but for *any* GPU with integrated HDA controller. The idea is
that the HDA controller streams audio via connectors located on the GPU,
so the GPU needs to be on for the HDA controller to do anything useful.
This commit implicitly fixes an unbalanced runtime PM ref upon unbind of
hda_intel.c: On ->probe, a runtime PM ref was previously released under
the condition "azx_has_pm_runtime(chip) || hda->use_vga_switcheroo", but
on ->remove a runtime PM ref was only acquired under the first of those
conditions. Thus, binding and unbinding the driver twice on a
vga_switcheroo capable system caused the runtime PM refcount to drop
below zero. The issue is resolved because the AZX_DCAPS_PM_RUNTIME flag
is now always set if use_vga_switcheroo is true.
For more information on device links please refer to:
https://www.kernel.org/doc/html/latest/driver-api/device_link.html
Documentation/driver-api/device_link.rst
Cc: Dave Airlie <airlied@redhat.com>
Cc: Ben Skeggs <bskeggs@redhat.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Takashi Iwai <tiwai@suse.de>
Reviewed-by: Peter Wu <peter@lekensteyn.nl>
Tested-by: Kai Heng Feng <kai.heng.feng@canonical.com> # AMD PowerXpress
Tested-by: Mike Lothian <mike@fireburn.co.uk> # AMD PowerXpress
Tested-by: Denis Lisov <dennis.lissov@gmail.com> # Nvidia Optimus
Tested-by: Peter Wu <peter@lekensteyn.nl> # Nvidia Optimus
Tested-by: Lukas Wunner <lukas@wunner.de> # MacBook Pro
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Link: https://patchwork.freedesktop.org/patch/msgid/51bd38360ff502a8c42b1ebf4405ee1d3f27118d.1520068884.git.lukas@wunner.de
If DRM drivers use runtime PM, they currently notify vga_switcheroo
whenever they ->runtime_suspend or ->runtime_resume to update
vga_switcheroo's internal power state tracking.
That's essentially a duplication of a functionality performed by the
PM core as it already tracks the GPU's power state and vga_switcheroo
can always query it.
Introduce a new internal helper vga_switcheroo_pwr_state() which does
just that if runtime PM is used, or falls back to vga_switcheroo's
internal power state tracking if manual power control is used.
Drop a redundant power state check in set_audio_state() while at it.
This removes one of the two purposes of the notification mechanism
implemented by vga_switcheroo_set_dynamic_switch(). The other one is
power management of the audio device and we'll remove that next.
Cc: Dave Airlie <airlied@redhat.com>
Cc: Ben Skeggs <bskeggs@redhat.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Peter Wu <peter@lekensteyn.nl>
Tested-by: Kai Heng Feng <kai.heng.feng@canonical.com> # AMD PowerXpress
Tested-by: Mike Lothian <mike@fireburn.co.uk> # AMD PowerXpress
Tested-by: Denis Lisov <dennis.lissov@gmail.com> # Nvidia Optimus
Tested-by: Peter Wu <peter@lekensteyn.nl> # Nvidia Optimus
Tested-by: Lukas Wunner <lukas@wunner.de> # MacBook Pro
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Link: https://patchwork.freedesktop.org/patch/msgid/0aa49d735b988aa04524a8dc339582ace33f0f94.1520068884.git.lukas@wunner.de
When cutting power to a GPU and its integrated HDA controller, their
cached current_state should be updated to D3cold to reflect reality.
We currently rely on the DRM and HDA drivers to do that, however:
- The HDA driver updates the current_state in azx_vs_set_state(), which
will no longer be called with driver power control once we migrate to
device links. (It will still be called with manual power control.)
- If the HDA device is not bound, its current_state remains at D0 even
though the GPU driver may decide to go to D3cold.
- The DRM drivers update the current_state using pci_set_power_state()
which can't put the device into a deeper power state than D3hot if the
GPU is not deemed power-manageable by the platform (even though it
*is* power-manageable by some nonstandard means, such as a _DSM).
Centralize updating the current_state of the GPU and HDA controller in
vga_switcheroo's ->runtime_suspend hook to overcome these deficiencies.
The GPU and HDA controller are two functions of the same PCI device
(VGA class device on function 0 and audio device on function 1) and
no other PCI devices reside on the same bus since this is a PCIe
point-to-point link, so we can just walk the bus and update the
current_state of all devices.
On ->runtime_resume, the HDA controller is in D0uninitialized state.
Resume to D0active and then let it autosuspend as it sees fit.
Note that vga_switcheroo_init_domain_pm_ops() is not supposed to be
called by hybrid graphics laptops which power down the GPU via its root
port's _PR3 resources and consequently vga_switcheroo_runtime_suspend()
is not used. On those laptops, the root port is power-manageable by the
platform (instead of by a nonstandard means) and the current_state is
therefore updated by the PCI core through the following call chain:
pci_set_power_state()
__pci_complete_power_transition()
pci_bus_set_current_state()
Resuming to D0active happens through:
pci_set_power_state()
__pci_start_power_transition()
pci_wakeup_bus()
Cc: Dave Airlie <airlied@redhat.com>
Cc: Ben Skeggs <bskeggs@redhat.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Peter Wu <peter@lekensteyn.nl>
Tested-by: Kai Heng Feng <kai.heng.feng@canonical.com> # AMD PowerXpress
Tested-by: Mike Lothian <mike@fireburn.co.uk> # AMD PowerXpress
Tested-by: Denis Lisov <dennis.lissov@gmail.com> # Nvidia Optimus
Tested-by: Peter Wu <peter@lekensteyn.nl> # Nvidia Optimus
Tested-by: Lukas Wunner <lukas@wunner.de> # MacBook Pro
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Link: https://patchwork.freedesktop.org/patch/msgid/8416958482c8c42d6f311ea5c1e5a65ccf21f5db.1520068884.git.lukas@wunner.de
This patch adds support for DMT display modes over HDMI.
The modes timings configurations are from the Amlogic Vendor linux tree
and tested over multiples monitors.
Previously only a selected number of CEA modes were supported.
Only these following modes are supported with these changes:
- 640x480@60Hz
- 800x600@60Hz
- 1024x768@60Hz
- 1152x864@75Hz
- 1280x1024@60Hz
- 1600x1200@60Hz
- 1920x1080@60Hz
The associated code to handle the clock rates is also added.
Acked-by: Jerome Brunet <jbrunet@baylibre.com>
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1520935670-14187-1-git-send-email-narmstrong@baylibre.com
This patch adds support for AUO G104SN02 V2 800x600 10.4" panel to DRM
simple panel driver.
Signed-off-by: Christoph Fritz <chf.fritz@googlemail.com>
Signed-off-by: Stefan Riedmueller <s.riedmueller@phytec.de>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1513430016.1930.4.camel@googlemail.com
Fixes the following sparse warnings:
drivers/gpu/drm/panel/panel-ilitek-ili9322.c:182:12: warning:
symbol 'ili9322_inputs' was not declared. Should it be static?
drivers/gpu/drm/panel/panel-ilitek-ili9322.c:343:28: warning:
symbol 'ili9322_regmap_config' was not declared. Should it be static?
Also change ili9322_inputs to 'const char * const' to avoid
chackpatch warning.
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1514948938-19996-1-git-send-email-weiyongjun1@huawei.com
Add support for the optional power-supply.
Note: A "dummy regulator" is returned by devm_regulator_get()
if the optional regulator is not present in the device tree,
simplifying the source code when enabling/disabling the regulator.
Signed-off-by: Philippe Cornu <philippe.cornu@st.com>
Reviewed-by: Yannick Fertré <yannick.fertre@st.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180205094532.23547-3-philippe.cornu@st.com
Convert the sharp lq123p1jx31 from using a fixed mode to specifying a
display timing with min/typ/max values. This allows us to capture the
timings set forth in the datasheet as well as the additional values that
we've cleared with the display vendor to avoid interference with the
digitizer on the Samsung Chromebook Plus (kevin).
A follow-on patch will specify the override mode for kevin devices.
Changes in v2:
- None
Changes in v3:
- None
Cc: Doug Anderson <dianders@chromium.org>
Cc: Eric Anholt <eric@anholt.net>
Cc: Heiko Stuebner <heiko@sntech.de>
Cc: Jeffy Chen <jeffy.chen@rock-chips.com>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Stéphane Marchesin <marcheu@chromium.org>
Cc: Thierry Reding <thierry.reding@gmail.com>
Cc: devicetree@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Cc: linux-rockchip@lists.infradead.org
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Tested-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180208174855.55620-6-seanpaul@chromium.org
This fixes bad color output. When I was first testing the device I
had the DPI hardware set to 666 mode, but apparently in the refactor
to use the bus_format information from the panel driver, I failed to
actually update the panel.
Signed-off-by: Eric Anholt <eric@anholt.net>
Fixes: e8b6f561b2 ("drm/panel: simple: Add the 7" DPI panel from Adafruit")
Cc: Thierry Reding <thierry.reding@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180309233332.1769-1-eric@anholt.net
Using the hint from the plane state, we turn on the background color
to avoid display corruption from planes blending with the background.
Changes from v1:
- Use needs_bg_fill from plane state
Signed-off-by: Stefan Schake <stschake@gmail.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/1520556817-97297-5-git-send-email-stschake@gmail.com
Considering a single plane only, we have to enable background color
when the plane has an alpha format and could be blending from the
background or when it doesn't cover the entire screen.
Changes from v1:
- Drop unrelated change
- Move needs_bg_fill to plane state
Signed-off-by: Stefan Schake <stschake@gmail.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/1520556817-97297-3-git-send-email-stschake@gmail.com
Alpha formats in DRM are assumed to be premultiplied, so we should be
setting the PREMULT bit in the plane configuration for HVS.
Changes from v1:
- Use correct has_alpha
Signed-off-by: Stefan Schake <stschake@gmail.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/1520556817-97297-2-git-send-email-stschake@gmail.com
More stuff for 4.17. Highlights:
- More fixes for "wattman" like functionality (fine grained clk/voltage control)
- Add more power profile infrastucture (context based dpm)
- SR-IOV fixes
- Add iomem debugging interface for use with umr
- Powerplay and cgs cleanups
- DC fixes and cleanups
- ttm improvements
- Misc cleanups all over
* 'drm-next-4.17' of git://people.freedesktop.org/~agd5f/linux: (143 commits)
drm/amdgpu:Always save uvd vcpu_bo in VM Mode
drm/amdgpu:Correct max uvd handles
drm/amdgpu: replace iova debugfs file with iomem (v3)
drm/amd/display: validate plane format on primary plane
drm/amdgpu: Clean sdma wptr register when only enable wptr polling
drm/amd/amdgpu: re-add missing GC 9.1 and SDMA0 4.1 sh_mask header files
drm/amdgpu: give warning before sleep in kiq_r/wreg
drm/amdgpu: further mitigate workaround for i915
drm/amdgpu: drop gtt->adev
drm/amdgpu: add amdgpu_evict_gtt debugfs entry
drm/amd/pp: Add #ifdef checks for CONFIG_ACPI
drm/amd/pp: fix "Delete the wrapper layer of smu_allocate/free_memory"
drm/amd/pp: Drop wrapper functions for upper/lower_32_bits
drm/amdgpu: Delete cgs wrapper functions for gpu memory manager
drm/amd/pp: Delete the wrapper layer of smu_allocate/free_memory
drm/amd/pp: Remove cgs wrapper function for temperature update
Revert "drm/amd/pp: Add a pp feature mask bit for AutoWattman feature"
drm/amd/pp: Add auto power profilng switch based on workloads (v2)
drm/amd/pp: Revert gfx/compute profile switch sysfs
drm/amd/pp: Fix sclk in highest two levels when compute on smu7
...
Instead of using timer and spinlocks, use delayed_work and
mutexes for rockchip psr. This allows us to make blocking
calls when enabling/disabling psr (which is sort of important
given we're talking over dpcd to the display).
Cc: Caesar Wang <wxt@rock-chips.com>
Cc: 征增 王 <wzz@rock-chips.com>
Cc: Stéphane Marchesin <marcheu@chromium.org>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Signed-off-by: Thierry Escande <thierry.escande@collabora.com>
Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20180305222324.5872-3-enric.balletbo@collabora.com
There's a race between when bridge_disable and when vop_crtc_disable
are called. If the flush timer triggers a new psr work between these,
we will operate eDP without power shutdowned by bridge_disable. In this
case, moving activate/deactivate to enable/disable bridge to avoid it.
Cc: Stéphane Marchesin <marcheu@chromium.org>
Signed-off-by: zain wang <wzz@rock-chips.com>
Signed-off-by: Caesar Wang <wxt@rock-chips.com>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Signed-off-by: Thierry Escande <thierry.escande@collabora.com>
Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20180305222324.5872-2-enric.balletbo@collabora.com
The HDMI vpll clock should be enabled when bind() is called. So move the
clk_prepare_enable of that clock to bind() function and add the missing
clk_disable_unprepare() required in error handling path and unbind().
Fixes: 12b9f204e8 ("drm: bridge/dw_hdmi: add rockchip rk3288 support")
Signed-off-by: Jeffy Chen <jeffy.chen@rock-chips.com>
Signed-off-by: Thierry Escande <thierry.escande@collabora.com>
Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20180302175757.28192-5-enric.balletbo@collabora.com
In bind the clk_prepare_enable of the HDMI pclk is called before adding the
i2c_adapter. So it should be the other way around in unbind, first remove
the i2c_adapter and then call the clk_disable_unprepare.
Fixes: 412d4ae6b7 ("drm/rockchip: hdmi: add Innosilicon HDMI support")
Signed-off-by: Jeffy Chen <jeffy.chen@rock-chips.com>
Signed-off-by: Thierry Escande <thierry.escande@collabora.com>
Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20180302175757.28192-4-enric.balletbo@collabora.com
In bind()'s error handling path call destroy functions instead of
cleanup functions for encoder and connector and reorder to match how is
called in bind().
In unbind() call the connector and encoder destroy functions.
Signed-off-by: Jeffy Chen <jeffy.chen@rock-chips.com>
Signed-off-by: Thierry Escande <thierry.escande@collabora.com>
Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20180302175757.28192-2-enric.balletbo@collabora.com
Replace the ad-hoc iturbt_709 property with the new standard
COLOR_ENCODING property. Compiles, but not tested.
v2: Fix typos (Ilia)
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: nouveau@lists.freedesktop.org
Cc: Ben Skeggs <bskeggs@redhat.com>
Cc: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180220134816.15229-1-ville.syrjala@linux.intel.com
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Ben Skeggs <bskeggs@redhat.com> #irc
This fixes hangs with legacy applications that use the mmap() syscall on
the fbdev device to map framebuffer memory. The fbdev implementation for
mmap() creates a mapping that conflicts with DRM usage and causes a hang
when the memory is accessed through the mapping.
Reported-by: Marcel Ziswiler <marcel.ziswiler@toradex.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Tested-by: Stefan Agner <stefan@agner.ch>
Tested-by: Marcel Ziswiler <marcel.ziswiler@toradex.com>
Reported-by: Marcel Ziswiler <marcel.ziswiler@toradex.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
This function allows mapping a GEM object into a virtual memory address
space, which makes it useful outside of the GEM code.
While at it, rename the function so it doesn't clash with the function
that implements the DRM_TEGRA_GEM_MMAP IOCTL.
Signed-off-by: Thierry Reding <treding@nvidia.com>
Move declarations in the gem.h header file into the same order as the
corresponding definitions in gem.c.
Signed-off-by: Thierry Reding <treding@nvidia.com>
There is one corner case missing schedule out notification of the preempted
request. The preempted request is just completed when preemption happen,
then it will be canceled and won't be resubmitted later, GVT-g will lost
the schedule out notification.
Here add schedule out notification if found the preempted request has been
completed.
v2:
- refine description, add completed check and notification in
execlists_cancel_port_requests. (Chris)
v3:
- use ternary confitional, remove local variable. (Tvrtko)
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Weinan Li <weinan.z.li@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1520302557-25079-1-git-send-email-weinan.z.li@intel.com
With the introduction of asymmetric slices in CNL, we cannot rely on
the previous SUBSLICE_MASK getparam to tell userspace what subslices
are available. Here we introduce a more detailed way of querying the
Gen's GPU topology that doesn't aggregate numbers.
This is essential for monitoring parts of the GPU with the OA unit,
because counters need to be normalized to the number of
EUs/subslices/slices. The current aggregated numbers like EU_TOTAL do
not gives us sufficient information.
The Mesa series making use of this API is :
https://patchwork.freedesktop.org/series/38795/
As a bonus we can draw representations of the GPU :
https://imgur.com/a/vuqpa
v2: Rename uapi struct s/_mask/_info/ (Tvrtko)
Report max_slice/subslice/eus_per_subslice rather than strides (Tvrtko)
Add uapi macros to read data from *_info structs (Tvrtko)
v3: Use !!(v & DRM_I915_BIT()) for uapi macros instead of custom shifts (Tvrtko)
v4: factorize query item writting (Tvrtko)
tweak uapi struct/define names (Tvrtko)
v5: Replace ALIGN() macro (Chris)
v6: Updated uapi comments (Tvrtko)
Moved flags != 0 checks into vfuncs (Tvrtko)
v7: Use access_ok() before copying anything, to avoid overflows (Chris)
Switch BUG_ON() to GEM_WARN_ON() (Tvrtko)
v8: Tweak uapi comments style to match the coding style (Lionel)
v9: Fix error in comment about computation of enabled subslice (Tvrtko)
v10: Fix/update comments in uAPI (Sagar)
v11: Drop drm_i915_query_(slice|subslice|eu)_info in favor of a single
drm_i915_query_topology_info (Joonas)
v12: Add subslice_stride/eu_stride in drm_i915_query_topology_info (Joonas)
v13: Fix comment in uAPI (Joonas)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180306122857.27317-7-lionel.g.landwerlin@intel.com
There are a number of information that are readable from hardware
registers and that we would like to make accessible to userspace. One
particular example is the topology of the execution units (how are
execution units grouped in subslices and slices and also which ones
have been fused off for die recovery).
At the moment the GET_PARAM ioctl covers some basic needs, but
generally is only able to return a single value for each defined
parameter. This is a bit problematic with topology descriptions which
are array/maps of available units.
This change introduces a new ioctl that can deal with requests to fill
structures of potentially variable lengths. The user is expected fill
a query with length fields set at 0 on the first call, the kernel then
sets the length fields to the their expected values. A second call to
the kernel with length fields at their expected values will trigger a
copy of the data to the pointed memory locations.
The scope of this uAPI is only to provide information to userspace,
not to allow configuration of the device.
v2: Simplify dispatcher code iteration (Tvrtko)
Tweak uapi drm_i915_query_item structure (Tvrtko)
v3: Rename pad fields into flags (Chris)
Return error on flags field != 0 (Chris)
Only copy length back to userspace in drm_i915_query_item (Chris)
v4: Use array of functions instead of switch (Chris)
v5: More comments in uapi (Tvrtko)
Return query item errors in length field (All)
v6: Tweak uapi comments style to match the coding style (Lionel)
v7: Add i915_query.h (Joonas)
v8: (Lionel) Change the behavior of the item iterator to report
invalid queries into the query item rather than stopping the
iteration. This enables userspace applications to query newer
items on older kernels and only have failure on the items that are
not supported.
v9: Edit copyright headers (Joonas)
v10: Typos & comments in uapi (Joonas)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180306122857.27317-6-lionel.g.landwerlin@intel.com
This might be useful information for developers looking at an error
state.
v2: Place topology towards the end of the error state (Chris)
v3: Reuse common printing code (Michal)
v4: Make this a one-liner (Chris)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180306122857.27317-5-lionel.g.landwerlin@intel.com
While the end goal is to make this information available to userspace
through a new ioctl, there is no reason we can't display it in a human
readable fashion through debugfs.
slice0: 3 subslice(s) (0x7):
subslice0: 8 EUs (0xff)
subslice1: 8 EUs (0xff)
subslice2: 8 EUs (0xff)
subslice3: 0 EUs (0x0)
slice1: 3 subslice(s) (0x7):
subslice0: 8 EUs (0xff)
subslice1: 8 EUs (0xff)
subslice2: 8 EUs (0xff)
subslice3: 0 EUs (0x0)
slice2: 3 subslice(s) (0x7):
subslice0: 8 EUs (0xff)
subslice1: 8 EUs (0xff)
subslice2: 8 EUs (0xff)
subslice3: 0 EUs (0x0)
v2: Reformat debugfs printing (Tvrtko)
Use the new EU mask helper (Tvrtko)
v3: Move printing code to intel_device_info.c to be shared with error
state (Michal)
v4: Bump u8 to u16 when using sseu_get_eus() (Lionel)
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180306122857.27317-4-lionel.g.landwerlin@intel.com
Up to now, subslice mask was assumed to be uniform across slices. But
starting with Cannonlake, slices can be asymmetric (for example slice0
has different number of subslices as slice1+). This change stores all
subslices masks for all slices rather than having a single mask that
applies to all slices.
v2: Rework how we store total numbers in sseu_dev_info (Tvrtko)
Fix CHV eu masks, was reading disabled as enabled (Tvrtko)
Readability changes (Tvrtko)
Add EU index helper (Tvrtko)
v3: Turn ALIGN(v, 8) / 8 into DIV_ROUND_UP(v, BITS_PER_BYTE) (Tvrtko)
Reuse sseu_eu_idx() for setting eu_mask on CHV (Tvrtko)
Reformat debug prints for subslices (Tvrtko)
v4: Change eu_mask helper into sseu_set_eus() (Tvrtko)
v5: With Haswell reporting masks & counts, bump sseu_*_eus() functions
to use u16 (Lionel)
v6: Fix sseu_get_eus() for > 8 EUs per subslice (Lionel)
v7: Change debugfs enabels for number of subslices per slice, will
need a small igt/pm_sseu change (Lionel)
Drop subslice_total field from sseu_dev_info, rely on
sseu_subslice_total() to recompute the value instead (Lionel)
v8: Remove unused function compute_subslice_total() (Lionel)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180306122857.27317-2-lionel.g.landwerlin@intel.com
gcc-4.4.4 has problems with initalizers of anon unions.
drivers/gpu/drm/i915/intel_guc_log.c: In function 'guc_log_control':
drivers/gpu/drm/i915/intel_guc_log.c:64: error: unknown field 'logging_enabled' specified in initializer
Work around this.
Fixes: 35fe703c31 ("drm/i915/guc: Change values for i915_guc_log_control")
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180308001333.rI2vrNRTY%akpm@linux-foundation.org
"Clock gating bug in GWL may not clear barrier state when an EOT
is received, causing a hang the next time that barrier is used."
HSDES: 2201832410
Cc: Rafael Antognolli <rafael.antognolli@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180307220912.3681-1-rodrigo.vivi@intel.com
We were previously selecting 1024x768 and 32BPP as the default
set-up for the PL111 consumers.
This does not work on elder systems: the device tree bindings
support a property "max-memory-bandwidth" in bytes/second that
states that if you exceed this the memory bus will saturate.
The result is flickering and unstable images.
Parse the "max-memory-bandwidth" and respect it when
intializing the driver. On the RealView PB11MP, Versatile and
Integrator/CP we get a nice console as default with this code.
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20180307215819.15814-1-linus.walleij@linaro.org
The following happens when connection a DVI output driven
from the SiI9022 using a DVI-to-VGA adapter plug:
i2c i2c-0: sendbytes: NAK bailout.
i2c i2c-0: sendbytes: NAK bailout.
Then no picture. Apparently the I2C engine inside the SiI9022
is not smart enough to try to fall back to DDC I2C. Or the
vendor have not integrated the electronics properly. I don't
know which one it is.
After this, the I2C bus seems stalled and the first attempt to
read the status register fails, and the code returns with
negative return value, and the display fails to initialized.
Instead, retry status readout five times and continue even
if this fails.
Tested on the ARM Versatile Express with a DVI-to-VGA
connector, it now gives picture.
Introduce a helper struct device *dev variable to make
the code more readable.
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Liviu Dudau <Liviu.Dudau@arm.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20180305101702.13441-1-linus.walleij@linaro.org
We want to cut down the default bpp to 16 on the RealView so
we can have a 1024x768 framebuffer console by default. The
memory bandwidth limitations makes this not work with the
PL111 default of 32bpp.
This builds on top of the earlier patches making the
framebuffer default bpp a per-variant variable.
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20180302090948.6399-4-linus.walleij@linaro.org
When UVD is in VM mode, there is not uvd handle exchanged,
uvd.handles are always 0. So vcpu_bo always need save,
Otherwise amdgpu driver will fail during suspend/resume.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105021
Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Max uvd handles should use adev->uvd.max_handles instead of
AMDGPU_MAX_UVD_HANDLES here.
Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
This allows access to pages allocated through the driver with optional
IOMMU mapping.
v2: Fix number of bytes copied and add write method
v3: drop check for kmap return
Original-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
In dce110, the plane configuration is such that plane 0
or the primary plane should be rendered with only RGB data.
This patch adds the validation to ensure that no video data
is rendered on plane 0.
Signed-off-by: Shirish S <shirish.s@amd.com>
Reviewed-by: Tony Cheng <Tony.Cheng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The sdma wptr polling memory is not fast enough, then the sdma
wptr register will be random, and not equal to sdma rptr, which
will cause sdma engine hang when load driver, so clean up the sdma
wptr directly to fix this issue.
v2:add comment above the code and correct coding style
Reviewed-by: Xiangliang Yu <Xiangliang.Yu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
These are required by umr to properly parse bitfield offsets.
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexdeucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
to catch error that may schedule in atomic context early on
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Disable the workaround on imported BOs as well.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexdeucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
We can use ttm->bdev instead.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexdeucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Allow evicting all BOs from the GTT domain.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexdeucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Fix compiling error when CONFIG_ACPI not enabled.
Change-Id: I5f901adbc799c10b30e5ea79f8f44760e749fae1
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
For amdgpu_bo_create_kernel to work the handle must be NULL initialized,
otherwise we only try to pin and map the BO.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Rex Zhu <rezhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
replace smu_upper_32_bits/smu_lower_32_bits with
the standard kernel macros
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
delete those cgs interfaces:
amdgpu_cgs_alloc_gpu_mem
amdgpu_cgs_free_gpu_mem
amdgpu_cgs_gmap_gpu_mem
amdgpu_cgs_gunmap_gpu_mem
amdgpu_cgs_kmap_gpu_mem
amdgpu_cgs_kunmap_gpu_mem
Reviewed-by: Alex Deucher <alexdeucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
use amdgpu_bo_create/free_kernel instand.
Reviewed-by: Alex Deucher <alexdeucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Alex Deucher <alexdeucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This reverts commit e429ea87b2939c4cce1b439baf6d76535a0767f2.
Implement Workload Aware Dynamic power management instand of
AutoWattman feature in linux.
Reviewed-by: Alex Deucher <alexdeucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add power profiling mode dynamic switch based on the workloads.
Currently, support Cumpute, VR, Video, 3D,power saving with Cumpute
have highest prority, power saving have lowest prority.
in manual dpm mode, driver will stop auto switch, just save the client's
requests. user can set power profiling mode through sysfs.
when exit manual dpm mode, driver will response the client's requests.
switch based on the client's prority.
v2: squash in fixes from Rex
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add support for the R-Car V3M (R8A77970) SoC to the LVDS encoder driver.
Note that there are some differences with the other R-Car gen3 SoCs, e.g.
LVDPLLCR has the same layout as in the R-Car gen2 SoCs.
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Add support for the R-Car V3M (R8A77970) SoC to the R-Car DU driver.
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Pimp drm_property_type_valid() to check for more fails with the
property flags. Also make the check before adding the property,
and bail out if things look bad.
Since we're now chekcing for more than the type let's also
change the function name to drm_property_flags_valid().
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180306164849.2862-6-ville.syrjala@linux.intel.com
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
If the property already has the enum value WARN and bail.
Replacing enum values doesn't make sense to me.
Throw out the pointless list_empty() while at it.
Cc: Daniel Vetter <daniel@ffwll.ch>
Suggested-by: Daniel Vetter <daniel@ffwll.ch>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180306164849.2862-1-ville.syrjala@linux.intel.com
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The LVDS encoders used to be described in DT as part of the DU. They now
have their own DT node, linked to the DU using the OF graph bindings.
This allows moving internal LVDS encoder support to a separate driver
modelled as a DRM bridge. Backward compatibility is retained as legacy
DT is patched live to move to the new bindings.
Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
The internal LVDS encoders now have their own DT bindings. Before
switching the driver infrastructure to those new bindings, implement
backward-compatibility through live DT patching.
Patching is disabled and will be enabled along with support for the new
DT bindings in the DU driver.
Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Reviewed-by: Frank Rowand <frank.rowand@sony.com>
If there is another bridge after analogix_dp, then the connector object
should not be created. This fixes following timeouts on Exynos5420-based
Chromebook2 Peach-PIT board during boot:
exynos-dp 145b0000.dp-controller: AUX CH cmd reply timeout!
exynos-dp 145b0000.dp-controller: AUX CH enable timeout!
exynos-dp 145b0000.dp-controller: AUX CH enable timeout!
exynos-dp 145b0000.dp-controller: AUX CH enable timeout!
exynos-dp 145b0000.dp-controller: AUX CH enable timeout!
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Andrzej Hajda <a.hajda@samsung.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180305085741.18896-4-m.szyprowski@samsung.com
The bridge does not need to be powered in analogix_dp_bind(), so
remove the calls to pm_runtime_get()/phy_power_on()/analogix_dp_init_dp()
as well as their power-off counterparts.
Cc: Stéphane Marchesin <marcheu@chromium.org>
Signed-off-by: zain wang <wzz@rock-chips.com>
Signed-off-by: Caesar Wang <wxt@rock-chips.com>
[the patch originally just removed the power_on portion, seanpaul removed
the power off code as well as improved the commit message]
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Signed-off-by: Thierry Escande <thierry.escande@collabora.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Reviewed-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Andrzej Hajda <a.hajda@samsung.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180305085741.18896-2-m.szyprowski@samsung.com
The main difference with previous GENs is that starting from Gen11
each VCS and VECS engine has its own power well, which only exist
if the related engine exists in the HW.
The fallback forcewake request workaround is only needed on gen9
according to the HSDES WA entry (1604254524), so we can go back to using
the simpler fw_domains_get/put functions.
BSpec: 18331
v2: fix fwtable, use array to test shadow tables, create new
accessors to avoid check on every access (Tvrtko)
v3 (from Paulo): Rebase.
v4:
- Range 09400-097FF should be FORCEWAKE_ALL (Daniele)
- Use the BIT macro for forcewake domains (Daniele)
- Add a comment about the range ordering (Oscar)
- Updated commit message (Oscar)
v5: Rebased
v6: Use I915_MAX_VCS/VECS (Michal)
v7: translate FORCEWAKE_ALL to available domains
v8: rebase, add clarification on fallback ack in commit message.
v9: fix rebase issue, change check in fw_domains_init from IS_GEN11
to GEN >= 11
v10: Generate is_genX_shadowed with a macro (Daniele)
Include gen11_fw_ranges in the selftest (Michel)
v11: Simplify FORCEWAKE_ALL, new line between NEEDS_FORCEWAKEs (Tvrtko)
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Acked-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180302161501.28594-6-mika.kuoppala@linux.intel.com
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Enhanced Execlists is an upgraded version of execlists which supports
up to 8 ports. The lrcs to be submitted are written to a submit queue
(the ExecLists Submission Queue - ELSQ), which is then loaded on the
HW. When writing to the ELSP register, the lrcs are written cyclically
in the queue from position 0 to position 7. Alternatively, it is
possible to write directly in the individual positions of the queue
using the ELSQC registers. To be able to re-use all the existing code
we're using the latter method and we're currently limiting ourself to
only using 2 elements.
v2: Rebase.
v3: Switch from !IS_GEN11 to GEN < 11 (Daniele Ceraolo Spurio).
v4: Use the elsq registers instead of elsp. (Daniele Ceraolo Spurio)
v5: Reword commit, rename regs to be closer to specs, turn off
preemption (Daniele), reuse engine->execlists.elsp (Chris)
v6: use has_logical_ring_elsq to differentiate the new paths
v7: add preemption support, rename els to submit_reg (Chris)
v8: save the ctrl register inside the execlists struct, drop CSB
handling updates (superseded by preempt_complete_status) (Chris)
v9: s/drm_i915_gem_request/i915_request (Mika)
v10: resolved conflict in inject_preempt_context (Mika)
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Thomas Daniel <thomas.daniel@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180302161501.28594-4-mika.kuoppala@linux.intel.com
Starting from Gen11 the context descriptor format has been updated in
the HW. The hw_id field has been considerably reduced in size and engine
class and instance fields have been added.
There is a slight name clashing issue because the field that we call
hw_id is actually called SW Context ID in the specs for Gen11+.
With the current size of the hw_id field we can have a maximum of 2k
contexts at any time, but we could use the sw_counter field (which is sw
defined) to increase that because the HW requirement is that
engine_id + sw id + sw_counter is a unique number.
GuC uses a similar method to support more contexts but does its tracking
at lrc level. To avoid doing an implementation that will need to be
reworked once GuC support lands, defer it for now and mark it as TODO.
v2: rebased, add documentation, fix GEN11_ENGINE_INSTANCE_SHIFT
v3: rebased, bring back lost code from i915_gem_context.c
v4: make TODO comment more generic
v5: be consistent with bit ordering, add extra checks (Chris)
Cc: Oscar Mateo <oscar.mateo@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Oscar Mateo <oscar.mateo@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180302161501.28594-3-mika.kuoppala@linux.intel.com
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Gen11 has up to 4 VCS and up to 2 VECS engines, this patch adds mmio
base definitions for all of them.
Bspec: 20944
Bspec: 7021
v2: Set the correct mmio_base in intel_engines_init_mmio; updating the
base mmio values any later would cause incorrect reads in
i915_gem_sanitize (Michel).
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Ceraolo Spurio, Daniele <daniele.ceraolospurio@intel.com>
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180302161501.28594-2-mika.kuoppala@linux.intel.com
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
If i915.enable_fbc is cleared at runtime, but FBC was previously enabled
then we don't disable FBC until the next time the crtc is disabled.
Make sure that if the module param is changed, we disable FBC in
intel_fbc_post_update so we never have to worry about disabling.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180305123608.20665-1-maarten.lankhorst@linux.intel.com
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
LSPCON likes to throw short HPDs during the enable seqeunce prior to the
link being trained. These obviously result in the channel CR/EQ check
failing and thus we schedule a pointless hotplug work to retrain the
link. Avoid that by ignoring the bad CR/EQ status until we've actually
initially trained the link.
I've not actually investigated to see what LSPCON is trying to signal
with the short pulse. But as long as it signals anything I think we're
supposed to check the link status anyway, so I don't really see other
good ways to solve this. I've not seen these short pulses being
generated by normal DP sinks.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Lyude Paul <lyude@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180117192149.17760-5-ville.syrjala@linux.intel.com
intel_dp->channel_eq_status is used in exactly one function, and we
don't need it to persist between calls. So just go back to using a
local variable instead.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
Signed-off-by: Lyude Paul <lyude@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180117192149.17760-4-ville.syrjala@linux.intel.com
Doing link retraining from the short pulse handler is problematic since
that might introduce deadlocks with MST sideband processing. Currently
we don't retrain MST links from this code, but we want to change that.
So better to move the entire thing to the hotplug work. We can utilize
the new encoder->hotplug() hook for this.
The only thing we leave in the short pulse handler is the link status
check. That one still depends on the link parameters stored under
intel_dp, so no locking around that but races should be mostly harmless
as the actual retraining code will recheck the link state if we
end up there by mistake.
v2: Rebase due to ->post_hotplug() now being just ->hotplug()
Check the connector type to figure out if we should do
the HDMI thing or the DP think for DDI
[pushed with whitespace changes for sparse]
Cc: Manasi Navare <manasi.d.navare@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Acked-by: Manasi Navare <manasi.d.navare@intel.com>
Signed-off-by: Lyude Paul <lyude@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180117192149.17760-3-ville.syrjala@linux.intel.com
The LG 4k TV I have doesn't deassert HPD when I turn the TV off, but
when I turn it back on it will pulse the HPD line. By that time it has
forgotten everything we told it about scrambling and the clock ratio.
Hence if we want to get a picture out if it again we have to tell it
whether we're currently sending scrambled data or not. Implement
that via the encoder->hotplug() hook.
v2: Force a full modeset to not follow the HDMI 2.0 spec more
closely (Shashank)
[pushed with whitespace fixes to make sparse happy]
Cc: Shashank Sharma <shashank.sharma@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Lyude Paul <lyude@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180117192149.17760-1-ville.syrjala@linux.intel.com
Allow encoders to customize their hotplug processing by moving the
intel_hpd_irq_event() code into an encoder hotplug vfunc. Currently
only SDVO needs this to re-enable hotplug signalling in the SDVO
chip. We'll use this same hook for DP/HDMI link management later.
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Lyude Paul <lyude@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180117192149.17760-1-ville.syrjala@linux.intel.com
No functional change since WA is already applied.
But since it has different names on different databases,
let's document it here to avoid future confusion.
Cc: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180306012812.19779-1-rodrigo.vivi@intel.com
No functional change. WA is already properly applied.
but in different databases it has different names.
Let's document all of them to avoid future confusion.
Cc: Rafael Antognolli <rafael.antognolli@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180306012000.18928-1-rodrigo.vivi@intel.com
In fact, apply the Cannonlake resolution check for all >= Gen-10 platforms
to be safe.
v3: Update GLK too. (Ville)
Longer variable names.
if-else in place of ternary operator.
v2: Use local variables for resolution limits and print them (Ville)
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Elio Martinez Monroy <elio.martinez.monroy@intel.com>
Signed-off-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180306203355.29292-1-dhinakaran.pandiyan@intel.com
The gfx/compute profiling mode switch is only for internally
test. Not a complete solution and unexpectly upstream.
so revert it.
Reviewed-by: Evan Quan <evan.quan@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Compute workload tends to be "bursty", Only tune the behavior of
nature dpm don't work well for most of such workloads. From test
results, Fix sclk in highest two levels can get better performance.
so add min sclk setting into the default cumpute workload policy on
smu7.
user still can change sclk range through sysfs pp_dpm_sclk
for better perf/watt.
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
It show what parameters can be configured to tune
the behavior of natural dpm for perf/watt on smu7.
user can select the mode per workload, but even the default per
workload settings are not bulletproof.
user can configure custom settings per different use case
for better perf or better perf/watt.
cat pp_power_profile_mode
NUM MODE_NAME SCLK_UP_HYST SCLK_DOWN_HYST SCLK_ACTIVE_LEVEL MCLK_UP_HYST MCLK_DOWN_HYST MCLK_ACTIVE_LEVEL
0 3D_FULL_SCREEN: 0 100 30 0 100 10
1 POWER_SAVING: 10 0 30 - - -
2 VIDEO: - - - 10 16 31
3 VR: 0 11 50 0 100 10
4 COMPUTE: 0 5 30 - - -
5 CUSTOM: 0 0 0 0 0 0
* CURRENT: 0 100 30 0 100 10
Under manual dpm level,
user can echo "0/1/2/3/4">pp_power_profile_mode
to select 3D_FULL_SCREEN/POWER_SAVING/VIDEO/VR/COMPUTE
mode.
echo "5 * * * * * * * *">pp_power_profile_mode
to set custom settings.
"5 * * * * * * * *" mean "CUSTOM enable_sclk SCLK_UP_HYST
SCLK_DOWN_HYST SCLK_ACTIVE_LEVEL enable_mclk MCLK_UP_HYST
MCLK_DOWN_HYST MCLK_ACTIVE_LEVEL"
if the parameter enable_sclk/enable_mclk is true,
driver will update the following parameters to dpm table.
if false, ignore the following parameters.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
use SW method to update DPM settings by updating SRAM
directly on CI.
Reviewed-by: Alex Deucher <alexdeucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
use SW method to update DPM settings by updating SRAM
directly on Tonga.
Reviewed-by: Alex Deucher <alexdeucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
use SW method to update DPM settings by updating SRAM
directly on Fiji.
Reviewed-by: Alex Deucher <alexdeucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Previously, we would spin waiting for all waiters to wake up and notice
their request had completed before we would reset the seqno upon
wraparound. However, we can mark their waits as complete and wake them
up directly using the existing machinery for handling the flushing of
missed wakeups when idling.
Suggested-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180306130143.13312-2-chris@chris-wilson.co.uk
Since commit fd10e2ce99 ("drm/i915/breadcrumbs: Ignore unsubmitted
signalers"), we cancel the signaler when retiring the request and so
upon wraparound, where we wait for all requests to be retired, we no
longer need to spin waiting for the signaling thread to release its
references to the in-flight requests, and so we can assert that the
signaler is idle.
References: fd10e2ce99 ("drm/i915/breadcrumbs: Ignore unsubmitted signalers")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180306130143.13312-1-chris@chris-wilson.co.uk
Reject requests to add properties/enums with an overly long name.
Previously we would have just silently truncated the string and exposed
it userspace.
v2: drm_property_create() returns a pointer
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180302140300.31110-1-ville.syrjala@linux.intel.com
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
When parking the engines and their breadcrumbs, if we have waiters left
then they missed their wakeup. Verify that each waiter's seqno did
complete.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180222092545.17216-2-chris@chris-wilson.co.uk
The goal here is to try and reduce the latency of signaling additional
requests following the wakeup from interrupt by reducing the list of
to-be-signaled requests from an rbtree to a sorted linked list. The
original choice of using an rbtree was to facilitate random insertions
of request into the signaler while maintaining a sorted list. However,
if we assume that most new requests are added when they are submitted,
we see those new requests in execution order making a insertion sort
fast, and the reduction in overhead of each signaler iteration
significant.
Since commit 56299fb7d9 ("drm/i915: Signal first fence from irq handler
if complete"), we signal most fences directly from notify_ring() in the
interrupt handler greatly reducing the amount of work that actually
needs to be done by the signaler kthread. All the thread is then
required to do is operate as the bottom-half, cleaning up after the
interrupt handler and preparing the next waiter. This includes signaling
all later completed fences in a saturated system, but on a mostly idle
system we only have to rebuild the wait rbtree in time for the next
interrupt. With this de-emphasis of the signaler's role, we want to
rejig it's datastructures to reduce the amount of work we require to
both setup the signal tree and maintain it on every interrupt.
References: 56299fb7d9 ("drm/i915: Signal first fence from irq handler if complete")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180222092545.17216-1-chris@chris-wilson.co.uk
The Amlogic Meson GX SoCs, embedded the v2.01a controller, has been also
identified needing this workaround.
This patch adds the corresponding version to enable a single iteration for
this specific version.
Fixes: be41fc55f1 ("drm: bridge: dw-hdmi: Handle overflow workaround based on device version")
Acked-by: Archit Taneja <architt@codeaurora.org>
[narmstrong: s/identifies/identified and rebased against Jernej's change]
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1519386277-25902-1-git-send-email-narmstrong@baylibre.com
error->device_info.has_guc, which we check in capture_uc_state, is set
in capture_gen_state, so the latter needs to be performed first.
v2: rebased
Reported-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Fixes: 7d41ef3479 (drm/i915: Add Guc/HuC firmware details to error state)
Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180305222122.3547-3-daniele.ceraolospurio@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
some of the static functions used from capture() have the "i915_"
prefix while other don't; most of them take i915 as a parameter, but one
of them derives it internally from error->i915. Let's be consistent by
avoiding prefix for static functions and by getting i915 from
error->i915. While at it, s/dev_priv/i915 in functions that don't
perform register reads.
v2: take i915 from error->i915 (Michal), s/dev_priv/i915,
update commit message
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180305222122.3547-2-daniele.ceraolospurio@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
The Parfait (version 2.1.0) static code analysis tool found the
following NULL pointer dereference problem.
- drivers/gpu/drm/drm_drv.c
Any calls to drm_minor_get_slot() could result in the return of a NULL
pointer when an invalid DRM device type is encountered. The
return of NULL was removed with BUG() from drm_minor_get_slot().
Signed-off-by: Joe Moriarty <joe.moriarty@oracle.com>
Reviewed-by: Steven Sistare <steven.sistare@oracle.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20180220191157.100960-3-joe.moriarty@oracle.com
The Parfait (version 2.1.0) static code analysis tool found the
following NULL pointer derefernce problem.
- drivers/gpu/drm/drm_vblank.c
Null pointer checks were added to return values from calls to
drm_crtc_from_index(). There is a possibility, however minute, that
crtc->index may not be found when trying to find the struct crtc
from it's assigned index given in drm_crtc_init_with_planes().
3 return checks for NULL where added with a call to
WARN_ON(!crtc).
Signed-off-by: Joe Moriarty <joe.moriarty@oracle.com>
Reviewed-by: Steven Sistare <steven.sistare@oracle.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20180220191157.100960-2-joe.moriarty@oracle.com
In XenGT, ioreq copy is used to trap mmio write and ppgtt write. Both
of them are memory write, ioreq handler couldn't distinguish them. So
ioreq handler probe the ppgtt write handler, if it is succuess, this
ioreq is ppgtt write, otherwise it is mmio write.
So ppgtt write handler should return an error at the failure of finding
page track, it is fatal to implement ioreq handler in XenGT.
Signed-off-by: Xiong Zhang <xiong.y.zhang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
page_track_handler take lock at the beginning, the lock should be released
at the failure of finding page track. Otherwise deadlock will happen.
Fixes: e502a2af4c ("drm/i915/gvt: Provide generic page_track infrastructure for write-protected page")
Signed-off-by: Xiong Zhang <xiong.y.zhang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Add a new debugfs entry kvmgt_nr_cache_entries under vgpu which shows
the number of entry in dma cache.
$ cat /sys/kernel/debug/gvt/vgpu1/kvmgt_nr_cache_entries
10101
v3: fix compiling error for some configuration. (Xiong Zhang <xiong.y.zhang@intel.com>)
v2: keep debugfs layout flat.
Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
The implementation of current kvmgt implicitly setup dma mapping at MPT
API gfn_to_mfn. First this design against the API's original purpose.
Second, there is no unmap hit in this design. The result is that the
dma mapping keep growing larger and larger. For mutl-vm case, they will
consume IOMMU IOVA low 4GB address space quickly and so tons of rbtree
entries crated in the IOMMU IOVA allocator. Finally, single IOVA
allocation can take as long as ~70ms. Such latency is intolerable.
To address both above issues, this patch introduced two new MPT API:
o dma_map_guest_page - setup dma map for guest page
o dma_unmap_guest_page - cancel dma map for guest page
The kvmgt implements these 2 API. And to reduce dma setup overhead for
duplicated pages (eg. scratch pages), two caches are used: one is for
mapping gfn to struct gvt_dma, another is for mapping dma addr to
struct gvt_dma.
With these 2 new API, the gtt now is able to cancel dma mapping when page
table is invalidated. The dma mapping is not in a gradual increase now.
v2: follow the old logic for VFIO_IOMMU_NOTIFY_DMA_UNMAP at this point.
Cc: Hang Yuan <hang.yuan@intel.com>
Cc: Xiong Zhang <xiong.y.zhang@intel.com>
Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Fix below error with minor code refactor.
CHECK drivers/gpu/drm/i915//gvt/handlers.c
drivers/gpu/drm/i915//gvt/handlers.c:203 sanitize_fence_mmio_access() error: 'vgpu' dereferencing possible ERR_PTR()
Reviewed-by: Zhi Wang <zhi.a.wang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Fix check error at
CHECK drivers/gpu/drm/i915//gvt/kvmgt.c
drivers/gpu/drm/i915//gvt/kvmgt.c:455 intel_vgpu_create() error: we previously assumed 'vgpu' could be null (see line 454)
For failed vgpu create, just show error return in failure message.
Reviewed-by: Zhi Wang <zhi.a.wang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
There is one issue relates to Coarse Power Gating(CPG) on KBL NUC in GVT-g,
vgpu can't get the correct default context by updating the registers before
inhibit context submission. It always get back the hardware default value
unless the inhibit context submission happened before the 1st time
forcewake put. With this wrong default context, vgpu will run with
incorrect state and meet unknown issues.
The solution is initialize these mmios by adding lri command in ring buffer
of the inhibit context, then gpu hardware has no chance to go down RC6 when
lri commands are right being executed, and then vgpu can get correct
default context for further use.
v3:
- fix code fault, use 'for' to loop through mmio render list(Zhenyu)
v4:
- save the count of engine mmio need to be restored for inhibit context and
refine some comments. (Kevin)
v5:
- code rebase
Cc: Kevin Tian <kevin.tian@intel.com>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Weinan Li <weinan.z.li@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
No functional change, just for easy to use.
v4:
- refine comment (Kevin)
Signed-off-by: Weinan Li <weinan.z.li@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
No functional change. This defination will also be used in future patchesi.
v4:
- refine patch description (Kevin)
Signed-off-by: Weinan Li <weinan.z.li@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
We don't know how many page tables will be shadowed. It varies
considerably corresponding to guest load. Radix tree is a better
choice for us. Since Page Frame Number is used as key so most of
the bits are common.
Here is some performance data (duration in us) of looking up a
element:
Before: (aka. ppgtt_find_shadow_page)
0.308 0.292 0.246 0.432 0.143 ... 0.311 0.225 0.382 0.199 0.325
After: (aka. intel_vgpu_find_spt_by_mfn)
0.106 0.106 0.107 0.106 0.105 0.107 ... 0.107 0.109 0.105 0.108
This time I didn't get the early data of hash table. The data is
measured when desktop is shown.
As last change, the overall benchmark almost is not changed, but
we get better scalability.
Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
This patch provide generic page_track infrastructure for write-protected
guest page. The old page_track logic gets rewrote and now stays in a new
standalone page_track.c. This page track infrastructure can be both used
by vGUC and GTT shadowing.
The important change is that it uses radix tree instead of hash table.
We don't have a predictable number of pages that will be tracked.
Here is some performance data (duration in us) of looking up a element:
Before: (aka. intel_vgpu_find_tracked_page)
0.091 0.089 0.090 ... 0.093 0.091 0.087 ... 0.292 0.285 0.292 0.291
After: (aka. intel_vgpu_find_page_track)
0.104 0.105 0.100 0.102 0.102 0.100 ... 0.101 0.101 0.105 0.105
The hash table has good performance at beginning, but turns bad with
more pages being tracked even no 3D applications are running. As
expected, radix tree has stable duration and very quick.
The overall benchmark (tested with Heaven Benchmark) marginally improved
since this is not the bottleneck. What we benefit more from this change
is scalability.
Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Don't extend page_track to mpt layer. Keep MPT simple and clean.
Meanwhile remove gtt.n_tracked_guest_page which doesn't make much
sense.
v2: clean up gtt.n_tracked_guest_page.
Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
The kvmgt's implementation of mpt api {set,unset}_wp_page is not real
write-protection - the data get written before invoke this two api.
As discussed, change the mpt api to match the real behavior.
Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
The target structure of some functions is struct intel_vgpu_ppgtt_spt and
their names are xxx_shadow_page. It should be xxx_shadow_page_table. Let's
use short name 'spt' instead to reduce the length. As well as the hash
table name.
Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
This is a another big one and the GVT shadow page management code is
heavily refined.
The new code only use struct intel_vgpu_ppgtt_spt to represent a vgpu
shadow page table - w/ or wo/ a guest page associated with. A pure shadow
page (no guest page associated) will be used to shadow splited 2M huge
gtt. In this case, the spt.guest_page.gfn should be a zero.
To search a existed shadow page table, we have two new interfaces:
- intel_vgpu_find_spt_by_gfn(), find a spt by guest gfn. It must not
be a pure spt.
- intel_vgpu_find_spt_by_mfn, Find the spt using shadow page mfn in
shadowed PTE.
The oos_page management is remained as what is was.
v2: Split some changes into small standalone patches.
Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Make the shadow PTE population code clear. Later we will add huge gtt
support based on this.
v2:
- rebase to latest code.
Signed-off-by: Changbin Du <changbin.du@intel.com>
Reviewed-by: Zhi Wang <zhi.wang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
GTT entry has similar format with the CPU PTE. We'd prefer named macro
instead of hardcode.
Signed-off-by: Changbin Du <changbin.du@intel.com>
Reviewed-by: Zhi Wang <zhi.a.wang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Factor out these two interfaces so we can kill some duplicated code in
scheduler.c.
v2:
- rename to intel_vgpu_{get,put}_ppgtt_mm
- refine handle_g2v_notification
Signed-off-by: Changbin Du <changbin.du@intel.com>
Reviewed-by: Zhi Wang <zhi.a.wang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Accurate names help to avoid confusing so improve readability.
Signed-off-by: Changbin Du <changbin.du@intel.com>
Reviewed-by: Zhi Wang <zhi.a.wang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
This add a new macro gvt_vdbg_mm() to print more verbose logs for
gtt shadowing. The added verbose logs are very useful for debugging.
gvt_vdbg_mm() only comes into effect if VERBOSE_DEBUG is defined by
the developer.
Signed-off-by: Changbin Du <changbin.du@intel.com>
Reviewed-by: Zhi Wang <zhi.a.wang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Less code and use existed helper ggtt_set_host_entry.
Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Separate ggtt and ppgtt since they are different. A little more code but
straightforward.
And move these helpers to gtt.c since that is the only client.
Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
If we manage an object with a reference count, then its life cycle
must flow the reference count operations. Meanwhile, change the
operation functions to generic name *put* and *get*.
Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
This is a big one and the GVT shadow graphic memory management code is
heavily refined. The new code is more straightforward with less code.
The struct intel_vgpu_mm is restructured to be clearly defined, use
accurate names and some of the original fields are removed which are
really redundant.
Now we only manage ppgtt mm object with mm->ppgtt_mm.lru_list. No need
to mix ppgtt and ggtt together, since one vGPU only has one ggtt object.
v4: Don't invoke ppgtt_free_all_shadow_page before intel_vgpu_destroy_all_ppgtt_mm.
v3: Add GVT_RING_CTX_NR_PDPS to avoid confusing about the PDPs.
v2: Split some changes into small standalone patches.
Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Always set the graphics values to the max for the
asic type. E.g., some 1 RB chips are actually 1 RB chips,
others are actually harvested 2 RB chips.
Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=99353
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Always set the graphics values to the max for the
asic type. E.g., some 1 RB chips are actually 1 RB chips,
others are actually harvested 2 RB chips.
Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=99353
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
v2: lock dpm level when update pptable by SW method
use SW method to update DPM settings by updating SRAM
directly on Polaris.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
it is used for adjust part of dpm settigs per workloads
to change the natural dpm behavior for better perf or perf/watt.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This features controls vega peak current protection to allow
for a wider compatibility with power supplies.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
used to set PccThrottleLevel and PccResidencyThreshold
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Get gpu info through adev directly in powerplay
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
it is required if a platform supports PCIe root complex
core voltage reduction. After receiving this notification,
SBIOS can apply default PCIe root complex power policy.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Include adev in powerplay instance.
so can visit adev directly instand of through cgs interface.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
use adev as input parameter to create powerplay instance
directly. delete cgs wrap layer for power play create.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
avoid build error:
drivers/gpu/drm/amd/amdgpu/../powerplay/inc/smu9_driver_if.h:342:3: error: redeclaration of enumerator ‘WM_COUNT’
WM_COUNT,
^
In file included from drivers/gpu/drm/amd/amdgpu/../display/dc/dm_services_types.h:32:0,
from drivers/gpu/drm/amd/amdgpu/../display/dc/dm_services.h:35,
from drivers/gpu/drm/amd/amdgpu/../display/modules/inc/mod_freesync.h:57,
from drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgpu_mode.h:48,
from drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgpu.h:55,
from drivers/gpu/drm/amd/amdgpu/../powerplay/inc/amd_powerplay.h:33,
from drivers/gpu/drm/amd/amdgpu/../powerplay/inc/smumgr.h:26,
from drivers/gpu/drm/amd/amdgpu/../powerplay/smumgr/vega10_smumgr.c:24:
drivers/gpu/drm/amd/amdgpu/../display/dc/dm_pp_smu.h:43:2: note: previous definition of ‘WM_COUNT’ was here
WM_COUNT,
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The below commit
"drm/atomic: Try to preserve the crtc enabled state in drm_atomic_remove_fb, v2"
introduces a slight behavioral change to rmfb. Instead of disabling a crtc
when the primary plane is disabled, it now preserves it.
This change leads to BUG hit while performing atomic commit on amd driver.
As a fix this patch ensures that we disable the CRTC's with NULL FB by returning
-EINVAL and hence triggering fall back to the old behavior and turning off the
crtc in atomic_remove_fb().
V2: Added error check for plane_state and removed sanity check for crtc.
Signed-off-by: Shirish S <shirish.s@amd.com>
Signed-off-by: Pratik Vishwakarma <Pratik.Vishwakarma@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
For consistency with other DCE generations.
HPD IRQs appear to be working fine.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The ring status can change during GPU reset, but we still need to be
able to schedule TTM buffer moves in the meantime.
Otherwise we can ran into problems because of aborted move/fill
operations during GPU resets.
v2: still check if ring is available during direct submit.
Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Chunming zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>