In the next patch, we are introducing a broad virtual engine to encompass
multiple physical engines, losing the 1:1 nature of BIT(engine->id). To
reflect the broader set of engines implied by the virtual instance, lets
store the full bitmask.
v2: Use intel_engine_mask_t (s/ring_mask/engine_mask/)
v3: Tvrtko voted for moah churn so teach everyone to not mention ring
and use $class$instance throughout.
v4: Comment upon the disparity in bspec for using VCS1,VCS2 in gen8 and
VCS[0-4] in later gen. We opt to keep the code consistent and use
0-index naming throughout.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190305180332.30900-1-chris@chris-wilson.co.uk
Mixed C99 and kernel types use is getting ugly. Prefer kernel types.
sed -i 's/\buint\(8\|16\|32\|64\)_t\b/u\1/g'
Acked-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Use INTEL_GEN to simplify the code for SKL+ platforms.
v2:
- split the enabling code into final one to identify any regression.
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Fei Jiang <fei.jiang@intel.com>
Signed-off-by: fred gao <fred.gao@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
It doesn't need to be changed, make it const. The string literals should
anyway be referred to as const data.
The following gets moved to rodata section:
0000000000000080 l O .rodata 0000000000001c00 cmd_info
Reviewed-by: Yan Zhao <yan.y.zhao@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
It doesn't need to be changed, make it const. The string literals should
anyway be referred to as const data.
The following gets moved to rodata section:
0000000000000410 l O .rodata 0000000000000018 decode_info_mi
0000000000000390 l O .rodata 0000000000000018 decode_info_3d_media
00000000000003e0 l O .rodata 0000000000000018 decode_info_2d
0000000000000330 l O .rodata 0000000000000018 decode_info_mfx_vc
00000000000002e0 l O .rodata 0000000000000018 decode_info_vebox
0000000000000300 l O .rodata 0000000000000028 sub_op_vebox
0000000000000360 l O .rodata 0000000000000028 sub_op_mfx_vc
00000000000003c0 l O .rodata 0000000000000020 sub_op_3d_media
0000000000000400 l O .rodata 0000000000000010 sub_op_2d
0000000000000430 l O .rodata 0000000000000010 sub_op_mi
Reviewed-by: Yan Zhao <yan.y.zhao@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
s/ME_SEMAPHORE_/MI_SEMAPHORE_
Signed-off-by: Xinyun Liu <xinyun.liu@intel.com>
Signed-off-by: Zhi Wang <zhi.a.wang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Merge tag 'gvt-next-2018-09-04'
drm-intel-next-2018-09-06-1:
UAPI Changes:
- GGTT coherency GETPARAM: GGTT has turned out to be non-coherent for some
platforms, which we've failed to communicate to userspace so far. SNA was
modified to do extra flushing on non-coherent GGTT access, while Mesa will
mitigate by always requiring WC mapping (which is non-coherent anyway).
- Neuter Resource Streamer uAPI: There never really were users for the feature,
so neuter it while keeping the interface bits for compatibility. This is a
long due item from past.
Cross-subsystem Changes:
- Backmerge of branch drm-next-4.19 for DP_DPCD_REV_14 changes
Core Changes:
- None
Driver Changes:
- A load of Icelake (ICL) enabling patches (Paulo, Manasi)
- Enabled full PPGTT for IVB,VLV and HSW (Chris)
- Bugzilla #107113: Distribute DDB based on display resolutions (Mahesh)
- Bugzillas #100023,#107476,#94921: Support limited range DP displays (Jani)
- Bugzilla #107503: Increase LSPCON timeout (Fredrik)
- Avoid boosting GPU due to an occasional stall in interactive workloads (Chris)
- Apply GGTT coherency W/A only for affected systems instead of all (Chris)
- Fix for infinite link training loop for faulty USB-C MST hubs (Nathan)
- Keep KMS functional on Gen4 and earlier when GPU is wedged (Chris)
- Stop holding ppGTT reference from closed VMAs (Chris)
- Clear error registers after error capture (Lionel)
- Various Icelake fixes (Anusha, Jyoti, Ville, Tvrtko)
- Add missing Coffeelake (CFL) PCI IDs (Rodrigo)
- Flush execlists tasklet directly from reset-finish (Chris)
- Fix LPE audio runtime PM (Chris)
- Fix detection of out of range surface positions (GLK/CNL) (Ville)
- Remove wait-for-idle for PSR2 (Dhinakaran)
- Power down existing display hardware resources when display is disabled (Chris)
- Don't allow runtime power management if RC6 doesn't exist (Chris)
- Add debugging checks for runtime power management paths (Imre)
- Increase symmetry in display power init/fini paths (Imre)
- Isolate GVT specific macros from i915_reg.h (Lucas)
- Increase symmetry in power management enable/disable paths (Chris)
- Increase IP disable timeout to 100 ms to avoid DRM_ERROR (Imre)
- Fix memory leak from HDMI HDCP write function (Brian, Rodrigo)
- Reject Y/Yf tiling on interlaced modes (Ville)
- Use a cached mapping for the physical HWS on older gens (Chris)
- Force slow path of writing relocations to buffer if unable to write to userspace (Chris)
- Do a full device reset after being wedged (Chris)
- Keep forcewake counts over reset (in case of debugfs user) (Imre, Chris)
- Avoid false-positive errors from power wells during init (Imre)
- Reset engines forcibly in exchange of declaring whole device wedged (Mika)
- Reduce context HW ID lifetime in preparation for Icelake (Chris)
- Attempt to recover from module load failures (Chris)
- Keep select interrupts over a reset to avoid missing/losing them (Chris)
- GuC submission backend improvements (Jakub)
- Terminate context images with BB_END (Chris, Lionel)
- Make GCC evaluate GGTT view struct size assertions again (Ville)
- Add selftest to exercise suspend/hibernate code-paths for GEM (Chris)
- Use a full emulation of a user ppgtt context in selftests (Chris)
- Exercise resetting in the middle of a wait-on-fence in selftests (Chris)
- Fix coherency issues on selftests for Baytrail (Chris)
- Various other GEM fixes / self-test updates (Chris, Matt)
- GuC doorbell self-tests (Daniele)
- PSR mode control through debugfs for IGTs (Maarten)
- Degrade expected WM latency errors to DRM_DEBUG_KMS (Chris)
- Cope with errors better in MST link training (Dhinakaran)
- Fix WARN on KBL external displays (Azhar)
- Power well code cleanups (Imre)
- Fixes to PSR debugging (Dhinakaran)
- Make forcewake errors louder for easier catching in CI (WARNs) (Chris)
- Fortify tiling code against programmer errors (Chris)
- Bunch of fixes for CI exposed corner cases (multiple authors, mostly Chris)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180907105446.GA22860@jlahtine-desk.ger.corp.intel.com
If a register is not cmd accessible, should not just print error
message. Return error here so as not to deliver this cmd.
v2: return -EBADRQC to align with return value elsewhere. (kevin tian)
Signed-off-by: Zhao Yan <yan.y.zhao@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
MI_NOOP is a common command appearing in almost all command buffers, put it
into a fastpath can improve perfomance, especially in command buffers
contains lots of MI_NOOPs (0s).
Take glmark2 as an example, 3% performance increase is observed after
introduced this patch. Meanwhile, in case where abundant in MI_NOOPs,
up to 12% performance increase is measured.
v2: use lowercase for index of MI_NOOP in cmd_info (zhenyu wang)
Signed-off-by: Li Weinan <weinan.z.li@intel.com>
Signed-off-by: Zhao Yan <yan.y.zhao@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Commit cd7e 61b9"init mmio by lri command in vgpu inhibit context"
initializes registers saved/restored in context with its vreg value
through lri command in ring buffer. It relies on vreg got updated
on every guest access. There is a case found that Linux guest uses
lri command in inhibit-ctx to update the register. This patch adds
vreg update on this case.
v2: move mmio_attribute functions to gvt.h (Zhenyu)
v3: use mask_mmio_write in vreg update
v4: refine codes and add more comments (Zhenyu)
Fixes: cd7e61b9("drm/i915/gvt: init mmio by lri command in vgpu inhibit context")
Signed-off-by: Hang Yuan <hang.yuan@linux.intel.com>
Signed-off-by: Weinan Li <weinan.z.li@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
on all platforms back to HSW. As well many other fix and improvements,
Including:
- Use GEM suspend when aborting initialization (Chris)
- Change i915_gem_fault to return vm_fault_t (Chris)
- Expand VMA to Non gem object entities (Chris)
- Improve logs for load failure, but quite logging on fault injection to avoid noise on CI (Chris)
- Other page directory handling fixes and improvements for gen6 (Chris)
- Other gtt clean-up removing redundancies and unused checks (Chris)
- Reorder aliasing ppgtt fini (Chris)
- Refactor of unsetting obg->mm.pages (Chris)
- Apply batch location restrictions before pinning (Chris)
- Ringbuffer fixes for context restore (Chris)
- Execlist fixes on freeing error pointer on allocation error (Chris)
- Make closing request flush mandatory (Chris)
- Move GEM sanitize from resume_early to resume (Chris)
- Improve debug dumps (Chris)
- Silent compiler for selftest (Chris)
- Other execlists changes to improve hangcheck and reset.
- Many gtt page directory fixes and improvements (Chris)
- Reorg context workarounds (Chris)
- Avoid ERR_PTR dereference on selftest (Chris)
Other GEM related work:
- Stop trying to reset GPU if reset failed (Mika)
- Add HW workaround for KBL to fix GPU reset (Mika)
- Fix context ban and hang accounting for client (Mika)
- Fixes on OA perf (Michel, Jani)
- Refactor on GuC log mechanisms (Piotr)
- Enable provoking vertex fix on Gen9 system (Kenneth)
More ICL patches for Display enabling:
- ICL - 10-bit support for HDMI (RK)
- ICL - Start adding TBT PLL (Paulo)
- ICL - DDI HDMK level selection (Manasi)
- ICL - GMBUS GPIO pin mapping fix (Mahesh)
- ICL - Adding DP_AUX_E support (James)
- ICL - Display interrupts handling (DK)
Other display fixes and improvements:
- Fix sprite destination color keying on SKL+ (Ville)
- Fixes and improvements on PCH detection, specially for non PCH systems (Jani)
- Document PCH_NOP (Lucas)
- Allow DBLSCAN user modes with eDP/LVDS/DSI (Ville)
- Opregion and ACPI cleanup and organization (Jani)
- Kill delays when activation psr (Rodrigo)
- ...and a consequent fix of the psr activation flow (DK)
- Fix HDMI infoframe setting (Imre)
- Fix Display interrupts and modes on old gens (Ville)
- Start switching to kernel unsigned int types (Jani)
- Introduction to Amber Lake and Whiskey Lake platforms (Jose)
- Audio clock fixes for HBR3 (RK)
- Standardize i915_reg.h definitions according to our doc and checkpatch (Paulo)
- Remove unused timespec_to_jiffies_timeout function (Arnd)
- Increase the scope of PSR wake fix for other VBTs out there (Vathsala)
- Improve debug msgs with prop name/id (Ville)
- Other clean up on unecessary cursor size defines (Ville)
- Enforce max hdisplay/hblank_start limits on HSW/BDW (Ville)
- Make ELD pointers constant (Jani)
- Fix for PSR VBT parse (Colin)
- Add warn about unsupported CDCLK rates (Imre)
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJbKsMqAAoJEPpiX2QO6xPKI64H/0dHkMxw7/D83eODTJteDFBN
h3tdBnLFlPfeG3ZWDeSs04/dM4e9YacMN7v53j1ia4eW/F1ms0TLcegcuPqYafTW
H8fhwGB2B5gmr5hLfh5joQkxvaucQMFdg95fWRqir93VrKvVJAJEYNcaiGniejDf
qqiZue6DgAzli0zjAprfbQsnJ17TyRtnxm8lLIcFcHPoayHBzAUBZQEP6cA5qe/Y
/2ahGfkYOVVWY08DHaioDBOLUEUbxCC1AvMlv9VbtKmyPoQjTIW/1iTq0RRxDoGb
BwfDvigSiFAmpYEfVENB0qUd9e/0WhMboSnMrfzEcF2yUn4xoJx5nbmkRFkr1jI=
=mfO6
-----END PGP SIGNATURE-----
Merge tag 'drm-intel-next-2018-06-20' of git://anongit.freedesktop.org/drm/drm-intel into drm-next
Chris is doing many reworks that allow us to get full-ppgtt supported
on all platforms back to HSW. As well many other fix and improvements,
Including:
- Use GEM suspend when aborting initialization (Chris)
- Change i915_gem_fault to return vm_fault_t (Chris)
- Expand VMA to Non gem object entities (Chris)
- Improve logs for load failure, but quite logging on fault injection to avoid noise on CI (Chris)
- Other page directory handling fixes and improvements for gen6 (Chris)
- Other gtt clean-up removing redundancies and unused checks (Chris)
- Reorder aliasing ppgtt fini (Chris)
- Refactor of unsetting obg->mm.pages (Chris)
- Apply batch location restrictions before pinning (Chris)
- Ringbuffer fixes for context restore (Chris)
- Execlist fixes on freeing error pointer on allocation error (Chris)
- Make closing request flush mandatory (Chris)
- Move GEM sanitize from resume_early to resume (Chris)
- Improve debug dumps (Chris)
- Silent compiler for selftest (Chris)
- Other execlists changes to improve hangcheck and reset.
- Many gtt page directory fixes and improvements (Chris)
- Reorg context workarounds (Chris)
- Avoid ERR_PTR dereference on selftest (Chris)
Other GEM related work:
- Stop trying to reset GPU if reset failed (Mika)
- Add HW workaround for KBL to fix GPU reset (Mika)
- Fix context ban and hang accounting for client (Mika)
- Fixes on OA perf (Michel, Jani)
- Refactor on GuC log mechanisms (Piotr)
- Enable provoking vertex fix on Gen9 system (Kenneth)
More ICL patches for Display enabling:
- ICL - 10-bit support for HDMI (RK)
- ICL - Start adding TBT PLL (Paulo)
- ICL - DDI HDMK level selection (Manasi)
- ICL - GMBUS GPIO pin mapping fix (Mahesh)
- ICL - Adding DP_AUX_E support (James)
- ICL - Display interrupts handling (DK)
Other display fixes and improvements:
- Fix sprite destination color keying on SKL+ (Ville)
- Fixes and improvements on PCH detection, specially for non PCH systems (Jani)
- Document PCH_NOP (Lucas)
- Allow DBLSCAN user modes with eDP/LVDS/DSI (Ville)
- Opregion and ACPI cleanup and organization (Jani)
- Kill delays when activation psr (Rodrigo)
- ...and a consequent fix of the psr activation flow (DK)
- Fix HDMI infoframe setting (Imre)
- Fix Display interrupts and modes on old gens (Ville)
- Start switching to kernel unsigned int types (Jani)
- Introduction to Amber Lake and Whiskey Lake platforms (Jose)
- Audio clock fixes for HBR3 (RK)
- Standardize i915_reg.h definitions according to our doc and checkpatch (Paulo)
- Remove unused timespec_to_jiffies_timeout function (Arnd)
- Increase the scope of PSR wake fix for other VBTs out there (Vathsala)
- Improve debug msgs with prop name/id (Ville)
- Other clean up on unecessary cursor size defines (Ville)
- Enforce max hdisplay/hblank_start limits on HSW/BDW (Ville)
- Make ELD pointers constant (Jani)
- Fix for PSR VBT parse (Colin)
- Add warn about unsupported CDCLK rates (Imre)
Signed-off-by: Dave Airlie <airlied@redhat.com>
# gpg: Signature made Thu 21 Jun 2018 07:12:10 AM AEST
# gpg: using RSA key FA625F640EEB13CA
# gpg: Good signature from "Rodrigo Vivi <rodrigo.vivi@intel.com>"
# gpg: aka "Rodrigo Vivi <rodrigo.vivi@gmail.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 6D20 7068 EEDD 6509 1C2C E2A3 FA62 5F64 0EEB 13CA
Link: https://patchwork.freedesktop.org/patch/msgid/20180625165622.GA21761@intel.com
Handle BXT cmd_parser as SKL/KBL.
v2: All supported platforms share the same routines.
Remove the platform check by now and let is_supported_device()
be the gate keeper.
Signed-off-by: Colin Xu <colin.xu@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
the cmd_reg_handler() is called by cmds LRM, PIPE_CTRL, SRM...
for LRM, SRM, we cannot get write data in a simple way.
On other side, the force_to_nonpriv reigsters will only be written in LRI
in current drivers. so we don't want to bother the handler to handle those
memory access cmds, just leave a print message here.
Signed-off-by: Zhao Yan <yan.y.zhao@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Return error will cause vm hang and enter failsafe mode.
However, we don't want that happen on detecting an wrong force_to_nonpriv
register write.
Therefore, we just omit the wrong write or patch it to default value.
v2: only return 0 on detecting lri write of registers outside whitelist,
but still return error on other error conditions. (zhenyu wang)
Signed-off-by: Zhao Yan <yan.y.zhao@intel.com>
Reviewed-by: Zhang Yulei <yulei.zhang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Each ring has a NOPID register and currently they are regarded as default
value of force_to_nonpriv registers in guest drivers
Signed-off-by: Zhao Yan <yan.y.zhao@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
For perfomance purpose, scanning of non-privileged batch buffer is turned
off by default. But for debugging purpose, it can be turned on via debugfs.
After scanning, we submit the original non-privileged batch buffer into
hardware, so that the scanning is only a peeking window of guest submitted
commands and will not affect the execution results.
v4:
- refine debugfs print format&content (zhenyu wang)
- print engine id instread of engine name to prevent potential memory leak
in debugfs warning message. (zhenyu wang)
v3:
- change vgpu->scan_nonprivbb from type bool to u32, so it is able to
selectively turn on/off scanning of non-privileged batch buffer on engine
level. e.g.
if vgpu->scan_nonprivbb=3, then it will scan non-privileged batch buffer
on engine 0 and 1.
- in debugfs interface to set vgpu->scan_nonprivbb, print warning message
to warn user and explicitly tell state change in kernel log (zhenyu wang)
v2:
- rebase
- update comments for start_gma_offset (henry)
Signed-off-by: Zhao Yan <yan.y.zhao@intel.com>
Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Zhi Wang <zhi.a.wang@intel.com>
The error exit path when a duplicate is found does not kfree and cmd_entry
struct and hence there is a small memory leak. Fix this by kfree'ing it.
Detected by CoverityScan, CID#1370198 ("Resource Leak")
Fixes: be1da7070a ("drm/i915/gvt: vGPU command scanner")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
GVT-g dispatches request to host i915 and depends on i915 notify
ring interrupt mechanism to check completion of request.
For now MI_USER_INTERRUPT in guest requests is passed through
in GVT-g cmd parser and i915 does not use it, which causes
unnecessary interrupt handling in i915.
On the other hand, if several requests from guest are combined into
one request in and contain MI_USER_INTERRUPT in the middle of
combined request. GVT-g still has to wait on the whole request to
complete to inject user interrupts to guest.
This patch makes all the MI_USER_INTERRUPT nop to save some interrupt
handling.
Here is test result to run glmark2 on guest for 10 seconds:
host master interrupts number is reduced from 16021 to 11162
host user interrupts number is reduced from 7936 to 3536
v2:
- revise commit message. (Kevin)
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Zhipeng Gong <zhipeng.gong@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Once the ring buffer is copied to ring_scan_buffer and scanned,
the shadow batch buffer start address is only updated into
ring_scan_buffer, not the real ring address allocated through
intel_ring_begin in later copy_workload_to_ring_buffer.
This patch is only to set the right shadow batch buffer address
from Ring buffer, not include the shadow_wa_ctx.
v2:
- refine some comments. (Zhenyu)
v3:
- fix typo in title. (Zhenyu)
v4:
- remove the unnecessary comments. (Zhenyu)
- add comments in bb_start_cmd_va update. (Zhenyu)
Fixes: 0a53bc07f0 ("drm/i915/gvt: Separate cmd scan from request allocation")
Cc: stable@vger.kernel.org # v4.15
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Cc: Yulei Zhang <yulei.zhang@intel.com>
Signed-off-by: fred gao <fred.gao@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJaW+iVAAoJEHm+PkMAQRiGCDsIAJALNpX7odTx/8y+yCSWbpBH
E57iwr4rmnI6tXJY6gqBUWTYnjAcf4b8IsHGCO6q3WIE3l/kt+m3eA21a32mF2Db
/bfPGTOWu5LoOnFqzgH2kiFuC3Y474toxpld2YtkQWYxi5W7SUtIHi/jGgkUprth
g15yPfwYgotJd/gpmPfBDMPlYDYvLlnPYbTG6ZWdMbg39m2RF2m0BdQ6aBFLHvbJ
IN0tjCM6hrLFBP0+6Zn60pevUW9/AFYotZn2ankNTk5QVCQm14rgQIP+Pfoa5WpE
I25r0DbkG2jKJCq+tlgIJjxHKD37GEDMc4T8/5Y8CNNeT9Q8si9EWvznjaAPazw=
=o5gx
-----END PGP SIGNATURE-----
BackMerge tag 'v4.15-rc8' into drm-next
Linux 4.15-rc8
Daniel requested this for so the intel CI won't fall over on drm-next
so often.
We had previous hack that tried to accept either i915_reg_t or offset
value to access vGPU virtual/shadow regs which broke that purpose to
be type safe in context. This one trys to explicitly separate the usage
of typed mmio reg with real offset.
Old vgpu_vreg(offset) helper is used only for offset now with new
vgpu_vreg_t(reg) is used for i915_reg_t only. Convert left usage
of that to new helper.
Also fixed left KASAN warning issues caused by previous hack.
v2: rebase, fixup against recent mmio switch change
Reviewed-by: Zhi Wang <zhi.a.wang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Save and restore the mocs regs of one VM in GVT-g burning too much CPU
utilization. Add LRI command scan to monitor the change of mocs registers,
save the state in vreg, and use delta update policy to restore them.
It can obviously reduce the MMIO r/w count, and improve the performance
of context switch.
Signed-off-by: Weinan Li <weinan.z.li@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
An earlier fix changed the return type from find_bb_size however the
integer return is being assigned to a unsigned int so the -ve error
check will never be detected. Make bb_size an int to fix this.
Detected by CoverityScan CID#1456886 ("Unsigned compared against 0")
Fixes: 1e3197d6ad ("drm/i915/gvt: Refine error handling for perform_bb_shadow")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
(cherry picked from commit 24f8a29af4)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
I have seen the cmd parser dump partial odd info. Stop that and only dump
the full verbose info when debug enabled.
Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
As there is already an I915_GTT_PAGE_SIZE marco in i915, let GVT-g use it
as well. Also this patch re-names some GTT marcos with additional prefix.
Signed-off-by: Zhi Wang <zhi.a.wang@intel.com>
1) Use standard i915 GEM object sequence to access the shadow batch buffer.
2) Manage i915 vma life cycle to solve one FIXME.
v2:
- Refine code structure.
- Refine the usage of GEM APIs.
- Add the missing lock/unlock in release_shadow_batch_buffer.
Test on my SKL NuC.
Signed-off-by: Zhi Wang <zhi.a.wang@intel.com>
Returns the error code if something is wrong and the size of batch buffer
is passed through the pointer.
Signed-off-by: Zhi Wang <zhi.a.wang@intel.com>
An earlier fix changed the return type from find_bb_size however the
integer return is being assigned to a unsigned int so the -ve error
check will never be detected. Make bb_size an int to fix this.
Detected by CoverityScan CID#1456886 ("Unsigned compared against 0")
Fixes: 1e3197d6ad ("drm/i915/gvt: Refine error handling for perform_bb_shadow")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Generally, there are 3 types of errors during command scan: a) some
commands might be unknown with EBADRQC; b) some cmd access invalid
address with EFAULT; c) some unexpected force nonpriv cmd with EPERM.
later the healthy state can be judged through the return error.
v2:
- remove some internal i915 errors rating. (Zhenyu)
v3:
- the healthy state is judged through the internal defined return
error. (Zhenyu)
- force non priv cmd error can be ignored. (Kevin)
v4:
- reuse standard defined errno instead of recreate, e.g EBADRQC for
unknown cmd, EFAULT for invalid address, EPERM for nonpriv. (Zhenyu)
v5:
- remove some irrelevant code for the patch.
- fix typo of vgpu_is_vm_unhealthy. (Zhenyu)
v6:
- move the healthy check and failsafe code into another patch. (Zhenyu)
v7:
- polish title and commit message. (Zhenyu)
Signed-off-by: fred gao <fred.gao@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Move ring scan buffers into intel_vgpu_submission since they belongs to
a part of vGPU submission stuffs.
Signed-off-by: Zhi Wang <zhi.a.wang@intel.com>
"reserved" means reserve something from somewhere. Actually they are
buffers used by command scanner. Rename it to ring_scan_buffer.
v2:
- Remove the usage of an extra variable. (Zhenyu)
Fixes: 0a53bc07f0 ("drm/i915/gvt: Separate cmd scan from request allocation")
Signed-off-by: Zhi Wang <zhi.a.wang@intel.com>
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJZ9kEFAAoJEHm+PkMAQRiGw6wH/0j197qyGd0hkVFMJO6LAgN3
KQWS4nZ5BkVDocwv0RVnUJTtXqU1eozFgdVEtSoaFXpzlHGuptR2Tau9efDCJ7w3
/utZxqvhGebZd2T+j+/o/LE8BRQxhADBNJq2D/o0WNt8ecxuG0GIkhkEYt/o3z1v
/sxlwVwzXB7Dc/h1WcgGJG7cS6L9KzzAzGAS/iNvdFrPOygHBv8c0MxVZIiBIeeK
1nZdyvbyM8uenSyG+prGt9ENrqXZxxfwUxIchi2V7A9m1WmD5zijNkf1JCWji/O+
UsA1auxna7MwoxjxqZuGm4MlKOwZ+8xutk4JGgc+aP/ulndJbJYu+4op/3vaFBM=
=Mhx+
-----END PGP SIGNATURE-----
Backmerge tag 'v4.14-rc7' into drm-next
Linux 4.14-rc7
Requested by Ben Skeggs for nouveau to avoid major conflicts,
and things were getting a bit conflicty already, esp around amdgpu
reverts.
Need to check valid state for per_ctx bb and bypass batch buffer
combine for scan if necessary. Otherwise adding invalid MI batch
buffer start cmd for per_ctx bb will cause scan failure, which is
taken as -EFAULT now so vGPU would be put in failsafe. This trys
to fix that by checking per_ctx bb valid state. Also remove old
invalid WARNING that indirect ctx bb shouldn't depend on valid
per_ctx bb.
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Zhi Wang <zhi.a.wang@intel.com>
fix the wrong return type and return error once the unknown
command is scanned.
v2:
- separate this error handle from healthy rating code. (Zhenyu)
Signed-off-by: fred gao <fred.gao@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Currently i915 request structure and shadow ring buffer are allocated
before command scan, so it will have to restore to previous states once
any error happens afterwards in the long dispatch_workload path.
This patch is to introduce a reserved ring buffer created at the beginning
of vGPU initialization. Workload will be coped to this reserved buffer and
be scanned first, the i915 request and shadow ring buffer are only
allocated after the result of scan is successful.
To balance the memory usage and buffer alloc time, the coming bigger ring
buffer will be reallocated and kept until more bigger buffer is coming.
v2:
- use kmalloc for the smaller ring buffer, realloc if required. (Zhenyu)
v3:
- remove the dynamically allocated ring buffer. (Zhenyu)
v4:
- code style polish.
- kfree previous allocated buffer once kmalloc failed. (Zhenyu)
Signed-off-by: fred gao <fred.gao@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJZpRPIAAoJEAx081l5xIa+kCIP/2m2q0jBmCATvXXwrMBH0zNk
4lm9yIfl9pmluJP97aklvkeKF77chhost76+hv+0sQ9ZsJD8koHWv5WyTHEs7Cfn
NpmtGPqYlIZsWNSwW0OFF/XzllgLCVEWa+W/7ryYzPZrSEZr6Ge4HE0qS3LfuLJv
K89amZWHkP5ysPZ1uxRBzHtZfNAhdyjYVTUntCR7gj3DYv3yNdeZu+/epfcWK2w/
Q+ggoy644vX/yzy5L5zCGL/J1BjStDuec7sgAKTlNx4TwBUmp2wsfhEdovQBGFiu
t5PHMajvrBRqSJWDIAZSUfjQzIMSz517J9LWeChU7KtAClNJQJEabbu4CoX4aEmG
UbSzEe0IxnxQ4842jcqQXZ+mevlNIEIBVSNR7dXi17jL3Ts+APQgrYjRJYVk2ipg
uQ9TwkeVVu2WRGyU8iRQrXAZI7+O3p4UnbNPjeG2qACD2Ur7Z3n7b0mhNFPOLzO4
gbIv4D6CcUB/vltl+vhZTW3P50oMCVSq8ScCpY8CGo29mZ5vypj5PTS+W8FsyY3Z
ypyMqWg/DyxKlOoO+aK8EmXuZmgtDR4kb8asltH/S1A0NZkzjrFkKgs10Cp6EjJy
Zz1BWa1KKEpdN6yp+jrbJKjf9MJ7K2RPGv3bxWnCCdNv4j49rk4t3IHqvcihddsd
XXFQB5zE7Pz0ROi/VkXR
=5fxW
-----END PGP SIGNATURE-----
Merge tag 'drm-for-v4.14' of git://people.freedesktop.org/~airlied/linux
Pull drm updates from Dave Airlie:
"This is the main drm pull request for 4.14 merge window.
I'm sending this early, as my continuing journey into fatherhood is
occurring really soon now, I'm going to be mostly useless for the next
couple of weeks, though I may be able to read email, I doubt I'll be
doing much patch applications or git sending. If anything urgent pops
up I've asked Daniel/Jani/Alex/Sean to try and direct stuff towards
you.
Outside drm changes:
Some rcar-du updates that touch the V4L tree, all acks should be in
place. It adds one export to the radix tree code for new i915 use
case. There are some minor AGP cleanups (don't see that too often).
Changes to the vbox driver in staging to avoid breaking compilation.
Summary:
core:
- Atomic helper fixes
- Atomic UAPI fixes
- Add YCBCR 4:2:0 support
- Drop set_busid hook
- Refactor fb_helper locking
- Remove a bunch of internal APIs
- Add a bunch of better default handlers
- Format modifier/blob plane property added
- More internal header refactoring
- Make more internal API names consistent
- Enhanced syncobj APIs (wait/signal/reset/create signalled)
bridge:
- Add Synopsys Designware MIPI DSI host bridge driver
tiny:
- Add Pervasive Displays RePaper displays
- Add support for LEGO MINDSTORMS EV3 LCD
i915:
- Lots of GEN10/CNL support patches
- drm syncobj support
- Skylake+ watermark refactoring
- GVT vGPU 48-bit ppgtt support
- GVT performance improvements
- NOA change ioctl
- CCS (color compression) scanout support
- GPU reset improvements
amdgpu:
- Initial hugepage support
- BO migration logic rework
- Vega10 improvements
- Powerplay fixes
- Stop reprogramming the MC
- Fixes for ACP audio on stoney
- SR-IOV fixes/improvements
- Command submission overhead improvements
amdkfd:
- Non-dGPU upstreaming patches
- Scratch VA ioctl
- Image tiling modes
- Update PM4 headers for new firmware
- Drop all BUG_ONs.
nouveau:
- GP108 modesetting support.
- Disable MSI on big endian.
vmwgfx:
- Add fence fd support.
msm:
- Runtime PM improvements
exynos:
- NV12MT support
- Refactor KMS drivers
imx-drm:
- Lock scanout channel to improve memory bw
- Cleanups
etnaviv:
- GEM object population fixes
tegra:
- Prep work for Tegra186 support
- PRIME mmap support
sunxi:
- HDMI support improvements
- HDMI CEC support
omapdrm:
- HDMI hotplug IRQ support
- Big driver cleanup
- OMAP5 DSI support
rcar-du:
- vblank fixes
- VSP1 updates
arcgpu:
- Minor fixes
stm:
- Add STM32 DSI controller driver
dw_hdmi:
- Add support for Rockchip RK3399
- HDMI CEC support
atmel-hlcdc:
- Add 8-bit color support
vc4:
- Atomic fixes
- New ioctl to attach a label to a buffer object
- HDMI CEC support
- Allow userspace to dictate rendering order on submit ioctl"
* tag 'drm-for-v4.14' of git://people.freedesktop.org/~airlied/linux: (1074 commits)
drm/syncobj: Add a signal ioctl (v3)
drm/syncobj: Add a reset ioctl (v3)
drm/syncobj: Add a syncobj_array_find helper
drm/syncobj: Allow wait for submit and signal behavior (v5)
drm/syncobj: Add a CREATE_SIGNALED flag
drm/syncobj: Add a callback mechanism for replace_fence (v3)
drm/syncobj: add sync obj wait interface. (v8)
i915: Use drm_syncobj_fence_get
drm/syncobj: Add a race-free drm_syncobj_fence_get helper (v2)
drm/syncobj: Rename fence_get to find_fence
drm: kirin: Add mode_valid logic to avoid mode clocks we can't generate
drm/vmwgfx: Bump the version for fence FD support
drm/vmwgfx: Add export fence to file descriptor support
drm/vmwgfx: Add support for imported Fence File Descriptor
drm/vmwgfx: Prepare to support fence fd
drm/vmwgfx: Fix incorrect command header offset at restart
drm/vmwgfx: Support the NOP_ERROR command
drm/vmwgfx: Restart command buffers after errors
drm/vmwgfx: Move irq bottom half processing to threads
drm/vmwgfx: Don't use drm_irq_[un]install
...
once error happens in shadow_indirect_ctx function, the variable
wa_ctx->indirect_ctx.obj is not initialized but accessed, so the
kernel null point panic occurs.
Fixes: 894cf7d156 ("drm/i915/gvt: i915_gem_object_create() returns an error pointer")
Cc: stable@vger.kernel.org # v4.8+
Signed-off-by: fred gao <fred.gao@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Use the exist function intel_gvt_ggtt_validate_range to replace
these duplicated code that do the same thing.
Signed-off-by: Ping Gao <ping.a.gao@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
To perform the workload scan and shadow in ELSP writing stage for
performance consideration, the workload scan and shadow stuffs
should be factored out from dispatch_workload().
v2:Put context pin before i915_add_request;
Refine the comments;
Rename some APIs;
v3:workload->status should set only when error happens.
v4:i915_add_request is must to have after i915_gem_request_alloc.
Signed-off-by: Ping Gao <ping.a.gao@intel.com>
Reviewed-by: Zhi Wang <zhi.a.wang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>