For the vGPU workloads, now GVT-g use per vGPU scheduler, the per-ring
work_thread only pick workload belongs to the current vGPU. And with time
slice based scheduler, it waits all the engines become idle before do vGPU
switch. So we can run free dispatch in per-ring work_thread, different ring
running in different 'vGPU' won't happen.
For the workloads between vGPU and Host, this scheduler_mutex can't block
host to dispatch workload into other ring engines.
Here remove this mutex since it impacts the performance when applications
use more than 1 ring engines in 1 vgpu.
ring0 running in vGPU1, ring1 running in Host. Will happen.
ring0 running in vGPU1, ring1 running in vGPU2. Won't happen.
Signed-off-by: Weinan Li <weinan.z.li@intel.com>
Signed-off-by: Ping Gao <ping.a.gao@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
The command buffer address in context like ring buffer base address
and wa_ctx address need to be audit to make sure they are in the
valid GGTT range.
Signed-off-by: Ping Gao <ping.a.gao@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
It will causes memory leak, if the function setup_spt_oos() fail,
in the function intel_gvt_init_gtt(),
which allocated by get_zeroed_page() and mapped by dma_map_page().
Unmap and free the page, after STP oos initialize fail,
it will fix this issue.
Signed-off-by: Zhou, Wenjia <zhiyuan_zhu@htc.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Resync with the main drm-next pull request for 4.13. What we really
need is to fully resync with pending drm-misc, but that's not yet
possible due to the still ongoing merge window.
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Pull drm updates from Dave Airlie:
"This is the main pull request for the drm, I think I've got one later
driver pull for mediatek SoC driver, I'm undecided on if it needs to
go to you yet.
Otherwise summary below:
Core drm:
- Atomic add driver private objects
- Deprecate preclose hook in modern drivers
- MST bandwidth tracking
- Use kvmalloc in more places
- Add mode_valid hook for crtc/encoder/bridge
- Reduce sync_file construction time
- Documentation updates
- New DRM synchronisation object support
New drivers:
- pl111 - pl111 CLCD display controller
Panel:
- Innolux P079ZCA panel driver
- Add NL12880B20-05, NL192108AC18-02D, P320HVN03 panels
- panel-samsung-s6e3ha2: Add s6e3hf2 panel support
i915:
- SKL+ watermark fixes
- G4x/G33 reset improvements
- DP AUX backlight improvements
- Buffer based GuC/host communication
- New getparam for (sub)slice infomation
- Cannonlake and Coffeelake initial patches
- Execbuf optimisations
radeon/amdgpu:
- Lots of Vega10 bug fixes
- Preliminary raven support
- KIQ support for compute rings
- MEC queue management rework
- DCE6 Audio support
- SR-IOV improvements
- Better radeon/amdgpu selection support
nouveau:
- HDMI stereoscopic support
- Display code rework for >= GM20x GPUs
msm:
- GEM rework for fine-grained locking
- Per-process pagetable work
- HDMI fixes for Snapdragon 820.
vc4:
- Remove 256MB CMA limit from vc4
- Add out-fence support
- Add support for cygnus
- Get/set tiling ioctls support
- Add T-format tiling support for scanout
zte:
- add VGA support.
etnaviv:
- Thermal throttle support for newer GPUs
- Restore userspace buffer cache performance
- dma-buf sync fix
stm:
- add stm32f429 display support
exynos:
- Rework vblank handling
- Fixup sw-trigger code
sun4i:
- V3s display engine support
- HDMI support for older SoCs
- Preliminary work on dual-pipeline SoCs.
rcar-du:
- VSP work
imx-drm:
- Remove counter load enable from PRE
- Double read/write reduction flag support
tegra:
- Documentation for the host1x and drm driver.
- Lots of staging ioctl fixes due to grate project work.
omapdrm:
- dma-buf fence support
- TILER rotation fixes"
* tag 'drm-for-v4.13' of git://people.freedesktop.org/~airlied/linux: (1270 commits)
drm: Remove unused drm_file parameter to drm_syncobj_replace_fence()
drm/amd/powerplay: fix bug fail to remove sysfs when rmmod amdgpu.
amdgpu: Set cik/si_support to 1 by default if radeon isn't built
drm/amdgpu/gfx9: fix driver reload with KIQ
drm/amdgpu/gfx8: fix driver reload with KIQ
drm/amdgpu: Don't call amd_powerplay_destroy() if we don't have powerplay
drm/ttm: Fix use-after-free in ttm_bo_clean_mm
drm/amd/amdgpu: move get memory type function from early init to sw init
drm/amdgpu/cgs: always set reference clock in mode_info
drm/amdgpu: fix vblank_time when displays are off
drm/amd/powerplay: power value format change for Vega10
drm/amdgpu/gfx9: support the amdgpu.disable_cu option
drm/amd/powerplay: change PPSMC_MSG_GetCurrPkgPwr for Vega10
drm/amdgpu: Make amdgpu_cs_parser_init static (v2)
drm/amdgpu/cs: fix a typo in a comment
drm/amdgpu: Fix the exported always on CU bitmap
drm/amdgpu/gfx9: gfx_v9_0_enable_gfx_static_mg_power_gating() can be static
drm/amdgpu/psp: upper_32_bits/lower_32_bits for address setup
drm/amd/powerplay/cz: print message if smc message fails
drm/amdgpu: fix typo in amdgpu_debugfs_test_ib_init
...
The late 2009, 27 inch Apple iMac10,1 has an
internal eDP display and an external Mini-
Displayport output, driven by a DCE-3.2, RV730
Radeon Mobility HD-4670.
The machine worked fine in a dual-display setup
with eDP panel + externally connected HDMI
or DVI-D digital display sink, connected via
MiniDP to DVI or HDMI adapter.
However, booting the machine single-display with
only eDP panel results in a completely black
display - even backlight powering off, as soon as
the radeon modesetting driver loads.
This patch fixes the single dispay eDP case by
assigning encoders based on dig->linkb, similar
to DCE-4+. While this should not be generally
necessary (Alex: "...atom on normal boards
should be able to handle any mapping."), Apple
seems to use some special routing here.
One remaining problem not solved by this patch
is that an external Minidisplayport->DP sink
does still not work on iMac10,1, whereas external
DVI and HDMI sinks continue to work.
The problem affects at least all tested kernels
since Linux 3.13 - didn't test earlier kernels, so
backporting to stable probably makes sense.
v2: With the original patch from 2016, Alex was worried it
will break other DCE3.2 systems. Use dmi_match() to
apply this special encoder assignment only for the
Apple iMac 10,1 from late 2009.
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Michel Dänzer <michel.daenzer@amd.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Pull DRM compat ioctl handling updates from Al Viro:
"This kills the double-copies in there and tons of field-by-field
copyin/copyout.
Several dead ioctls put to rest, while we are at it - the native
counterparts had been gone for a decade, so we can bloody well fail
early on the compat side. No point rearranging the 32bit structure
into 64bit one (and back) only to be told "piss off, I don't know that
ioctl" by the native code..."
* 'work.drm' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (29 commits)
Fix trivial misannotations
mga: switch compat ioctls to drm_ioctl_kernel()
radeon: take out dead compat ioctls
drm compat: ia64 is not biarch
drm_compat_ioctl(): tidy up a bit
switch compat_drm_mapbufs() to drm_ioctl_kernel()
switch compat_drm_rmmap() to drm_ioctl_kernel()
switch compat_drm_mode_addfb2() to drm_ioctl_kernel()
switch compat_drm_wait_vblank() to drm_ioctl_kernel()
switch compat_drm_update_draw()
compat_drm: switch sg ioctls
compat_drm: switch AGP compat ioctls to drm_ioctl_kernel()
switch compat_drm_dma() to drm_ioctl_kernel()
switch compat_drm_resctx() to drm_ioctl_kernel()
switch compat_drm_getsareactx() to drm_ioctl_kernel()
switch compat_drm_setsareactx() to drm_ioctl_kernel()
switch compat_drm_freebufs() to drm_ioctl_kernel()
switch compat_drm_markbufs() to drm_ioctl_kernel()
switch compat_drm_addmap() to drm_ioctl_kernel()
switch compat_drm_getstats() to drm_ioctl_kernel()
...
Pull dma-mapping infrastructure from Christoph Hellwig:
"This is the first pull request for the new dma-mapping subsystem
In this new subsystem we'll try to properly maintain all the generic
code related to dma-mapping, and will further consolidate arch code
into common helpers.
This pull request contains:
- removal of the DMA_ERROR_CODE macro, replacing it with calls to
->mapping_error so that the dma_map_ops instances are more self
contained and can be shared across architectures (me)
- removal of the ->set_dma_mask method, which duplicates the
->dma_capable one in terms of functionality, but requires more
duplicate code.
- various updates for the coherent dma pool and related arm code
(Vladimir)
- various smaller cleanups (me)"
* tag 'dma-mapping-4.13' of git://git.infradead.org/users/hch/dma-mapping: (56 commits)
ARM: dma-mapping: Remove traces of NOMMU code
ARM: NOMMU: Set ARM_DMA_MEM_BUFFERABLE for M-class cpus
ARM: NOMMU: Introduce dma operations for noMMU
drivers: dma-mapping: allow dma_common_mmap() for NOMMU
drivers: dma-coherent: Introduce default DMA pool
drivers: dma-coherent: Account dma_pfn_offset when used with device tree
dma: Take into account dma_pfn_offset
dma-mapping: replace dmam_alloc_noncoherent with dmam_alloc_attrs
dma-mapping: remove dmam_free_noncoherent
crypto: qat - avoid an uninitialized variable warning
au1100fb: remove a bogus dma_free_nonconsistent call
MAINTAINERS: add entry for dma mapping helpers
powerpc: merge __dma_set_mask into dma_set_mask
dma-mapping: remove the set_dma_mask method
powerpc/cell: use the dma_supported method for ops switching
powerpc/cell: clean up fixed mapping dma_ops initialization
tile: remove dma_supported and mapping_error methods
xen-swiotlb: remove xen_swiotlb_set_dma_mask
arm: implement ->dma_supported instead of ->set_dma_mask
mips/loongson64: implement ->dma_supported instead of ->set_dma_mask
...
Pull sound updates from Takashi Iwai:
"This development cycle resulted in a fair amount of changes in both
core and driver sides. The most significant change in ALSA core is
about PCM. Also the support of of-graph card and the new DAPM widget
for DSP are noteworthy changes in ASoC core. And there're lots of
small changes splat over the tree, as you can see in diffstat.
Below are a few highlights:
ALSA core:
- Removal of set_fs() hackery from PCM core stuff, and the code
reorganization / optimization thereafter
- Improved support of PCM ack ops, and a new ABI for improved
control/status mmap handling
- Lots of constifications in various codes
ASoC core:
- The support of of-graph card, which may work as a better generic
device for a replacement of simple-card
- New widget types intended mainly for use with DSPs
ASoC drivers:
- New drivers for Allwinner V3s SoCs
- Ensonic ES8316 codec support
- More Intel SKL and KBL works
- More device support for Intel SST Atom (mostly for cheap tablets
and 2-in-1 devices)
- Support for Rockchip PDM controllers
- Support for STM32 I2S and S/PDIF controllers
- Support for ZTE AUD96P22 codecs
HD-audio:
- Support of new Realtek codecs (ALC215/ALC285/ALC289), more quirks
for HP and Dell machines
- A few more fixes for i915 component binding"
* tag 'sound-4.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (418 commits)
ALSA: hda - Fix unbalance of i915 module refcount
ASoC: Intel: Skylake: Remove driver debugfs exit
ASoC: Intel: Skylake: explicitly add the headers sst-dsp.h
ALSA: hda/realtek - Remove GPIO_MASK
ALSA: hda/realtek - Fix typo of pincfg for Dell quirk
ALSA: pcm: add a documentation for tracepoints
ALSA: atmel: ac97c: fix error return code in atmel_ac97c_probe()
ALSA: x86: fix error return code in hdmi_lpe_audio_probe()
ASoC: Intel: Skylake: Add support to read firmware registers
ASoC: Intel: Skylake: Add sram address to sst_addr structure
ASoC: Intel: Skylake: Debugfs facility to dump module config
ASoC: Intel: Skylake: Add debugfs support
ASoC: fix semicolon.cocci warnings
ASoC: rt5645: Add quirk override by module option
ASoC: rsnd: make arrays path and cmd_case static const
ASoC: audio-graph-card: add widgets and routing for external amplifier support
ASoC: audio-graph-card: update bindings for amplifier support
ASoC: rt5665: calibration should be done before jack detection
ASoC: rsnd: constify dev_pm_ops structures.
ASoC: nau8825: change crosstalk-bypass property to bool type
...
So far in an attempt to make sure all power wells get disabled during
display uninitialization the driver removed any secondary request bits
(BIOS, KVMR, DEBUG) that were set for a given power well. The known
source for these requests was DMC's request on power well 1 and the misc
IO power well. Since DMC is inactive (DC states are disabled) at the
point we disable these power wells, there shouldn't be any reason to
leave them on. However there are two problems with the above
assumption: Bspec requires that the misc IO power well stays enabled
(without providing a reason) and there can be KVMR requests that we
can't remove anyway (the KVMR request register is R/O). Atm, a KVMR
request can trigger a timeout WARN when trying to disable power wells.
To make the code aligned to Bspec and to get rid of the KVMR WARN, don't
try to remove the secondary requests, only detect them and stop polling
for the power well disabled state when any one is set.
Also add a comment about the timeout values required by Bspec when
enabling power wells and the fact that waiting for them to get disabled
is not required by Bspec.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98564
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1498750622-14023-5-git-send-email-imre.deak@intel.com
Currently, we move all unreferenced contexts to an RCU free list and
then onto a worker for eventual reaping. To compensate against this
growing into a long list with frequent allocations starving the system
of available memory, before we allocate a new context we reap all the
stale contexts. This puts all the cost of destroying the context into
the next allocator, which is presumably more sensitive to syscall
latency and unfair. We can limit the number of contexts being freed by
the new allocator to both keep the list trimmed and to allow the
allocator to be reasonably fast.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170705142634.18554-4-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Before we create a new context, we try and reap all the stale contexts
(i.e. those that are freed but waiting for a worker to come and return
their allocations to the system). Before we do this, we retire all
requests so that we clear any inflight no longer used contexts (who are
only being kept alived by those inflght requests). However, any context
that is finally unreferenced by this retirement is put onto an RCU list
and not available for immediately reaping, we stall for no immediate
benefit.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170705142634.18554-3-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
In a IGD passthrough environment, the real ISA bridge may doesn't exist.
then pch_id couldn't be correctly gotten from ISA bridge, but pch_id is
used to identify LPT_H and LPT_LP. Currently i915 treat all LPT pch as
LPT_H,then errors occur when i915 runs on LPT_LP machines with igd
passthrough.
This patch set pch_id for HSW/BDW according to IGD type and isn't fully
correct. But it solves such issue on HSW/BDW ult/ulx machines.
QA CI system is blocked by this issue for a long time, it's better that
we could merge it to unblock QA CI system.
We know the root cause is in device model of virtual passthrough, and
will resolve it in the future with several parts cooperation in kernel,
qemu and xen.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99938
Signed-off-by: Xiong Zhang <xiong.y.zhang@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1497496305-5364-1-git-send-email-xiong.y.zhang@intel.com
Like with panning and modesetting, and like with those, stick with
simple drm_modeset_locking_all for the legacy path, and the full
atomic dance for atomic drivers.
This means a bit more boilerplate since setting up the atomic state
machinery is rather verbose, but then this is shared code for 30+
drivers or so, so meh.
After this patch there's only the LUT/cmap path which is still using
drm_modeset_lock_all for an atomic driver. But Peter is already
locking into reworking that, so I'll leave that code as-is for now.
v2: Squash in patches from Maarten to unify all the various atomic
paths into just one atomic update function for fbdev overall. On top
do one s/restore_fbdev_mode/restore_fbdev_mode_atomic/ so that we have
all-atomic callchains after the first check.
Cc: Peter Rosin <peda@axentia.se>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Thierry Reding <treding@nvidia.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170704151833.17304-10-daniel.vetter@ffwll.ch