We have seen a case of a bad reference count for vblanks with the
Rockchip VOP:
------------[ cut here ]------------
WARNING: CPU: 1 PID: 383 at drivers/gpu/drm/drm_irq.c:1198 drm_vblank_put+0x40/0xcc
Modules linked in: brcmfmac brcmutil
CPU: 1 PID: 383 Comm: kworker/u8:2 Not tainted 4.9.75-rt60 #1
Hardware name: Rockchip (Device Tree)
Workqueue: events_unbound flip_worker
Backtrace:
[<c010b7b0>] (dump_backtrace) from [<c010ba4c>] (show_stack+0x18/0x1c)
r7:c0b1b13c r6:600b0013 r5:00000000 r4:c0b1b13c
[<c010ba34>] (show_stack) from [<c032d248>] (dump_stack+0x78/0x94)
[<c032d1d0>] (dump_stack) from [<c011e6e8>] (__warn+0xe4/0x104)
r7:00000009 r6:c03cf26c r5:00000000 r4:00000000
[<c011e604>] (__warn) from [<c011e7c0>] (warn_slowpath_null+0x28/0x30)
r9:eeb443a0 r8:eeb443c8 r7:ee8a5ec0 r6:ee8a5ec0 r5:edb47f00 r4:ee096200
[<c011e798>] (warn_slowpath_null) from [<c03cf26c>] (drm_vblank_put+0x40/0xcc)
[<c03cf22c>] (drm_vblank_put) from [<c03cf310>] (drm_crtc_vblank_put+0x18/0x1c)
r5:edb47f00 r4:ee3c8a80
[<c03cf2f8>] (drm_crtc_vblank_put) from [<c03ef9b4>] (vop_fb_unref_worker+0x18/0x24)
[<c03ef99c>] (vop_fb_unref_worker) from [<c03df194>] (flip_worker+0x98/0xb4)
r5:edb47f00 r4:eeb443a8
[<c03df0fc>] (flip_worker) from [<c0134808>] (process_one_work+0x1a8/0x2fc)
r9:00000000 r8:ee807d00 r7:00000000 r6:ee809c00 r5:eeb443a8 r4:edfe5f80
[<c0134660>] (process_one_work) from [<c01358ec>] (worker_thread+0x2ac/0x458)
r10:00000088 r9:edfe5f98 r8:ee809c2c r7:c0b04100 r6:ee809c00 r5:ee809c00
r4:edfe5f80
[<c0135640>] (worker_thread) from [<c013a0bc>] (kthread+0xfc/0x10c)
r10:00000000 r9:00000000 r8:c0135640 r7:edfe5f80 r6:00000000 r5:edf0e240
r4:ee8a4000 r3:ed194e00
[<c0139fc0>] (kthread) from [<c0107cb8>] (ret_from_fork+0x14/0x3c)
r8:00000000 r7:00000000 r6:00000000 r5:c0139fc0 r4:edf0e240
---[ end trace 0000000000000002 ]---
It seems that this is caused by unfortunate timing between
vop_crtc_atomic_flush() and vop_handle_vblank() given the following
ordering:
atomic_flush handle_vblank
------------ -------------
drm_flip_work_queue
set_bit
if (test_and_clear_bit(...))
drm_flip_work_commit
drm_vblank_get
This results in vop_fb_unref_worker (called as flip work) decrementing
the vblank refcount before it has been incremented.
Signed-off-by: John Keeping <john@metanate.com>
Reviewed-by: Sandy huang <hjc@rock-chips.com>
Signed-off-by: Sandy Huang <hjc@rock-chips.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180328160351.23763-1-john@metanate.com
Add support for Xen para-virtualized frontend display driver.
Accompanying backend [1] is implemented as a user-space application
and its helper library [2], capable of running as a Weston client
or DRM master.
Configuration of both backend and frontend is done via
Xen guest domain configuration options [3].
Driver limitations:
1. Only primary plane without additional properties is supported.
2. Only one video mode supported which resolution is configured
via XenStore.
3. All CRTCs operate at fixed frequency of 60Hz.
1. Implement Xen bus state machine for the frontend driver according to
the state diagram and recovery flow from display para-virtualized
protocol: xen/interface/io/displif.h.
2. Read configuration values from Xen store according
to xen/interface/io/displif.h protocol:
- read connector(s) configuration
- read buffer allocation mode (backend/frontend)
3. Handle Xen event channels:
- create for all configured connectors and publish
corresponding ring references and event channels in Xen store,
so backend can connect
- implement event channels interrupt handlers
- create and destroy event channels with respect to Xen bus state
4. Implement shared buffer handling according to the
para-virtualized display device protocol at xen/interface/io/displif.h:
- handle page directories according to displif protocol:
- allocate and share page directories
- grant references to the required set of pages for the
page directory
- allocate xen balllooned pages via Xen balloon driver
with alloc_xenballooned_pages/free_xenballooned_pages
- grant references to the required set of pages for the
shared buffer itself
- implement pages map/unmap for the buffers allocated by the
backend (gnttab_map_refs/gnttab_unmap_refs)
5. Implement kernel modesetiing/connector handling using
DRM simple KMS helper pipeline:
- implement KMS part of the driver with the help of DRM
simple pipepline helper which is possible due to the fact
that the para-virtualized driver only supports a single
(primary) plane:
- initialize connectors according to XenStore configuration
- handle frame done events from the backend
- create and destroy frame buffers and propagate those
to the backend
- propagate set/reset mode configuration to the backend on display
enable/disable callbacks
- send page flip request to the backend and implement logic for
reporting backend IO errors on prepare fb callback
- implement virtual connector handling:
- support only pixel formats suitable for single plane modes
- make sure the connector is always connected
- support a single video mode as per para-virtualized driver
configuration
6. Implement GEM handling depending on driver mode of operation:
depending on the requirements for the para-virtualized environment,
namely requirements dictated by the accompanying DRM/(v)GPU drivers
running in both host and guest environments, number of operating
modes of para-virtualized display driver are supported:
- display buffers can be allocated by either
frontend driver or backend
- display buffers can be allocated to be contiguous
in memory or not
Note! Frontend driver itself has no dependency on contiguous memory for
its operation.
6.1. Buffers allocated by the frontend driver.
The below modes of operation are configured at compile-time via
frontend driver's kernel configuration.
6.1.1. Front driver configured to use GEM CMA helpers
This use-case is useful when used with accompanying DRM/vGPU driver
in guest domain which was designed to only work with contiguous
buffers, e.g. DRM driver based on GEM CMA helpers: such drivers can
only import contiguous PRIME buffers, thus requiring frontend driver
to provide such. In order to implement this mode of operation
para-virtualized frontend driver can be configured to use
GEM CMA helpers.
6.1.2. Front driver doesn't use GEM CMA
If accompanying drivers can cope with non-contiguous memory then, to
lower pressure on CMA subsystem of the kernel, driver can allocate
buffers from system memory.
Note! If used with accompanying DRM/(v)GPU drivers this mode of operation
may require IOMMU support on the platform, so accompanying DRM/vGPU
hardware can still reach display buffer memory while importing PRIME
buffers from the frontend driver.
6.2. Buffers allocated by the backend
This mode of operation is run-time configured via guest domain
configuration through XenStore entries.
For systems which do not provide IOMMU support, but having specific
requirements for display buffers it is possible to allocate such buffers
at backend side and share those with the frontend.
For example, if host domain is 1:1 mapped and has DRM/GPU hardware
expecting physically contiguous memory, this allows implementing
zero-copying use-cases.
Note, while using this scenario the following should be considered:
a) If guest domain dies then pages/grants received from the backend
cannot be claimed back
b) Misbehaving guest may send too many requests to the
backend exhausting its grant references and memory
(consider this from security POV).
Note! Configuration options 1.1 (contiguous display buffers) and 2
(backend allocated buffers) are not supported at the same time.
7. Handle communication with the backend:
- send requests and wait for the responses according
to the displif protocol
- serialize access to the communication channel
- time-out used for backend communication is set to 3000 ms
- manage display buffers shared with the backend
[1] https://github.com/xen-troops/displ_be
[2] https://github.com/xen-troops/libxenbe
[3] https://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=docs/man/xl.cfg.pod.5.in;h=a699367779e2ae1212ff8f638eff0206ec1a1cc9;hb=refs/heads/master#l1257
Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20180403112317.28751-2-andr2000@gmail.com
drm_atomic_helper_shutdown() needs to release the reference held by
plane->fb. Since commit 49d70aeaec ("drm/atomic-helper: Fix leak in
disable_all") we're doing that by calling drm_atomic_clean_old_fb() in
drm_atomic_helper_disable_all(). This also leaves plane->fb == NULL
afterwards. However, since drm_atomic_helper_disable_all() is also
used by the i915 gpu reset code
drm_atomic_helper_commit_duplicated_state() then has to undo the
damage and put the correct plane->fb pointers back in (and also
adjust the ref counts to match again as well).
That approach doesn't work so well for load detection as nothing
sets up the plane->old_fb pointers for us. This causes us to
leak an extra reference for each plane->fb when
drm_atomic_helper_commit_duplicated_state() calls
drm_atomic_clean_old_fb() after load detection.
To fix this let's call drm_atomic_clean_old_fb() only for
drm_atomic_helper_shutdown() as that's the only time we need to
actually drop the plane->fb references. In all the other cases
(load detection, gpu reset) we want to leave plane->fb alone.
v2: Don't inflict the clean_old_fbs bool to drivers (Daniel)
v3: Squash in the revert and rewrite the commit msg (Daniel)
Cc: martin.peres@free.fr
Cc: chris@chris-wilson.co.uk
Cc: Dave Airlie <airlied@gmail.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180322152313.6561-3-ville.syrjala@linux.intel.com
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> #pre-squash
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
- Mask mode type garbage from userspace (Ville)
Something went wrong on the misc tree side, but I'll pull the patch directly.
* 'drm-misc-next-fixes' of git://anongit.freedesktop.org/drm/drm-misc:
drm: Fix uabi regression by allowing garbage mode->type from userspace
mipi_dbi_enable_flush() wants to call the fb->dirty() hook from the
bowels of the .atomic_enable() hook. That prevents us from taking the
plane mutex in fb->dirty() unless we also plumb down the acquire
context.
Instead it seems simpler to split the fb->dirty() into a tinydrm
specific lower level hook that can be called from
mipi_dbi_enable_flush() and from a generic higher level
tinydrm_fb_dirty() helper. As we don't have a tinydrm specific
vfuncs table we'll just stick it into tinydrm_device directly
for now.
v2: Deal with the fb->dirty() in tinydrm_display_pipe_update() as well (Noralf)
Cc: "Noralf Trønnes" <noralf@tronnes.org>
Cc: David Lechner <david@lechnology.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Noralf Trønnes <noralf@tronnes.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20180323153509.15287-1-ville.syrjala@linux.intel.com
Reviewed-by: Noralf Trønnes <noralf@tronnes.org>
Tested-by: Noralf Trønnes <noralf@tronnes.org>
It's only used to protect our page list, and only when we know we have
a full reference. This means none of these code paths can ever race
with the final unref, and hence we do not need dev->struct_mutex
serialization and can simply switch to our own locking.
For more context the only magic the locked gem_free_object provides is
that it prevents concurrent final unref (and destruction) of gem
objects while anyone is holding dev->struct_mutex. This was used by
i915 (and other drivers) to implement eviction handling with less
headaches.
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Reviewed-by: Sean Paul <seanpaul@chromium.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20180327082356.24516-3-daniel.vetter@ffwll.ch
- GPUVM support for dGPUs
- KFD events support for dGPUs
- Fix live-lock situation when restoring multiple evicted processes
- Fix VM page table allocation on large-bar systems
- Fix for build failure on frv architecture
* tag 'drm-amdkfd-next-2018-03-27' of git://people.freedesktop.org/~gabbayo/linux:
drm/amdkfd: Use ordered workqueue to restore processes
drm/amdgpu: Fix acquiring VM on large-BAR systems
drm/amdkfd: Add module option for testing large-BAR functionality
drm/amdkfd: Kmap event page for dGPUs
drm/amdkfd: Add ioctls for GPUVM memory management
drm/amdkfd: Add TC flush on VMID deallocation for Hawaii
drm/amdkfd: Allocate CWSR trap handler memory for dGPUs
drm/amdkfd: Add per-process IDR for buffer handles
drm/amdkfd: Aperture setup for dGPUs
drm/amdkfd: Remove limit on number of GPUs
drm/amdkfd: Populate DRM render device minor
drm/amdkfd: Create KFD VMs on demand
drm/amdgpu: Add kfd2kgd interface to acquire an existing VM
drm/amdgpu: Add helper to turn an existing VM into a compute VM
drm/amdgpu: Fix initial validation of PD BO for KFD VMs
drm/amdgpu: Move KFD-specific fields into struct amdgpu_vm
drm/amdkfd: fix uninitialized variable use
drm/amdkfd: add missing include of mm.h
- Display fixes for booting with MST hub lid closed and display
freezing after hibernation (fd.o bugs 105470 & 105196)
- Fix for a very rare interrupt handling race resulting in GPU hang
* tag 'drm-intel-next-fixes-2018-03-27' of git://anongit.freedesktop.org/drm/drm-intel:
drm/i915: Fix hibernation with ACPI S0 target state
drm/i915/execlists: Use a locked clear_bit() for synchronisation with interrupt
drm/i915: Specify which engines to reset following semaphore/event lockups
drm/i915/dp: Write to SET_POWER dpcd to enable MST hub.
After
commit dd9f31c7a3
Author: Imre Deak <imre.deak@intel.com>
Date: Wed Aug 16 17:46:07 2017 +0300
drm/i915/gen9+: Set same power state before hibernation image
save/restore
during hibernation/suspend the power domain functionality got disabled,
after which resume could leave it incorrectly disabled if the ACPI
target state was S0 during suspend and i915 was not loaded by the loader
kernel.
This was caused by not considering if we resumed from hibernation as the
condition for power domains reiniting.
Fix this by simply tracking if we suspended power domains during system
suspend and reinit power domains accordingly during resume. This will
result in reiniting power domains always when resuming from hibernation,
regardless of the platform and whether or not i915 is loaded by the
loader kernel.
The reason we didn't catch this earlier is that the enabled/disabled
state of power domains during PMSG_FREEZE/PMSG_QUIESCE is platform
and kernel config dependent: on my SKL the target state is S4
during PMSG_FREEZE and (with the driver loaded in the loader kernel)
S0 during PMSG_QUIESCE. On the reporter's machine it's S0 during
PMSG_FREEZE but (contrary to this) power domains are not initialized
during PMSG_QUIESCE since i915 is not loaded in the loader kernel, or
it's loaded but without the DMC firmware being available.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105196
Reported-and-tested-by: amn-bas@hotmail.com
Fixes: dd9f31c7a3 ("drm/i915/gen9+: Set same power state before hibernation image save/restore")
Cc: amn-bas@hotmail.com
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180322143642.26883-1-imre.deak@intel.com
(cherry picked from commit 0f90603c33)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>