In order to implement plane leasing we need to count things,
just make the code consistent with the counting code currently
used for counting crtcs/encoders/connectors and drop the need
for num_overlay_planes.
v2: don't forget to assign plane_ptr. (keithp)
v3: use correct bounds check, found by igt.
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Sean Paul <seanpaul@chromium.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
s/amdgpu_dm_find_first_crct_matching_connector/
amdgpu_dm_find_first_crtc_matching_connector/
And while here, make it static.
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Use dm_new_*_state and dm_old_*_state for their respective amdgpu_dm new
and old object states. Helps with readability, and enforces use of new
DRM api (choose either new, or old).
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Leo (Sunpeng) Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
To conform to DRM's new API, we should not be accessing a DRM object's
internal state directly. Rather, the DRM for_each_old/new_* iterators,
and drm_atomic_get_old/new_* interface should be used.
This is an ongoing process. For now, update the DRM-facing atomic
functions, where the atomic state object is given.
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Leo (Sunpeng) Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Use the correct for_each_new/old_* iterators instead of for_each_*
The following functions were considered:
amdgpu_dm_find_first_crtc_matching_connector: use for_each_new
- Old from_state_var flag was always choosing the new state
amdgpu_dm_display_resume: use for_each_new
- drm_atomic_helper_duplicate_state is called during suspend to
cache the state
- It sets 'state' within the state triplet to 'new_state'
amdgpu_dm_commit_planes: use for_each_old
- Called after the state was swapped (via atomic commit tail)
amdgpu_dm_atomic_commit: use for_each_new
- Called before the state is swapped
amdgpu_dm_atomic_commit_tail: use for_each_old
- Called after the state was swapped
dm_update_crtcs_state: use for_each_new
- Called before the state is swapped (via atomic check)
amdgpu_dm_atomic_check: use for_each_new
- Called before the state is swapped
v2: Split out typo fixes to a new patch.
v3: Say "functions considered" instead of "affected functions". The
latter implies that changes are made to each.
[airlied: squashed with my hacks]
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Leo (Sunpeng) Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
tilcdc changes for v4.15
* tag 'tilcdc-4.15' of https://github.com/jsarha/linux:
drm/tilcdc: Remove redundant OF_DETACHED flag setting
drm/tilcdc: Precalculate total frametime in tilcdc_crtc_set_mode()
drm/tilcdc: Use tilcdc_crtc_shutdown() in tilcdc_crtc_destroy()
drm/tilcdc: Remove WARN_ON(!drm_modeset_is_locked(&crtc->mutex)) checks
drm/tilcdc: Turn raster off in crtc reset, if it was on in the HW
drm/tilcdc: switch to drm_*{get,put} helpers
drm/tilcdc: tilcdc_tfp410: make of_device_ids const.
drm/tilcdc: tilcdc_panel: make of_device_ids const.
In the full-ppgtt world, we can fill the GGTT full of context objects.
These context objects are currently implicitly tracked by the requests
that pin them i.e. they are only unpinned when the request is completed
and retired, but we do not have the link from the vma to the request
(anymore). In order to unpin those contexts, we have to issue another
request and wait upon the switch to the kernel context.
The bug during eviction was that we assumed that a full GGTT meant we
would have requests on the GGTT timeline, and so we missed situations
where those requests where merely in flight (and when even they have not
yet been submitted to hw yet). The fix employed here is to change the
already-is-idle test to no look at the execution timeline, but count the
outstanding requests and then check that we have switched to the kernel
context. Erring on the side of overkill here just means that we stall a
little longer than may be strictly required, but we only expect to hit
this path in extreme corner cases where returning an erroneous error is
worse than the delay.
v2: Logical inversion when swapping over branches.
Fixes: 80b204bce8 ("drm/i915: Enable multiple timelines")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171012125726.14736-1-chris@chris-wilson.co.uk
(cherry picked from commit 55b4f1ce2f)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Quick 4.15 misc pull for the build fix:
Cross-subsystem Changes:
- piles an piles of misc/trivial patches all over, some more from
outreachy applicants
Core Changes:
- build fix for the bridge/of cleanup (Maarten)
- fix vblank count in arm_vblank_event (Ville)
- some kerneldoc typo fixes from Thierry
Driver Changes:
- vc4: Fix T-format tiling scanout, cleanup clock divider w/a (Anholt)
- sun4i: small cleanups and improved code comments all over (Chen-Yu
Tsai)
* tag 'drm-misc-next-2017-10-16' of git://anongit.freedesktop.org/drm/drm-misc: (21 commits)
drm/via: use ARRAY_SIZE
drm/gma500: use ARRAY_SIZE
drm/sun4i: hdmi: Move PAD_CTRL1 setting to mode_set function
drm/sun4i: hdmi: Document PAD_CTRL1 output invert bits
drm/sun4i: backend: Add comment explaining why registers are cleared
drm/sun4i: backend: Use drm_fb_cma_get_gem_addr() to get display memory
drm/sun4i: backend: Create regmap after access is possible
drm/sun4i: don't add components that are already in the queue
drm/vc4: Fix pitch setup for T-format scanout.
drm/vc4: Move the DSI clock divider workaround closer to the clock call.
drm: Replace kzalloc with kcalloc
drm/tinydrm: Remove explicit .best_encoder assignment
drm/tinydrm: Replace dev_error with DRM_DEV_ERROR
drm/drm_of: Move drm_of_panel_bridge_remove_function into header.
drm/atomic-helper: Fix reference to drm_crtc_send_vblank_event()
drm/atomic-helper: Fix typo
drm: Add missing __user annotation to drm_syncobj_array_find()
drm/rockchip: add PINCTRL dependency for LVDS
drm/kirin: Checking for IS_ERR() instead of NULL
driver:gpu: return -ENOMEM on allocation failure.
...
We free objects in bulk after they wait for their RCU grace period.
Currently, we take struct_mutex and unbind all the objects. This can lead
to a long lock duration during which time those objects have their pages
unfreeable (i.e. the shrinker is prevented from reaping those pages). If
we only process a single object under the struct_mutex and then free the
pages, the number of objects locked away from the shrinker is minimal
and we allow regular clients better access to struct_mutex if they need
it.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171013202621.7276-9-chris@chris-wilson.co.uk
Inspired by Tvrtko's critique of the reaping of the stale contexts
before allocating a new one, also limit the freed object reaping to the
oldest stale object before allocating a fresh object. Unlike contexts,
objects may have radically different sizes of backing storage, but
similar to contexts, while we want to prevent starvation due to
excessive freed lists, we also do not want to delay fresh allocations
for too long. Only freeing the oldest on the freed object list before
each allocation is a reasonable compromise.
v2: Only a single consumer of llist_del_first() is allowed (although
multiple llist_add are still allowed in parallel). Unlike
i915_gem_context, i915_gem_flush_free_objects() is itself not serialized
and so we need to add our own spinlock. Otherwise KASAN eventually spots
a use-after-free for the race on *first->next.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> #v1
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171013202621.7276-8-chris@chris-wilson.co.uk
Prefer to defer activating our GEM shrinker until we have a few
megabytes to free; or we have accumulated sufficient mempressure by
deferring the reclaim to force a shrink. The intent is that because our
objects may typically be large, we are too effective at shrinking and
are not rewarded for freeing more pages than the batch. It will also
defer the initial shrinking to hopefully put it at a lower priority than
say the buffer cache (although it will balance out over a number of
reclaims, with GEM being more bursty).
v2: Give it a feedback system to try and tune the batch size towards
an effective size for the available objects.
v3: Start keeping track of shrinker stats in debugfs
v4: Protect against finding no shrinkable objects (div-by-zero)
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171013202621.7276-7-chris@chris-wilson.co.uk
shrink_slab() allows us to report back the number of objects we
successfully scanned (out of the target shrinkctl->nr_to_scan). As
report the number of pages owned by each GEM object as a separate item
to the shrinker, we cannot precisely control the number of shrinker
objects we scan on each pass; and indeed may free more than requested.
If we fail to tell the shrinker about the number of objects we process,
it will continue to hold a grudge against us as any objects left
unscanned are added to the next reclaim -- and so we will keep on
"unfairly" shrinking our own slab in comparison to other slabs.
v2: fixup the misplaced addition, we want to count everything we scan
(to match the number we reported earlier) not just the objects we
successfully validated and freed.
References: 912d572d63 ("drm/i915: wire up shrinkctl->nr_scanned")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171013202621.7276-6-chris@chris-wilson.co.uk
Remove the struct_mutex requirement around dev_priv->mm.bound_list and
dev_priv->mm.unbound_list by giving it its own spinlock. This reduces
one more requirement for struct_mutex and in the process gives us
slightly more accurate unbound_list tracking, which should improve the
shrinker - but the drawback is that we drop the retirement before
counting so i915_gem_object_is_active() may be stale and lead us to
underestimate the number of objects that may be shrunk (see commit
bed50aea61 ("drm/i915/shrinker: Flush active on objects before
counting")).
v2: Crosslink the spinlock to the lists it protects, and btw this
changes s/obj->global_link/obj->mm.link/
v3: Fix decoupling of old links in i915_gem_object_attach_phys()
v3.1: Fix the fix, only unlink if it was linked
v3.2: Use a local for to_i915(obj->base.dev)->mm.obj_lock
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171016114037.5556-1-chris@chris-wilson.co.uk
In the next patch, we want to reduce the lock coverage within the
shrinker, and one of the dangerous walks we have is over obj->vma_list.
We are only walking the obj->vma_list in order to check whether it has
been permanently pinned by HW access, typically via use on the scanout.
But we have a couple of other long term pins, the context objects for
which we currently have to check the individual vma pin_count. If we
instead mark these using obj->pin_display, we can forgo the dangerous
and sometimes slow list iteration.
v2: Rearrange code to try and avoid confusion from false associations
due to arrangement of whitespace along with rebasing on obj->pin_global.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171013202621.7276-4-chris@chris-wilson.co.uk
This patch replace instances of drm_gem_object_reference/unreference with
*_get/put() suffixes, because get/put is shorter and consistent with the
kernel use of *_get/put() suffixes.
This was done with the following Coccinelle script:
@r1@
expression e;
@@
(
-drm_gem_object_reference(e);
+drm_gem_object_get(e);
|
-drm_gem_object_unreference(e);
+drm_gem_object_put(e);
|
-drm_gem_object_unreference_unlocked(e);
+drm_gem_object_put_unlocked(e);
)
Signed-off-by: Haneen Mohammed <hamohammed.sa@gmail.com>
[resolved small conflict with removed armada_gem_dumb_map_offset]
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: https://patchwork.freedesktop.org/patch/msgid/a59ef1ed109ade897bcffcb01b33214262db8942.1505932812.git.hamohammed.sa@gmail.com
We have implemented delayed ring mmio switch mechanism to reduce
unnecessary mmio switch. While the vGPU is being destroyed or
detached from VM, we need to force the ring switch to host context.
The later deadline is missed. Then it got a chance that word load
from VM2 might execute under the ring context of VM1 which was
attached to a same vGPU instance. Finally, the GPU is hang.
This patch guarantee the two deadline are performed.
v2: Remove unused variable 'scheduler'
Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>