linux/include/drm
Lyude Paul ebcc0e6b50 drm/dp_mst: Introduce new refcounting scheme for mstbs and ports
The current way of handling refcounting in the DP MST helpers is really
confusing and probably just plain wrong because it's been hacked up many
times over the years without anyone actually going over the code and
seeing if things could be simplified.

To the best of my understanding, the current scheme works like this:
drm_dp_mst_port and drm_dp_mst_branch both have a single refcount. When
this refcount hits 0 for either of the two, they're removed from the
topology state, but not immediately freed. Both ports and branch devices
will reinitialize their kref once it's hit 0 before actually destroying
themselves. The intended purpose behind this is so that we can avoid
problems like not being able to free a remote payload that might still
be active, due to us having removed all of the port/branch device
structures in memory, as per:

commit 91a25e4631 ("drm/dp/mst: deallocate payload on port destruction")

Which may have worked, but then it caused use-after-free errors. Being
new to MST at the time, I tried fixing it;

commit 263efde31f ("drm/dp/mst: Get validated port ref in drm_dp_update_payload_part1()")

But, that was broken: both drm_dp_mst_port and drm_dp_mst_branch structs
are validated in almost every DP MST helper function. Simply put, this
means we go through the topology and try to see if the given
drm_dp_mst_branch or drm_dp_mst_port is still attached to something
before trying to use it in order to avoid dereferencing freed memory
(something that has happened a LOT in the past with this library).
Because of this it doesn't actually matter whether or not we keep keep
the ports and branches around in memory as that's not enough, because
any function that validates the branches and ports passed to it will
still reject them anyway since they're no longer in the topology
structure. So, use-after-free errors were fixed but payload deallocation
was completely broken.

Two years later, AMD informed me about this issue and I attempted to
come up with a temporary fix, pending a long-overdue cleanup of this
library:

commit c54c7374ff ("drm/dp_mst: Skip validating ports during destruction, just ref")

But then that introduced use-after-free errors, so I quickly reverted
it:

commit 9765635b30 ("Revert "drm/dp_mst: Skip validating ports during destruction, just ref"")

And in the process, learned that there is just no simple fix for this:
the design is just broken. Unfortunately, the usage of these helpers are
quite broken as well. Some drivers like i915 have been smart enough to
avoid accessing any kind of information from MST port structures, but
others like nouveau have assumed, understandably so, that
drm_dp_mst_port structures are normal and can just be accessed at any
time without worrying about use-after-free errors.

After a lot of discussion, me and Daniel Vetter came up with a better
idea to replace all of this.

To summarize, since this is documented far more indepth in the
documentation this patch introduces, we make it so that drm_dp_mst_port
and drm_dp_mst_branch structures have two different classes of
refcounts: topology_kref, and malloc_kref. topology_kref corresponds to
the lifetime of the given drm_dp_mst_port or drm_dp_mst_branch in it's
given topology. Once it hits zero, any associated connectors are removed
and the branch or port can no longer be validated. malloc_kref
corresponds to the lifetime of the memory allocation for the actual
structure, and will always be non-zero so long as the topology_kref is
non-zero. This gives us a way to allow callers to hold onto port and
branch device structures past their topology lifetime, and dramatically
simplifies the lifetimes of both structures. This also finally fixes the
port deallocation problem, properly.

Additionally: since this now means that we can keep ports and branch
devices allocated in memory for however long we need, we no longer need
a significant amount of the port validation that we currently do.

Additionally, there is one last scenario that this fixes, which couldn't
have been fixed properly beforehand:

- CPU1 unrefs port from topology (refcount 1->0)
- CPU2 refs port in topology(refcount 0->1)

Since we now can guarantee memory safety for ports and branches
as-needed, we also can make our main reference counting functions fix
this problem by using kref_get_unless_zero() internally so that topology
refcounts can only ever reach 0 once.

Changes since v4:
* Change the kernel-figure summary for dp-mst/topology-figure-1.dot a
  bit - danvet
* Remove figure numbers - danvet

Changes since v3:
* Remove rebase detritus - danvet
* Split out purely style changes into separate patches - hwentlan

Changes since v2:
* Fix commit message - checkpatch
* s/)-1/) - 1/g - checkpatch

Changes since v1:
* Remove forward declarations - danvet
* Move "Branch device and port refcounting" section from documentation
  into kernel-doc comments - danvet
* Export internal topology lifetime functions into their own section in
  the kernel-docs - danvet
* s/@/&/g for struct references in kernel-docs - danvet
* Drop the "when they are no longer being used" bits from the kernel
  docs - danvet
* Modify diagrams to show how the DRM driver interacts with the topology
  and payloads - danvet
* Make suggested documentation changes for
  drm_dp_mst_topology_get_mstb() and drm_dp_mst_topology_get_port() -
  danvet
* Better explain the relationship between malloc refs and topology krefs
  in the documentation for drm_dp_mst_topology_get_port() and
  drm_dp_mst_topology_get_mstb() - danvet
* Fix "See also" in drm_dp_mst_topology_get_mstb() - danvet
* Rename drm_dp_mst_topology_get_(port|mstb)() ->
  drm_dp_mst_topology_try_get_(port|mstb)() and
  drm_dp_mst_topology_ref_(port|mstb)() ->
  drm_dp_mst_topology_get_(port|mstb)() - danvet
* s/should/must in docs - danvet
* WARN_ON(refcount == 0) in topology_get_(mstb|port) - danvet
* Move kdocs for mstb/port structs inline - danvet
* Split drm_dp_get_last_connected_port_and_mstb() changes into their own
  commit - danvet

Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Daniel Vetter <daniel@ffwll.ch>
Cc: David Airlie <airlied@redhat.com>
Cc: Jerry Zuo <Jerry.Zuo@amd.com>
Cc: Juston Li <juston.li@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190111005343.17443-7-lyude@redhat.com
2019-01-10 20:12:19 -05:00
..
bridge drm: remove include of drmP.h from bridge/dw_hdmi.h 2019-01-09 22:27:44 +01:00
i2c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
tinydrm drm/tinydrm: Use DRM_GEM_CMA_VMAP_DRIVER_OPS 2018-11-20 14:58:19 +01:00
ttm drm: Remove drm_global.{c,h} v2 2018-11-05 14:21:21 -05:00
amd_asic_type.h drm/amdgpu: simplify Raven, Raven2, and Picasso handling 2018-09-14 09:38:03 -05:00
ati_pcigart.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
drm_agpsupport.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
drm_atomic_helper.h drm: Fix up drm_atomic_state_helper.[hc] extraction 2018-11-30 16:37:52 +01:00
drm_atomic_state_helper.h drm: Fix up drm_atomic_state_helper.[hc] extraction 2018-11-30 16:37:52 +01:00
drm_atomic_uapi.h drm: extract drm_atomic_uapi.c 2018-09-09 14:19:18 +02:00
drm_atomic.h drm/atomic: integrate modeset lock with private objects 2018-12-11 15:24:30 +01:00
drm_audio_component.h ALSA: hda: Make audio component support more generic 2018-07-17 22:25:48 +02:00
drm_auth.h drm: Add drm_object lease infrastructure [v5] 2017-10-25 16:31:29 +10:00
drm_blend.h drm: Add per-plane pixel blend mode property 2018-08-24 17:31:37 +01:00
drm_bridge.h drm/bridge: Move the struct drm_bridge member kerneldoc inline. 2018-06-21 14:00:06 -07:00
drm_cache.h drm: add func to get max iomem address v2 2018-02-13 11:57:59 -05:00
drm_client.h drm/cma-helper: Fix crash in fbdev error path 2018-10-02 13:03:34 +02:00
drm_color_mgmt.h drm: drop drmP.h include from drm_plane.c 2018-09-09 14:19:17 +02:00
drm_connector.h drm/edid: Add display_info.rgb_quant_range_selectable 2019-01-10 19:01:06 +02:00
drm_crtc_helper.h drm: Remove transitional helpers 2018-10-05 18:04:10 +02:00
drm_crtc.h drm/crc: Cleanup crtc_crc_open function 2018-08-22 09:47:58 -07:00
drm_debugfs_crc.h drm/crc: Only report a single overflow when a CRC fd is opened 2018-07-06 14:57:03 +02:00
drm_debugfs.h drm/debugfs: Add kerneldoc 2017-03-24 09:36:06 +01:00
drm_device.h drm: move DRM_SWITCH_POWER defines to drm_device.h 2019-01-09 22:11:18 +01:00
drm_displayid.h
drm_dp_dual_mode_helper.h drm: Fix LSPCON kernel-doc 2016-10-19 18:20:40 +03:00
drm_dp_helper.h Merge drm/drm-next into drm-intel-next-queued 2018-11-20 13:14:08 +02:00
drm_dp_mst_helper.h drm/dp_mst: Introduce new refcounting scheme for mstbs and ports 2019-01-10 20:12:19 -05:00
drm_drv.h drm: Improve dumb callback docs 2018-11-27 15:23:18 +01:00
drm_edid.h drm/edid: Add display_info.rgb_quant_range_selectable 2019-01-10 19:01:06 +02:00
drm_encoder_slave.h drm: remove include of drmP.h from drm_encoder_slave.h 2019-01-09 22:35:35 +01:00
drm_encoder.h drm: Add drm/drm_util.h header file 2018-09-09 14:18:11 +02:00
drm_fb_cma_helper.h drm/rcar-du: Convert drm_atomic_helper_suspend/resume() 2018-10-23 15:59:01 +02:00
drm_fb_helper.h drm/fb-helper: document remove*_conflicting_framebuffers() 2018-09-07 22:07:49 +02:00
drm_file.h drm: include idr.h from drm_file.h 2019-01-02 11:37:56 +02:00
drm_fixed.h
drm_flip_work.h drm/kms-helpers: Use recommened kerneldoc for struct member refs 2017-01-25 16:18:57 +01:00
drm_fourcc.h drm/fourcc: Add char_per_block, block_w and block_h in drm_format_info 2018-11-02 09:55:27 +00:00
drm_framebuffer.h drm: make drm_framebuffer.h self contained 2019-01-09 22:11:28 +01:00
drm_gem_cma_helper.h drm: remove drmP.h from drm_gem_cma_helper.h 2019-01-09 22:54:08 +01:00
drm_gem_framebuffer_helper.h drm: Move simple_display_pipe prepare_fb helper into gem fb helpers 2018-04-24 13:57:22 +02:00
drm_gem.h drm: remove deprecated "[__]drm_gem_object_[un]reference[_locked]" functions 2018-11-24 22:12:54 +01:00
drm_hashtab.h drm: drop extern from function decls 2017-03-24 09:36:06 +01:00
drm_hdcp.h drm: include types.h from drm_hdcp.h 2019-01-02 11:38:01 +02:00
drm_ioctl.h drm: remove all control node code 2018-05-03 21:26:32 +02:00
drm_irq.h drm: Extract drm_vblank.[hc] 2017-06-01 08:02:14 +02:00
drm_lease.h drm: Add four ioctls for managing drm mode object leases [v7] 2017-10-25 16:31:30 +10:00
drm_legacy.h drm: un-inline drm_legacy_findmap() 2019-01-02 11:37:11 +02:00
drm_mipi_dsi.h drm: dsi: Add lane clock rate fields to DSI device 2018-10-24 16:26:35 +02:00
drm_mm.h drm/mm: Add a search-by-address variant to only inspect a single hole 2018-05-24 15:04:30 +01:00
drm_mode_config.h drm/connector: Clarify the unit of TV margins 2018-12-19 14:38:35 +01:00
drm_mode_object.h drm: remove drm_mode_object_{un/reference} aliases 2018-03-19 09:09:46 -04:00
drm_modes.h drm: drop _mode_ from remaining connector functions 2018-07-13 18:40:27 +02:00
drm_modeset_helper_vtables.h drm: drop _mode_ from update_edit_property() 2018-07-13 18:40:27 +02:00
drm_modeset_helper.h drm/modeset-helper: Add simple modeset suspend/resume helpers 2017-11-30 18:18:08 +01:00
drm_modeset_lock.h drm: Add DRM_MODESET_LOCK_BEGIN/END helpers 2018-11-29 10:48:31 -05:00
drm_of.h drm: of: Export and rename drm_crtc_port_mask() 2018-06-27 21:44:04 +02:00
drm_os_linux.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
drm_panel.h This is the 4.19-rc6 release 2018-10-04 11:03:34 +10:00
drm_pci.h drm: drop drm_pcie_get_speed_cap_mask and drm_pcie_get_max_link_width 2018-07-05 16:40:00 -05:00
drm_pciids.h drm/radeon: change SPDX identifier to MIT 2018-10-15 16:16:12 -05:00
drm_plane_helper.h drm: Unexport drm_plane_helper_check_update 2018-10-05 22:45:19 +02:00
drm_plane.h drm: Add drm_any_plane_has_format() 2018-11-06 21:34:22 +02:00
drm_prime.h drm/prime: Add drm_gem_prime_mmap() 2018-11-20 14:54:53 +01:00
drm_print.h Merge drm/drm-next into drm-misc-next 2018-08-27 10:00:03 -04:00
drm_property.h drm: Fix kernel doc for DRM_MODE_PROP_IMMUTABLE 2018-10-03 13:05:12 -07:00
drm_rect.h drm/rect: Handle rounding errors in drm_rect_clip_scaled, v3. 2018-05-04 11:09:54 +02:00
drm_scdc_helper.h drm: Fix warning when building docs for scdc_helper 2017-07-31 14:24:14 +02:00
drm_simple_kms_helper.h drm: Move simple_display_pipe prepare_fb helper into gem fb helpers 2018-04-24 13:57:22 +02:00
drm_syncobj.h Make some drm headers self-contained with includes and forward declarations 2019-01-07 16:43:24 +01:00
drm_sysfs.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
drm_util.h drm: Add drm/drm_util.h header file 2018-09-09 14:18:11 +02:00
drm_utils.h drm: Add panel orientation quirks, v6. 2017-12-04 23:03:21 +01:00
drm_vblank.h drm/vblank: Remove old-style comments 2018-10-05 17:39:28 +02:00
drm_vma_manager.h drm/i915: Prevent writing into a read-only object via a GGTT mmap 2018-07-13 16:14:04 +01:00
drm_writeback.h drm/atomic: Avoid connector to writeback_connector casts 2018-07-07 07:51:19 +02:00
drmP.h drm: move DRM_SWITCH_POWER defines to drm_device.h 2019-01-09 22:11:18 +01:00
gma_drm.h
gpu_scheduler.h drm/scheduler: Add drm_sched_job_cleanup 2018-11-05 14:21:27 -05:00
i915_component.h drm/i915: Split audio component to a generic type 2018-07-17 22:25:19 +02:00
i915_drm.h x86/gpu: reserve ICL's graphics stolen memory 2018-07-10 16:28:47 -07:00
i915_pciids.h drm/i915/aml: Add new Amber Lake PCI ID 2018-10-11 10:59:34 -07:00
intel_lpe_audio.h ALSA: x86: Register multiple PCM devices for the LPE audio card 2017-05-03 16:24:00 +03:00
intel-gtt.h drm: include kernel.h and agp_backend.h from intel-gtt.h 2019-01-02 11:37:47 +02:00
spsc_queue.h drm: move amd_gpu_scheduler into common location 2017-12-07 11:51:56 -05:00