drm/panfrost: Add fdinfo support GPU load metrics
The drm-stats fdinfo tags made available to user space are drm-engine,
drm-cycles, drm-max-freq and drm-curfreq, one per job slot.
This deviates from standard practice in other DRM drivers, where a single
set of key:value pairs is provided for the whole render engine. However,
Panfrost has separate queues for fragment and vertex/tiler jobs, so a
decision was made to calculate bus cycles and workload times separately.
Maximum operating frequency is calculated at devfreq initialisation time.
Current frequency is made available to user space because nvtop uses it
when performing engine usage calculations.
It is important to bear in mind that both GPU cycle and kernel time numbers
provided are at best rough estimations, and always reported in excess from
the actual figure because of two reasons:
- Excess time because of the delay between the end of a job processing,
the subsequent job IRQ and the actual time of the sample.
- Time spent in the engine queue waiting for the GPU to pick up the next
job.
To avoid race conditions during enablement/disabling, a reference counting
mechanism was introduced, and a job flag that tells us whether a given job
increased the refcount. This is necessary, because user space can toggle
cycle counting through a debugfs file, and a given job might have been in
flight by the time cycle counting was disabled.
The main goal of the debugfs cycle counter knob is letting tools like nvtop
or IGT's gputop switch it at any time, to avoid power waste in case no
engine usage measuring is necessary.
Also add a documentation file explaining the possible values for fdinfo's
engine keystrings and Panfrost-specific drm-curfreq-<keystr> pairs.
Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230929181616.2769345-3-adrian.larumbe@collabora.com
2023-09-29 18:14:28 +00:00
|
|
|
.. SPDX-License-Identifier: GPL-2.0+
|
|
|
|
|
|
|
|
=========================
|
|
|
|
drm/Panfrost Mali Driver
|
|
|
|
=========================
|
|
|
|
|
|
|
|
.. _panfrost-usage-stats:
|
|
|
|
|
|
|
|
Panfrost DRM client usage stats implementation
|
|
|
|
==============================================
|
|
|
|
|
|
|
|
The drm/Panfrost driver implements the DRM client usage stats specification as
|
|
|
|
documented in :ref:`drm-client-usage-stats`.
|
|
|
|
|
|
|
|
Example of the output showing the implemented key value pairs and entirety of
|
|
|
|
the currently possible format options:
|
|
|
|
|
|
|
|
::
|
|
|
|
pos: 0
|
|
|
|
flags: 02400002
|
|
|
|
mnt_id: 27
|
|
|
|
ino: 531
|
|
|
|
drm-driver: panfrost
|
|
|
|
drm-client-id: 14
|
|
|
|
drm-engine-fragment: 1846584880 ns
|
|
|
|
drm-cycles-fragment: 1424359409
|
|
|
|
drm-maxfreq-fragment: 799999987 Hz
|
|
|
|
drm-curfreq-fragment: 799999987 Hz
|
|
|
|
drm-engine-vertex-tiler: 71932239 ns
|
|
|
|
drm-cycles-vertex-tiler: 52617357
|
|
|
|
drm-maxfreq-vertex-tiler: 799999987 Hz
|
|
|
|
drm-curfreq-vertex-tiler: 799999987 Hz
|
|
|
|
drm-total-memory: 290 MiB
|
|
|
|
drm-shared-memory: 0 MiB
|
|
|
|
drm-active-memory: 226 MiB
|
|
|
|
drm-resident-memory: 36496 KiB
|
|
|
|
drm-purgeable-memory: 128 KiB
|
|
|
|
|
|
|
|
Possible `drm-engine-` key names are: `fragment`, and `vertex-tiler`.
|
|
|
|
`drm-curfreq-` values convey the current operating frequency for that engine.
|
2024-03-06 01:56:36 +00:00
|
|
|
|
|
|
|
Users must bear in mind that engine and cycle sampling are disabled by default,
|
|
|
|
because of power saving concerns. `fdinfo` users and benchmark applications which
|
|
|
|
query the fdinfo file must make sure to toggle the job profiling status of the
|
|
|
|
driver by writing into the appropriate sysfs node::
|
|
|
|
|
|
|
|
echo <N> > /sys/bus/platform/drivers/panfrost/[a-f0-9]*.gpu/profiling
|
|
|
|
|
|
|
|
Where `N` is either `0` or `1`, depending on the desired enablement status.
|