Commit Graph

37970 Commits

Author SHA1 Message Date
Rex Zhu
841e3be124 drm/amd/powerplay: notify smu once display changed on Rv.
when User turn off display or screen idle timeout,
smu need this message to start S0i2 entry.

Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-08-29 15:28:01 -04:00
Rex Zhu
3b4ca9e649 drm/amd/powerplay: add dummy pp table for raven. (v2)
As there is no PPTable in RV, it is difficult to
cleanly decouple PPTABLE functionality in existing
codes.

v2: agd: squash in clean build fix

Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-08-29 15:28:01 -04:00
Rex Zhu
e154162ef7 drm/amd/powerplay: refine pp code for raven
delete useless code.

Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-08-29 15:28:00 -04:00
Alex Deucher
ffe6d881e9 drm/amdgpu/gfx9: adjust mqd allocation size
To allocate additional space for the dynamic cu masks.
Confirmed with the hw team that we only need 1 dword
for the mask.  The mask is the same for each SE so
you only need 1 dword.

Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-08-29 15:28:00 -04:00
Alex Deucher
29696bd680 drm/amdgpu/gfx9: update mqd to include dynamic CU mask
Necessary for proper operation with KIQ.

Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-08-29 15:27:59 -04:00
Alex Deucher
31bf29ab39 drm/amdgpu/gfx8: drop cz mqd
It was unused and according to hw team, it's the same for
all asics in a gfx family so remove it.

Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-08-29 15:27:58 -04:00
Alex Deucher
925d5d798f drm/amdgpu/gfx8: apply dynamic cu mask to APUs as well
Confirmed with the hw team.  It's the same for all asics.

Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-08-29 15:27:58 -04:00
Alex Deucher
ecf9d34485 drm/amdgpu/powerplay/vega10: fix typo in register base index
Probably a copy pasta.  No functional difference, both have
the same value.

Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reported-by: Michael von Khurja <mvonkhurja@techpowerup.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-08-29 15:27:57 -04:00
Christian König
6ac7defb5c drm/amdgpu: cleanup GWS, GDS and OA allocation
Those are certainly not kernel allocations, instead set the NO_CPU_ACCESS flag.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-08-29 15:27:57 -04:00
Christian König
34d7be5dc2 drm/amdgpu: fix and cleanup VM ready check
Stop checking the mapped BO itself, cause that one is
certainly not a page table.

Additional to that move the code into amdgpu_vm.c

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-08-29 15:27:56 -04:00
Christian König
87f64a76b3 drm/amdgpu: fix amdgpu_vm_bo_map trace point
That somehow got lost.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-08-29 15:27:55 -04:00
Kent Russell
5b41d94cc4 drm/amdgpu: Move VBIOS version to sysfs
sysfs is more stable, and doesn't require root to access

Signed-off-by: Kent Russell <kent.russell@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-08-29 15:27:55 -04:00
Kent Russell
db95e21855 drm/amdgpu: Add debugfs file for VBIOS and version
Add 2 debugfs files, one that contains the VBIOS version, and one that
contains the VBIOS itself. These won't change after initialization,
so we can add the VBIOS version when we parse the atombios information.

This ensures that we can find out the VBIOS version, even when the dmesg
buffer fills up, and makes it easier to associate which VBIOS version is
for which GPU on mGPU configurations. Set the size to 20 characters in
case of some weird VBIOS version that exceeds the expected 17 character
format (3-8-3\0). The VBIOS dump also allows for easy debugging

    v2: Move to debugfs, clarify commit message, add VBIOS dump file

Signed-off-by: Kent Russell <kent.russell@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-08-29 15:27:54 -04:00
Tom St Denis
f7871fd193 drm/radeon: use new TTM populate/dma map helper functions
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-08-29 15:27:53 -04:00
Tom St Denis
7405e0dad4 drm/amd/amdgpu: Use new TTM populate/map helper function
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-08-29 15:27:52 -04:00
Tom St Denis
a4dec819c8 drm/ttm: Add helper functions to populate/map in one call (v2)
These functions replace a section of common code found
in radeon/amdgpu drivers (and possibly others) as part
of the ttm_tt_*populate() callbacks.

v2: squash in fix for sw iommu from Tom

Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-08-29 15:27:52 -04:00
Harry Wentland
e719d5169f drm/amd/include: Add hdmi_redriver_set to atomfirmware
We'll need this for a some upcoming display changes

Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-08-29 15:27:51 -04:00
Tom St Denis
ca3670aa37 drm/amd/amdgpu: Remove AMDGPU tracepoint and use new TTM tracepoint (v2)
Switches the AMDGPU driver over to the TTM tracepoint and removes
our old one.  Now you can enable traces before loading the module
and trace all mappings.

Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

(v2): Use struct device instead of pci in trace.
2017-08-29 15:27:51 -04:00
Tom St Denis
a92e145059 drm/ttm: Add DMA map/unmap tracepoint (v3)
Also exports two functions that vendor drivers can call
to trace DMA mappings.  This is meant to help translate
IOMMU mappings of bus addresses back to physical pages.

Used by the umr amdgpu debugger for instance.

Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

(v2): Use dev_name() to get PCI path instead.
(v3): Use correct types for dma/phys addresses
2017-08-29 15:27:50 -04:00
Evan Quan
727030b0c6 drm/amdgpu: support polaris10/11/12 new cp firmwares
Newer versions of the CP firmware require changes in how the driver
initializes the hw block.
Change the firmware name for new firmware to maintain compatibility with
older kernels.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-08-29 15:27:50 -04:00
Colin Ian King
fd4b5f54e1 drm/amdgpu: remove duplicate return statement
Remove a redundant identical return statement, it has no use.

Detected by CoverityScan, CID#1454586 ("Structurally dead code")

Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-08-29 15:27:49 -04:00
Christophe JAILLET
06f10a537e drm/amdgpu: check memory allocation failure
Check memory allocation failure and return -ENOMEM in such a case.

'num_post_dep_syncobjs' still has to be set to 0 before the test in order
to have it initialized if 'amdgpu_cs_parser_fini()' is called to free
resources.

The calling graph would be, in such a case!
   failure in amdgpu_cs_process_syncobj_out_dep()
      ---> error code returned by amdgpu_cs_dependencies()
         --> amdgpu_cs_parser_fini() is called

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-08-29 15:27:48 -04:00
Roger He
a3ce364558 drm/amd/amdgpu: fix BANK_SELECT on Vega10 (v2)
BANK_SELECT should always be FRAGMENT_SIZE + 3 due to 8-entry (2^3)
per cache line in L2 TLB for Vega10.

v2: agd: fix warning

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Roger He <Hongbo.He@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-08-29 15:27:48 -04:00
Christian König
1cacc86a63 drm/amdgpu: inline amdgpu_ttm_do_bind again
The function is called only once and doesn't do anything special.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Roger He <Hongbo.He@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-08-29 15:27:47 -04:00
Christian König
9b0655e3ad drm/amdgpu: fix amdgpu_ttm_bind
Use ttm_bo_mem_space instead of manually allocating GART space.

This allows us to evict BOs when there isn't enought GART space any more.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-08-29 15:27:46 -04:00
Christian König
febb84a603 drm/amdgpu: remove the GART copy hack
This isn't used since we don't map evicted BOs to GART any more.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Roger He <Hongbo.He@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-08-29 15:27:46 -04:00
Monk Liu
172423bcc7 drm/ttm:fix wrong decoding of bo_count
we observe abnormal number from:
/sys/devices/virtual/drm/amdttm/buffer_objects/bo_count

bo_count is atomic_inc which is "int" type,
shouldn't explicitly turn it to unsigned long.

Signed-off-by: Monk Liu <monk.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-08-29 15:27:45 -04:00
Monk Liu
7e96a13523 drm/ttm: fix missing inc bo_count
Signed-off-by: Monk Liu <monk.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-08-29 15:27:45 -04:00
Alex Deucher
b249e18df1 drm/amdgpu: set sched_hw_submission higher for KIQ (v3)
KIQ doesn't really use the GPU scheduler.  The base
drivers generally use the KIQ ring directly rather than
submitting IBs.  However, amdgpu_sched_hw_submission
(which defaults to 2) limits the number of outstanding
fences to 2.  KFD uses the KIQ for TLB flushes and the
2 fence limit hurts performance when there are several KFD
processes running.

v2: move some expressions to one line
    change KIQ sched_hw_submission to at least 16
v3: bump to 256

Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-08-29 15:27:44 -04:00
Alex Deucher
c3db7b5a55 drm/amdgpu: move default gart size setting into gmc modules
Move the asic specific code into the IP modules.

Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-08-29 15:27:43 -04:00
Alex Deucher
a4da14cc62 drm/amdgpu: refine default gart size
Be more explicit and add comments explaining each case.
Also s/gart/GART/ in the parameter string as per Felix'
suggestion.

Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-08-29 15:27:43 -04:00
Evan Quan
84d43463a2 drm/amd/powerplay: ACG frequency added in PPTable
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-08-29 15:27:42 -04:00
Christian König
f0694d3b8a drm/amdgpu: discard commands of killed processes
When a process is killed we shouldn't submit all waiting jobs, but instead
clean up as fast as possible.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-08-29 15:27:42 -04:00
Christian König
cf273a59ca drm/amdgpu: fix and cleanup shadow handling
Set the shadow flag on the shadow and not the parent, always bind shadow BOs
during allocation instead of manually, use the reservation_object wrappers
to grab the lock.

This fixes a couple of issues with binding the shadow BOs as well as correctly
evicting them when memory becomes tight.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-08-29 15:27:40 -04:00
Alex Deucher
83e74db6a8 drm/amdgpu: add automatic per asic settings for gart_size
We need a larger gart for asics that do not support GPUVM on all
engines (e.g., MM) to make sure we have enough space for all
gtt buffers in physical mode.  Change the default size based on
the asic type.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-08-29 15:27:40 -04:00
Alex Deucher
2d6fb10565 drm/amdgpu/gfx8: fix spelling typo in mqd allocation
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-08-29 15:27:39 -04:00
Evan Quan
9dd73b1e89 drm/amd/powerplay: unhalt mec after loading
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-08-29 15:27:38 -04:00
Emily Deng
ddbb531350 drm/amdgpu/virtual_dce: Virtual display doesn't support disable vblank immediately
For virtual display, it uses software timer to emulate the vsync interrupt,
it doesn't have high precision, so doesn't support disable vblank immediately.

BUG: SWDEV-129274

Signed-off-by: Emily Deng <Emily.Deng@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-08-29 15:27:38 -04:00
Felix Kuehling
38a8791aa7 drm/amdgpu: Fix huge page updates with CPU
Correctly detect system memory mappings when using CPU and don't use
huge pages for them.

Avoid incorrectly translating a physical page table GPU address when
splitting a huge page while mapping system memory.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-08-29 15:27:37 -04:00
Dave Airlie
7846b12fe0 Merge branch 'drm-vmwgfx-next' of git://people.freedesktop.org/~syeh/repos_linux into drm-next
vmwgfx add fence fd support.

* 'drm-vmwgfx-next' of git://people.freedesktop.org/~syeh/repos_linux:
  drm/vmwgfx: Bump the version for fence FD support
  drm/vmwgfx: Add export fence to file descriptor support
  drm/vmwgfx: Add support for imported Fence File Descriptor
  drm/vmwgfx: Prepare to support fence fd
  drm/vmwgfx: Fix incorrect command header offset at restart
  drm/vmwgfx: Support the NOP_ERROR command
  drm/vmwgfx: Restart command buffers after errors
  drm/vmwgfx: Move irq bottom half processing to threads
  drm/vmwgfx: Don't use drm_irq_[un]install
2017-08-29 10:38:14 +10:00
Dave Airlie
7ebdb0dd52 ummary:
- Provide NV12MT pixel format support of Mixer driver in generic way.
 - Refactor Exynos KMS drivers
   . Refactoring to panel detection way
   . Refactoring to setting up possible_crtcs
   . Refactoring to video and command mode support
 - Some cleanups
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJZn77sAAoJEFc4NIkMQxK4TaAP/1jb9CO2+gMnTTNdjJU+tCEx
 D/bztIDG0bxltpGjm7cDTe0S71GEfdoQ2rN75SCWTAkofVfe9bUUCecpCusWTchF
 lgkF3eTEVCSWw+7qko7sDvxmdC+8p0yZ4LHziozaB2Kd2yvIYLlkfiJeAHF30MpG
 tA2AErKJVOQxOS+z2/BHI7q4T9q5cdON5CW4j2OYQjzuOP2F/62RQlde48BG/WgA
 m9qK4zg4wVGkzadKTtBrK134girceAlC27gLabrLpsz6sv/EwYMtGFkAs4C4P/N5
 fDJKNjaiSphMwLJI9m4y9Q8mSvJWydDvr8JqO0Y3u2MPF6k2e7xOGTEsnqkBGTip
 vNoX1j6qHSC7DnXUCrvSqVJ+GDZZQWGnX1ggOtatNc38+oVnd8k3WIEJkFrKA5ap
 M5/0l2n01AnBbT1U+/N0a3dkHUd3Ecg+s+cSaOIe7aEMuUrM1hTAkQFHEUcPV54S
 5bqj9HquQcXeZdtbhB4X9b7/i+Aexj6YPm/Tv9aTn7cz4MJrB2N5hhdp5tt2Mqpj
 8+kZwGNi54AXB5Q+L6RFlefelWVxjGtmsoEp4M+wxZqP31+CeektoaxO0Cgfn0iJ
 JJOfpPIhHEUGE8pHH6TZWd8yFhB8oH2OAg7uZwHWgneJHZs3lQFmebwOfKl5p9cz
 tPyND6oasX8KouRIM/T5
 =JZ8z
 -----END PGP SIGNATURE-----

Merge tag 'exynos-drm-next-for-v4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos into drm-next

Summary:
- Provide NV12MT pixel format support of Mixer driver in generic way.
- Refactor Exynos KMS drivers
  . Refactoring to panel detection way
  . Refactoring to setting up possible_crtcs
  . Refactoring to video and command mode support
- Some cleanups

* tag 'exynos-drm-next-for-v4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos:
  drm/exynos: simplify set_pixfmt() in DECON and FIMD drivers
  drm/exynos: consistent use of cpp
  drm/exynos: mixer: remove src offset from mixer_graph_buffer()
  drm/exynos: mixer: simplify mixer_graph_buffer()
  drm/exynos: mixer: simplify vp_video_buffer()
  drm/exynos: mixer: enable NV12MT support for the video plane
  drm/exynos: mixer: fix chroma comment in vp_video_buffer()
  arm64: dts: exynos: remove i80-if-timings nodes
  dt-bindings: exynos5433-decon: remove i80-if-timings property
  drm/exynos/decon5433: use mode info stored in CRTC to detect i80 mode
  drm/exynos: add mode_valid callback to exynos_drm
  drm/exynos/decon5433: refactor irq requesting code
  drm/exynos/mic: use mode info stored in CRTC to detect i80 mode
  drm/exynos/dsi: propagate info about command mode from panel
  drm/exynos/dsi: refactor panel detection logic
  drm/exynos: use helper to set possible crtcs
  drm/exynos/decon5433: use readl_poll_timeout helpers
2017-08-29 10:37:36 +10:00
Jason Ekstrand
ffa9443fb3 drm/syncobj: Add a signal ioctl (v3)
This IOCTL provides a mechanism for userspace to trigger a sync object
directly.  There are other ways that userspace can trigger a syncobj
such as submitting a dummy batch somewhere or hanging on to a triggered
sync_file and doing an import.  This just provides an easy way to
manually trigger the sync object without weird hacks.

The motivation for this IOCTL is Vulkan fences.  Vulkan lets you create
a fence already in the signaled state so that you can wait on it
immediatly without stalling.  We could also handle this with a new
create flag to ask the driver to create a syncobj that is already
signaled but the IOCTL seemed a bit cleaner and more generic.

v2:
 - Take an array of sync objects (Dave Airlie)
v3:
 - Throw -EINVAL if pad != 0

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-08-29 10:16:25 +10:00
Jason Ekstrand
aa4035d2c7 drm/syncobj: Add a reset ioctl (v3)
This just resets the dma_fence to NULL so it looks like it's never been
signaled.  This will be useful once we add the new wait API for allowing
wait on "submit and signal" behavior.

v2:
 - Take an array of sync objects (Dave Airlie)
v3:
 - Throw -EINVAL if pad != 0

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Christian König <christian.koenig@amd.com> (v1)
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-08-29 10:16:19 +10:00
Jason Ekstrand
3e6fb72d6c drm/syncobj: Add a syncobj_array_find helper
The wait ioctl has a bunch of code to read an syncobj handle array from
userspace and turn it into an array of syncobj pointers.  We're about to
add two new IOCTLs which will need to work with arrays of syncobj
handles so let's make some helpers.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-08-29 06:28:23 +10:00
Jason Ekstrand
e7aca5031a drm/syncobj: Allow wait for submit and signal behavior (v5)
Vulkan VkFence semantics require that the application be able to perform
a CPU wait on work which may not yet have been submitted.  This is
perfectly safe because the CPU wait has a timeout which will get
triggered eventually if no work is ever submitted.  This behavior is
advantageous for multi-threaded workloads because, so long as all of the
threads agree on what fences to use up-front, you don't have the extra
cross-thread synchronization cost of thread A telling thread B that it
has submitted its dependent work and thread B is now free to wait.

Within a single process, this can be implemented in the userspace driver
by doing exactly the same kind of tracking the app would have to do
using posix condition variables or similar.  However, in order for this
to work cross-process (as is required by VK_KHR_external_fence), we need
to handle this in the kernel.

This commit adds a WAIT_FOR_SUBMIT flag to DRM_IOCTL_SYNCOBJ_WAIT which
instructs the IOCTL to wait for the syncobj to have a non-null fence and
then wait on the fence.  Combined with DRM_IOCTL_SYNCOBJ_RESET, you can
easily get the Vulkan behavior.

v2:
 - Fix a bug in the invalid syncobj error path
 - Unify the wait-all and wait-any cases
v3:
 - Unify the timeout == 0 case a bit with the timeout > 0 case
 - Use wait_event_interruptible_timeout
v4:
 - Use proxy fence
v5:
 - Revert to a combination of v2 and v3
 - Don't use proxy fences
 - Don't use wait_event_interruptible_timeout because it just adds an
   extra layer of callbacks

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-08-29 06:28:17 +10:00
Jason Ekstrand
1fc08218ed drm/syncobj: Add a CREATE_SIGNALED flag
This requests that the driver create the sync object such that it
already has a signaled dma_fence attached.  Because we don't need
anything in particular (just something signaled), we use a dummy null
fence.  This is useful for Vulkan which has a similar flag that can be
passed to vkCreateFence.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-08-29 06:27:41 +10:00
Jason Ekstrand
9c19fb10a5 drm/syncobj: Add a callback mechanism for replace_fence (v3)
It is useful in certain circumstances to know when the fence is replaced
in a syncobj.  Specifically, it may be useful to know when the fence
goes from NULL to something valid.  This does make syncobj_replace_fence
a little more expensive because it has to take a lock but, in the common
case where there is no callback list, it spends a very short amount of
time inside the lock.

v2:
 - Don't lock in drm_syncobj_fence_get.  We only really need to lock
   around fence_replace to make the callback work.
v3:
 - Fix the cb_list comment to make kbuild happy

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-08-29 06:26:42 +10:00
Dave Airlie
5e60a10eae drm/syncobj: add sync obj wait interface. (v8)
This interface will allow sync object to be used to back
Vulkan fences. This API is pretty much the vulkan fence waiting
API, and I've ported the code from amdgpu.

v2: accept relative timeout, pass remaining time back
to userspace.
v3: return to absolute timeouts.
v4: absolute zero = poll,
    rewrite any/all code to have same operation for arrays
    return -EINVAL for 0 fences.
v4.1: fixup fences allocation check, use u64_to_user_ptr
v5: move to sec/nsec, and use timespec64 for calcs.
v6: use -ETIME and drop the out status flag. (-ETIME
is suggested by ickle, I can feel a shed painting)
v7: talked to Daniel/Arnd, use ktime and ns everywhere.
v8: be more careful in the timeout calculations
    use uint32_t for counter variables so we don't overflow
    graciously handle -ENOINT being returned from dma_fence_wait_timeout

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-08-29 06:26:32 +10:00
Jason Ekstrand
afca4216b8 i915: Use drm_syncobj_fence_get
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-08-29 06:20:31 +10:00
Jason Ekstrand
309a5482fa drm/syncobj: Add a race-free drm_syncobj_fence_get helper (v2)
The atomic exchange operation in drm_syncobj_replace_fence is sufficient
for the case where it races with itself.  However, if you have a race
between a replace_fence and dma_fence_get(syncobj->fence), you may end
up with the entire replace_fence happening between the point in time
where the one thread gets the syncobj->fence pointer and when it calls
dma_fence_get() on it.  If this happens, then the reference may be
dropped before we get a chance to get a new one.  The new helper uses
dma_fence_get_rcu_safe to get rid of the race.

This is also needed because it allows us to do a bit more than just get
a reference in drm_syncobj_fence_get should we wish to do so.

v2:
 - RCU isn't that scary
 - Call rcu_read_lock/unlock
 - Don't rename fence to _fence
 - Make the helper static inline

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Christian König <christian.koenig@amd.com> (v1)
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-08-29 06:20:30 +10:00