Use pre-computed iova when unmapping, to reduce the places we assume iova
and mmap offset are (at the moment) the same. And get rid of an extra
drm_gem_free_mmap_offset() call (since it is already called from
drm_gem_object_release())
Signed-off-by: Rob Clark <robdclark@gmail.com>
Atomic wants to split the prepare/pin from where we actually program the
scanout address (so that any part that can fail is done synchronously).
Add some fb/gem apis to make this easier to use from the kms parts.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Give ourselves a way to wait for certain fence #.. makes it easier to
wait on a set of bo's, which we'll need for atomic.
Signed-off-by: Rob Clark <robdclark@gmail.com>
* 'drm-next' of git://people.freedesktop.org/~dvdhrm/linux:
drm/omap: remove null test before kfree
drm/bochs: replace ALIGN(PAGE_SIZE) by PAGE_ALIGN
drm/ttm: recognize ARM arch in ioprot handler
drm: enable render-nodes by default
drm/ttm: remove declaration of ttm_tt_cache_flush
drm/gem: remove misleading gfp parameter to get_pages()
drm/omap: use __GFP_DMA32 for shmem-backed gem
drm/i915: use shmem helpers if possible
Conflicts:
drivers/gpu/drm/drm_stub.c
drm_gem_get_pages() currently allows passing a 'gfp' parameter that is
passed to shmem combined with mapping_gfp_mask(). Given that the default
mapping_gfp_mask() is GFP_HIGHUSER, it is _very_ unlikely that anyone will
ever make use of that parameter. In fact, all drivers currently pass
redundant flags or 0.
This patch removes the 'gfp' parameter. The only reason to keep it is to
remove flags like __GFP_WAIT. But in its current form, it can only be used
to add flags. So to remove __GFP_WAIT, you'd have to drop it from the
mapping_gfp_mask, which again is stupid as this mask is used by shmem-core
for other allocations, too.
If any driver ever requires that parameter, we can introduce a new helper
that takes the raw 'gfp' parameter. The caller'd be responsible to combine
it with mapping_gfp_mask() in a suitable way. The current
drm_gem_get_pages() helper would then simply use mapping_gfp_mask() and
call the new helper. This is what shmem_read_mapping_pages{_gfp,} does
right now.
Moreover, the gfp-zone flag-usage is not obvious: If you pass a modified
zone, shmem core will WARN() or even BUG(). In other words, the following
must be true for 'gfp' passed to shmem_read_mapping_pages_gfp():
gfp_zone(mapping_gfp_mask(mapping)) == gfp_zone(gfp)
Add a comment to drm_gem_read_pages() explaining that constraint.
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
If probe fails after IOMMU is attached, we need to detach in order to
clean up properly. Before this change, IOMMU faults would occur if the
probe failed (-EPROBE_DEFER).
Signed-off-by: Stephane Viau <sviau@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
Add a VRAM carveout that is used for systems which do not have an IOMMU.
The VRAM carveout uses CMA. The arch code must setup a CMA pool for the
device (preferrably in highmem.. a 256m-512m VRAM pool in lowmem is not
cool). The user can configure the VRAM pool size using msm.vram module
param.
Technically, the abstraction of IOMMU behind msm_mmu is not strictly
needed, but it simplifies the GEM code a bit, and will be useful later
when I add support for a2xx devices with GPUMMU, so I decided to keep
this part.
It appears to be possible to configure the GPU to restrict access to
addresses within the VRAM pool, but this is not done yet. So for now
the GPU will refuse to load if there is no sort of mmu. Once address
based limits are supported and tested to confirm that we aren't giving
the GPU access to arbitrary memory, this restriction can be lifted
Signed-off-by: Rob Clark <robdclark@gmail.com>
Subsequent threads returning EBUSY from vm_insert_pfn() was not
handled correctly. As a result concurrent access from new threads
to mmapped data caused SIGBUS.
See e79e0fe3
Signed-off-by: Rob Clark <robdclark@gmail.com>
Acked-by: David Brown <davidb@codeaurora.org>
Re-arrange things a bit so that we can get work requested after a bo
fence passes, like pageflip, done before retiring bo's. Without any
sort of bo cache in userspace, some games can trigger hundred's of
transient bo's, which can cause retire to take a long time (5-10ms).
Obviously we want a bo cache.. but this cleanup will make things a
bit easier for atomic as well and makes things a bit cleaner.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Acked-by: David Brown <davidb@codeaurora.org>
When we CPU_PREP a bo with NOSYNC flag (for example, to implement
PIPE_TRANSFER_DISCARD_WHOLE_RESOURCE), an -EBUSY return indicates to
userspace that the bo is still busy. Previously it was incorrectly
returning 0 in this case.
And while we're in there throw in an bit of extra sanity checking in
case userspace tries to wait for a bogus fence.
Signed-off-by: Rob Clark <robdclark@gmail.com>
In case of error, the function drm_prime_pages_to_sg() returns ERR_PTR()
and never returns NULL. The NULL test in the return value check should
be replaced with IS_ERR().
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
The userspace API already had everything needed to handle read vs write
synchronization. This patch actually bothers to hook it up properly, so
that we don't need to (for example) stall on userspace read access to a
buffer that gpu is also still reading.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Add initial support for a3xx 3d core.
So far, with hardware that I've seen to date, we can have:
+ zero, one, or two z180 2d cores
+ a3xx or a2xx 3d core, which share a common CP (the firmware
for the CP seems to implement some different PM4 packet types
but the basics of cmdstream submission are the same)
Which means that the eventual complete "class" hierarchy, once
support for all past and present hw is in place, becomes:
+ msm_gpu
+ adreno_gpu
+ a3xx_gpu
+ a2xx_gpu
+ z180_gpu
This commit splits out the parts that will eventually be common
between a2xx/a3xx into adreno_gpu, and the parts that are even
common to z180 into msm_gpu.
Note that there is no cmdstream validation required. All memory access
from the GPU is via IOMMU/MMU. So as long as you don't map silly things
to the GPU, there isn't much damage that the GPU can do.
Signed-off-by: Rob Clark <robdclark@gmail.com>
The snapdragon chips have multiple different display controllers,
depending on which chip variant/version. (As far as I can tell, current
devices have either MDP3 or MDP4, and upcoming devices have MDSS.) And
then external to the display controller are HDMI, DSI, etc. blocks which
may be shared across devices which have different display controller
blocks.
To more easily add support for different display controller blocks, the
display controller specific bits are split out into a "kms" module,
which provides the kms plane/crtc/encoder objects.
The external HDMI, DSI, etc. blocks are part encoder, and part connector
currently. But I think I will pull in the drm_bridge patches from
chromeos tree, and split them into a bridge+connector, with the
registers that need to be set in modeset handled by the bridge. This
would remove the 'msm_connector' base class. But some things need to be
double checked to make sure I could get the correct ON/OFF sequencing..
This patch adds support for mdp4 crtc (including hw cursor), dtv encoder
(part of MDP4 block), and hdmi.
Signed-off-by: Rob Clark <robdclark@gmail.com>