For a5xx the gpu is 64b so we need to change iova to 64b everywhere. On
the display side, iova is still 32b so it can ignore the upper bits.
(Although all the armv8 devices have an iommu that can map 64b pa to 32b
iova.)
Signed-off-by: Rob Clark <robdclark@gmail.com>
We can have various combinations of 64b and 32b address space, ie. 64b
CPU but 32b display and gpu, or 64b CPU and GPU but 32b display. So
best to decouple the device iova's from mmap offset.
Signed-off-by: Rob Clark <robdclark@gmail.com>
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJX6H4uAAoJEHm+PkMAQRiG5sMH/3yzrMiUCSokdS+cvY+jgKAG
JS58JmRvBPz2mRaU3MRPBGRDeCz/Nc9LggL2ZcgM+E1ZYirlYyQfIED3lkqk5R07
kIN1wmb+kQhXyU4IY3fEX7joqyKC6zOy4DUChPkBQU0/0+VUmdVmcJvsuPlnMZtf
g95m0BdYTui+eDezASRqOEp3Lb5ONL4c3ao4yBP0LHF033ctj3VJQiyi5uERPZJ0
5e6Mo7Wxn78t9WqJLQAiEH46kTwT2plNlxf3XXqTenfIdbWhqE873HPGeSMa3VQV
VywXTpCpSPQsA8BYg66qIbebdKOhs9MOviHVfqDtwQlvwhjlBDya0gNHfI5fSy4=
=Y/L5
-----END PGP SIGNATURE-----
Merge tag 'v4.8-rc8' into drm-next
Linux 4.8-rc8
There was a lot of fallout in the imx/amdgpu/i915 drivers, so backmerge
it now to avoid troubles.
* tag 'v4.8-rc8': (1442 commits)
Linux 4.8-rc8
fault_in_multipages_readable() throws set-but-unused error
mm: check VMA flags to avoid invalid PROT_NONE NUMA balancing
radix tree: fix sibling entry handling in radix_tree_descend()
radix tree test suite: Test radix_tree_replace_slot() for multiorder entries
fix memory leaks in tracing_buffers_splice_read()
tracing: Move mutex to protect against resetting of seq data
MIPS: Fix delay slot emulation count in debugfs
MIPS: SMP: Fix possibility of deadlock when bringing CPUs online
mm: delete unnecessary and unsafe init_tlb_ubc()
huge tmpfs: fix Committed_AS leak
shmem: fix tmpfs to handle the huge= option properly
blk-mq: skip unmapped queues in blk_mq_alloc_request_hctx
MIPS: Fix pre-r6 emulation FPU initialisation
arm64: kgdb: handle read-only text / modules
arm64: Call numa_store_cpu_info() earlier.
locking/hung_task: Fix typo in CONFIG_DETECT_HUNG_TASK help text
nvme-rdma: only clear queue flags after successful connect
i2c: qup: skip qup_i2c_suspend if the device is already runtime suspended
perf/core: Limit matching exclusive events to one PMU
...
Since fence_wait_timeout_reservation_object_wait_timeout_rcu() with a
timeout of 0 becomes reservation_object_test_signaled_rcu(), we do not
need to handle such conversion in the caller. The only challenge are
those callers that wish to differentiate the error code between the
nonblocking busy check and potentially blocking wait.
v2: 9 is only 0 in German.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Rob Clark <robdclark@gmail.com>
Signed-off-by: Rob Clark <robdclark@gmail.com>
The drm_gem_object_unreference() function tests whether its argument
is NULL and then returns immediately.
Thus the test around the call is not needed.
This issue was detected by using the Coccinelle software.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Rob Clark <robdclark@gmail.com>
The drm_gem_object_unreference_unlocked() function tests whether
its argument is NULL and then returns immediately.
Thus the test around the calls is not needed.
This issue was detected by using the Coccinelle software.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Rob Clark <robdclark@gmail.com>
Before we can add vmap shrinking, we really need to know which vmap'ings
are currently being used. So switch to get/put interface. Stubbed put
fxns for now.
Signed-off-by: Rob Clark <robdclark@gmail.com>
For a first step, only purge obj->madv==DONTNEED objects. We could be
more agressive and next try unpinning inactive objects.. but that is
only useful if you have swap.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Some, but not all, callers of obj->vmap() would check if return
IS_ERR(). So let's actually return an error if vmap() fails. And fixup
the call-sites that were not handling this properly.
Signed-off-by: Rob Clark <robdclark@gmail.com>
drm_gem_object_lookup() has never required the drm_device for its file
local translation of the user handle to the GEM object. Let's remove the
unused parameter and save some space.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: dri-devel@lists.freedesktop.org
Cc: Dave Airlie <airlied@redhat.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
[danvet: Fixup kerneldoc too.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This was only used for atomic commit these days. So instead just give
atomic it's own work-queue where we can do a block on each bo in turn.
Simplifies things a whole bunch and makes the 'struct fence' conversion
easier.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Better encapsulate the per-timeline stuff into fence-context. For now
there is just a single fence-context, but eventually we'll also have one
per-CRTC to enable fully explicit fencing.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Convert the raw unsigned long 'pfn' argument to pfn_t for the purpose of
evaluating the PFN_MAP and PFN_DEV flags. When both are set it triggers
_PAGE_DEVMAP to be set in the resulting pte.
There are no functional changes to the gpu drivers as a result of this
conversion.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Hansen <dave@sr71.net>
Cc: David Airlie <airlied@linux.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The atomic commit cannot easily undo and return an error once the
state is swapped. Change to uninterruptible wait, and ignore the
timeout error.
Signed-off-by: Wentao Xu <wentaox@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
The 'timeout' value comes from userspace (CLOCK_MONOTONIC), but
converting this directly to jiffies doesn't take into account the
initial jiffies count at boot, which may differ from the base time
of CLOCK_MONOTONIC.
TODO: add ktime_delta_jiffies() when rebasing on 4.1 and use that
instead of ktime_sub/ktime_to_timespec/timespec_to_jiffies combo (as
suggested by Arnd)
v2: switch over from 'struct timespec' to ktime_t throughout, since
'struct timespec' will be deprecated (as suggested by Arnd)
v3: minor cosmetic tweaks
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Rob Clark <robdclark@gmail.com>
If the GEM object is imported, drm_prime_gem_destroy needs to be
called to clean up dma buffer related information.
Signed-off-by: Jilai Wang <jilaiw@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
Avoid casts from pointers to fixed-size integers to prevent the compiler
from warning. Print virtual memory addresses using %p instead. Also turn
a couple of %d/%x specifiers into %zu/%zd/%zx to avoid further warnings
due to mismatched format strings.
Signed-off-by: Thierry Reding <treding@nvidia.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Add support to use the VRAM carveout (if specified in dtb) for fbdev
scanout buffer. This allows drm/msm to take over a bootloader splash-
screen, and avoids corruption on screen that results if the kernel uses
memory that is still being scanned out for itself.
Signed-off-by: Rob Clark <robdclark@gmail.com>
The functions framebuffer_release() and vunmap() perform also input
parameter validation. Thus the test around the call is not needed.
This issue was detected by using the Coccinelle software.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Reviewed-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Rob Clark <robdclark@gmail.com>
Use pre-computed iova when unmapping, to reduce the places we assume iova
and mmap offset are (at the moment) the same. And get rid of an extra
drm_gem_free_mmap_offset() call (since it is already called from
drm_gem_object_release())
Signed-off-by: Rob Clark <robdclark@gmail.com>
Atomic wants to split the prepare/pin from where we actually program the
scanout address (so that any part that can fail is done synchronously).
Add some fb/gem apis to make this easier to use from the kms parts.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Give ourselves a way to wait for certain fence #.. makes it easier to
wait on a set of bo's, which we'll need for atomic.
Signed-off-by: Rob Clark <robdclark@gmail.com>
* 'drm-next' of git://people.freedesktop.org/~dvdhrm/linux:
drm/omap: remove null test before kfree
drm/bochs: replace ALIGN(PAGE_SIZE) by PAGE_ALIGN
drm/ttm: recognize ARM arch in ioprot handler
drm: enable render-nodes by default
drm/ttm: remove declaration of ttm_tt_cache_flush
drm/gem: remove misleading gfp parameter to get_pages()
drm/omap: use __GFP_DMA32 for shmem-backed gem
drm/i915: use shmem helpers if possible
Conflicts:
drivers/gpu/drm/drm_stub.c
drm_gem_get_pages() currently allows passing a 'gfp' parameter that is
passed to shmem combined with mapping_gfp_mask(). Given that the default
mapping_gfp_mask() is GFP_HIGHUSER, it is _very_ unlikely that anyone will
ever make use of that parameter. In fact, all drivers currently pass
redundant flags or 0.
This patch removes the 'gfp' parameter. The only reason to keep it is to
remove flags like __GFP_WAIT. But in its current form, it can only be used
to add flags. So to remove __GFP_WAIT, you'd have to drop it from the
mapping_gfp_mask, which again is stupid as this mask is used by shmem-core
for other allocations, too.
If any driver ever requires that parameter, we can introduce a new helper
that takes the raw 'gfp' parameter. The caller'd be responsible to combine
it with mapping_gfp_mask() in a suitable way. The current
drm_gem_get_pages() helper would then simply use mapping_gfp_mask() and
call the new helper. This is what shmem_read_mapping_pages{_gfp,} does
right now.
Moreover, the gfp-zone flag-usage is not obvious: If you pass a modified
zone, shmem core will WARN() or even BUG(). In other words, the following
must be true for 'gfp' passed to shmem_read_mapping_pages_gfp():
gfp_zone(mapping_gfp_mask(mapping)) == gfp_zone(gfp)
Add a comment to drm_gem_read_pages() explaining that constraint.
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
If probe fails after IOMMU is attached, we need to detach in order to
clean up properly. Before this change, IOMMU faults would occur if the
probe failed (-EPROBE_DEFER).
Signed-off-by: Stephane Viau <sviau@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
Add a VRAM carveout that is used for systems which do not have an IOMMU.
The VRAM carveout uses CMA. The arch code must setup a CMA pool for the
device (preferrably in highmem.. a 256m-512m VRAM pool in lowmem is not
cool). The user can configure the VRAM pool size using msm.vram module
param.
Technically, the abstraction of IOMMU behind msm_mmu is not strictly
needed, but it simplifies the GEM code a bit, and will be useful later
when I add support for a2xx devices with GPUMMU, so I decided to keep
this part.
It appears to be possible to configure the GPU to restrict access to
addresses within the VRAM pool, but this is not done yet. So for now
the GPU will refuse to load if there is no sort of mmu. Once address
based limits are supported and tested to confirm that we aren't giving
the GPU access to arbitrary memory, this restriction can be lifted
Signed-off-by: Rob Clark <robdclark@gmail.com>
Subsequent threads returning EBUSY from vm_insert_pfn() was not
handled correctly. As a result concurrent access from new threads
to mmapped data caused SIGBUS.
See e79e0fe3
Signed-off-by: Rob Clark <robdclark@gmail.com>
Acked-by: David Brown <davidb@codeaurora.org>
Re-arrange things a bit so that we can get work requested after a bo
fence passes, like pageflip, done before retiring bo's. Without any
sort of bo cache in userspace, some games can trigger hundred's of
transient bo's, which can cause retire to take a long time (5-10ms).
Obviously we want a bo cache.. but this cleanup will make things a
bit easier for atomic as well and makes things a bit cleaner.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Acked-by: David Brown <davidb@codeaurora.org>
When we CPU_PREP a bo with NOSYNC flag (for example, to implement
PIPE_TRANSFER_DISCARD_WHOLE_RESOURCE), an -EBUSY return indicates to
userspace that the bo is still busy. Previously it was incorrectly
returning 0 in this case.
And while we're in there throw in an bit of extra sanity checking in
case userspace tries to wait for a bogus fence.
Signed-off-by: Rob Clark <robdclark@gmail.com>
In case of error, the function drm_prime_pages_to_sg() returns ERR_PTR()
and never returns NULL. The NULL test in the return value check should
be replaced with IS_ERR().
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
The userspace API already had everything needed to handle read vs write
synchronization. This patch actually bothers to hook it up properly, so
that we don't need to (for example) stall on userspace read access to a
buffer that gpu is also still reading.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Add initial support for a3xx 3d core.
So far, with hardware that I've seen to date, we can have:
+ zero, one, or two z180 2d cores
+ a3xx or a2xx 3d core, which share a common CP (the firmware
for the CP seems to implement some different PM4 packet types
but the basics of cmdstream submission are the same)
Which means that the eventual complete "class" hierarchy, once
support for all past and present hw is in place, becomes:
+ msm_gpu
+ adreno_gpu
+ a3xx_gpu
+ a2xx_gpu
+ z180_gpu
This commit splits out the parts that will eventually be common
between a2xx/a3xx into adreno_gpu, and the parts that are even
common to z180 into msm_gpu.
Note that there is no cmdstream validation required. All memory access
from the GPU is via IOMMU/MMU. So as long as you don't map silly things
to the GPU, there isn't much damage that the GPU can do.
Signed-off-by: Rob Clark <robdclark@gmail.com>
The snapdragon chips have multiple different display controllers,
depending on which chip variant/version. (As far as I can tell, current
devices have either MDP3 or MDP4, and upcoming devices have MDSS.) And
then external to the display controller are HDMI, DSI, etc. blocks which
may be shared across devices which have different display controller
blocks.
To more easily add support for different display controller blocks, the
display controller specific bits are split out into a "kms" module,
which provides the kms plane/crtc/encoder objects.
The external HDMI, DSI, etc. blocks are part encoder, and part connector
currently. But I think I will pull in the drm_bridge patches from
chromeos tree, and split them into a bridge+connector, with the
registers that need to be set in modeset handled by the bridge. This
would remove the 'msm_connector' base class. But some things need to be
double checked to make sure I could get the correct ON/OFF sequencing..
This patch adds support for mdp4 crtc (including hw cursor), dtv encoder
(part of MDP4 block), and hdmi.
Signed-off-by: Rob Clark <robdclark@gmail.com>