Add ptr to list of interesting registers to 'struct adreno_gpu' and use
that to move most of the debugfs show and register dump bits down into
adreno_gpu. This will avoid duplication as support for additional
adreno generations is added.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Push a few bits down into adreno_gpu so they won't have to be duplicated
as support for additional adreno generations is added.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Somewhere along the way, the firmware loader sprouted another lock
dependency, resulting in possible deadlock scenario:
&dev->struct_mutex --> &sb->s_type->i_mutex_key#2 --> &mm->mmap_sem
which is problematic vs things like gem mmap.
So introduce a separate mutex to synchronize gpu init.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Downstream kernel IOMMU had a non-standard way of dealing with multiple
devices and multiple ports/contexts. We don't need that on upstream
kernel, so rip out the crazy.
Note that we have to move the pinning of the ringbuffer to after the
IOMMU is attached. No idea how that managed to work properly on the
downstream kernel.
For now, I am leaving the IOMMU port name stuff in place, to simplify
things for folks trying to backport latest drm/msm to device kernels.
Once we no longer have to care about pre-DT kernels, we can drop this
and instead backport upstream IOMMU driver.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Some of the w/a or different behavior of userspace blob driver seem to
be keyed to gpu patch revision, rather than gpu-id. So expose the full
chip-id to userspace so it can DTRT.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Add support for adreno 330. Not too much different, just a few
differences in initial configuration plus setting OCMEM base.
Userspace support is already in upstream mesa.
Note that the existing DT code is simply using the bindings from
downstream android kernel, to simplify porting of this driver to
existing devices. These do not constitute any committed/stable
DT ABI. The addition of proper DT bindings will be a subsequent
patch, at which point (as best as possible) I will try to support
either upstream bindings or what is found in downstream android
kernel, so that existing device DT files can be used.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Add a VRAM carveout that is used for systems which do not have an IOMMU.
The VRAM carveout uses CMA. The arch code must setup a CMA pool for the
device (preferrably in highmem.. a 256m-512m VRAM pool in lowmem is not
cool). The user can configure the VRAM pool size using msm.vram module
param.
Technically, the abstraction of IOMMU behind msm_mmu is not strictly
needed, but it simplifies the GEM code a bit, and will be useful later
when I add support for a2xx devices with GPUMMU, so I decided to keep
this part.
It appears to be possible to configure the GPU to restrict access to
addresses within the VRAM pool, but this is not done yet. So for now
the GPU will refuse to load if there is no sort of mmu. Once address
based limits are supported and tested to confirm that we aren't giving
the GPU access to arbitrary memory, this restriction can be lifted
Signed-off-by: Rob Clark <robdclark@gmail.com>
If gpu locks up with the rptr shortly beyond the wrap-around point in
the ringbuffer, because the rptr was not reset (but wptr is, by virtue
of resetting rb->cur), we could end up in a scenario where we think
there is not enough space in the ringbuffer for the next cmds. And
since the CP won't reset rptr until after processing an IB, this leaves
things in a sort of deadlock.
So reset rptr too. And a bit more spiffing up of hangcheck to make
things easier to debug.
Signed-off-by: Rob Clark <robdclark@gmail.com>
A basic, no-frills recovery mechanism in case the gpu gets wedged. We
could try to be a bit more fancy and restart the next submit after the
one that got wedged, but for now keep it simple. This is enough to
recover things if, for example, the gpu hangs mid way through a piglit
run.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Add initial support for a3xx 3d core.
So far, with hardware that I've seen to date, we can have:
+ zero, one, or two z180 2d cores
+ a3xx or a2xx 3d core, which share a common CP (the firmware
for the CP seems to implement some different PM4 packet types
but the basics of cmdstream submission are the same)
Which means that the eventual complete "class" hierarchy, once
support for all past and present hw is in place, becomes:
+ msm_gpu
+ adreno_gpu
+ a3xx_gpu
+ a2xx_gpu
+ z180_gpu
This commit splits out the parts that will eventually be common
between a2xx/a3xx into adreno_gpu, and the parts that are even
common to z180 into msm_gpu.
Note that there is no cmdstream validation required. All memory access
from the GPU is via IOMMU/MMU. So as long as you don't map silly things
to the GPU, there isn't much damage that the GPU can do.
Signed-off-by: Rob Clark <robdclark@gmail.com>