On the userspace side, all the basics are working, and most of glmark2
is working. I've been working through deqp, and I've got a couple more
things to fix (but we've gone from 70% to 80+% pass in last day, and
current deqp run that is going should pick up another 5-10%). I expect
to push the mesa patches today or tomorrow.
There are a couple more a5xx related patches to take the gpu out of
secure mode (for the devices that come up in secure mode, like the hw
I have), but those depend on an scm patch that would come in through
another tree. If that can land in the next day or two, there might
be a second late pull request for drm/msm.
In addition to the new-shiny, there have also been a lot of overlay/
plane related fixes for issues found using drm-hwc2 (in the process of
testing/debugging the atomic/kms fence patches), resulting in rework
to assign hwpipes to kms planes dynamically (as part of global atomic
state) and also handling SMP (fifo) block allocation atomically as
part of the ->atomic_check() step. All those patches should also help
out atomic weston (when those patches eventually land).
* 'msm-next' of git://people.freedesktop.org/~robclark/linux: (36 commits)
drm/msm: gpu: Add support for the GPMU
drm/msm: gpu: Add A5XX target support
drm/msm: Disable interrupts during init
drm/msm: Remove 'src_clk' from adreno configuration
drm/msm: gpu: Add OUT_TYPE4 and OUT_TYPE7
drm/msm: Add adreno_gpu_write64()
drm/msm: gpu Add new gpu register read/write functions
drm/msm: gpu: Return error on hw_init failure
drm/msm: gpu: Cut down the list of "generic" registers to the ones we use
drm/msm: update generated headers
drm/msm/adreno: move scratch register dumping to per-gen code
drm/msm/rd: support for 64b iova
drm/msm: convert iova to 64b
drm/msm: set dma_mask properly
drm/msm: Remove bad calls to of_node_put()
drm/msm/mdp5: move LM bounds check into plane->atomic_check()
drm/msm/mdp5: dump smp state on errors too
drm/msm/mdp5: add debugfs to show smp block status
drm/msm/mdp5: handle SMP block allocations "atomically"
drm/msm/mdp5: dynamically assign hw pipes to planes
...
Big thing is that drm-misc is now officially a group maintainer/committer
model thing, with MAINTAINERS suitably updated. Otherwise just the usual
pile of misc things all over, nothing that stands out this time around.
* tag 'drm-misc-next-2016-11-29' of git://anongit.freedesktop.org/git/drm-misc: (33 commits)
drm: Introduce drm_framebuffer_assign()
drm/bridge: adv7511: Enable the audio data and clock pads on adv7533
drm/bridge: adv7511: Add Audio support
drm/edid: Consider alternate cea timings to be the same VIC
drm/atomic: Constify drm_atomic_crtc_needs_modeset()
drm: bridge: dw-hdmi: add ASoC dependency
drm: Fix shift operations for drm_fb_helper::drm_target_preferred()
drm: Avoid NULL dereference for DRM_LEGACY debug message
drm: Use u64_to_user_ptr() helper for blob ioctls
drm: Fix conflicting macro parameter in drm_mm_for_each_node_in_range()
drm: Fixup kernel doc for driver->gem_create_object
drm/hisilicon/hibmc: mark PM functions __maybe_unused
drm/hisilicon/hibmc: Checking for NULL instead of IS_ERR()
drm: bridge: add DesignWare HDMI I2S audio support
drm: Check against color expansion in drm_mm_reserve_node()
drm: Define drm_mm_for_each_node_in_range()
drm/doc: Fix links in drm_property.c
MAINTAINERS: Add link to drm-misc documentation
vgaarb: use valid dev pointer in vgaarb_info()
drm/atomic: Unconfuse the old_state mess in commmit_tail
...
drm/qxl: various bugfixes and cleanups,
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABAgAGBQJYMzLfAAoJEEy22O7T6HE41rIQANAEl/o8cYUoyYTJlhmmnl2U
K+QBdr7PACdbr8RZrGpwA5ad9ZJGijpZRd2gThrzNS0JBdZI48gPEzU7V206xlyD
AriBeAu6IkoBTEl+GGx2DfvOdLR6+7KlIrDYIpl2vILgkqlHhneXdHR3R03byRHG
2Jrxv2YQxCs8swtAb8FRkVNaUgrfkKOKFFlx1LoLFApYeP02oSxZp0Ve4nuRNj7x
9DCivIw4NyQ9tY1fORapmrEPTerqZnzYdb9RFSv4xilx4Stq1UWdXfTSpwXZHZaG
VroXZb1I0fZEk1aapIxuzLZFGNSM7wLET/nK02sSvzxJJv2PiyVAabIo70nUqsQK
H/iGT2g4MZC1Yvz6evENtckbiA1p3F9jnd+Po9ivDY/RrTpND3hVC2WbcOXWxZkb
m69muvXfrnZwoF9xWPG8aTrCATim++1Ty8/8LoKdVq1d0Dp/Gzk8KnklBPY2vRFt
dpxqH3jLgED/QcO5W/yQdf0kPRsrNwKFNLqP9bCF2hMIw1VHHddZtnBBXDGATXYq
hdFA8EEg3gh/kY7V8b+GyxjRKRbveG208hu+H4EirxHmRn5xJN1VoTLk9va+AJL1
I30l4USLDkTgf1AjYmk7yFIUTemCtwjfa0lsuu4l3rRJ3k1eBrtZe2cpWv2BoQDU
by0sNnDelzJTQ9/v1i3J
=OYiT
-----END PGP SIGNATURE-----
Merge tag 'drm-qemu-20161121' of git://git.kraxel.org/linux into drm-next
drm/virtio: fix busid in a different way, allocate more vbufs.
drm/qxl: various bugfixes and cleanups,
* tag 'drm-qemu-20161121' of git://git.kraxel.org/linux: (224 commits)
drm/virtio: allocate some extra bufs
qxl: Allow resolution which are not multiple of 8
qxl: Don't notify userspace when monitors config is unchanged
qxl: Remove qxl_bo_init() return value
qxl: Call qxl_gem_{init, fini}
qxl: Add missing '\n' to qxl_io_log() call
qxl: Remove unused prototype
qxl: Mark some internal functions as static
Revert "drm: virtio: reinstate drm_virtio_set_busid()"
drm/virtio: fix busid regression
drm: re-export drm_dev_set_unique
Linux 4.9-rc5
gp8psk: Fix DVB frontend attach
gp8psk: fix gp8psk_usb_in_op() logic
dvb-usb: move data_mutex to struct dvb_usb_device
iio: maxim_thermocouple: detect invalid storage size in read()
aoe: fix crash in page count manipulation
lightnvm: invalid offset calculation for lba_shift
Kbuild: enable -Wmaybe-uninitialized warnings by default
pcmcia: fix return value of soc_pcmcia_regulator_set
...
Most 5XX targets have GPMU (Graphics Power Management Unit) that
handles a lot of the heavy lifting for power management including
thermal and limits management and dynamic power collapse. While
the GPMU itself is optional, it is usually nessesary to hit
aggressive power targets.
The GPMU firmware needs to be loaded into the GPMU at init time via a
shared hardware block of registers. Using the GPU to write the microcode
is more efficient than using the CPU so at first load create an indirect
buffer that can be executed during subsequent initalization sequences.
After loading the GPMU gets initalized through a shared register
interface and then we mostly get out of its way and let it do
its thing.
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
Disable the interrupt during the init sequence to avoid having
interrupts fired for errors and other things that we are not
ready to handle while initializing.
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
The adreno code inherited a silly workaround from downstream
from the bad old days before decent clock control. grp_clk[0]
(named 'src_clk') doesn't actually exist - it was used as a proxy
for whatever the core clock actually was (usually 'core_clk').
All targets should be able to correctly request 'core_clk' and
get the right thing back so zap the anachronism and directly
use grp_clk[0] to control the clock rate.
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
Add helper functions for TYPE4 and TYPE7 ME opcodes that replace
TYPE0 and TYPE3 starting with the A5XX targets.
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
Add a new generic function to write a "64" bit value. This isn't
actually a 64 bit operation, it just writes the upper and lower
32 bit of a 64 bit value to a specified LO and HI register. If
a particular target doesn't support one of the registers it can
mark that register as SKIP and writes/reads from that register
will be quietly dropped.
This can be immediately put in place for the ringbuffer base and
the RPTR address. Both writes are converted to use
adreno_gpu_write64() with their respective high and low registers
and the high register appropriately marked as SKIP for both 32 bit
targets (a3xx and a4xx). When a5xx comes it will define valid target
registers for the 'hi' option and everything else will just work.
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
Add some new functions to manipulate GPU registers. gpu_read64 and
gpu_write64 can read/write a 64 bit value to two 32 bit registers.
For 4XX and older these are normally perfcounter registers, but
future targets will use 64 bit addressing so there will be many
more spots where a 64 bit read and write are needed.
gpu_rmw() does a read/modify/write on a 32 bit register given a mask
and bits to OR in.
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
When the GPU hardware init function fails (like say, ME_INIT timed
out) return error instead of blindly continuing on. This gives us
a small chance of saving the system before it goes boom.
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
There are very few register accesses in the common code. Cut down
the list of common registers to just those that are used. This
saves const space and saves us the effort of maintaining registers
for A3XX and A4XX that don't exist or are unused.
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
For a5xx the gpu is 64b so we need to change iova to 64b everywhere. On
the display side, iova is still 32b so it can ignore the upper bits.
(Although all the armv8 devices have an iommu that can map 64b pa to 32b
iova.)
Signed-off-by: Rob Clark <robdclark@gmail.com>
Previous value really only made sense on armv7 without LPAE. Everything
that supports more than 4g of memory also has iommu's that can map
anything.
Signed-off-by: Rob Clark <robdclark@gmail.com>
In add_components_mdp, we parse the endpoints in MDP output ports
using the helper for_each_endpoint_of_node(). Our function calls
of_node_put() on the endpoint node before we iterate over the
next one. This is already done by the helper, and results in
trying to decrement the refcount twice.
Remove the extra of_node_put calls. This fixes warnings seen when
we try to insert the driver as a module on IFC6410.
Reported-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Archit Taneja <architt@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
The mode_config->max_{width,height} is for the maximum size of a fb, not
the max scanout limits (of the layer-mixer). It is legal, and in fact
common, to create a larger fb, only only scan-out a smaller part of it.
For example multi-monitor configurations for x11, or android wallpaper
layer (which is created larger than the screen resolution for fast
scrolling by just changing the src x/y coordinates).
Signed-off-by: Rob Clark <robdclark@gmail.com>
Previously, SMP block allocation was not checked in the plane's
atomic_check() fxn, so we could fail allocation SMP block allocation at
atomic_update() time. Re-work the block allocation to request blocks
during atomic_check(), but not update the hw until committing the atomic
update.
Since SMP blocks allocated at atomic_check() time, we need to manage the
SMP state as part of mdp5_state (global atomic state). This actually
ends up significantly simplifying the SMP management, as the SMP module
does not need to manage the intermediate state between assigning new
blocks before setting flush bits and releasing old blocks after vblank.
(The SMP registers and SMP allocation is not double-buffered, so newly
allocated blocks need to be updated in kms->prepare_commit() released
blocks in kms->complete_commit().)
Signed-off-by: Rob Clark <robdclark@gmail.com>
(re)assign the hw pipes to planes based on required caps, and to handle
situations where we could not modify an in-use plane (ie. SMP block
reallocation).
This means all planes advertise the superset of formats and properties.
Userspace must (as always) use atomic TEST_ONLY step for atomic updates,
as not all planes may be available for use on every frame.
The mapping of hwpipe to plane is stored in mdp5_state, so that state
updates are atomically committed in the same way that plane/etc state
updates are managed. This is needed because the mdp5_plane_state keeps
a pointer to the hwpipe, and we don't want global state to become out
of sync with the plane state if an atomic update fails, we hit deadlock/
backoff scenario, etc. The use of state_lock keeps multiple parallel
updates which both re-assign hwpipes properly serialized.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Add basic state duplication/apply mechanism. Following commits will
move actual global hw state into this.
The state_lock allows multiple concurrent updates to proceed as long as
they don't both try to alter global state. The ww_mutex mechanism will
trigger backoff in case of deadlock between multiple threads trying to
update state.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Archit Taneja <architt@codeaurora.org>
Split out the hardware pipe specifics from mdp5_plane. To start, the hw
pipes are statically assigned to planes, but next step is to assign the
hw pipes during plane->atomic_check() based on requested caps (scaling,
YUV, etc). And then hw pipe re-assignment if required if required SMP
blocks changes.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Archit Taneja <architt@codeaurora.org>
Just use plane->name now that it is a thing. In a following patch, once
we dynamically assign hw pipes to planes, it won't make sense to name
planes the way we do, so this also partly reduces churn in following
patch.
Signed-off-by: Rob Clark <robdclark@gmail.com>
We can do this all from mdp5_plane_complete_commit(), so simplify things
a bit and drop mdp5_plane_complete_flip().
Signed-off-by: Rob Clark <robdclark@gmail.com>
Plane's (pipes) can be assigned dynamically with atomic, so it doesn't
make much sense to name the pipe after it's primary plane.
Signed-off-by: Rob Clark <robdclark@gmail.com>
These are really plane-id's, not crtc-id's. Only connection to CRTCs is
that they are used as primary-planes.
Current name is just legacy from when we only supported RGB/primary
planes. Lets pick a better name now.
Signed-off-by: Rob Clark <robdclark@gmail.com>
We can have various combinations of 64b and 32b address space, ie. 64b
CPU but 32b display and gpu, or 64b CPU and GPU but 32b display. So
best to decouple the device iova's from mmap offset.
Signed-off-by: Rob Clark <robdclark@gmail.com>
If fb dimensions are larger than what can be scanned out, but the src
dimensions are not, the hw can still handle this. So clip.
Signed-off-by: Rob Clark <robdclark@gmail.com>
If the bottom-most layer is not fullscreen, we need to use the BASE
mixer stage for solid fill (ie. MDP5_CTL_BLEND_OP_FLAG_BORDER_OUT). The
blend_setup() code pretty much handled this already, we just had to
figure this out in _atomic_check() and assign the stages appropriately.
Also fix the case where there are zero enabled planes, where we also
need to enable BORDER_OUT.
Signed-off-by: Rob Clark <robdclark@gmail.com>
It has been suggested that having per-plane modifiers is making life
more difficult for userspace, so let's just retire modifier[1-3] and
use modifier[0] to apply to the entire framebuffer.
Obviosuly this means that if individual planes need different tiling
layouts and whatnot we will need a new modifier for each combination
of planes with different tiling layouts.
For a bit of extra backwards compatilbilty the kernel will allow
non-zero modifier[1+] but it require that they will match modifier[0].
This in case there's existing userspace out there that sets
modifier[1+] to something non-zero with planar formats.
Mostly a cocci job, with a bit of manual stuff mixed in.
@@
struct drm_framebuffer *fb;
expression E;
@@
- fb->modifier[E]
+ fb->modifier
@@
struct drm_framebuffer fb;
expression E;
@@
- fb.modifier[E]
+ fb.modifier
Cc: Kristian Høgsberg <hoegsberg@gmail.com>
Cc: Ben Widawsky <benjamin.widawsky@intel.com>
Cc: Rob Clark <robdclark@gmail.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Tomeu Vizoso <tomeu@tomeuvizoso.net>
Cc: dczaplejewicz@collabora.co.uk
Suggested-by: Kristian Høgsberg <hoegsberg@gmail.com>
Acked-by: Ben Widawsky <ben@bwidawsk.net>
Acked-by: Daniel Stone <daniels@collabora.com>
Acked-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1479295996-26246-1-git-send-email-ville.syrjala@linux.intel.com
drm_atomic_set_fence_for_plane() is smart and won't overwrite
plane_state->fence if the user already set an explicit fence there.
Cc: Rob Clark <robdclark@gmail.com>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1478513013-3221-3-git-send-email-gustavo@padovan.org
If VRAM allocation fails, the error handling path crashes in
msm_drm_uninit(). The following changes are made to fix this:
msm_gem_shrinker_cleanup() is fixed to unregister the shrinker only
if it was init-ed in the first place.
Before calling kms->funcs->destroy(), we check if kms->funcs is also
non-NULL. This is needed for MDP5, since during msm_drm_int(), priv->kms
becomes non-NULL early, but msm_kms_init() is called on it only later
in mdp5_kms_init().
Signed-off-by: Archit Taneja <architt@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Andy Gross <andy.gross@linaro.org>
If the bottom-most layer is not fullscreen, we need to use the BASE
mixer stage for solid fill (ie. MDP5_CTL_BLEND_OP_FLAG_BORDER_OUT). The
blend_setup() code pretty much handled this already, we just had to
figure this out in _atomic_check() and assign the stages appropriately.
Also fix the case where there are zero enabled planes, where we also
need to enable BORDER_OUT.
Signed-off-by: Rob Clark <robdclark@gmail.com>
The DSI/HDMI PLLs in MSM require resources like interface clocks, power
domains to be enabled before we can access their registers.
The clock framework doesn't have a mechanism at the moment where we can
tie such resources to a clock, so we make sure that the KMS driver enables
these resources whenever a PLL is expected to be in use.
One place where we can't ensure the resource dependencies are met is when
the clock framework tries to disable unused clocks. The KMS driver doesn't
know when the clock framework calls the is_enabled clk_op, and hence can't
enable interface clocks/power domains beforehand.
We set the CLK_IGNORE_UNUSED flag for PLL clocks for now. This needs to be
revisited, since bootloaders can enable display, and we would want to
disable the PLL clocks if there isn't a display driver using them.
Cc: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Archit Taneja <architt@codeaurora.org>
Acked-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
The msm/dsi host drivers calls drm_helper_hpd_irq_event in the
mipi_dsi_host attach/detatch callbacks.
mipi_dsi_attach()/mipi_dsi_detach() from a panel/bridge
driver could be called from a context where the drm_device's
mode_config.mutex is already held, resulting in a deadlock.
Queue it as work instead.
Signed-off-by: Archit Taneja <architt@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>