If the caller allows and we do not have to wait for any signals,
immediately execute the work within the caller's process. By doing so we
avoid the overhead of scheduling a new task, and the latency in
executing it, at the cost of pulling that work back into the immediate
context. (Sometimes we still prefer to offload the task to another cpu,
especially if we plan on executing many such tasks in parallel for this
client.)
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200325120227.8044-2-chris@chris-wilson.co.uk
We dropped calling process_csb prior to handling direct submission in
order to avoid the nesting of spinlocks and lift process_csb() and the
majority of the tasklet out of irq-off. However, we do want to avoid
ksoftirqd latency in the fast path, so try and pull the interrupt-bh
local to direct submission if we can acquire the tasklet's lock.
v2: Document the read of pending[0] from outside the tasklet with
READ_ONCE.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200325120227.8044-1-chris@chris-wilson.co.uk
Driver changes for ti-sysc for v5.7 merge window
Driver changes for ti-sysc interconnect target module driver mostly
to be able to probe display subsystem (DSS) without platform data:
- Rename clk_enable/disable quirks to less confusing pre and post
reset quirks
- Enable module reset to work with modules with no sysconfig register
- Also consider non-existing module register when matching quirks
- Don't warn with nested ti-sysc devices
- Implement basic SoC revision handling
- Detect DSS related devices
- Implement DSS reset quirks
Note that there is also a DSS driver specific probe fix to allow
probing devices configured for interconnect target module data that
was agreed to be merged along with the ti-sysc driver changes.
And then there also changes to handle RTC, EDMA and PRUSS:
- Add module unlock quirk for RTC
- Detect EDMA modules
- Add support for handling PRUSS
* tag 'omap-for-v5.7/ti-sysc-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
bus: ti-sysc: Add support for PRUSS SYSC type
dt-bindings: bus: ti-sysc: Add support for PRUSS SYSC type
bus: ti-sysc: Detect EDMA and set quirk flags for tptc
bus: ti-sysc: Fix wrong offset for display subsystem reset quirk
bus: ti-sysc: Implement display subsystem reset quirk
bus: ti-sysc: Detect display subsystem related devices
bus: ti-sysc: Handle module unlock quirk needed for some RTC
bus: ti-sysc: Implement SoC revision handling
bus: ti-sysc: Don't warn about legacy property for nested ti-sysc devices
bus: ti-sysc: Consider non-existing registers too when matching quirks
bus: ti-sysc: Improve reset to work with modules with no sysconfig
bus: ti-sysc: Rename clk related quirks to pre_reset and post_reset quirks
bus: ti-sysc: Fix 1-wire reset quirk
drm/omap: Prepare DSS for probing without legacy platform data
Link: https://lore.kernel.org/r/pull-1583511417-919838@atomide.com-3
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
We set the priority hint on execlists to avoid executing the tasklet for
when we know that there will be no change in execution order. However,
as we set it from the virtual engine for all siblings, but only one
physical engine may respond, we leave the hint set on the others
stopping direct submission that could take place.
If we do not set the hint, we may attempt direct submission even if we
don't expect to submit. If we set the hint, we may not do any submission
until the tasklet is run (and sometimes we may park the engine before
that has had a chance). Ergo there's only a minor ill-effect on mixed
virtual/physical engine workloads where we may try and fail to do direct
submission more often than required. (Pure virtual / engine workloads
will have redundant tasklet execution suppressed as normal.)
Closes: https://gitlab.freedesktop.org/drm/intel/issues/1522
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200325101358.12231-1-chris@chris-wilson.co.uk
- fix for potential out-of-bounds reads in the perfmon ioctl
implementation from Christian
- override to expose proper feature flags for the GC400 found on the
STM32MP1 SoC, also from Christian
- Guido fixed an issue where we would spuriously fail to enter
runtime suspend due to a new GPU engine status bit on GC7000
- tree-wide change from Gustavo to get rid of zero-length arrays
- fix for missed TS cache flush on GC7000, leading to spurious
MMU faults from me
- request pages from DMA32 zone on systems where we can't address
all present memory from me
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Lucas Stach <l.stach@pengutronix.de>
Link: https://patchwork.freedesktop.org/patch/msgid/74d9c6d19099fdba6c6795204a6aa445b7930c79.camel@pengutronix.de
dGFX has local memory so it does not have aperture or support
CPU fences but even for iGFX it have a small number of fences.
As replacement for fences to track frontbuffer modifications by CPU
we have a software tracking that is already in used by FBC and PSR.
PSR don't support fences so it shows that this tracking is reliable.
So lets make fences a nice-to-have to activate FBC for GEN9+, this
will allow us to enable FBC for dGFXs and iGFXs even when there is no
available fence.
We do not set fences to rotated planes but FBC only have restrictions
against 16bpp, so adding it here.
Also adding a new check for the tiling format, fences are only set
to X and Y tiled planes but again FBC don't have any restrictions
against tiling so adding linear as supported as well, other formats
should be added after tested but IGT only supports drawing in thse
3 formats.
intel_fbc_hw_tracking_covers_screen() maybe can also have the same
treatment as fences but BSpec is not clear if the size limitation is
for hardware tracking or general use of FBC and I don't have a 5K
display to test it, so keeping as is for safety.
v2:
- Added tiling and pixel format rotation checks
- Changed the GEN version not requiring fences to 11 from 9, DDX
needs some changes but it don't have support for GEN11+
v3:
- Changed back to GEN9+
- Moved GEN test to inside of tiling_is_valid()
v4:
- moved rotation check to its own functions
v5:
- renamed rotations_is_valid to rotation_is_valid
- moved pre-g4x rotation check to rotation_is_valid()
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200319211535.114625-1-jose.souza@intel.com
is_color_space_conversion() is a misnomer. It checks not only if color
space conversion is needed, but also if format conversion is needed.
This is actually desired behaviour because result of this function
determines if CSC block should be enabled or not (CSC block can also do
format conversion).
In order to clear misunderstandings, let's rework
is_color_space_conversion() to do exactly what is supposed to do and add
another function which will determine if CSC block must be enabled or
not.
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Jernej Skrabec <jernej.skrabec@siol.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20200304232512.51616-5-jernej.skrabec@siol.net
Using huge page-table entries requires that the physical address of the
start of a buffer object is huge page size aligned.
Make a special version of the TTM range manager that accomplishes this,
but falls back to a smaller page size alignment (PUD->PMD, PMD->NORMAL)
to avoid eviction.
If other drivers want to use it in the future, it can be made a
TTM generic helper. Note that drivers can force eviction for a certain
alignment by assigning the TTM GPU alignment correspondingly.
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: "Jérôme Glisse" <jglisse@redhat.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Thomas Hellstrom (VMware) <thomas_os@shipmail.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Acked-by: Christian König <christian.koenig@amd.com>
Support huge (PMD-size and PUD-size) page-table entries by providing a
huge_fault() callback.
We still support private mappings and write-notify by splitting the huge
page-table entries on write-access.
Note that for huge page-faults to occur, either the kernel needs to be
compiled with trans-huge-pages always enabled, or the kernel needs to be
compiled with trans-huge-pages enabled using madvise, and the user-space
app needs to call madvise() to enable trans-huge pages on a per-mapping
basis.
Furthermore huge page-faults will not succeed unless buffer objects and
user-space addresses are aligned on huge page size boundaries.
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: "Jérôme Glisse" <jglisse@redhat.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Thomas Hellstrom (VMware) <thomas_os@shipmail.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Transcoder timing calculation differ for command mode.
v2: Use is_vid_mode, and use same I915_WRITE (Jani)
v3: Adjust the calculations to reflect dsc compression ratio
v4: Rearrange the vertical and horizontal timing calc, optimize
local variables usage. (Jani)
v5: Fix the values used for calculation, use afe_clk for
byte clock calculation, use intel_de_write/read (Jani)
Signed-off-by: Vandita Kulkarni <vandita.kulkarni@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200312053841.2794-3-vandita.kulkarni@intel.com
Surface define v4 added new member buffer_byte_stride. With this patch
add buffer_byte_stride in surface metadata and create surface using new
command if support is available.
Also with this patch replace device specific data types with kernel
types.
Signed-off-by: Deepak Rawat <drawat.floss@gmail.com>
Reviewed-by: Thomas Hellström (VMware) <thomas_os@shipmail.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Roland Scheidegger <sroland@vmware.com>
Makes surface_define cleaner by sending vmw_surface_metadata instead of
all the arguments individually.
v2: fix uninitialized return value, error message
Signed-off-by: Deepak Rawat <drawat.floss@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Thomas Hellström (VMware) <thomas_os@shipmail.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Roland Scheidegger <sroland@vmware.com>
Create a new structure vmw_surface_metadata representing the metadata
used for creating surface. With this can make the surface_define_priv
a bit cleaner.
Signed-off-by: Deepak Rawat <drawat.floss@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Thomas Hellström (VMware) <thomas_os@shipmail.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Roland Scheidegger <sroland@vmware.com>
With SM5 capability a new version of streamoutput is supported by device
which need backing mob and a new field. With this change the new command
is supported in command buffer.
v2: Also track streamoutput context binding in binding manager.
v3: Track only one streamoutput as only one can be set to context.
v4: Fix comment typos
Signed-off-by: Deepak Rawat <drawat.floss@gmail.com>
Signed-off-by: Neha Bhende <bhenden@vmware.com>
Reviewed-by: Thomas Hellström (VMware) <thomas_os@shipmail.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Roland Scheidegger <sroland@vmware.com>
Previous name vmw_ctx_bindinfo_so is misleading because it actually
represent so target and stream output is a new resource type that needs
tracking for SM5 capable device. Also rename binding type enum and
internal functions to reflect these belongs to so targets.
Signed-off-by: Deepak Rawat <drawat.floss@gmail.com>
Reviewed-by: Thomas Hellström (VMware) <thomas_os@shipmail.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Roland Scheidegger <sroland@vmware.com>
Virtual device now support new commands to manage unordered access
views. Allow them as part of user-space command buffer. This involves
adding UA view cotable, binding tracker info, new view type and command
verifier functions.
v2: fix comment typo
v3: style fixes (don't use deprecated PTR_RET)
Signed-off-by: Deepak Rawat <drawat.floss@gmail.com>
Signed-off-by: Neha Bhende <bhenden@vmware.com>
Reviewed-by: Thomas Hellström (VMware) <thomas_os@shipmail.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Roland Scheidegger <sroland@vmware.com>
Virtual device now supports new shader types, allow them as valid shader
type in command buffer. Also add per shader bind info in binding manager
state for new shader type.
Signed-off-by: Deepak Rawat <drawat.floss@gmail.com>
Reviewed-by: Thomas Hellström (VMware) <thomas_os@shipmail.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Roland Scheidegger <sroland@vmware.com>
Virtual device added new register for suggested GB memory, read the new
register when available.
Signed-off-by: Deepak Rawat <drawat.floss@gmail.com>
Reviewed-by: Thomas Hellström (VMware) <thomas_os@shipmail.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Roland Scheidegger <sroland@vmware.com>
A new enum to represent new SM5 graphics context capability in vmwgfx.
v2: use new correct cap bits (merged several later commits into it).
Signed-off-by: Deepak Rawat <drawat.floss@gmail.com>
Signed-off-by: Thomas Hellström (VMware) <thomas_os@shipmail.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Roland Scheidegger <sroland@vmware.com>
Instead of having different bool in device private to represent
incremental graphics context capabilities, add a new sm type enum.
v2: Use enum instead of bit flag.
v3: Incorporated review comments.
Signed-off-by: Deepak Rawat <drawat.floss@gmail.com>
Reviewed-by: Thomas Hellström (VMware) <thomas_os@shipmail.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Roland Scheidegger <sroland@vmware.com>