This should prevent a number of races from occuring, the most obvious of
which will be exposed when we start making use of the "display sync" evo
channel for page flipping. The DS channel will reject any command stream
that doesn't completely agree with the current "master" state.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
We need to be able to have the bh run while possibly spinning waiting for
the EVO notifier to signal. This apparently happens in some circumstances
with preempt disabled, so our workqueue was never being run.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
The nv50 display isr bh needs to be converted to a tasklet, which means
we can't sleep anymore. The places we execute vbios init tables are
rare, and not in any way performance critical, so this isn't a huge
problem.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
With cmwq, there's no reason for nouveau to use a dedicated workqueue.
Drop dev_priv->wq and use system_wq instead. Each work item is sync
flushed when the containing structure is unregistered/destroyed.
Note that this change also makes sure that nv50_gpio_handler is not
freed while the contained work item is still running.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: David Airlie <airlied@linux.ie>
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
This gives a small, but noticeable performance gain at lower performance
levels, and unchanged at the higher ones.
With this commit, we're now using the same timeslice size as the NVIDIA
binary driver currently does, and dropping an unknown bit that NVIDIA
no longer appear to set.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
We may well be making more use of semaphores in the future, having the
entire VM available makes requiring DMA objects for each and every
semaphore block unnecessary.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
And also, don't disable PFIFO IRQs completely whenever we recieve one,
just when we don't know about it already.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
These are the same semaphores nvc0 will use, and they potentially allow
us to do much cooler things than our current inter-channel sync impl.
Lets switch to them where possible now for some testing.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
In preparation for the addition of a new nv40 backend, we'll need to be
able to distinguish between a paged dma object and the on-chip GART.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
This reverts commit 5a893fc28f.
This causes a use after free in the ttm free alloc pages path,
when it tries to get the be after the be has been destroyed.
Signed-off-by: Dave Airlie <airlied@redhat.com>
* 'stable/ttm.pci-api.v5' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
ttm: Include the 'struct dev' when using the DMA API.
nouveau/ttm/PCIe: Use dma_addr if TTM has set it.
radeon/ttm/PCIe: Use dma_addr if TTM has set it.
ttm: Expand (*populate) to support an array of DMA addresses.
ttm: Utilize the DMA API for pages that have TTM_PAGE_FLAG_DMA32 set.
ttm: Introduce a placeholder for DMA (bus) addresses.
Using an order 19 drm_ht for the mmap offsets is a little obscene. That
means that will a fully populated GTT with every single object mmaped at
least once in its lifetime, there will be exactly one object in each
bucket.
Typically systems only have at most a few thousand objects, though you
may see a KDE desktop hit 50000. And most of those should never be
mapped... On my systems, just using an order 10 ht would still have an
average occupancy less than 1, so apply a small safety factor and
use an order 12 ht, like the other mmap offset ht.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Dave Airlie <airlied@redhat.com>
... and fixup some methods to accept the constant argument.
Now that constant module arrays are loaded into read-only memory, using
const appropriately has some benefits beyond warning the programmer
about likely mistakes.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Dave Airlie <airlied@redhat.com>
the texture checking code didn't work for block formats like s3tc,
this overhauls it to work for all types.
v2: add texture array support.
v3: add subsampled formats
Signed-off-by: Dave Airlie <airlied@redhat.com>
Nouveau doesn't have enough information at ttm_backend_func.bind() time
to implement things like tiled GART, or to keep a buffer at a constant
address in the GPU virtual address space no matter where in physical
memory it's placed.
To resolve this, nouveau will handle binding of all buffers to the GPU
itself from the move_notify() hook. This commit ensures it's called
for all buffer moves.
Talked to Dave about the impact on radeon, which uses move_notify, it
doesn't look like anything should break there.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Thomas Hellstrom <thomas@shipmail.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
If there is an alpha channel, need to mask in 1's in the alpha channel
to prevent the fb from being completely transparent.
Signed-off-by: Rob Clark <rob@ti.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Now all the asic specific stuff ist mostly hid in radeon_asic.*
Reviewed-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@redhat.com>
... and switch it to container_of upcasting.
v2: converted new pageflip code-paths.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Unconditionally initialize the drm gem object - it's not
worth the trouble not to for the few kernel objects.
This patch only changes the place of the drm gem object,
access is still done via pointers.
v2: Uncoditionally align the size in radeon_bo_create. At
least the r600/evergreen blit code didn't to this, angering
the paranoid gem code.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@redhat.com>
With the switch to implicit free space accounting one pointer
got unused when scanning. Use it to create a single-linked list
to ensure correct unwinding of the scan state.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@redhat.com>
The old api has a two-step process: First search for a suitable
free hole, then allocate from that specific hole. No user used
this to do anything clever. So drop it for the embeddable variant
of the drm_mm api (the old one retains this ability, for the time
being).
With struct drm_mm_node embedded, we cannot track allocations
anymore by checking for a NULL pointer. So keep track of this
and add a small helper drm_mm_node_allocated.
Also add a function to move allocations between different struct
drm_mm_node.
v2: Implement suggestions by Chris Wilson.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@redhat.com>
The idea is to track free holes implicitly by marking the allocation
immediatly preceeding a hole.
To avoid an ugly corner case add a dummy head_node to struct drm_mm
to track the hole that spans to complete allocation area when the
memory manager is empty.
To guarantee that there's always a preceeding/following node (that might
be marked as hole_follows == 1), move the mm->node_list list_head to the
head_node.
The main allocator and fair-lru scan code actually becomes simpler.
Only the debug code slightly suffers because free areas are no longer
explicit.
Also add drm_mm_for_each_node (which will be much more useful when
struct drm_mm_node is embeddable).
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Nouveau was checking drm_mm internals on teardown to see whether the
memory manager was initialized. Hide these internals in a small
inline helper function.
Acked-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This makes the accounting when using 'debug_dma_dump_mappings()'
and CONFIG_DMA_API_DEBUG=y be assigned to the correct device
instead of 'fallback'.
No functional change - just cosmetic.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
If the TTM layer has used the DMA API to setup pages that are
TTM_PAGE_FLAG_DMA32 (look at patch titled: "ttm: Utilize the
DMA API for pages that have TTM_PAGE_FLAG_DMA32 set"), lets
use it when programming the GART in the PCIe type cards.
This patch skips doing the pci_map_page (and pci_unmap_page) if
there is a DMA addresses passed in for that page. If the dma_address
is zero (or DMA_ERROR_CODE), then we continue on with our old
behaviour.
[v2: Added a review-by tag]
Reviewed-by: Thomas Hellstrom <thomas@shipmail.org>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Tested-by: Ian Campbell <ian.campbell@citrix.com>
If the TTM layer has used the DMA API to setup pages that are
TTM_PAGE_FLAG_DMA32 (look at patch titled: "ttm: Utilize the dma_addr_t
array for pages that are to in DMA32 pool."), lets use it
when programming the GART in the PCIe type cards.
This patch skips doing the pci_map_page (and pci_unmap_page) if
there is a DMA addresses passed in for that page. If the dma_address
is zero (or DMA_ERROR_CODE), then we continue on with our old
behaviour.
[v2: Fixed an indentation problem, added reviewed-by tag]
[v3: Added Acked-by Jerome]
Acked-by: Jerome Glisse <j.glisse@gmail.com>
Reviewed-by: Thomas Hellstrom <thomas@shipmail.org>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Tested-by: Ian Campbell <ian.campbell@citrix.com>
We free the temporary binding before leaving this function, so we also have
to wait for the move to actually complete.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Due to the default case handling the older chipsets, a bunch of the newer
ones ended up having the wrong tiling regs used. This commit switches the
default case to handle the newest chipsets.
This also makes nv4e touch the "extra" tiling regs. "nv" doesn't touch
them for C51 but traces of the NVIDIA binary driver show it being done
there.
I couldn't find NV41/NV45 traces to confirm the behaviour there, but an
educated guess was taken at each of them.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>