As a quick hack we make the old intel_gtt structure mutable so we can
fool a bunch of the existing code which depends on elements in that data
structure. We can/should try to remove this in a subsequent patch.
This should preserve the old gtt init behavior which upon writing these
patches seems incorrect. The next patch will fix these things.
The one exception is VLV which doesn't have the preserved flush control
write behavior. Since we want to do that for all GEN6+ stuff, we'll
handle that in a later patch. Mainstream VLV support doesn't actually
exist yet anyway.
v2: Update the comment to remove the "voodoo"
Check that the last pte written matches what we readback
v3: actually kill cache_level_to_agp_type since most of the flags will
disappear in an upcoming patch
v4: v3 was actually not what we wanted (Daniel)
Make the ggtt bind assertions better and stricter (Chris)
Fix some uncaught errors at gtt init (Chris)
Some other random stuff that Chris wanted
v5: check for i==0 in gen6_ggtt_bind_object to shut up gcc (Ben)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by [v4]: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: Make the cache_level -> agp_flags conversion for pre-gen6 a
tad more robust by mapping everything != CACHE_NONE to the cached agp
flag - we have a 1:1 uncached mapping, but different modes of
cacheable (at least on later generations). Suggested by Chris Wilson.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
In order to handle differences in pte encoding between architectures it
is desirable to have one helper function, pte_encode, do it all for us.
As such, this commit moves the code around so we're in good shape to do
that.
Luckily the ppgtt pte and the ggtt pte look very similar.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
HSW will change the PTE encoding, and laying this out now will be
helpful when we're ready to implement that. More importantly, GGTT and
PPGTT PTE encoding is quite similar, so moving this out into a helper
function will enable us to lance the AGP layer.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This will make the calculations of size easier to read instead of just
assuming uint32_t everywhere.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Some subsequent commits will need to know what generation we're running
on to do different pte encoding for the ppgtt. Since it's not much
hassle or overhead to store it in the ppgtt structure, do that.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The mid-level cache or as it's more commonly referred to now as L3, is
not setup this way on HSW.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Pull drm merge (part 1) from Dave Airlie:
"So first of all my tree and uapi stuff has a conflict mess, its my
fault as the nouveau stuff didn't hit -next as were trying to rebase
regressions out of it before we merged.
Highlights:
- SH mobile modesetting driver and associated helpers
- some DRM core documentation
- i915 modesetting rework, haswell hdmi, haswell and vlv fixes, write
combined pte writing, ilk rc6 support,
- nouveau: major driver rework into a hw core driver, makes features
like SLI a lot saner to implement,
- psb: add eDP/DP support for Cedarview
- radeon: 2 layer page tables, async VM pte updates, better PLL
selection for > 2 screens, better ACPI interactions
The rest is general grab bag of fixes.
So why part 1? well I have the exynos pull req which came in a bit
late but was waiting for me to do something they shouldn't have and it
looks fairly safe, and David Howells has some more header cleanups
he'd like me to pull, that seem like a good idea, but I'd like to get
this merge out of the way so -next dosen't get blocked."
Tons of conflicts mostly due to silly include line changes, but mostly
mindless. A few other small semantic conflicts too, noted from Dave's
pre-merged branch.
* 'drm-next' of git://people.freedesktop.org/~airlied/linux: (447 commits)
drm/nv98/crypt: fix fuc build with latest envyas
drm/nouveau/devinit: fixup various issues with subdev ctor/init ordering
drm/nv41/vm: fix and enable use of "real" pciegart
drm/nv44/vm: fix and enable use of "real" pciegart
drm/nv04/dmaobj: fixup vm target handling in preparation for nv4x pcie
drm/nouveau: store supported dma mask in vmmgr
drm/nvc0/ibus: initial implementation of subdev
drm/nouveau/therm: add support for fan-control modes
drm/nouveau/hwmon: rename pwm0* to pmw1* to follow hwmon's rules
drm/nouveau/therm: calculate the pwm divisor on nv50+
drm/nouveau/fan: rewrite the fan tachometer driver to get more precision, faster
drm/nouveau/therm: move thermal-related functions to the therm subdev
drm/nouveau/bios: parse the pwm divisor from the perf table
drm/nouveau/therm: use the EXTDEV table to detect i2c monitoring devices
drm/nouveau/therm: rework thermal table parsing
drm/nouveau/gpio: expose the PWM/TOGGLE parameter found in the gpio vbios table
drm/nouveau: fix pm initialization order
drm/nouveau/bios: check that fixed tvdac gpio data is valid before using it
drm/nouveau: log channel debug/error messages from client object rather than drm client
drm/nouveau: have drm debugging macros build on top of core macros
...
Convert #include "..." to #include <path/...> in drivers/gpu/.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Dave Airlie <airlied@redhat.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Dave Jones <davej@redhat.com>
Remove redundant DRM UAPI header #inclusions from drivers/gpu/.
Remove redundant #inclusions of core DRM UAPI headers (drm.h, drm_mode.h and
drm_sarea.h). They are now #included via drmP.h and drm_crtc.h via a preceding
patch.
Without this patch and the patch to make include the UAPI headers from the core
headers, after the UAPI split, the DRM C sources cannot find these UAPI headers
because the DRM code relies on specific -I flags to make #include "..." work
on headers in include/drm/ - but that does not work after the UAPI split without
adding more -I flags.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Dave Airlie <airlied@redhat.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Dave Jones <davej@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (GNU/Linux)
iQEcBAABAgAGBQJQX7MuAAoJEHm+PkMAQRiG0h0IAJURkrMCAQUxA+Ik66ReH89s
LQcVd0U9uL4UUOi7f5WR64Vf9Cfu6VVGX9ZKSvjpNskvlQaUQPMIt4pMe6g4X4dI
u0bApEy4XZz3nGabUAghIU8jJ8cDmhCG6kPpSiS7pi7KHc0yIa4WFtJRrIpGaIWT
xuK38YOiOHcSDRlLyWZzainMncQp/ixJdxnqVMTonkVLk0q0b84XzOr4/qlLE5lU
i+TsK3PRKdQXgvZ4CebL+srPBwWX1dmgP3VkeBloQbSSenSeELICbFWavn2ml+sF
GXi4dO93oNquL/Oy5SwI666T4uNcrRPaS+5X+xSZgBW/y2aQVJVJuNZg6ZP/uWk=
=0v2l
-----END PGP SIGNATURE-----
Merge tag 'v3.6-rc7' into drm-intel-next-queued
Manual backmerge of -rc7 to resolve a silent conflict leading to
compile failure in drivers/gpu/drm/i915/intel_hdmi.c.
This is due to the bugfix in -rc7:
commit b98b601672
Author: Wang Xingchao <xingchao.wang@intel.com>
Date: Thu Sep 13 07:43:22 2012 +0800
drm/i915: HDMI - Clear Audio Enable bit for Hot Plug
Since this code moved around a lot in -next git put that snippet at
the wrong spot. I've tried to fix this by making the conflict explicit
by merging a version for next with:
commit 3cce574f01
Author: Wang Xingchao <xingchao.wang@intel.com>
Date: Thu Sep 13 11:19:00 2012 +0800
drm/i915: HDMI - Clear Audio Enable bit for Hot Plug unconditionally
But that failed to solve the entire problem. To avoid pushing out
further -nightly branch to our QA where this is broken, do the
backmerge and manually add the stuff git adds to -next from the patch
in -fixes.
Note that this doesn't show up in git's merge diff (and hence is also
not handled by git rerere), which adds to the reasons why I'd like to
fix this with a verbose backmerge. The git merge diff only shows a
bunch of trivial conflicts of the "code changed in lines next to each
another" kind.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
By providing a callback for when we need to bind the pages, and then
release them again later, we can shorten the amount of time we hold the
foreign pages mapped and pinned, and importantly the dmabuf objects then
behave as any other normal object with respect to the shrinker and
memory management.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Rather than have multiple data structures for describing our page layout
in conjunction with the array of pages, we can migrate all users over to
a scatterlist.
One major advantage, other than unifying the page tracking structures,
this offers is that we replace the vmalloc'ed array (which can be up to
a megabyte in size) with a chain of individual pages which helps reduce
memory pressure.
The disadvantage is that we then do not have a simple array to iterate,
or to access randomly. The common case for this is in the relocation
processing, which will typically fit within a single scatterlist page
and so be almost the same cost as the simple array. For iterating over
the array, the extra function call could be optimised away, but in
reality is an insignificant cost of either binding the pages, or
performing the pwrite/pread.
v2: Fix drm_clflush_sg() to not invoke wbinvd as well! And fix the
trivial compile error from rebasing.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Daniel writes:
"New stuff for -next. Highlights:
- prep patches for the modeset rework. Note that one of those patches
touches the fb helper in the common drm code.
- hasw hdmi audio support (Wang Xingchao)
- improved instdone dumping for gen7 (Ben)
- unbound tracking and a few follow-up patches from Chris
- dma_buf->begin/end_cpu_access plus fix for drm/udl (Dave)
- improve mmio error reporting for hsw
- prep patch for WQ_NON_REENTRANT removal (Tejun Heo)
"
* 'for-airlied' of git://people.freedesktop.org/~danvet/drm-intel: (41 commits)
drm/i915: Remove __GFP_NO_KSWAPD
drm/i915: disable rc6 on ilk when vt-d is enabled
drm/i915: Avoid unbinding due to an interrupted pin_and_fence during execbuffer
drm/i915: Use new INSTDONE registers (Gen7+)
drm/i915: Add new INSTDONE registers
drm/i915: Extract reading INSTDONE
drm/i915: Use a non-blocking wait for set-to-domain ioctl
drm/i915: Juggle code order to ease flow of the next patch
drm/i915: Use cpu relocations if the object is in the GTT but not mappable
drm/i915: Extract general object init routine
drm/i915: Protect private gem objects from truncate (such as imported dmabuf)
drm/i915: Only pwrite through the GTT if there is space in the aperture
i915: use alloc_ordered_workqueue() instead of explicit UNBOUND w/ max_active = 1
drm/i915: Find unclaimed MMIO writes.
drm/i915: Add ERR_INT to gen7 error state
drm/i915: Cantiga+ cannot handle a hsync front porch of 0
drm/i915: fix reassignment of variable "intel_dp->DP"
drm/i915: Try harder to allocate an mmap_offset
drm/i915: Show pin count in debugfs
drm/i915: Show (count, size) of purgeable objects in i915_gem_objects
...
There was some merge conflicts in -next and they weren't so pretty, so
backmerge now to avoid them.
Conflicts:
drivers/gpu/drm/i915/i915_gem.c
drivers/gpu/drm/i915/intel_modes.c
The current layout is to place the per-process tables at the end of the
GTT. However, this is currently using a hardcoded maximum size for the GTT
and not taking in account limitations imposed by the BIOS. Use the value
for the total number of entries allocated in the table as provided by
the configuration registers.
Reported-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Ben Widawsky <ben@bwidawsk.net>
Cc: Matthew Garret <mjg@redhat.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
When dealing with a working set larger than the GATT, or even the
mappable aperture when touching through the GTT, we end up with evicting
objects only to rebind them at a new offset again later. Moving an
object into and out of the GTT requires clflushing the pages, thus
causing a double-clflush penalty for rebinding.
To avoid having to clflush on rebinding, we can track the pages as they
are evicted from the GTT and only relinquish those pages on memory
pressure.
As usual, if it were not for the handling of out-of-memory condition and
having to manually shrink our own bo caches, it would be a net reduction
of code. Alas.
Note: The patch also contains a few changes to the last-hope
evict_everything logic in i916_gem_execbuffer.c - we no longer try to
only evict the purgeable stuff in a first try (since that's superflous
and only helps in OOM corner-cases, not fragmented-gtt trashing
situations).
Also, the extraction of the get_pages retry loop from bind_to_gtt (and
other callsites) to get_pages should imo have been a separate patch.
v2: Ditch the newly added put_pages (for unbound objects only) in
i915_gem_reset. A quick irc discussion hasn't revealed any important
reason for this, so if we need this, I'd like to have a git blame'able
explanation for it.
v3: Undo the s/drm_malloc_ab/kmalloc/ in get_pages that Chris noticed.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: Split out code movements and rant a bit in the commit message
with a few Notes. Done v2]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
They've changed it ... for no apparent reason. Meh.
V2: remove unused 'is_hsw' field.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (GNU/Linux)
iQEcBAABAgAGBQJQLWtvAAoJEHm+PkMAQRiG/DYH+wd0FqfEuYkYk4KPyAPuhKpX
zX7HYfLvyJE/ZYIdrhjq1E6Xm2KNr7gtX7/Rdzi2W38M9sjbYzwG1UGIw51qnxWy
yZJH9BGkfyQgQPeuDGohfB6DkDy2JWr2eqMDvakjOwgBsIzji0PQD/f3UvndhtUa
c+tTj/kjavHE1Yr2Wy6OnRZz3Uc0hIMn/Q0JqtbCs3LUgEV1KA4OEAe56XNz4Ku4
WE+FFaGFPvtriQsQON+ohPS5IC8jzQGK/0vbrJ4lWjFnZy4gvZXnborTOwD0WSQG
fbsNuxp1AaM2/pqfMwXm1w0ADvwOITHNiwwXf9id6DoK81QwTFpUdvKpn6yB6gQ=
=rurr
-----END PGP SIGNATURE-----
Merge tag 'v3.6-rc2' into drm-intel-next
Backmerge Linux 3.6-rc2 to resolve a few funny conflicts before we put
even more madness on top:
- drivers/gpu/drm/i915/i915_irq.c: Just a spurious WARN removed in
-fixes, that has been changed in a variable-rename in -next, too.
- drivers/gpu/drm/i915/intel_ringbuffer.c: -next remove scratch_addr
(since all their users have been extracted in another fucntion),
-fixes added another user for a hw workaroudn.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The exporter should have given us pages in the correct place, avoid
the prepare object mapping phase on dmar systems.
This fixes an oops on a GM45/R600 machine, when running the intel/radeon
tests.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Several functions of the GPU have the restriction that differing memory
domains cannot be placed next to each other (as the GPU may prefetch
beyond the end of one domain and hang as it crosses into the other
domain). We use the facility of the drm_mm to mark ranges with a
particular color that corresponds to the cache attributes of those pages
in order to prevent allocating adjacent blocks of differing memory
types.
v2: Rebase ontop of drm_mm coloring v2.
v3: Fix rebinding existing gtt_space and add a verification routine.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This adds handle->fd and fd->handle support to i915, this is to allow
for offloading of rendering in one direction and outputs in the other.
v2 from Daniel Vetter:
- fixup conflicts with the prepare/finish gtt prep work.
- implement ppgtt binding support.
Note that we have squat i-g-t testcoverage for any of the lifetime and
access rules dma_buf/prime support brings along. And there are quite a
few intricate situations here.
Also note that the integration with the existing code is a bit
hackish, especially around get_gtt_pages and put_gtt_pages. It imo
would be easier with the prep code from Chris Wilson's unbound series,
but that is for 3.6.
Also note that I didn't bother to put the new prepare/finish gtt hooks
to good use by moving the dma_buf_map/unmap_attachment calls in there
(like we've originally planned for).
Last but not least this patch is only compile-tested, but I've changed
very little compared to Dave Airlie's version. So there's a decent
chance v2 on drm-next works as well as v1 on 3.4-rc.
v3: Right when I've hit sent I've noticed that I've screwed up one
obj->sg_list (for dmar support) and obj->sg_table (for prime support)
disdinction. We should be able to merge these 2 paths, but that's
material for another patch.
v4: fix the error reporting bugs pointed out by ickle.
v5: fix another error, and stop non-gtt mmaps on shared objects
stop pread/pwrite on imported objects, add fake kmap
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This originates from a hack by me to quickly fix a bug in an earlier
patch where we needed control over whether or not waiting on a seqno
actually did any retire list processing. Since the two operations aren't
clearly related, we should pull the parameter out of the wait function,
and make the caller responsible for retiring if the action is desired.
The only function call site which did not get an explicit retire_request call
(on purpose) is i915_gem_inactive_shrink(). That code was already calling
retire_request a second time.
v2: don't modify any behavior excepit i915_gem_inactive_shrink(Daniel)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We don't need the pt_addr for the !dmar case, so drop the else and
move the if (dmar) condition out of the loop.
v2: Fixup whitespace damage noticed by Chris Wilson.
v3: Collapse the two identical if blocks. Chris Wilson makes me look
like a moron right now ...
Noticed-by: Konstantin Belousov <kostikbel@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wislon.co.uk>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Daniel Vetter wrote
First pull request for 3.5-next, slightly large than usual because new
things kept coming in since the last pull for 3.4.
Highlights:
- first batch of hw enablement for vlv (Jesse et al) and hsw (Eugeni). pci
ids are not yet added, and there's still quite a few patches to merge
(mostly modesetting). To make QA easier I've decided to merge this stuff
in pieces.
- loads of cleanups and prep patches spurred by the above. Especially vlv
is a real frankenstein chip, but also hsw is stretching our driver's
code design. Expect more to come in this area for 3.5.
- more gmbus fixes, cleanups and improvements by Daniel Kurtz. Again,
there are more patches needed (and some already queued up), but I wanted
to split this a bit for better testing.
- pwrite/pread rework and retuning. This series has been in the works for
a few months already and a lot of i-g-t tests have been created for it.
Now it's finally ready to be merged. Note that one patch in this series
touches include/pagemap.h, that patch is acked-by akpm.
- reduce mappable pressure and relocation throughput improvements from
Chris.
- mmap offset exhaustion mitigation by Chris Wilson.
- a start at figuring out which codepaths in our messy dri1/ums+gem/kms
driver we actually need to support by bailing out of unsupported case.
The driver now refuses to load without kms on gen6+ and disallows a few
ioctls that userspace never used in certain cases. More of this will
definitely come.
- More decoupling of global gtt and ppgtt.
- Improved dual-link lvds detection by Takashi Iwai.
- Shut up the compiler + plus fix the fallout (Ben)
- Inverted panel brightness handling (mostly Acer manages to break things
in this way).
- Small fixlets and adjustements and some minor things to help debugging.
Regression-wise QA reported quite a few issues on ivb, but all of them
turned out to be hw stability issues which are already fixed in
drm-intel-fixes (QA runs the nightly regression tests on -next alone,
without -fixes automatically merged in). There's still one issue open on
snb, it looks like occlusion query writes are not quite as cache coherent
as we've expected. With some of the pwrite adjustements we can now
reliably hit this. Kernel workaround for it is in the works."
* 'drm-intel-next' of git://people.freedesktop.org/~danvet/drm-intel: (101 commits)
drm/i915: VCS is not the last ring
drm/i915: Add a dual link lvds quirk for MacBook Pro 8,2
drm/i915: make quirks more verbose
drm/i915: dump the DMA fetch addr register on pre-gen6
drm/i915/sdvo: Include YRPB as an additional TV output type
drm/i915: disallow gem init ioctl on ilk
drm/i915: refuse to load on gen6+ without kms
drm/i915: extract gt interrupt handler
drm/i915: use render gen to switch ring irq functions
drm/i915: rip out old HWSTAM missed irq WA for vlv
drm/i915: open code gen6+ ring irqs
drm/i915: ring irq cleanups
drm/i915: add SFUSE_STRAP registers for digital port detection
drm/i915: add WM_LINETIME registers
drm/i915: add WRPLL clocks
drm/i915: add LCPLL control registers
drm/i915: add SSC offsets for SBI access
drm/i915: add port clock selection support for HSW
drm/i915: add S PLL control
drm/i915: add PIXCLK_GATE register
...
Conflicts:
drivers/char/agp/intel-agp.h
drivers/char/agp/intel-gtt.c
drivers/gpu/drm/i915/i915_debugfs.c
The ppgtt page directory lives in a snatched part of the gtt pte
range. Which naturally gets cleared on hibernate when we pull the
power. Suspend to ram (which is what I've tested) works because
despite the fact that this is a mmio region, it is actually back by
system ram.
Fix this by moving the page directory setup code to the ppgtt init
code (which gets called on resume).
This fixes hibernate on my ivb and snb.
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We've lost our guard page somewhere in the gtt rewrite, this patch
here will restore it.
Exercised by i-g-t/tests/gem_cs_prefetch.
v2: Substract the guard page from the range we're supposed to manage
with gem. Suggested by Chris Wilson to increase the odds of old ums +
gem userspace not blowing up. To compensate for the loss of a page,
don't substract the guard page in the modeset init code any longer.
Tested-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=44748
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
... because this is what it actually doesn now that we have the global
gtt vs. ppgtt split.
Also move it to the other global gtt functions in i915_gem_gtt.c
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
And track the existence of such a binding similar to the aliasing
ppgtt case. Speeds up binding/unbinding in the common case where we
only need a ppgtt binding (which is accessed in a cpu coherent fashion
by the gpu) and no gloabl gtt binding (which needs uc writes for the
ptes).
This patch just puts the required tracking in place.
v2: Check that global gtt mappings exist in the error_state capture
code (with Chris Wilson's llc reloc patches batchbuffers are no longer
relocated as mappable in all situations, so this matters). Suggested
by Chris Wilson.
v3: Adapted to Chris' latest llc-reloc patches.
v4: Fix a bug in the i915 error state capture code noticed by Chris
Wilson.
Reviewed-and-tested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Note that there's a functional change buried in this patch wrt the ilk
dmar workaround: We now only idle the gpu while tearing down the dmar
mappings, not while clearing the gtt. Keeping the current semantics
would have made for some really ugly code and afaik the issue is only
with the dmar unmapping that needs a fully idle gpu.
Reviewed-and-tested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This adds support to bind/unbind objects and wires it up. Objects are
only put into the ppgtt when necessary, i.e. at execbuf time.
Objects are still unconditionally put into the global gtt.
v2: Kill the quick hack and explicitly pass cache_level to ppgtt_bind
like for the global gtt function. Noticed by Chris Wilson.
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Tested-by: Chris Wilson <chris@chris-wilson.co.uk>
Tested-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This just adds the setup and teardown code for the ppgtt PDE and the
last-level pagetables, which are fixed for the entire lifetime, at
least for the moment.
v2: Kill the stray debug printk noted by and improve the pte
definitions as suggested by Chris Wilson.
v3: Clean up the aperture stealing code as noted by Ben Widawsky.
v4: Paint the init code in a more pleasing colour as suggest by Chris
Wilson.
v5: Explain the magic numbers noticed by Ben Widawsky.
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Tested-by: Chris Wilson <chris@chris-wilson.co.uk>
Tested-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
After the ILK vt-d workaround patches it became clear that we had
introduced a bug. Chris Wilson tracked down the issue to recursive
calls to unmap. This happens because we try to optimize waiting on
requests by calling retire requests after the wait, which may drop the
last reference on an object and end up freeing the object (and then
unmap the object from the gtt).
After the last patch we can now choose to defer processing the retire
list.
Kudos to Chris Wilson for tracking this one down.
This patch fixes gem_unref_active_buffers from i-g-t. It was tested by
forcing do_idle_maps to true.
This also fixes tests/gem_linear_blits in intel-gpu-tools.
Reported-by: guang.a.yang@intel.com
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=42180
Reviewed-by: Keith Packard <keithp@keithp.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Sometimes it may be the case when we idle the gpu or wait on something
we don't actually want to process the retiring list. This patch allows
callers to choose the behavior.
Reviewed-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Idle the GPU before doing any unmaps. We know if VT-d is in use through
an exported variable from iommu code.
This should avoid a known HW issue.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Keith Packard <keithp@keithp.com>
[anholt v2: Don't forget that when going from cached to uncached, we
haven't been tracking the write domain from the CPU perspective, since
we haven't needed it for GPU coherency.]
[ickle v3: We also need to make sure we relinquish any fences on older
chipsets and clear the GTT for sane domain tracking.]
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
... in preparation for changing the cache level (and thus the flags upon
the PTEs) dynamically.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
... to clarify just how we use it inside the driver and remove the
confusion of the poorly matching agp_type names. We still need to
translate through agp_type for interface into the fake AGP driver.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Keith Packard <keithp@keithp.com>
We can only utilize the stolen portion of the GTT if we are in sole
charge of the hardware. This is only true if using GEM and KMS,
otherwise VESA continues to access stolen memory.
Reported-by: Arnd Bergmann <arnd@arndb.de>
Reported-by: Frederic Weisbecker <fweisbec@gmail.com>
Tested-by: Jiri Olsa <jolsa@redhat.com>
Tested-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Dave Airlie spotted that his ILK laptop with DMAR enabled was generating
the occasional DMAR warning.
"The ordering in the previous code was to rewrite the GTT table before
unmapping the pages and that makes sense to me."
This is his stable patch ported to d-i-n.
Reported-by: Dave Airlie <airlied@redhat.com>
Original-patch-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
This still uses the agp functions to actually reinstate the mappings
(with a gross hack to make agp cooperate), but it wires everything
up correctly for the switchover.
The call to agp_rebind_memory can be dropped because all non-kms drivers
do all their rebinding on EnterVT.
v2: Be more paranoid and flush the chipset cache after restoring gtt
mappings.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>