Commit Graph

244 Commits

Author SHA1 Message Date
Ben Widawsky
230f955f73 drm/i915/bdw: Free correct number of ppgtt pages
I am unclear how this got messed up in the shuffle, but it did.

Cc: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-14 09:33:10 +01:00
Jesse Barnes
b53c8c3577 drm/i915: drop duplicate ggtt vma list add in setup_global_gtt
Preallocated objects will already have been added to the vma_list when
creating their ggtt vma entry, and coincidentally also marked as holding
a ggtt mapping. Repeating the vma_list manipulation when setting up the
ggtt after preallocation is a recipe for an unhappy kernel.

Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: Use the improve commit message suggest by Chris.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-13 01:05:01 +01:00
Ville Syrjälä
b42218c19f drm/i915/bdw: Don't muck with gtt_size on Gen8 when PPGTT setup fails
v2: Resolve rebase conflicts and switch to gen < 8 color for GenX
checking.

v3: Rebase on top of the address space refactoring.

Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> (v1)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:49 +01:00
Ben Widawsky
28cf541543 drm/i915/bdw: unleash PPGTT
v2: Squash in fix from Ben: Set PPGTT batches as necessary

This fixes the regression in the last couple of days when we enabled
PPGTT.

v3: Squash in fixup to still use GTT for secure batches from Ville:

BDW doesn't have a separate secure vs. non-secure bit in
MI_BATCH_BUFFER_START. So for secure batches we have to simply
leave the PPGTT bit unset. Fortunately older generations (except
HSW) had similar limitations so execbuffer already creates a GTT
mapping for all secure batches.

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:48 +01:00
Ben Widawsky
94e409c144 drm/i915/bdw: Implement PPGTT enable
Legacy PPGTT on GEN8 requires programming 4 PDP registers per ring.
Since all rings are using the same address space with the current code
the logic is simply to program all the tables we've setup for the PPGTT.

v2: Turn on PPGTT in GFX_MODE

v3: v2 was the wrong patch

v4: Resolve conflicts due to patch series reordering.

v5: Squash in fixup from Ben: Use LRI to write PDPs

The docs (and simulator seems to back up) suggest that we can only
program legacy PPGTT PDPs with LRI commands.

v6: Rebase around context differences conflicts.

v7: Use #defines for per ring PDPs. (Damien)

v8: Don't use typede'f private_t.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (up to v3 and v7)
Reviewed-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:47 +01:00
Ben Widawsky
9df15b499b drm/i915/bdw: Implement PPGTT insert
GEN8 insertion is very similar to GEN6.

v2: Rebase on top of Imre's for_each_sg_page helpers.

v3: Fixup my conversion (spotted by Ville).

v4: Rebase on top of the address space refactoring.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (v1)
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:47 +01:00
Ben Widawsky
459108b8cc drm/i915/bdw: Implement PPGTT clear range
GEN8 PPGTT range clearing is very similar to GEN6 if we assume that our
PDEs are all valid, which they should be.

v2: Rebase on top of the address space refactoring.

v3: Rebase on top of the bool use_scratch addition to the clear_range interface.

Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (v1)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:46 +01:00
Ben Widawsky
b1fe667329 drm/i915/bdw: Initialize the PDEs
The upcoming clear and insert routines will expect that PDEs all point
to valid Page Directories. Doing that lazily doesn't really buy us
anything.

The page allocation is done regardless earlier in init so it shouldn't
hurt set the PDEs.

v2: Squash in patches to implement fixed PDE write function:

- If I had done this in the first place, the bug that's going to be
  fixed in an upcoming patch would have been much easier to find.

- Use WB for PDEs.

  The PAT bit is used for page size. 2ME PDEs aren't even supported in
  BDW, so this was completely invalid. The solution is to make our
  PDEs WB+LLC instead of the pervious WB+eLLC. As far as I can guess,
  this change won't matter for performance.

  Thanks to Ville for the quick correction when discussing on IRC.

v3: Return the pde type for pde encoding (Damien)

Reviewed-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:46 +01:00
Ben Widawsky
37aca44ad5 drm/i915/bdw: PPGTT init & cleanup
Aside from the potential size increase of the PPGTT, the primary
difference from previous hardware is the Page Directories are no longer
carved out of the Global GTT.

Note that the PDE allocation is done as a 8MB contiguous allocation,
this needs to be eventually fixed (since driver reloading will be a
pain otherwise). Also, this will be a no-go for real PPGTT support.

v2: Move vtable initialization

v3: Resolve conflicts due to patch series reordering.

v4: Rebase on top of the address space refactoring of the PPGTT
support. Drop Imre's r-b tag for v2, too outdated by now.

v5: Free the correct amount of memory, "get_order takes size not a page
count." (Imre)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:45 +01:00
Ben Widawsky
fbe5d36e77 drm/i915/bdw: Support BDW caching
BDW caching works differently than the previous generations. Instead of
having bits in the PTE which directly control how the page is cached,
the 3 PTE bits PWT PCD and PAT provide an index into a PAT defined by
register 0x40e0. This style of caching is functionally equivalent to how
it works on HSW and before.

v2: Tiny bikeshed as discussed on internal irc.

v3: Squash in patch from Ville to mirror the x86 PAT setup more like
in arch/x86/mm/pat.c. Primarily, the 0th index will be WB, and not
uncached.

v4: Comment for reason to not use a 64b write on the PPAT.

v5: Add a FIXME comment that the caching bits in the PAT registers
might be wrong due to doc confusion.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (v1)
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:45 +01:00
Ben Widawsky
94ec8f6130 drm/i915/bdw: Add GTT functions
With the PTE clarifications, the bind and clear functions can now be
added for gen8.

v2: Use for_each_sg_pages in gen8_ggtt_insert_entries.

v3: Drop dev argument to pte encode functions, upstream lost it. Also
rebase on top of the scratch page movement.

v4: Rebase on top of the new address space vfuncs.

v5: Add the bool use_scratch argument to clear_range and the bool valid argument
to the PTE encode function to follow upstream changes.

v6: Add a FIXME(BDW) about the size mismatch of the readback check
that Jon Bloomfield spotted.

v7: Squash in fixup patch from Ben for the posting read to match the
64bit ptes and so shut up the WARN.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (v1)
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:44 +01:00
Ben Widawsky
d31eb10e6c drm/i915/bdw: Create gen8_gtt_pte_t
With gen6 PTE type in place, pave the way for the new gen8 type.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:44 +01:00
Ben Widawsky
63340133f3 drm/i915/bdw: Make gen8_gmch_probe
Probing gen8 is similar to gen6. To make the code cleaner and more
maintainable however we can use the probe functions to split it out.

v2: Rebased on top of update gtt probe infrastructure.

v3: Rebased on top of Kenneth' Graunke's ->pte_encode refactoring.

V4: Resolve conflicts with Ben's latest ppgtt patches, also switch to
gen < 8 testing instead of gen <= 7.

v5: Resolve conflicts with address space vfunc changes in upstream.

v6: Use 39b DMA mask. At least, for this mode, it is the correct mask.
(Imre)

Cc: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:43 +01:00
Ben Widawsky
9459d25237 drm/i915/bdw: support GMS and GGMS changes
All the BARs have the ability to grow.

v2: Pulled out the simulator workaround to a separate patch.
Rebased.

v3: Rebase onto latest vlv patches from Jesse.

v4: Rebased on top of the early stolen quirk patch from Jesse.

v5: Use the new macro names.
s/INTEL_BDW_PCI_IDS_D/INTEL_BDW_D_IDS
s/INTEL_BDW_PCI_IDS_M/INTEL_BDW_M_IDS
It's Jesse's fault for not following the convention I originally set.

Cc: Ingo Molnar <mingo@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:39 +01:00
Daniel Vetter
8fe6bd239a drm/i915/bdw: Disable PPGTT for now
This will be changed once the gen8 code is fully implemented.

v2: Use ENOSYS instead of ENXIO as suggested by Chris.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:35 +01:00
Daniel Vetter
7f16e5c141 Merge tag 'v3.12' into drm-intel-next
I want to merge in the new Broadwell support as a late hw enabling
pull request. But since the internal branch was based upon our
drm-intel-nightly integration branch I need to resolve all the
oustanding conflicts in drm/i915 with a backmerge to make the 60+
patches apply properly.

We'll propably have some fun because Linus will come up with a
slightly different merge solution.

Conflicts:
	drivers/gpu/drm/i915/i915_dma.c
	drivers/gpu/drm/i915/i915_drv.c
	drivers/gpu/drm/i915/intel_crt.c
	drivers/gpu/drm/i915/intel_ddi.c
	drivers/gpu/drm/i915/intel_display.c
	drivers/gpu/drm/i915/intel_dp.c
	drivers/gpu/drm/i915/intel_drv.h

All rather simple adjacent lines changed or partial backports from
-next to -fixes, with the exception of the thaw code in i915_dma.c.
That one needed a bit of shuffling to restore the intent.

Oh and the massive header file reordering in intel_drv.h is a bit
trouble. But not much.

v2: Also don't forget the fixup for the silent conflict that results
in compile fail ...

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-04 16:28:52 +01:00
Ben Widawsky
828c79087c drm/i915: Disable GGTT PTEs on GEN6+ suspend
Once the machine gets to a certain point in the suspend process, we
expect the GPU to be idle. If it is not, we might corrupt memory.
Empirically (with an early version of this patch) we have seen this is
not the case. We cannot currently explain why the latent GPU writes
occur.

In the technical sense, this patch is a workaround in that we have an
issue we can't explain, and the patch indirectly solves the issue.
However, it's really better than a workaround because we understand why
it works, and it really should be a safe thing to do in all cases.

The noticeable effect other than the debug messages would be an increase
in the suspend time. I have not measure how expensive it actually is.

I think it would be good to spend further time to root cause why we're
seeing these latent writes, but it shouldn't preclude preventing the
fallout.

NOTE: It should be safe (and makes some sense IMO) to also keep the
VALID bit unset on resume when we clear_range(). I've opted not to do
this as properly clearing those bits at some later point would be extra
work.

v2: Fix bugzilla link

Bugzilla: http://bugs.freedesktop.org/show_bug.cgi?id=65496
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=59321
Tested-by: Takashi Iwai <tiwai@suse.de>
Tested-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Tested-By: Todd Previte <tprevite@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-10-18 15:44:47 +02:00
Ben Widawsky
b35b380ed4 drm/i915: Make PTE valid encoding optional
We need this to work around a corruption when the boot kernel image
loads the hibernated kernel image from swap on Haswell systems -
somehow not everything is properly shut off.

This is just the prep work, the next patch will implement the actual
workaround.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Add a commit message suitable for -fixes and add cc: stable]
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-10-18 15:40:21 +02:00
Daniel Vetter
a1e2265332 drm/i915: Use kcalloc more
No buffer overflows here, but better safe than sorry.

v2:
- Fixup the sizeof conversion, I've missed the pointer deref (Jani).
- Drop the redundant GFP_ZERO, kcalloc alreads memsets (Jani).
- Use kmalloc_array for the execbuf fastpath to avoid the memset
  (Chris). I've opted to leave all other conversions as-is since they
  aren't in a fastpath and dealing with cleared memory instead of
  random garbage is just generally nicer.

Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
[danvet: Drop the contentious kmalloc_array hunk in execbuf.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-10-01 07:45:01 +02:00
Chris Wilson
651d794fae drm/i915: Use Write-Through cacheing for the display plane on Iris
Haswell GT3e has the unique feature of supporting Write-Through cacheing
of objects within the eLLC/LLC. The purpose of this is to enable the display
plane to remain coherent whilst objects lie resident in the eLLC/LLC - so
that we, in theory, get the best of both worlds, perfect display and fast
access.

However, we still need to be careful as the CPU does not see the WT when
accessing the cache. In particular, this means that we need to flush the
cache lines after writing to an object through the CPU, and on
transitioning from a cached state to WT.

v2: Actually do the clflush on transition to WT, nagging by Ville.
v3: Flush the CPU cache after writes into WT objects.
v4: Rease onto LLC updates and report WT as "uncached" for
get_cache_level_ioctl to remain symmetric with set_cache_level_ioctl.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-08-22 13:31:38 +02:00
Chris Wilson
2c22569bba drm/i915: Update rules for writing through the LLC with the cpu
As mentioned in the previous commit, reads and writes from both the CPU
and GPU go through the LLC. This gives us coherency between the CPU and
GPU irrespective of the attribute settings either device sets. We can
use to avoid having to clflush even uncached memory.

Except for the scanout.

The scanout resides within another functional block that does not use
the LLC but reads directly from main memory. So in order to maintain
coherency with the scanout, writes to uncached memory must be flushed.
In order to optimize writes elsewhere, we start tracking whether an
framebuffer is attached to an object.

v2: Use pin_display tracking rather than fb_count (to ensure we flush
cursors as well etc) and only force the clflush along explicit writes to
the scanout paths (i.e. pin_to_display_plane and pwrite into scanout).

v3: Force the flush after hitting the slowpath in pwrite, as after
dropping the lock the object's cache domain may be invalidated. (Ville)

Based on a patch by Ville Syrjälä.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-08-10 11:20:49 +02:00
Chris Wilson
350ec881d9 drm/i915: Rename I915_CACHE_MLC_LLC to L3_LLC for Ivybridge
MLC_LLC was never validated for Sandybridge and was superseded by a new
level of cacheing for the GPU in Ivybridge. Update our names to be
consistent with usage, and in the process stop setting the unwanted bit
on Sandybridge.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
[danvet: s/BUG/WARN_ON(1) bikeshed.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-08-06 16:35:30 +02:00
Ben Widawsky
40d74980d3 drm/i915: Use ggtt_vm to save some typing
Just some small cleanups, and a rename of vm->ggtt_vm requested by
Daniel.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-08-05 19:04:10 +02:00
Ben Widawsky
a70a3148b0 drm/i915: Make proper functions for VMs
Earlier in the conversion sequence we attempted to quickly wedge in the
transitional interface as static inlines.

Now that we're sure these interfaces are sane, for easier debug and to
decrease code size (since many of these functions may be called quite a
bit), make them real functions

While at it, kill off the set_color interface. We'll always have the
VMA, or easily get to it.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-08-05 19:04:08 +02:00
Ben Widawsky
87a6b688cc drm/i915/hsw: Change default LLC age to 3
The default LLC age was changed:
commit 0d8ff15e9a
Author: Ben Widawsky <benjamin.widawsky@intel.com>
Date:   Thu Jul 4 11:02:03 2013 -0700

drm/i915/hsw: Set correct Haswell PTE encodings.

On the surface it would seem setting a default age wouldn't matter
because all GEM BOs are aged similarly, so the order in which objects
are evicted would not be subject to aging. The current working theory as
to why this caused a regression though is that LLC is a bit special in
that it is shared with the CPU. Presumably (not verified) the CPU
fetches cachelines with age 3, and therefore recently cached GPU objects
would be evicted before similar CPU object first when the LLC is full.
It stands to reason therefore that this would negatively impact CPU
bound benchmarks - but those seem to be low on the priority list.

eLLC OTOH does not have this same property as LLC. It should be used
entirely for the GPU, and so the age really shouldn't matter.
Furthermore, we have no evidence to suggest one is better than another
on eLLC. Since we've never properly supported eLLC before no, there
should be no regression. If the GPU client really wants "younger"
objects, they should use MOCS.

v2: Drop the extra #define (Chad)

v3: Actually git add

v4: Pimped commit message

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67062
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-08-05 19:04:06 +02:00
Chris Wilson
08c45263a6 drm/i915: Use the same pte_encoding for ppgtt as for gtt
The PTE layouts are the same for both ppgtt and gtt, so we can simplify
the setup for ppgtt by copying the encoding function pointer from gtt.
This prevents bugs where we update one function pointer, but forget the
other.

For instance,

commit 4d15c145a6
Author: Ben Widawsky <ben@bwidawsk.net>
Date:   Thu Jul 4 11:02:06 2013 -0700

    drm/i915: Use eLLC/LLC by default when available

only extends the gtt to use eLLC/LLC cacheing and forgets to also update
the ppgtt function pointer.

v2: Actually mention the bug being fixed (Kenneth)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-08-04 21:29:57 +02:00
Ben Widawsky
2f63315692 drm/i915: Create VMAs
Formerly: "drm/i915: Create VMAs (part 1)"

In a previous patch, the notion of a VM was introduced. A VMA describes
an area of part of the VM address space. A VMA is similar to the concept
in the linux mm. However, instead of representing regular memory, a VMA
is backed by a GEM BO. There may be many VMAs for a given object, one
for each VM the object is to be used in. This may occur through flink,
dma-buf, or a number of other transient states.

Currently the code depends on only 1 VMA per object, for the global GTT
(and aliasing PPGTT). The following patches will address this and make
the rest of the infrastructure more suited

v2: s/i915_obj/i915_gem_obj (Chris)

v3: Only move an object to the now global unbound list if there are no
more VMAs for the object which are bound into a VM (ie. the list is
empty).

v4: killed obj->gtt_space
some reworks due to rebase

v5: Free vma on error path (Imre)

v6: Another missed vma free in i915_gem_object_bind_to_gtt error path
(Imre)
Fixed vma freeing in stolen preallocation (Imre)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Imre Deak <imre.deak@intel.com>
[danvet: Squash in fixup from Ben to not deref a non-existing vma in
set_cache_level, reported by Chris.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-18 08:46:13 +02:00
Ben Widawsky
93bd8649db drm/i915: Put the mm in the parent address space
Every address space should support object allocation. It therefore makes
sense to have the allocator be part of the "superclass" which GGTT and
PPGTT will derive.

Since our maximum address space size is only 2GB we're not yet able to
avoid doing allocation/eviction; but we'd hope one day this becomes
almost irrelvant.

v2: Rebased

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-17 22:23:43 +02:00
Ben Widawsky
853ba5d223 drm/i915: Move gtt and ppgtt under address space umbrella
The GTT and PPGTT can be thought of more generally as GPU address
spaces. Many of their actions (insert entries), state (LRU lists), and
many of their characteristics (size) can be shared. Do that.

The change itself doesn't actually impact most of the VMA/VM rework
coming up, it just fits in with the grand scheme of abstracting the GPU
VM operations. GGTT will usually be a special case where we either know
an object must be in the GGTT (dislay engine, workarounds, etc.).

The scratch page is left as part of the VM (even though it's currently
shared with the ppgtt code) because in the future when we have Full
PPGTT, I intend to create a separate scratch page for each.

v2: Drop usage of i915_gtt_vm (Daniel)
Make cleanup also part of the parent class (Ben)
Modified commit msg
Rebased

v3: Properly share scratch page (Imre)
Finish commit message (Daniel, Imre)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-17 22:21:47 +02:00
Ben Widawsky
4d15c145a6 drm/i915: Use eLLC/LLC by default when available
DRI clients really should be using MOCS to get fine grained streaming
cache controls. With that note, I *hope* that this patch doesn't improve
performance overwhelmingly, because if it does - it means there is a
problem elsewhere.

In any case, the kernel, and old userspace should get some benefit from
this, so let's do it. eLLC is always a good default, and really not
using it is the special case for MOCS.

References: http://www.intel.com/newsroom/kits/restricted/ha$well!/pdfs/4th_Gen_Intel_Core_PressBriefing_5-29.pdf (page 57)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-16 08:08:39 +02:00
Ben Widawsky
0d8ff15e9a drm/i915/hsw: Set correct Haswell PTE encodings.
The cacheability controls have changed, and the bits have been
rearranged in general.

Note that age 0 is the oldest (most likely to get evicted) and age 3
is the youngest (most likely to stick around for a bit). We've picked
0 for no reason, but atm it shouldn't matter anyway (since we don't
yet try to differentiate between different objects).

v2: Remove comments for snb/ivb cache leves, that's a separate change.

v3: Resolve conflicts due to patch series reordering.

v4: Rebased on top of Kenneth Graunke's ->pte_encode refactoring.

v5: Removed eLLC bits for separate patch.

In the internal repository this was:
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
[danvet: Add comment about cache ages as requested by Ben provoked due
to a question from Damien.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-16 07:57:42 +02:00
Ben Widawsky
c6cfb32567 drm/i915: Embed drm_mm_node in i915 gem obj
Embedding the node in the obj is more natural in the transition to VMAs
which will also have embedded nodes. This change also helps transition
away from put_block to remove node.

Though it's quite an uncommon occurrence, it's somewhat convenient to not
fail at bind time because we cannot allocate the node. Though in
practice there are other allocations (like the request structure) which
would probably make this point not terribly useful.

Quoting Daniel:
Note that the only difference between put_block and remove_node is
that the former fills up the preallocation cache. Which we don't need
anyway and hence is just wasted space.

v2: Clean up the stolen preallocation code.
Rebased on the reserve_node patches
renames ggtt_ stuff to gtt_ stuff
WARN_ON if the object is already bound (which doesn't mean it's in the
bound list, tricky)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-08 22:04:36 +02:00
Ben Widawsky
edd41a870f drm/i915: Kill obj->gtt_offset
With the getters in place from the previous patch this members serves no
purpose other than saving one spare pointer chase, which will be killed
in the next patch anyway.

Moving to VMAs, this members adds unnecessary confusion since an object
may exist at different offsets in different VMs.

v2: Properly preserve the stolen offset. This code is a bit hacky but it
all goes away when we embed the drm_mm_node and removes the need for the
incorrect patch I submitted previously: "Use gtt_space->start for stolen
reservation"

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-08 22:04:35 +02:00
Ben Widawsky
f343c5f647 drm/i915: Getter/setter for object attributes
Soon we want to gut a lot of our existing assumptions how many address
spaces an object can live in, and in doing so, embed the drm_mm_node in
the object (and later the VMA).

It's possible in the future we'll want to add more getter/setter
methods, but for now this is enough to enable the VMAs.

v2: Reworked commit message (Ben)
Added comments to the main functions (Ben)
sed -i "s/i915_gem_obj_set_color/i915_gem_obj_ggtt_set_color/" drivers/gpu/drm/i915/*.[ch]
sed -i "s/i915_gem_obj_bound/i915_gem_obj_ggtt_bound/" drivers/gpu/drm/i915/*.[ch]
sed -i "s/i915_gem_obj_size/i915_gem_obj_ggtt_size/" drivers/gpu/drm/i915/*.[ch]
sed -i "s/i915_gem_obj_offset/i915_gem_obj_ggtt_offset/" drivers/gpu/drm/i915/*.[ch]
(Daniel)

v3: Rebased on new reserve_node patch
Changed DRM_DEBUG_KMS to actually work (will need fixing later)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-08 22:04:34 +02:00
Ben Widawsky
338710e7af drm: Change create block to reserve node
With the previous patch we no longer actually create a node, we simply
find the correct hole and occupy it. This very well could have been
squashed with the last patch, but since I already had David's review, I
figured it's easiest to keep it distinct.

Also update the users in i915. Conveniently this is the only user of the
interface.

CC: David Airlie <airlied@linux.ie>
CC: <dri-devel@lists.freedesktop.org>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Acked-by: David Airlie <airlied@linux.ie>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-08 22:04:33 +02:00
Ben Widawsky
b3a070cccb drm: pre allocate node for create_block
For an upcoming patch where we introduce the i915 VMA, it's ideal to
have the drm_mm_node as part of the VMA struct (ie. it's pre-allocated).
Part of the conversion to VMAs is to kill off obj->gtt_space. Doing this
will break a bunch of code, but amongst them are 2 callers of
drm_mm_create_block(), both related to stolen memory.

It also allows us to embed the drm_mm_node into the object currently
which provides a nice transition over to the new code.

v2: Reordered to do before ripping out obj->gtt_offset.
Some minor cleanups made available because of reordering.

v3: s/continue/break on failed stolen node allocation (David)
Set obj->gtt_space on failed node allocation (David)
Only unref stolen (fix double free) on failed create_stolen (David)
Free node, and NULL it in failed create_stolen (David)
Add back accidentally removed newline (David)

CC: <dri-devel@lists.freedesktop.org>
Reviewed-by: David Herrmann <dh.herrmann@gmail.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Acked-by: David Airlie <airlied@linux.ie>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-08 22:04:32 +02:00
Ben Widawsky
b2f21b4dfd drm/i915: Use gtt shortform where possible
Just for compactness.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-01 11:27:59 +02:00
Ben Widawsky
80a74f7f9c drm/i915: Drop dev from pte_encode
The original pte_encode function needed the dev argument so we could do
platform specific handling via IS_GENX, etc. With the merging of a pte
encoding function there should never been a need to quirk away gen
specific details.

The patch doesn't do much but makes the upcoming reworks in gtt/ppgtt/mm
slightly (albeit, ever so) easier.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-01 11:27:59 +02:00
Ben Widawsky
6716724006 drm/i915: Combine scratch members into a struct
There isn't any special reason to do this other than it makes it obvious
that the two members are connected.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-01 11:27:58 +02:00
Ben Widawsky
84f1356058 drm/i915: Really share scratch page
A previous patch had set up the ppgtt and ggtt to use the same scratch
page, but still kept around both pointers. Kill it, it's not needed and
gets in our way for upcoming cleanups.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-01 11:27:57 +02:00
Ben Widawsky
6670a5a5c7 drm/i915: make PDE|PTE platform specific
Nothing outside of i915_gem_gtt.c and more specifically, the relevant
gen specific init function should need to know about number of PDEs, or
PTEs per PD. Exposing this will only lead to circumventing using the
upcoming VM abstraction.

To accomplish this, move the defines into the .c file, rename the PDE
define to be GEN6, and make the PTE count less of a magic number.

The remaining code in the global gtt setup is a bit messy, but an
upcoming patch will clean that one up.

v2: Don't hardcode number of PDEs (Daniel + Jesse)
Reworded commit message to reflect change.

Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-01 11:27:57 +02:00
Ben Widawsky
35c20a60c7 drm/i915: Rename the gtt_list to global_list
Since it will be used for the global bound/unbound list with full PPGTT,
this helps clarify things for upcoming code rework.

Recommended-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-06-03 10:51:14 +02:00
Daniel Vetter
e1b73cba13 Linux 3.10-rc2
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.19 (GNU/Linux)
 
 iQEcBAABAgAGBQJRmpexAAoJEHm+PkMAQRiGrRIH/1uWFW38RvaCV/PXm/ia6Z+x
 BfBJfBIvPxGwb4n7aQNQlhU25xkfrPZ6szO4WiBH5/KPH3xYi2I2OZ1AzffkYqMF
 BWkPmsPK6EsTdp16zsi6JtH2aXArG4SpYA7ZamPvDkmfigHuiZg7GlL/9eHTRPNV
 P7Q8JToOrcnP8RoGgNj0uFiQeQbc62Kmoq7WuPtUhVlpQCCCknXgOJiYgz9w6Xe9
 /i79YFS8WRrzAquExT1NbIOh4ZMqB9MvuroaVWy8JDDLUyz7QUvOCe3tCDNguwgi
 FdWvU6nfkdQq5SLaWCWXDE9Rp/pL1MvfBn9vCOwFcp42aw0aQ0PgJVIXvsqufd0=
 =jgDI
 -----END PGP SIGNATURE-----

Merge tag 'v3.10-rc2' into drm-intel-next-queued

Backmerge Linux 3.10-rc2 since the various (rather trivial) conflicts
grew a bit out of hand. intel_dp.c has the only real functional
conflict since the logic changed while dev_priv->edp.bpp was moved
around.

Also squash in a whitespace fixup from Ben Widawsky for
i915_gem_gtt.c, git seems to do something pretty strange in there
(which I don't fully understand tbh).

Conflicts:
	drivers/gpu/drm/i915/i915_reg.h
	drivers/gpu/drm/i915/intel_dp.c

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-05-21 09:52:16 +02:00
Ben Widawsky
c4ae25ecdf Revert "drm/i915: Calculate correct stolen size for GEN7+"
This reverts commit 03752f5b7b.

This revert requires a bit of explanation on how I understand things
work. Internally the architects/designers decide how the stolen encoding
works. We put it in a doc. BIOS writers take these docs and implement
it. Driver writers read the doc too, and read the value left by the BIOS
writers, and then we make magic.

The failing here is that in the docs we had[1] contained two different
definitions for this register for Gen7. (We have both a PCI register,
and an MMIO, and each of these were different). At the time [2] of
03752f5, we asked the architects what the correct value should be; but
that doesn't match the reality (BIOS) unfortunately.

So on all machines I can get my hands on, this revert is the right thing
to do. I've also worked with the product group to confirm that they
agree this revert is what we should do. People using HW made my "people"
who both write their own BIOS, and have access to our docs (Apple?).
Investigations are still ongoing about whether we need to add a list
of machines needing special handling, but this patch should be the
right thing for pretty much everyone.

[1] The docs are still wrong on this one. Now instead of two registers with
two definitions, we have one register with BOTH definitions, progress?
[2] The open source PRMs have the "wrong" definitions in chapter Volume
1 part6, section 1.1.12.

This digging was inspired by Paulo.

Cc: Paulo Zanoni <przanoni@gmail.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
[danvet: Augment the patch saying that it's still a bit unclear
whether there are any machines out there with "wrong" firmware and
whether we need to add a list to handle them specially.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-05-07 18:59:09 +02:00
Ben Widawsky
3e30254205 drm/i915: Extract PDE writes
It also makes some sense IMO to have these two functions separate
irrespective of the number of callers.

Only the single caller for now, but that will change as we add more
PPGTTs.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
[danvet: Resolve conflict.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-05-06 11:49:27 +02:00
Ben Widawsky
0a73287060 drm/i915: BUG_ON bad PPGTT offset
Because PPGTT PDEs within the GTT are calculated in cachelines
(HW guys consistency ftw) we do a divide which will wreak havoc if this
is wrong, and I know that from experience).

If/when we move to multiple PPGTTs this will have to become a WARN, and
return an error. For now however it should always be considered fatal,
and only a developer could hit it.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
[danvet: s/BUG/WARN]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-05-06 11:40:47 +02:00
Zhang, Xiong Y
43b27290dd drm/i915: correct the calculation of first_pd_entry_in_global_pt
When ppgtt is enabled, dev_priv->gtt.total has excluded the gtt space
occupied by ppgtt table in i915_gem_init_global_gtt() function. So the
calculation of first_pd_entry_in_global_pt doesn't need to subtract
I915_PPGTT_PD_ENTRIES again. Or else PPGTT directory table will be
destroyed by global gtt allocation.

This regression has been introduced in

commit a54c0c279f
Author: Ben Widawsky <ben@bwidawsk.net>
Date:   Thu Jan 24 14:45:00 2013 -0800

    drm/i915: remove intel_gtt structure

The breakage is pretty subtile since the old gtt_total_entries
included the pde range, whereas the new on did not.

Cc: stable@vger.kernel.org
Signed-off-by: Xiong Zhang<xiong.y.zhang@intel.com>
[danvet: Add regression citation and cc: stable. Thanks to Chris for
correcting my wrong guess about which commit broke things.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-04-27 14:07:16 +02:00
Kenneth Graunke
9119708cd4 drm/i915: Split out Haswell code from gen6_pte_encode.
Now that we have function pointers, it's cleaner to just create a new
per-platform PTE encoding function.

This should be identical in behavior to the previous code.

v2: Drop accidental inline keyword on hsw_pte_encode.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Tested-by: Daniel Leung <daniel.leung@linux.intel.com> [v1]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-04-22 11:44:21 +02:00
Kenneth Graunke
93c34e70eb drm/i915: Fix page table entries for Bay Trail.
On Bay Trail, bit 1 means "writeable by the GPU."  Failing to set that
means basically anything using the GPU will cause hangs.

v2: Drop accidental inline keyword on byt_pte_encode.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Tested-by: Daniel Leung <daniel.leung@linux.intel.com> [v1]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-04-22 11:44:11 +02:00
Kenneth Graunke
2d04befb94 drm/i915: Add PTE encoding function to the gtt/ppgtt vtables.
Sandybridge/Ivybridge, Bay Trail, and Haswell all have slightly
different page table entry formats.  Rather than polluting one function
with generation checks, simply use a function pointer and set up the
correct PTE encoding function at startup.

v2: Move the gen6_gtt_pte_t typedef to i915_drv.h so that the function
    pointers and implementations have identical signatures.  Also remove
    inline keyword on gen6_pte_encode.  Both suggested by Jani Nikula.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jani Nikula <jani.nikula@linux.intel.com>
Tested-by: Daniel Leung <daniel.leung@linux.intel.com> [v1]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-04-22 11:20:15 +02:00
Ben Widawsky
6a99476180 drm/i915: Remove stale code
Looks like a some remnant from a rebase.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-04-18 09:43:20 +02:00
Ville Syrjälä
a6f429a5a2 drm/i915: Configure GAM_ECOCHK appropriatly for Gen7
IVB and HSW use different encodings for the PPGTT cacheability bits in
the GAM_ECOCHK register.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-04-18 09:43:19 +02:00
Ville Syrjälä
a65c2fcd00 drm/i915: Set GAC_ECO_BITS register on Gen7+
According to BSpec GAC_ECO_BITS register exists on Gen7 platforms as
well. Configure it accordingly.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-04-18 09:43:18 +02:00
Ville Syrjälä
3b9d7888df drm/i915: Add ECOBITS_SNB_BIT
GAC_ECO_BITS has a bit similar to GAM_ECOCHK's ECOCHK_SNB_BIT. Add
the define, and enable it on SNB.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-04-18 09:43:18 +02:00
Ben Widawsky
b7c36d2546 drm/i915: Allow PPGTT enable to fail
I'm really not happy that we have to support this, but this will be the
simplest way to handle cases where PPGTT init can fail, which I promise
will be coming in the future.

v2: Resolve conflicts due to patch series reordering.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (v1)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-04-18 09:43:16 +02:00
Ben Widawsky
5963cf049a drm/i915: NULL aliasing_ppgtt on cleanup
This will allow us to carry on if we've cleaned up the PPGTT. The usage
for this is coming up - it simplifies handling a failed PPGTT init.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Spill the secrets about failing ppgtt init.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-04-18 09:43:15 +02:00
Ben Widawsky
6197349bde drm/i915: Abstract PPGTT enabling
Since we've already set up a nice vtable to abstract other PPGTT
functions, also abstract the actual register programming to enable
things.

This function will probably need to change a bit as we implement real
processes.

v2: Resolve conflicts due to patch series reordering.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (v1)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-04-18 09:43:15 +02:00
Ben Widawsky
3ed124b21e drm/i915: Rework PPGTT init code
This rework will help if future platforms choose to be a bit different.
Should have no functional impact.

v2: Don't move around the vtable setup (Daniel)

v3: Squash in the disable-by-default patch.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (v1)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-04-18 09:43:14 +02:00
Ben Widawsky
3eb1c005c6 drm/i915: Conditionally carve out GGTT PDE
It only works that way on GEN6 and GEN7. Let's not assume GENn will be
the same.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-04-18 09:43:14 +02:00
Ben Widawsky
1e7d12d467 drm/i915/ppgtt: Set scratch page "globally"
The PPGTT scratch page is used for all gens, and doing it in the global
part of our PPGTT setup makes the code a bit nicer.

This was in a patch submitted earlier as part of the PPGTT cleanups.
Grumpy maintainer must have missed it, and I didn't yell when
appropriate. Apologies for everyone :-)

v2: Update commit message

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-04-18 09:43:13 +02:00
Ben Widawsky
c81dbe0563 drm/i915: random checkpatch fixes
There used to be other fixes in this patch but they've slowly disappeared as
other parts have been fixed.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-04-18 09:43:13 +02:00
Ben Widawsky
e7c2b58b70 drm/i915: Call out GEN6 PTE specificity
We can assume that the PTE layout, and size changes for future
generations. To avoid confusion with the existing GEN6 PTE typedef, give
it a GEN6_ prefix.

v2: Fixup checkpatch warning and bikeshed commit message slightly.

v3: Rebase on top of Imre's for_each_sg_pages rework.

v4: Fixup conflicts in patch series reordering.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (v1)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-04-18 09:43:12 +02:00
Ben Widawsky
a93e41618e drm/i915: generalize pte vs. register BAR allocation
All gen6+ parts so far have 1 BAR which holds both the register space
and the GTT PTEs. Up until now, that was a 4MB BAR with half allocated
to each.

I have a strong hunch (wink, nod, wink) that future gens will also keep
a similar 50-50 split though the sizes may change. To help this along
change the code to obey the rule of half the total size instead of a
hard-coded 2MB.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-04-18 09:43:11 +02:00
Imre Deak
2db76d7c3c lib/scatterlist: sg_page_iter: support sg lists w/o backing pages
The i915 driver uses sg lists for memory without backing 'struct page'
pages, similarly to other IO memory regions, setting only the DMA
address for these. It does this, so that it can program the HW MMU
tables in a uniform way both for sg lists with and without backing pages.

Without a valid page pointer we can't call nth_page to get the current
page in __sg_page_iter_next, so add a helper that relevant users can
call separately. Also add a helper to get the DMA address of the current
page (idea from Daniel).

Convert all places in i915, to use the new API.

Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-03-27 17:13:44 +01:00
Daniel Vetter
a15326a57c drm/i915: fixup pd vs pt confusion in gen6 ppgtt code
The index variable points at a page table, not a page directory or a
pde. Ben Widawsky fix this up correctly in his ppgtt cleanup, but I've
botched the job and copy&pasted the old confusion from the original
gen6 ppgtt code in

commit def886c376
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Thu Jan 24 14:44:56 2013 -0800

    drm/i915: vfuncs for ppgtt

Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-03-23 12:18:03 +01:00
Daniel Vetter
6ddc4fc70a style nit: Align function parameter continuation properly. 2013-03-23 12:18:02 +01:00
Imre Deak
6e995e231a drm/i915: use for_each_sg_page for setting up the gtt ptes
The existing gtt setup code is correct - and so doesn't need to be fixed to
handle compact dma scatter lists similarly to the previous patches. Still,
take the for_each_sg_page macro into use, to get somewhat simpler code.

Signed-off-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-03-23 12:17:31 +01:00
Jesse Barnes
086ddccec4 drm/i915: use gen6 stolen check on VLV
It uses the same bit definitions.

Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-03-06 20:07:03 +01:00
Ben Widawsky
41907ddc1b drm/i915: Fix gen2 mappable calculations
When I refactored the code initially, I forgot that gen2 uses a
different bar for the CPU mappable aperture. The agp-less code knows
nothing of generations less than 5, so we have to expand the gtt_probe
function to include the mappable base and end.

It was originally broken by me:
commit baa09f5fd8
Author: Ben Widawsky <ben@bwidawsk.net>
Date:   Thu Jan 24 13:49:57 2013 -0800

    drm/i915: Add probe and remove to the gtt ops

Reported-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-02-15 10:30:38 +01:00
Changlong Xie
d93c623354 drm/i915: gen6_gmch_remove can be static
Signed-off-by: Changlong Xie <changlongx.xie@intel.com>
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-01-31 11:50:12 +01:00
Ben Widawsky
e78891ca76 drm/i915: Reclaim GTT space for failed PPGTT
When the PPGTT init fails, we may as well reuse the space that we were
reserving for the PPGTT PDEs.

This also fixes an extraneous mutex_unlock.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-01-31 11:50:08 +01:00
Ben Widawsky
a54c0c279f drm/i915: remove intel_gtt structure
With the probe call in our dispatch table, we can now cut away the
last three remaining members in the intel_gtt shared struct and so
remove it completely.

v2: Rebased on top of Daniel's series

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
[danvet: bikeshed commit message a bit.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-01-31 11:50:07 +01:00
Ben Widawsky
baa09f5fd8 drm/i915: Add probe and remove to the gtt ops
The idea, and much of the code came originally from:

commit 0712f0249c3148d8cf42a3703403c278590d4de5
Author: Ben Widawsky <ben@bwidawsk.net>
Date:   Fri Jan 18 17:23:16 2013 -0800

    drm/i915: Create a vtable for i915 gtt

Daniel didn't like the color of that patch series, and so I asked him to
start something which appealed to his sense of color. The preceding
patches are those, and now this is going on top of that.

[extracted from the original commit message]

One immediately obvious thing to implement is our gmch probing. The init
function was getting massively bloated. Fundamentally, all that's needed
from GMCH probing is the GTT size, and the stolen size. It makes design
sense to put the mappable calculation in there as well, but the code
turns out a bit nicer without it (IMO)

The intel_gtt bridge thing is still here, but the subsequent patches
will finish ripping that out.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Bikeshedded one comment (GMADR is just the PCI aperture, we
use it for other things than just accessing tiled surfaces through a
linear view) and cut the newly added long lines a bit. Also one
checkpatch error.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-01-31 11:50:07 +01:00
Daniel Vetter
3440d26585 drm/i915: extract hw ppgtt setup/cleanup code
At the moment only cosmetics, but being able to initialize/cleanup
arbitrary ppgtt address spaces paves the way to have more than one of
them ... Just in case we ever get around to implementing real
per-process address spaces. Note that in that case another vfunc for
ppgtt would be beneficial though. But that can wait until the code
grows a second place which initializes ppgtts.

Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-01-31 11:50:06 +01:00
Daniel Vetter
960e3e429f drm/i915: pte_encode is gen6+
All the other gen6+ hw code has the gen6_ prefix, so be consistent
about it.

Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-01-31 11:50:06 +01:00
Daniel Vetter
def886c376 drm/i915: vfuncs for ppgtt
Like for the global gtt we want a notch more flexibility here. Only
big change (besides a few tiny function parameter adjustments) was to
move gen6_ppgtt_insert_entries up (and remove _sg_ from its name, we
only have one kind of insert_entries since the last gtt cleanup).

We could also extract the platform ppgtt setup/teardown code a bit
better, but I don't care that much.

With this we have the hw details of pte writing nicely hidden away
behind a bit of abstraction. Which should pave the way for
different/multiple ppgtts (e.g. what we need for real ppgtt support).

Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-01-31 11:50:05 +01:00
Daniel Vetter
7faf1ab2ff drm/i915: vfuncs for gtt_clear_range/insert_entries
We have a few too many differences here, so finally take the prepared
abstraction and run with it. A few smaller changes are required to get
things into shape:

- move i915_cache_level up since we need it in the gt funcs
- split up i915_ggtt_clear_range and move the two functions down to
  where the relevant insert_entries functions are
- adjustments to a few function parameter lists

Now we have 2 functions which deal with the gen6+ global gtt
(gen6_ggtt_ prefix) and 2 functions which deal with the legacy gtt
code in the intel-gtt.c fake agp driver (i915_ggtt_ prefix).

Init is still a bit a mess, but honestly I don't care about that.

One thing I've thought about while deciding on the exact interfaces is
a flag parameter for ->clear_range: We could use that to decide
between writing invalid pte entries or scratch pte entries. In case we
ever get around to fixing all our bugs which currently prevent us from
filling the gtt with empty ptes for the truly unused ranges ...

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
[bwidawsk: Moved functions to the gtt struct]
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-01-31 11:50:05 +01:00
Ben Widawsky
8d2e630899 drm/i915: Needs_dmar, not
The reasoning behind our code taking two paths depending upon whether or
not we may have been configured for IOMMU isn't clear to me. It should
always be safe to use the pci mapping functions as they are designed to
abstract the decision we were handling in i915.

Aside from simpler code, removing another member for the intel_gtt
struct is a nice motivation.

I ran this by Chris, and he wasn't concerned about the extra kzalloc,
and memory references vs. page_to_phys calculation in the case without
IOMMU.

v2: Update commit message

v3: Remove needs_dmar addition from Zhenyu upstream

This reverts (and then other stuff)
commit 20652097da
Author: Zhenyu Wang <zhenyuw@linux.intel.com>
Date:   Thu Dec 13 23:47:47 2012 +0800

    drm/i915: Fix missed needs_dmar setting

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> (v2)
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Squash in follow-up fix to remove the bogus hunk which
deleted the dma_mask configuration for gen6+.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-01-20 13:11:12 +01:00
Ben Widawsky
9c61a32d31 drm/i915: Remove scratch page from shared
We already had a mapping in both (minus the phys_addr in AGP).

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-01-20 13:11:11 +01:00
Ben Widawsky
a81cc00c11 drm/i915: Cut out the infamous ILK w/a from AGP layer
And, move it to where the rest of the logic is.

There is some slight functionality changes. There was extra paranoid
checks in AGP code making sure we never do idle maps on gen2 parts. That
was not duplicated as the simple PCI id check should do the right thing.

v2: use IS_GEN5 && IS_MOBILE check instead. For now, this is the same as
IS_IRONLAKE_M but is more future proof. The workaround docs hint that
more than one platform may be effected, but we've never seen such a
platform in the wild. (Rodrigo, Daniel)

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> (v1)
Cc: Dave Airlie <airlied@redhat.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-01-20 13:11:11 +01:00
Ben Widawsky
93d187993b drm/i915: Remove use of gtt_mappable_entries
Mappable_end, ie. size is almost always what you want as opposed to the
number of entries. Since we already have that information, we can scrap
the number of entries and only calculate it when needed.

If gtt_start is !0, this will have slightly different behavior. This
difference can only occur in DRI1, and exists when we try to kick out
the firmware fb. The new code seems like a bugfix to me.

The other case where we've changed the behavior is during init we check
the mappable region against our current known upper and lower limits
(64MB, and 512MB). This now matches the comment, and makes things more
convenient after removing gtt_mappable_entries.

Also worth noting is the setting of mappable_end is taken out of setup
because we do it earlier now in the DRI2 case and therefore need to add
that tiny hunk to support the DRI1 IOCTL.

v2: Move up mappable end to before legacy AGP init

v3: Add the dev_priv inclusion here from previous rebase error in patch
5

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> (v2)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: squash in fix for a printk format flag mismatch warning.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-01-20 13:09:20 +01:00
Ben Widawsky
dabb7a91ae drm/i915: Remove use on gma_bus_addr on gen6+
We have enough info to not use the intel_gtt bridge stuff.

v2: Move setup of mappable_base above the legacy init stuff because we
still need that on older platforms. (Daniel)

v3: Remove the dev_priv hunk which was rebased in by accident

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> (v2)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-01-17 22:47:03 +01:00
Ben Widawsky
5d4545aef5 drm/i915: Create a gtt structure
The purpose of the gtt structure is to help isolate our gtt specific
properties from the rest of the code (in doing so it help us finish the
isolation from the AGP connection).

The following members are pulled out (and renamed):
gtt_start
gtt_total
gtt_mappable_end
gtt_mappable
gtt_base_addr
gsm

The gtt structure will serve as a nice place to put gen specific gtt
routines in upcoming patches. As far as what else I feel belongs in this
structure: it is meant to encapsulate the GTT's physical properties.
This is why I've not added fields which track various drm_mm properties,
or things like gtt_mtrr (which is itself a pretty transient field).

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
[Ben modified commit messages]
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-01-17 22:33:56 +01:00
Ben Widawsky
00fc2c3c53 drm/i915: Remove gtt_mappable_total
With the assertion from the previous patch in place, it should be safe
to get rid gtt_mappable_total. Keeps things saner to not have to track
the same info in two places.

In order to keep the diff as simple as possible and keep with the
existing gtt_setup semantics we opt to keep gtt_mappable_end. It's not
as consistent with the 'total' used in the previous patch, but that can
be fixed later.

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
[Ben modified commit message]
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-01-17 22:33:41 +01:00
Ben Widawsky
35451cb6fb drm/i915: Mappable_end can't ever be > end
Both DRI1 and DRI2 can never specify a mappable size which goes past the
GTT size.  Don't pretend otherwise.

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-01-17 22:33:02 +01:00
Ben Widawsky
c1fc6521ef drm/i915: Kill gtt_end
It's duplicated in the more useful gtt_total.

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-01-17 22:27:31 +01:00
Dave Airlie
b5cc6c0387 Merge tag 'drm-intel-next-2012-12-21' of git://people.freedesktop.org/~danvet/drm-intel into drm-next
Daniel writes:
- seqno wrap fixes and debug infrastructure from Mika Kuoppala and Chris
  Wilson
- some leftover kill-agp on gen6+ patches from Ben
- hotplug improvements from Damien
- clear fb when allocated from stolen, avoids dirt on the fbcon (Chris)
- Stolen mem support from Chris Wilson, one of the many steps to get to
  real fastboot support.
- Some DDI code cleanups from Paulo.
- Some refactorings around lvds and dp code.
- some random little bits&pieces

* tag 'drm-intel-next-2012-12-21' of git://people.freedesktop.org/~danvet/drm-intel: (93 commits)
  drm/i915: Return the real error code from intel_set_mode()
  drm/i915: Make GSM void
  drm/i915: Move GSM mapping into dev_priv
  drm/i915: Move even more gtt code to i915_gem_gtt
  drm/i915: Make next_seqno debugs entry to use i915_gem_set_seqno
  drm/i915: Introduce i915_gem_set_seqno()
  drm/i915: Always clear semaphore mboxes on seqno wrap
  drm/i915: Initialize hardware semaphore state on ring init
  drm/i915: Introduce ring set_seqno
  drm/i915: Missed conversion to gtt_pte_t
  drm/i915: Bug on unsupported swizzled platforms
  drm/i915: BUG() if fences are used on unsupported platform
  drm/i915: fixup overlay stolen memory leak
  drm/i915: clean up PIPECONF bpc #defines
  drm/i915: add intel_dp_set_signal_levels
  drm/i915: remove leftover display.update_wm assignment
  drm/i915: check for the PCH when setting pch_transcoder
  drm/i915: Clear the stolen fb before enabling
  drm/i915: Access to snooped system memory through the GTT is incoherent
  drm/i915: Remove stale comment about intel_dp_detect()
  ...

Conflicts:
	drivers/gpu/drm/i915/intel_display.c
2013-01-17 20:34:08 +10:00
Ben Widawsky
1c45140d3d drm/i915: Make GSM void
The iomapping of the register region has historically been a uint32_t
for the obvious reason that our PTE size was always 4b. In the future
however, we cannot make this assumption.

By making the type void, it makes the upcoming pointer math we will do
much easier, and hopefully gives the compiler opportunities to warn us
when we do stupid things.

v2: Cast to __iomem, caught by Ville

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Fixup __iomem issue for real.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-12-20 16:32:04 +01:00
Ben Widawsky
06e5598fce drm/i915: Move GSM mapping into dev_priv
This removes an unused field from the AGP structure and moves it into
the dev_priv structure (with a slightly better name). This builds upon
the kill-agp series already merged.

GSM is a well defined term in the bspec:
GSM: Graphics Stolen Memory

GTT stolen space is defined for storage of the GFX GTT entries in
physical memory. IA can not access GSM directly , it can only access via
GTTMMADR. GT can access GSM directly or through GTTMMADR.

This is not the entire stolen space.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-12-20 16:28:42 +01:00
Ben Widawsky
d7e5008f7c drm/i915: Move even more gtt code to i915_gem_gtt
This really should have been part of the kill agp series.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-12-20 16:27:35 +01:00
Ben Widawsky
079a43f67f drm/i915: Missed conversion to gtt_pte_t
commit f61c060907
Author: Ben Widawsky <ben@bwidawsk.net>
Date:   Mon Oct 22 11:44:43 2012 -0700

    drm/i915: introduce gtt_pte_t

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-12-18 22:32:03 +01:00
Zhenyu Wang
20652097da drm/i915: Fix missed needs_dmar setting
From Ben's AGP dependence removal change, "needs_dmar" flag has not
been properly setup for new chips using new GTT init function. This
one adds missed setting of that flag to make sure we do pci mappings
with IOMMU enabled.

Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-12-13 21:40:24 +01:00
Chris Wilson
ed2f345267 drm/i915: Avoid clearing preallocated regions from the GTT
As yet we do not do any preallocation (chicken-and-egg problem), but we
may like to preserve anything already allocated by the BIOS or grub and
reuse for own purposes after initialising the driver.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-11-30 23:24:49 +01:00
Ben Widawsky
2ff4aeac39 drm/i915: Fix pte updates in ggtt clear range
This bug was introduced by me:
commit e76e9aebcd
Author: Ben Widawsky <ben@bwidawsk.net>
Date:   Sun Nov 4 09:21:27 2012 -0800

    drm/i915: Stop using AGP layer for GEN6+

The existing code uses memset_io which follows memset semantics in only
guaranteeing a write of individual bytes. Since a PTE entry is 4 bytes,
this can only be correct if the scratch page address is 0.

This caused unsightly errors when we clear the range at load time,
though I'm not really sure what the heck is referencing that memory
anyway. I caught this is because I believe we have some other bug where
the display is doing reads of memory we feel should be cleared (or we
are relying on scratch pages to be a specific value).

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-11-29 11:14:44 +01:00
Ben Widawsky
b5c621584b drm/i915: Use pci_resource functions for BARs.
This was leftover crap from kill-agp. The current code is theoretically
broken for 64b bars. (I resist removing theoretically because I am too
lazy to test).

We still need to ioremap things ourselves because we want to ioremap_wc
the PTEs.

v2: Forgot to kill the tmp variable in v1

CC: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-11-21 17:47:14 +01:00
Chris Wilson
d640c4b09a drm/i915: Report amount of usable graphics memory in MiB
...rather than kilo-PTE.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: Apply s/Usabel/usable/ bikeshed suggested by Ben Widawsky.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-11-13 16:10:56 +01:00
Ben Widawsky
ccdf56cdb2 drm/i915: Fix sparse warnings in from AGP kill code
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-11-11 23:51:45 +01:00
Ben Widawsky
26b1ff35c8 drm/i915: Move the remaining gtt code
It's pretty much all consolidated now that we've killed AGP. We can move
the one outlier, and defines too.

(Kill some unused defines in the process)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-11-11 23:51:44 +01:00
Ben Widawsky
0f9b91c754 drm/i915: flush system agent TLBs on SNB
This allows us to map the PTEs WC. I've not done thorough testing or
performance measurements with this patch, but it should be decent.

This is based on a patch from Jesse with the original commit message
> I've only lightly tested this so far, but the corruption seems to be
> gone if I write the GFX_FLSH_CNTL reg after binding an object.  This
> register should control the TLB for the system agent, which is what CPU
> mapped objects will go through.

It has been updated for the new AGP-less code by me, and included with
it is feedback from the original patch.

v2: Updated to reflect paranoia on pte updates/register posting reads.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by [v1]: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-11-11 23:51:44 +01:00
Ben Widawsky
03752f5b7b drm/i915: Calculate correct stolen size for GEN7+
This bug existed in the old code, but was easier to fix here in the
rework. Unfortunately gen7 doesn't have a nice way to figure out the
size and we must use a lookup table.

As Jesse pointed out, there is some confusion in the docs about these
definitions. We're picking the one which seems more accurate, but we
really aren't certain.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-11-11 23:51:43 +01:00
Ben Widawsky
e76e9aebcd drm/i915: Stop using AGP layer for GEN6+
As a quick hack we make the old intel_gtt structure mutable so we can
fool a bunch of the existing code which depends on elements in that data
structure. We can/should try to remove this in a subsequent patch.

This should preserve the old gtt init behavior which upon writing these
patches seems incorrect. The next patch will fix these things.

The one exception is VLV which doesn't have the preserved flush control
write behavior. Since we want to do that for all GEN6+ stuff, we'll
handle that in a later patch. Mainstream VLV support doesn't actually
exist yet anyway.

v2: Update the comment to remove the "voodoo"
Check that the last pte written matches what we readback

v3: actually kill cache_level_to_agp_type since most of the flags will
disappear in an upcoming patch

v4: v3 was actually not what we wanted (Daniel)
Make the ggtt bind assertions better and stricter (Chris)
Fix some uncaught errors at gtt init (Chris)
Some other random stuff that Chris wanted

v5: check for i==0 in gen6_ggtt_bind_object to shut up gcc (Ben)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by [v4]: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: Make the cache_level -> agp_flags conversion for pre-gen6 a
tad more robust by mapping everything != CACHE_NONE to the cached agp
flag - we have a 1:1 uncached mapping, but different modes of
cacheable (at least on later generations). Suggested by Chris Wilson.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-11-11 23:51:42 +01:00
Ben Widawsky
e7210c3c4f drm/i915: move more pte encoding to pte encode
In order to handle differences in pte encoding between architectures it
is desirable to have one helper function, pte_encode, do it all for us.
As such, this commit moves the code around so we're in good shape to do
that.

Luckily the ppgtt pte and the ggtt pte look very similar.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-11-11 23:51:11 +01:00
Ben Widawsky
54d125274f drm/i915: Extract PPGTT pte encoding
HSW will change the PTE encoding, and laying this out now will be
helpful when we're ready to implement that. More importantly, GGTT and
PPGTT PTE encoding is quite similar, so moving this out into a helper
function will enable us to lance the AGP layer.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-11-11 23:51:11 +01:00
Ben Widawsky
f61c060907 drm/i915: introduce gtt_pte_t
This will make the calculations of size easier to read instead of just
assuming uint32_t everywhere.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-11-11 23:51:10 +01:00
Ben Widawsky
8f2c59f0aa drm/i915: Add dev to ppgtt
Some subsequent commits will need to know what generation we're running
on to do different pte encoding for the ppgtt. Since it's not much
hassle or overhead to store it in the ppgtt structure, do that.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-11-11 23:51:10 +01:00
Ben Widawsky
8693607ae4 drm/i915: No LLC_MLC for HSW.
The mid-level cache or as it's more commonly referred to now as L3, is
not setup this way on HSW.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-11-11 23:51:09 +01:00
Linus Torvalds
612a9aab56 Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linux
Pull drm merge (part 1) from Dave Airlie:
 "So first of all my tree and uapi stuff has a conflict mess, its my
  fault as the nouveau stuff didn't hit -next as were trying to rebase
  regressions out of it before we merged.

  Highlights:
   - SH mobile modesetting driver and associated helpers
   - some DRM core documentation
   - i915 modesetting rework, haswell hdmi, haswell and vlv fixes, write
     combined pte writing, ilk rc6 support,
   - nouveau: major driver rework into a hw core driver, makes features
     like SLI a lot saner to implement,
   - psb: add eDP/DP support for Cedarview
   - radeon: 2 layer page tables, async VM pte updates, better PLL
     selection for > 2 screens, better ACPI interactions

  The rest is general grab bag of fixes.

  So why part 1? well I have the exynos pull req which came in a bit
  late but was waiting for me to do something they shouldn't have and it
  looks fairly safe, and David Howells has some more header cleanups
  he'd like me to pull, that seem like a good idea, but I'd like to get
  this merge out of the way so -next dosen't get blocked."

Tons of conflicts mostly due to silly include line changes, but mostly
mindless.  A few other small semantic conflicts too, noted from Dave's
pre-merged branch.

* 'drm-next' of git://people.freedesktop.org/~airlied/linux: (447 commits)
  drm/nv98/crypt: fix fuc build with latest envyas
  drm/nouveau/devinit: fixup various issues with subdev ctor/init ordering
  drm/nv41/vm: fix and enable use of "real" pciegart
  drm/nv44/vm: fix and enable use of "real" pciegart
  drm/nv04/dmaobj: fixup vm target handling in preparation for nv4x pcie
  drm/nouveau: store supported dma mask in vmmgr
  drm/nvc0/ibus: initial implementation of subdev
  drm/nouveau/therm: add support for fan-control modes
  drm/nouveau/hwmon: rename pwm0* to pmw1* to follow hwmon's rules
  drm/nouveau/therm: calculate the pwm divisor on nv50+
  drm/nouveau/fan: rewrite the fan tachometer driver to get more precision, faster
  drm/nouveau/therm: move thermal-related functions to the therm subdev
  drm/nouveau/bios: parse the pwm divisor from the perf table
  drm/nouveau/therm: use the EXTDEV table to detect i2c monitoring devices
  drm/nouveau/therm: rework thermal table parsing
  drm/nouveau/gpio: expose the PWM/TOGGLE parameter found in the gpio vbios table
  drm/nouveau: fix pm initialization order
  drm/nouveau/bios: check that fixed tvdac gpio data is valid before using it
  drm/nouveau: log channel debug/error messages from client object rather than drm client
  drm/nouveau: have drm debugging macros build on top of core macros
  ...
2012-10-03 23:29:23 -07:00
David Howells
760285e7e7 UAPI: (Scripted) Convert #include "..." to #include <path/...> in drivers/gpu/
Convert #include "..." to #include <path/...> in drivers/gpu/.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Dave Airlie <airlied@redhat.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Dave Jones <davej@redhat.com>
2012-10-02 18:01:07 +01:00
David Howells
4126d5d61f UAPI: (Scripted) Remove redundant DRM UAPI header #inclusions from drivers/gpu/.
Remove redundant DRM UAPI header #inclusions from drivers/gpu/.

Remove redundant #inclusions of core DRM UAPI headers (drm.h, drm_mode.h and
drm_sarea.h).  They are now #included via drmP.h and drm_crtc.h via a preceding
patch.

Without this patch and the patch to make include the UAPI headers from the core
headers, after the UAPI split, the DRM C sources cannot find these UAPI headers
because the DRM code relies on specific -I flags to make #include "..."  work
on headers in include/drm/ - but that does not work after the UAPI split without
adding more -I flags.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Dave Airlie <airlied@redhat.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Dave Jones <davej@redhat.com>
2012-10-02 18:01:05 +01:00
Daniel Vetter
398b7a1b88 Linux 3.6-rc7
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.18 (GNU/Linux)
 
 iQEcBAABAgAGBQJQX7MuAAoJEHm+PkMAQRiG0h0IAJURkrMCAQUxA+Ik66ReH89s
 LQcVd0U9uL4UUOi7f5WR64Vf9Cfu6VVGX9ZKSvjpNskvlQaUQPMIt4pMe6g4X4dI
 u0bApEy4XZz3nGabUAghIU8jJ8cDmhCG6kPpSiS7pi7KHc0yIa4WFtJRrIpGaIWT
 xuK38YOiOHcSDRlLyWZzainMncQp/ixJdxnqVMTonkVLk0q0b84XzOr4/qlLE5lU
 i+TsK3PRKdQXgvZ4CebL+srPBwWX1dmgP3VkeBloQbSSenSeELICbFWavn2ml+sF
 GXi4dO93oNquL/Oy5SwI666T4uNcrRPaS+5X+xSZgBW/y2aQVJVJuNZg6ZP/uWk=
 =0v2l
 -----END PGP SIGNATURE-----

Merge tag 'v3.6-rc7' into drm-intel-next-queued

Manual backmerge of -rc7 to resolve a silent conflict leading to
compile failure in drivers/gpu/drm/i915/intel_hdmi.c.

This is due to the bugfix in -rc7:

commit b98b601672
Author: Wang Xingchao <xingchao.wang@intel.com>
Date:   Thu Sep 13 07:43:22 2012 +0800

    drm/i915: HDMI - Clear Audio Enable bit for Hot Plug

Since this code moved around a lot in -next git put that snippet at
the wrong spot. I've tried to fix this by making the conflict explicit
by merging a version for next with:

commit 3cce574f01
Author: Wang Xingchao <xingchao.wang@intel.com>
Date:   Thu Sep 13 11:19:00 2012 +0800

    drm/i915: HDMI - Clear Audio Enable bit for Hot Plug unconditionally

But that failed to solve the entire problem. To avoid pushing out
further -nightly branch to our QA where this is broken, do the
backmerge and manually add the stuff git adds to -next from the patch
in -fixes.

Note that this doesn't show up in git's merge diff (and hence is also
not handled by git rerere), which adds to the reasons why I'd like to
fix this with a verbose backmerge. The git merge diff only shows a
bunch of trivial conflicts of the "code changed in lines next to each
another" kind.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-09-24 18:17:12 +02:00
Chris Wilson
2f745ad3d3 drm/i915: Convert the dmabuf object to use the new i915_gem_object_ops
By providing a callback for when we need to bind the pages, and then
release them again later, we can shorten the amount of time we hold the
foreign pages mapped and pinned, and importantly the dmabuf objects then
behave as any other normal object with respect to the shrinker and
memory management.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-09-20 14:23:10 +02:00
Chris Wilson
9da3da660d drm/i915: Replace the array of pages with a scatterlist
Rather than have multiple data structures for describing our page layout
in conjunction with the array of pages, we can migrate all users over to
a scatterlist.

One major advantage, other than unifying the page tracking structures,
this offers is that we replace the vmalloc'ed array (which can be up to
a megabyte in size) with a chain of individual pages which helps reduce
memory pressure.

The disadvantage is that we then do not have a simple array to iterate,
or to access randomly. The common case for this is in the relocation
processing, which will typically fit within a single scatterlist page
and so be almost the same cost as the simple array. For iterating over
the array, the extra function call could be optimised away, but in
reality is an insignificant cost of either binding the pages, or
performing the pwrite/pread.

v2: Fix drm_clflush_sg() to not invoke wbinvd as well! And fix the
trivial compile error from rebasing.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-09-20 14:22:57 +02:00
Dave Airlie
65983bd605 Merge branch 'for-airlied' of git://people.freedesktop.org/~danvet/drm-intel into drm-next
Daniel writes:
"New stuff for -next. Highlights:
- prep patches for the modeset rework. Note that one of those patches
  touches the fb helper in the common drm code.
- hasw hdmi audio support (Wang Xingchao)
- improved instdone dumping for gen7 (Ben)
- unbound tracking and a few follow-up patches from Chris
- dma_buf->begin/end_cpu_access plus fix for drm/udl (Dave)
- improve mmio error reporting for hsw
- prep patch for WQ_NON_REENTRANT removal (Tejun Heo)
"

* 'for-airlied' of git://people.freedesktop.org/~danvet/drm-intel: (41 commits)
  drm/i915: Remove __GFP_NO_KSWAPD
  drm/i915: disable rc6 on ilk when vt-d is enabled
  drm/i915: Avoid unbinding due to an interrupted pin_and_fence during execbuffer
  drm/i915: Use new INSTDONE registers (Gen7+)
  drm/i915: Add new INSTDONE registers
  drm/i915: Extract reading INSTDONE
  drm/i915: Use a non-blocking wait for set-to-domain ioctl
  drm/i915: Juggle code order to ease flow of the next patch
  drm/i915: Use cpu relocations if the object is in the GTT but not mappable
  drm/i915: Extract general object init routine
  drm/i915: Protect private gem objects from truncate (such as imported dmabuf)
  drm/i915: Only pwrite through the GTT if there is space in the aperture
  i915: use alloc_ordered_workqueue() instead of explicit UNBOUND w/ max_active = 1
  drm/i915: Find unclaimed MMIO writes.
  drm/i915: Add ERR_INT to gen7 error state
  drm/i915: Cantiga+ cannot handle a hsync front porch of 0
  drm/i915: fix reassignment of variable "intel_dp->DP"
  drm/i915: Try harder to allocate an mmap_offset
  drm/i915: Show pin count in debugfs
  drm/i915: Show (count, size) of purgeable objects in i915_gem_objects
  ...
2012-09-03 12:05:01 +10:00
Dave Airlie
93bb70e0c0 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux into drm-next
There was some merge conflicts in -next and they weren't so pretty, so
backmerge now to avoid them.

Conflicts:
	drivers/gpu/drm/i915/i915_gem.c
	drivers/gpu/drm/i915/intel_modes.c
2012-08-27 16:22:20 +10:00
Chris Wilson
9a0f938bde drm/i915: Use the correct size of the GTT for placing the per-process entries
The current layout is to place the per-process tables at the end of the
GTT. However, this is currently using a hardcoded maximum size for the GTT
and not taking in account limitations imposed by the BIOS. Use the value
for the total number of entries allocated in the table as provided by
the configuration registers.

Reported-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Ben Widawsky <ben@bwidawsk.net>
Cc: Matthew Garret <mjg@redhat.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-08-24 11:07:59 +02:00
Chris Wilson
6c085a728c drm/i915: Track unbound pages
When dealing with a working set larger than the GATT, or even the
mappable aperture when touching through the GTT, we end up with evicting
objects only to rebind them at a new offset again later. Moving an
object into and out of the GTT requires clflushing the pages, thus
causing a double-clflush penalty for rebinding.

To avoid having to clflush on rebinding, we can track the pages as they
are evicted from the GTT and only relinquish those pages on memory
pressure.

As usual, if it were not for the handling of out-of-memory condition and
having to manually shrink our own bo caches, it would be a net reduction
of code. Alas.

Note: The patch also contains a few changes to the last-hope
evict_everything logic in i916_gem_execbuffer.c - we no longer try to
only evict the purgeable stuff in a first try (since that's superflous
and only helps in OOM corner-cases, not fragmented-gtt trashing
situations).

Also, the extraction of the get_pages retry loop from bind_to_gtt (and
other callsites) to get_pages should imo have been a separate patch.

v2: Ditch the newly added put_pages (for unbound objects only) in
i915_gem_reset. A quick irc discussion hasn't revealed any important
reason for this, so if we need this, I'd like to have a git blame'able
explanation for it.

v3: Undo the s/drm_malloc_ab/kmalloc/ in get_pages that Chris noticed.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: Split out code movements and rant a bit in the commit message
with a few Notes. Done v2]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-08-21 14:34:11 +02:00
Daniel Vetter
a843af186c drm/i915: fix hsw uncached pte
They've changed it ... for no apparent reason. Meh.

V2: remove unused 'is_hsw' field.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-08-17 09:21:35 +02:00
Daniel Vetter
a22ddff8be Linux 3.6-rc2
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.18 (GNU/Linux)
 
 iQEcBAABAgAGBQJQLWtvAAoJEHm+PkMAQRiG/DYH+wd0FqfEuYkYk4KPyAPuhKpX
 zX7HYfLvyJE/ZYIdrhjq1E6Xm2KNr7gtX7/Rdzi2W38M9sjbYzwG1UGIw51qnxWy
 yZJH9BGkfyQgQPeuDGohfB6DkDy2JWr2eqMDvakjOwgBsIzji0PQD/f3UvndhtUa
 c+tTj/kjavHE1Yr2Wy6OnRZz3Uc0hIMn/Q0JqtbCs3LUgEV1KA4OEAe56XNz4Ku4
 WE+FFaGFPvtriQsQON+ohPS5IC8jzQGK/0vbrJ4lWjFnZy4gvZXnborTOwD0WSQG
 fbsNuxp1AaM2/pqfMwXm1w0ADvwOITHNiwwXf9id6DoK81QwTFpUdvKpn6yB6gQ=
 =rurr
 -----END PGP SIGNATURE-----

Merge tag 'v3.6-rc2' into drm-intel-next

Backmerge Linux 3.6-rc2 to resolve a few funny conflicts before we put
even more madness on top:

- drivers/gpu/drm/i915/i915_irq.c: Just a spurious WARN removed in
  -fixes, that has been changed in a variable-rename in -next, too.

- drivers/gpu/drm/i915/intel_ringbuffer.c: -next remove scratch_addr
  (since all their users have been extracted in another fucntion),
  -fixes added another user for a hw workaroudn.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-08-17 09:01:08 +02:00
Dave Airlie
f00f979145 i915: don't map imported dma-bufs for dmar.
The exporter should have given us pages in the correct place, avoid
the prepare object mapping phase on dmar systems.

This fixes an oops on a GM45/R600 machine, when running the intel/radeon
tests.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-08-05 22:53:59 +02:00
Chris Wilson
42d6ab4839 drm/i915: Segregate memory domains in the GTT using coloring
Several functions of the GPU have the restriction that differing memory
domains cannot be placed next to each other (as the GPU may prefetch
beyond the end of one domain and hang as it crosses into the other
domain). We use the facility of the drm_mm to mark ranges with a
particular color that corresponds to the cache attributes of those pages
in order to prevent allocating adjacent blocks of differing memory
types.

v2: Rebase ontop of drm_mm coloring v2.
v3: Fix rebinding existing gtt_space and add a verification routine.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-07-26 12:56:25 +02:00
Daniel Vetter
1286ff7397 i915: add dmabuf/prime buffer sharing support.
This adds handle->fd and fd->handle support to i915, this is to allow
for offloading of rendering in one direction and outputs in the other.

v2 from Daniel Vetter:
- fixup conflicts with the prepare/finish gtt prep work.
- implement ppgtt binding support.

Note that we have squat i-g-t testcoverage for any of the lifetime and
access rules dma_buf/prime support brings along. And there are quite a
few intricate situations here.

Also note that the integration with the existing code is a bit
hackish, especially around get_gtt_pages and put_gtt_pages. It imo
would be easier with the prep code from Chris Wilson's unbound series,
but that is for 3.6.

Also note that I didn't bother to put the new prepare/finish gtt hooks
to good use by moving the dma_buf_map/unmap_attachment calls in there
(like we've originally planned for).

Last but not least this patch is only compile-tested, but I've changed
very little compared to Dave Airlie's version. So there's a decent
chance v2 on drm-next works as well as v1 on 3.4-rc.

v3: Right when I've hit sent I've noticed that I've screwed up one
obj->sg_list (for dmar support) and obj->sg_table (for prime support)
disdinction. We should be able to merge these 2 paths, but that's
material for another patch.

v4: fix the error reporting bugs pointed out by ickle.

v5: fix another error, and stop non-gtt mmaps on shared objects
stop pread/pwrite on imported objects, add fake kmap

Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-05-23 10:47:10 +01:00
Ben Widawsky
b2da9fe5d5 drm/i915: remove do_retire from i915_wait_request
This originates from a hack by me to quickly fix a bug in an earlier
patch where we needed control over whether or not waiting on a seqno
actually did any retire list processing. Since the two operations aren't
clearly related, we should pull the parameter out of the wait function,
and make the caller responsible for retiring if the action is desired.

The only function call site which did not get an explicit retire_request call
(on purpose) is i915_gem_inactive_shrink(). That code was already calling
retire_request a second time.

v2: don't modify any behavior excepit i915_gem_inactive_shrink(Daniel)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-05-03 11:18:20 +02:00
Daniel Vetter
211c568bc6 drm/i915: simplify ppgtt setup
We don't need the pt_addr for the !dmar case, so drop the else and
move the if (dmar) condition out of the loop.

v2: Fixup whitespace damage noticed by Chris Wilson.

v3: Collapse the two identical if blocks. Chris Wilson makes me look
like a moron right now ...

Noticed-by: Konstantin Belousov <kostikbel@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wislon.co.uk>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-04-12 21:14:10 +02:00
Dave Airlie
effbc4fd8e Merge branch 'drm-intel-next' of git://people.freedesktop.org/~danvet/drm-intel into drm-core-next
Daniel Vetter wrote
First pull request for 3.5-next, slightly large than usual because new
things kept coming in since the last pull for 3.4.
Highlights:
- first batch of hw enablement for vlv (Jesse et al) and hsw (Eugeni). pci
 ids are not yet added, and there's still quite a few patches to merge
 (mostly modesetting). To make QA easier I've decided to merge this stuff
 in pieces.
- loads of cleanups and prep patches spurred by the above. Especially vlv
 is a real frankenstein chip, but also hsw is stretching our driver's
 code design. Expect more to come in this area for 3.5.
- more gmbus fixes, cleanups and improvements by Daniel Kurtz. Again,
 there are more patches needed (and some already queued up), but I wanted
 to split this a bit for better testing.
- pwrite/pread rework and retuning. This series has been in the works for
 a few months already and a lot of i-g-t tests have been created for it.
 Now it's finally ready to be merged.  Note that one patch in this series
 touches include/pagemap.h, that patch is acked-by akpm.
- reduce mappable pressure and relocation throughput improvements from
 Chris.
- mmap offset exhaustion mitigation by Chris Wilson.
- a start at figuring out which codepaths in our messy dri1/ums+gem/kms
 driver we actually need to support by bailing out of unsupported case.
 The driver now refuses to load without kms on gen6+ and disallows a few
 ioctls that userspace never used in certain cases. More of this will
 definitely come.
- More decoupling of global gtt and ppgtt.
- Improved dual-link lvds detection by Takashi Iwai.
- Shut up the compiler + plus fix the fallout (Ben)
- Inverted panel brightness handling (mostly Acer manages to break things
 in this way).
- Small fixlets and adjustements and some minor things to help debugging.

Regression-wise QA reported quite a few issues on ivb, but all of them
turned out to be hw stability issues which are already fixed in
drm-intel-fixes (QA runs the nightly regression tests on -next alone,
without -fixes automatically merged in). There's still one issue open on
snb, it looks like occlusion query writes are not quite as cache coherent
as we've expected. With some of the pwrite adjustements we can now
reliably hit this. Kernel workaround for it is in the works."

* 'drm-intel-next' of git://people.freedesktop.org/~danvet/drm-intel: (101 commits)
  drm/i915: VCS is not the last ring
  drm/i915: Add a dual link lvds quirk for MacBook Pro 8,2
  drm/i915: make quirks more verbose
  drm/i915: dump the DMA fetch addr register on pre-gen6
  drm/i915/sdvo: Include YRPB as an additional TV output type
  drm/i915: disallow gem init ioctl on ilk
  drm/i915: refuse to load on gen6+ without kms
  drm/i915: extract gt interrupt handler
  drm/i915: use render gen to switch ring irq functions
  drm/i915: rip out old HWSTAM missed irq WA for vlv
  drm/i915: open code gen6+ ring irqs
  drm/i915: ring irq cleanups
  drm/i915: add SFUSE_STRAP registers for digital port detection
  drm/i915: add WM_LINETIME registers
  drm/i915: add WRPLL clocks
  drm/i915: add LCPLL control registers
  drm/i915: add SSC offsets for SBI access
  drm/i915: add port clock selection support for HSW
  drm/i915: add S PLL control
  drm/i915: add PIXCLK_GATE register
  ...

Conflicts:
	drivers/char/agp/intel-agp.h
	drivers/char/agp/intel-gtt.c
	drivers/gpu/drm/i915/i915_debugfs.c
2012-04-12 10:27:01 +01:00
Daniel Vetter
55a254ac63 drm/i915: properly restore the ppgtt page directory on resume
The ppgtt page directory lives in a snatched part of the gtt pte
range. Which naturally gets cleared on hibernate when we pull the
power. Suspend to ram (which is what I've tested) works because
despite the fact that this is a mmio region, it is actually back by
system ram.

Fix this by moving the page directory setup code to the ppgtt init
code (which gets called on resume).

This fixes hibernate on my ivb and snb.

Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-04-01 12:25:29 +02:00
Daniel Vetter
d1dd20a965 drm/i915: clear the entire gtt when using gem
We've lost our guard page somewhere in the gtt rewrite, this patch
here will restore it.

Exercised by i-g-t/tests/gem_cs_prefetch.

v2: Substract the guard page from the range we're supposed to manage
with gem. Suggested by Chris Wilson to increase the odds of old ums +
gem userspace not blowing up. To compensate for the loss of a page,
don't substract the guard page in the modeset init code any longer.

Tested-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=44748
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-03-27 13:15:24 +02:00
Daniel Vetter
644ec02b5d drm/i915: s/i915_gem_do_init/i915_gem_init_global_gtt
... because this is what it actually doesn now that we have the global
gtt vs. ppgtt split.

Also move it to the other global gtt functions in i915_gem_gtt.c

Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-03-27 13:14:59 +02:00
Daniel Vetter
74898d7edc drm/i915: bind objects to the global gtt only when needed
And track the existence of such a binding similar to the aliasing
ppgtt case. Speeds up binding/unbinding in the common case where we
only need a ppgtt binding (which is accessed in a cpu coherent fashion
by the gpu) and no gloabl gtt binding (which needs uc writes for the
ptes).

This patch just puts the required tracking in place.

v2: Check that global gtt mappings exist in the error_state capture
code (with Chris Wilson's llc reloc patches batchbuffers are no longer
relocated as mappable in all situations, so this matters). Suggested
by Chris Wilson.

v3: Adapted to Chris' latest llc-reloc patches.

v4: Fix a bug in the i915 error state capture code noticed by Chris
Wilson.

Reviewed-and-tested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-03-20 21:52:01 +01:00
Daniel Vetter
741639079c drm/i915: split out dma mapping from global gtt bind/unbind functions
Note that there's a functional change buried in this patch wrt the ilk
dmar workaround: We now only idle the gpu while tearing down the dmar
mappings, not while clearing the gtt. Keeping the current semantics
would have made for some really ugly code and afaik the issue is only
with the dmar unmapping that needs a fully idle gpu.

Reviewed-and-tested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-03-20 21:51:41 +01:00
Daniel Vetter
7bddb01fb9 drm/i915: ppgtt binding/unbinding support
This adds support to bind/unbind objects and wires it up. Objects are
only put into the ppgtt when necessary, i.e. at execbuf time.

Objects are still unconditionally put into the global gtt.

v2: Kill the quick hack and explicitly pass cache_level to ppgtt_bind
like for the global gtt function. Noticed by Chris Wilson.

Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Tested-by: Chris Wilson <chris@chris-wilson.co.uk>
Tested-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-02-09 21:25:23 +01:00
Daniel Vetter
1d2a314c97 drm/i915: initialization/teardown for the aliasing ppgtt
This just adds the setup and teardown code for the ppgtt PDE and the
last-level pagetables, which are fixed for the entire lifetime, at
least for the moment.

v2: Kill the stray debug printk noted by and improve the pte
definitions as suggested by Chris Wilson.

v3: Clean up the aperture stealing code as noted by Ben Widawsky.

v4: Paint the init code in a more pleasing colour as suggest by Chris
Wilson.

v5: Explain the magic numbers noticed by Ben Widawsky.

Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Tested-by: Chris Wilson <chris@chris-wilson.co.uk>
Tested-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-02-09 21:25:11 +01:00
Ben Widawsky
8436473a4b drm/i915: drm/i915: Fix recursive calls to unmap
After the ILK vt-d workaround patches it became clear that we had
introduced a bug.  Chris Wilson tracked down the issue to recursive
calls to unmap. This happens because we try to optimize waiting on
requests by calling retire requests after the wait, which may drop the
last reference on an object and end up freeing the object (and then
unmap the object from the gtt).

After the last patch we can now choose to defer processing the retire
list.

Kudos to Chris Wilson for tracking this one down.

This patch fixes gem_unref_active_buffers from i-g-t. It was tested by
forcing do_idle_maps to true.

This also fixes tests/gem_linear_blits in intel-gpu-tools.

Reported-by: guang.a.yang@intel.com
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=42180
Reviewed-by: Keith Packard <keithp@keithp.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-01-26 11:25:51 +01:00
Ben Widawsky
b93f9cf14e drm/i915: argument to control retiring behavior
Sometimes it may be the case when we idle the gpu or wait on something
we don't actually want to process the retiring list. This patch allows
callers to choose the behavior.

Reviewed-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-01-26 11:19:19 +01:00
Ben Widawsky
5c0422878f drm/i915: ILK + VT-d workaround
Idle the GPU before doing any unmaps. We know if VT-d is in use through
an exported variable from iommu code.

This should avoid a known HW issue.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Keith Packard <keithp@keithp.com>
2011-10-20 15:26:39 -07:00
Chris Wilson
e4ffd173a1 drm/i915: Add an interface to dynamically change the cache level
[anholt v2: Don't forget that when going from cached to uncached, we
haven't been tracking the write domain from the CPU perspective, since
we haven't needed it for GPU coherency.]

[ickle v3: We also need to make sure we relinquish any fences on older
chipsets and clear the GTT for sane domain tracking.]

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2011-06-09 21:51:16 -07:00
Chris Wilson
d5bd144959 drm/i915/gtt: Split out i915_gem_gtt_rebind_object()
... in preparation for changing the cache level (and thus the flags upon
the PTEs) dynamically.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2011-06-09 21:51:15 -07:00
Chris Wilson
93dfb40cd8 drm/i915: Rename agp_type to cache_level
... to clarify just how we use it inside the driver and remove the
confusion of the poorly matching agp_type names. We still need to
translate through agp_type for interface into the fake AGP driver.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Keith Packard <keithp@keithp.com>
2011-05-10 13:56:43 -07:00
Chris Wilson
bee4a186c1 drm/i915,agp/intel: Do not clear stolen entries
We can only utilize the stolen portion of the GTT if we are in sole
charge of the hardware. This is only true if using GEM and KMS,
otherwise VESA continues to access stolen memory.

Reported-by: Arnd Bergmann <arnd@arndb.de>
Reported-by: Frederic Weisbecker <fweisbec@gmail.com>
Tested-by: Jiri Olsa <jolsa@redhat.com>
Tested-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-01-24 18:26:25 +00:00
Chris Wilson
d912640058 drm/i915/gtt: Unmap the PCI pages after unbinding them from the GTT
Dave Airlie spotted that his ILK laptop with DMAR enabled was generating
the occasional DMAR warning.

"The ordering in the previous code was to rewrite the GTT table before
unmapping the pages and that makes sense to me."

This is his stable patch ported to d-i-n.

Reported-by: Dave Airlie <airlied@redhat.com>
Original-patch-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-01-11 20:44:56 +00:00
Chris Wilson
a8e93126a6 drm/i915/gtt: Clear the cachelines upon resume
Required for my pineview system to not barf after resuming.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-12-09 19:46:23 +00:00
Chris Wilson
05394f3975 drm/i915: Use drm_i915_gem_object as the preferred type
A glorified s/obj_priv/obj/ with a net reduction of over a 100 lines and
many characters!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-11-23 20:19:10 +00:00
Daniel Vetter
185cbcb304 drm/i915: no more agp for gem
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-11-23 20:14:48 +00:00
Daniel Vetter
7c2e6fdf45 drm/i915: move gtt handling to i915_gem_gtt.c
No more drm_*_agp in i915_gem.c!

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-11-23 20:14:47 +00:00
Daniel Vetter
76aaf22016 drm/i915: restore gtt on resume in the drm instead of in intel-gtt.ko
This still uses the agp functions to actually reinstate the mappings
(with a gross hack to make agp cooperate), but it wires everything
up correctly for the switchover.

The call to agp_rebind_memory can be dropped because all non-kms drivers
do all their rebinding on EnterVT.

v2: Be more paranoid and flush the chipset cache after restoring gtt
mappings.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-11-23 20:14:46 +00:00