Commit Graph

244 Commits

Author SHA1 Message Date
Ben Widawsky
5abbcca30d drm/i915/bdw: Kill ppgtt->num_pt_pages
With the original PPGTT implementation if the number of PDPs was not a
power of two, the number of pages for the page tables would end up being
rounded up. The code actually had a bug here afaict, but this is a
theoretical bug as I don't believe this can actually occur with the
current code/HW..

With the rework of the page table allocations, there is no longer a
distinction between number of page table pages, and number of page
directory entries. To avoid confusion, kill the redundant (and newer)
struct member.

Cc: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-05 21:30:00 +01:00
Ben Widawsky
b146520ff9 drm/i915: Split GEN6 PPGTT initialization up
Simply to match the GEN8 style of PPGTT initialization, split up the
allocations and mappings. Unlike GEN8, we skip a separate dma_addr_t
allocation function, as it is much simpler pre-gen8.

With this code it would be easy to make a more general PPGTT
initialization function with per GEN alloc/map/etc. or use a common
helper, similar to the ringbuffer code. I don't see a benefit to doing
this just yet, but who knows...

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-05 21:30:00 +01:00
Ben Widawsky
a00d825de9 drm/i915: Split GEN6 PPGTT cleanup
This cleanup is similar to the GEN8 cleanup (though less necessary).
Having everything split will make cleaning the initialization path error
paths easier to understand.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-05 21:29:59 +01:00
Ben Widawsky
c4ac524c15 drm/i915: Update i915_gem_gtt.c copyright
I keep meaning to do this... by now almost the entire file has been
written by an Intel employee (including Daniel post-2010).

Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-05 21:29:58 +01:00
Ben Widawsky
7907f45bf9 Revert "drm/i915/bdw: Limit GTT to 2GB"
This reverts commit 3a2ffb65ee.

Now that the code is fixed to use smaller allocations, it should be safe
to let the full GGTT be used on BDW.

The testcase for this is anything which uses more than half of the GTT,
thus eclipsing the old limit.

Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-05 21:29:57 +01:00
Ben Widawsky
7ad47cf252 drm/i915/bdw: Reorganize PT allocations
The previous allocation mechanism would get 2 contiguous allocations,
one for the page directories, and one for the page tables. As each page
table is 1 page, and there are 512 of these per page directory, this
goes to 2MB. An unfriendly request at best. Worse still, our HW now
supports 4 page directories, and a 2MB allocation is not allowed.

In order to fix this, this patch attempts to split up each page table
allocation into a single, discrete allocation. There is nothing really
fancy about the patch itself, it just has to manage an extra pointer
indirection, and have a fancier bit of logic to free up the pages.

To accommodate some of the added complexity, two new helpers are
introduced to allocate, and free the page table pages.

NOTE: I really wanted to split the way we do allocations, and the way in
which we identify the page table/page directory being used. I found
splitting this functionality up to be too unwieldy. I apologize in
advance to the reviewer. I'd recommend looking at the result, rather
than the diff.

v2/NOTE2: This patch predated commit:
6f1cc99351
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Tue Dec 31 15:50:31 2013 +0000

    drm/i915: Avoid dereference past end of page arr

It fixed the same issue as that patch, but because of the limbo state of
PPGTT, Chris patch was merged instead. The excess churn is a result of
my using my original patch, which has my preferred naming. Primarily
act_* is changed to which_*, but it's mostly the same otherwise. I've
kept the convention Chris used for the pte wrap (I had something
slightly different, and broken - but fixable)

v3: Rename which_p[..]e to drop which_ (Chris)
Remove BUG_ON in inner loop (Chris)
Redo the pde/pdpe wrap logic (Chris)

v4: s/1MB/2MB in commit message (Imre)
Plug leaking gen8_pt_pages in both the error path, as well as general
free case (Imre)

v5: Rename leftover "which_" variables (Imre)
Add the pde = 0 wrap that was missed from v3 (Imre)

Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Squash in fixup from Ben.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-05 21:29:41 +01:00
Ben Widawsky
782f149523 drm/i915: Make clear/insert vfuncs args absolute
This patch converts insert_entries and clear_range, both functions which
are specific to the VM. These functions tend to encapsulate the gen
specific PTE writes. Passing absolute addresses to the insert_entries,
and clear_range will help make the logic clearer within the functions as
to what's going on. Currently, all callers simply do the appropriate
page shift, which IMO, ends up looking weird with an upcoming change for
the gen8 page table allocations.

Up until now, the PPGTT was a funky 2 level page table. GEN8 changes
this to look more like a 3 level page table, and to that extent we need
a significant amount more memory simply for the page tables. To address
this, the allocations will be split up in finer amounts.

v2: Replace size_t with uint64_t (Chris, Imre)

v3: Fix size in gen8_ppgtt_init (Ben)
Fix Size in i915_gem_suspend_gtt_mappings/restore (Imre)

Reviewed-by: Imre Deak <imre.deak@intel.com> (v2)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-04 15:57:52 +01:00
Ben Widawsky
bf2b4ed291 drm/i915/bdw: Split ppgtt initialization up
Like cleanup in an earlier patch, the code becomes much more readable,
and easier to extend if we extract out helper functions for the various
stages of init.

Note that with this patch it becomes really simple, and tempting to begin
using the 'goto out' idiom with explicit free/fini semantics. I've
kept the error path as similar as possible to the cleanup() function to
make sure cleanup is as robust as possible

v2: Remove comment "NB:From here on, ppgtt->base.cleanup() should
function properly"
Update commit message to reflect above

v3: Rebased on top of bugfixes found in the previous patch by Imre
Moved number of pd pages assertion to the proper place (Imre)

v4:
Allocate dma address space for num_pd_pages, not num_pd_entries (Ben)
Don't use gen8_pt_dma_addr after free on error path (Imre)
With new fix from v4 of the previous patch.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-04 15:54:59 +01:00
Ben Widawsky
f3a964b96d drm/i915/bdw: Reorganize PPGTT init
Create 3 clear stages in PPGTT init. This will help with upcoming
changes be more readable. The 3 stages are, allocation, dma mapping, and
writing the P[DT]Es

One nice benefit to the patches is that it makes 2 very clear error
points, allocation, and mapping, and avoids having to do any handling
after writing PTEs (something which was likely buggy before). This
simplified error handling I suspect will be helpful when we move to
deferred/dynamic page table allocation and mapping.

The patches also attempts to break up some of the steps into more
logical reviewable chunks, particularly when we free.

v2: Don't call cleanup on the error path since that takes down the
drm_mm and list entry, which aren't setup at this point.

v3: Fixes addressing Imre's comments from:
<1392821989.19792.13.camel@intelbox>

Don't do dynamic allocation for the page table DMA addresses. I can't
remember why I did it in the first place. This addresses one of Imre's
other issues.

Fix error path leak of page tables.

v4: Fix the fix of the error path leak. Original fix still leaked page
tables. (Imre)

Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-04 15:54:31 +01:00
Ben Widawsky
b18b6bde30 drm/i915/bdw: Free PPGTT struct
GEN8 never freed the PPGTT struct. As GEN8 doesn't use full PPGTT, the
leak is small and only found on a module reload. ie. I don't think this
needs to go to stable.

v2: The very naive, kfree in gen8 ppgtt cleanup, is subject to a double
free on PPGTT initialization failure. (Spotted by Imre). Instead this
patch pulls the ppgtt struct freeing out of the cleanup and leaves it to
the allocators/callers or the one doing the last kref_put as in standard
convention

Reported-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-04 15:53:58 +01:00
Daniel Vetter
d47c3ea2bd drm/i915: Allow blocking in the PDE alloc when running low on gtt space
There's no need not to, really.

Split out from Chris vma-bind rework.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ben Widawsky <benjamin.widawsky@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-02-14 14:18:15 +01:00
Daniel Vetter
1ec9e26dda drm/i915: Consolidate binding parameters into flags
Anything more than just one bool parameter is just a pain to read,
symbolic constants are much better.

Split out from Chris' vma-binding rework patch.

v2: Undo the behaviour change in object_pin that Chris spotted.

v3: Split out misplaced hunk to handle set_cache_level errors,
spotted by Jani.

v4: Keep the current over-zealous binding logic in the execbuffer code
working with a quick hack while the overall binding code gets shuffled
around.

v5: Reorder the PIN_ flags for more natural patch splitup.

v6: Pull out the PIN_GLOBAL split-up again.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ben Widawsky <benjamin.widawsky@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-02-14 14:16:58 +01:00
Ben Widawsky
b45a67157a drm/i915/bdw: Split up PPGTT cleanup
This will make the code more readable, and extensible which is needed
for upcoming feature work. Eventually, we'll do the same for init.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-02-13 11:51:21 +01:00
Linus Torvalds
9b0cd304f2 Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linux
Pull drm updates from Dave Airlie:
 "Been a bit busy, first week of kids school, and waiting on other trees
  to go in before I could send this, so its a bit later than I'd
  normally like.

  Highlights:
   - core:
      timestamp fixes, lots of misc cleanups
   - new drivers:
      bochs virtual vga
   - vmwgfx:
      major overhaul for their nextgen virt gpu.
   - i915:
      runtime D3 on HSW, watermark fixes, power well work, fbc fixes,
      bdw is no longer prelim.
   - nouveau:
      gk110/208 acceleration, more pm groundwork, old overlay support
   - radeon:
      dpm rework and clockgating for CIK, pci config reset, big endian
      fixes
   - tegra:
      panel support and DSI support, build as module, prime.
   - armada, omap, gma500, rcar, exynos, mgag200, cirrus, ast:
      fixes
   - msm:
      hdmi support for mdp5"

* 'drm-next' of git://people.freedesktop.org/~airlied/linux: (595 commits)
  drm/nouveau: resume display if any later suspend bits fail
  drm/nouveau: fix lock unbalance in nouveau_crtc_page_flip
  drm/nouveau: implement hooks for needed for drm vblank timestamping support
  drm/nouveau/disp: add a method to fetch info needed by drm vblank timestamping
  drm/nv50: fill in crtc mode struct members from crtc_mode_fixup
  drm/radeon/dce8: workaround for atom BlankCrtc table
  drm/radeon/DCE4+: clear bios scratch dpms bit (v2)
  drm/radeon: set si_notify_smc_display_change properly
  drm/radeon: fix DAC interrupt handling on DCE5+
  drm/radeon: clean up active vram sizing
  drm/radeon: skip async dma init on r6xx
  drm/radeon/runpm: don't runtime suspend non-PX cards
  drm/radeon: add ring to fence trace functions
  drm/radeon: add missing trace point
  drm/radeon: fix VMID use tracking
  drm: ast,cirrus,mgag200: use drm_can_sleep
  drm/gma500: Lock struct_mutex around cursor updates
  drm/i915: Fix the offset issue for the stolen GEM objects
  DRM: armada: fix missing DRM_KMS_FB_HELPER select
  drm/i915: Decouple GPU error reporting from ring initialisation
  ...
2014-01-29 20:49:12 -08:00
Jani Nikula
d330a9530c drm/i915: move module parameters into a struct, in a new file
With 20+ module parameters, I think referring to them via a struct
improves clarity over just having a bunch of globals. While at it, move
the parameter initialization and definitions into a new file
i915_params.c to reduce clutter in i915_drv.c.

Apart from the ill-named i915_enable_rc6, i915_enable_fbc and
i915_enable_ppgtt parameters, for which we lose the "i915_" prefix
internally, the module parameters now look the same both on the kernel
command line and in code. For example, "i915.modeset".

The downsides of the change are losing static on a couple of variables
and not having the initialization and module_param_named() right next to
each other. On the other hand, all module parameters are now defined in
one place at i915_params.c. Plus you can do this to find all module
parameter references:

$ git grep "i915\." -- drivers/gpu/drm/i915

v2:
- move the definitions into a new file
- s/i915_params/i915/
- make i915_try_reset i915.reset, for consistency

Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-01-27 17:16:45 +01:00
Daniel Vetter
0e5539b923 Merge branch 'topic/ppgtt' into drm-intel-next-queued
Because whatever.*

* This should contain a fairly long list of issues and still
unresolved resgressions, but I didn't really get a vote.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-01-25 21:14:57 +01:00
Linus Torvalds
e1ba84597c PCI changes for the v3.14 merge window:
Resource management
     - Change pci_bus_region addresses to dma_addr_t (Bjorn Helgaas)
     - Support 64-bit AGP BARs (Bjorn Helgaas, Yinghai Lu)
     - Add pci_bus_address() to get bus address of a BAR (Bjorn Helgaas)
     - Use pci_resource_start() for CPU address of AGP BARs (Bjorn Helgaas)
     - Enforce bus address limits in resource allocation (Yinghai Lu)
     - Allocate 64-bit BARs above 4G when possible (Yinghai Lu)
     - Convert pcibios_resource_to_bus() to take pci_bus, not pci_dev (Yinghai Lu)
 
   PCI device hotplug
     - Major rescan/remove locking update (Rafael J. Wysocki)
     - Make ioapic builtin only (not modular) (Yinghai Lu)
     - Fix release/free issues (Yinghai Lu)
     - Clean up pciehp (Bjorn Helgaas)
     - Announce pciehp slot info during enumeration (Bjorn Helgaas)
 
   MSI
     - Add pci_msi_vec_count(), pci_msix_vec_count() (Alexander Gordeev)
     - Add pci_enable_msi_range(), pci_enable_msix_range() (Alexander Gordeev)
     - Deprecate "tri-state" interfaces: fail/success/fail+info (Alexander Gordeev)
     - Export MSI mode using attributes, not kobjects (Greg Kroah-Hartman)
     - Drop "irq" param from *_restore_msi_irqs() (DuanZhenzhong)
 
   SR-IOV
     - Clear NumVFs when disabling SR-IOV in sriov_init() (ethan.zhao)
 
   Virtualization
     - Add support for save/restore of extended capabilities (Alex Williamson)
     - Add Virtual Channel to save/restore support (Alex Williamson)
     - Never treat a VF as a multifunction device (Alex Williamson)
     - Add pci_try_reset_function(), et al (Alex Williamson)
 
   AER
     - Ignore non-PCIe error sources (Betty Dall)
     - Support ACPI HEST error sources for domains other than 0 (Betty Dall)
     - Consolidate HEST error source parsers (Bjorn Helgaas)
     - Add a TLP header print helper (Borislav Petkov)
 
   Freescale i.MX6
     - Remove unnecessary code (Fabio Estevam)
     - Make reset-gpio optional (Marek Vasut)
     - Report "link up" only after link training completes (Marek Vasut)
     - Start link in Gen1 before negotiating for Gen2 mode (Marek Vasut)
     - Fix PCIe startup code (Richard Zhu)
 
   Marvell MVEBU
     - Remove duplicate of_clk_get_by_name() call (Andrew Lunn)
     - Drop writes to bridge Secondary Status register (Jason Gunthorpe)
     - Obey bridge PCI_COMMAND_MEM and PCI_COMMAND_IO bits (Jason Gunthorpe)
     - Support a bridge with no IO port window (Jason Gunthorpe)
     - Use max_t() instead of max(resource_size_t,) (Jingoo Han)
     - Remove redundant of_match_ptr (Sachin Kamat)
     - Call pci_ioremap_io() at startup instead of dynamically (Thomas Petazzoni)
 
   NVIDIA Tegra
     - Disable Gen2 for Tegra20 and Tegra30 (Eric Brower)
 
   Renesas R-Car
     - Add runtime PM support (Valentine Barshak)
     - Fix rcar_pci_probe() return value check (Wei Yongjun)
 
   Synopsys DesignWare
     - Fix crash in dw_msi_teardown_irq() (Bjørn Erik Nilsen)
     - Remove redundant call to pci_write_config_word() (Bjørn Erik Nilsen)
     - Fix missing MSI IRQs (Harro Haan)
     - Add dw_pcie prefix before cfg_read/write (Pratyush Anand)
     - Fix I/O transfers by using CPU (not realio) address (Pratyush Anand)
     - Whitespace cleanup (Jingoo Han)
 
   EISA
     - Call put_device() if device_register() fails (Levente Kurusa)
     - Revert EISA initialization breakage ((Bjorn Helgaas)
 
   Miscellaneous
     - Remove unused code, including PCIe 3.0 interfaces (Stephen Hemminger)
     - Prevent bus conflicts while checking for bridge apertures (Bjorn Helgaas)
     - Stop clearing bridge Secondary Status when setting up I/O aperture (Bjorn Helgaas)
     - Use dev_is_pci() to identify PCI devices (Yijing Wang)
     - Deprecate DEFINE_PCI_DEVICE_TABLE (Joe Perches)
     - Update documentation 00-INDEX (Erik Ekman)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJS3ujEAAoJEFmIoMA60/r8A4EQAK9AZSUSVNWvlKdC1PrBfT3w
 7fVILx5A4KWsOU8eoFwCPQLrgvUtMltg16yN2tbCjqpKEdrVc36biMO9bwhnXSyZ
 KopHKMWnn0sza/z2H8mcGy+0azGdWcIjcErX/a8WeS6zyWBjm+yzckrHNVpPu4Ca
 SpCBhfgBMjKyIZyLtP6juFSH34S2DfQex4oUSyPC+gjqPy5wW/xw/kBxZfOXl+yU
 P9pQT+geMIc31pETMdG9wd/TT+47YAui4ieSggoVxfVrphCXv6S8mOMCMuQc2bAy
 MHy9uFm1jbvKZZIYrzJ+9HFiiU/6MNiOO3Ygua52xuSp1Zrcjwi2CLD9/QBXbDVs
 pTKU5JIO7q43llkQUpIXTwBvEApSZRhuqzXegsMAYIg4AWmbfm/2fXkfWlQThYMp
 J48blAJZ4t0vhMr9usgwbtdBe8F5euExOxpwH0QMCMABbuu8/B3TLm39+LTcIbsw
 Efgm3N9iUTyiV5fe9Rr62nflhyqXjTevPl4wbZZe4OOdm0MXZY+/BzuNJhg3wyY8
 QKz2J3FB6OR7BCLHCp80l50s5+Ih4F5kmOXwFKjT1D1MFRaNaPDmp9BY6TitU6hg
 zj55gP4c8x6n3alakbf972Yhgs/4oi1va8cZL+pCYWb8nPO5ldaMiT7QBBLUreQV
 BtDtC7u/AFWJ5e73+jVO
 =La1R
 -----END PGP SIGNATURE-----

Merge tag 'pci-v3.14-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI updates from Bjorn Helgaas:
 "PCI changes for the v3.14 merge window:

  Resource management
    - Change pci_bus_region addresses to dma_addr_t (Bjorn Helgaas)
    - Support 64-bit AGP BARs (Bjorn Helgaas, Yinghai Lu)
    - Add pci_bus_address() to get bus address of a BAR (Bjorn Helgaas)
    - Use pci_resource_start() for CPU address of AGP BARs (Bjorn Helgaas)
    - Enforce bus address limits in resource allocation (Yinghai Lu)
    - Allocate 64-bit BARs above 4G when possible (Yinghai Lu)
    - Convert pcibios_resource_to_bus() to take pci_bus, not pci_dev (Yinghai Lu)

  PCI device hotplug
    - Major rescan/remove locking update (Rafael J. Wysocki)
    - Make ioapic builtin only (not modular) (Yinghai Lu)
    - Fix release/free issues (Yinghai Lu)
    - Clean up pciehp (Bjorn Helgaas)
    - Announce pciehp slot info during enumeration (Bjorn Helgaas)

  MSI
    - Add pci_msi_vec_count(), pci_msix_vec_count() (Alexander Gordeev)
    - Add pci_enable_msi_range(), pci_enable_msix_range() (Alexander Gordeev)
    - Deprecate "tri-state" interfaces: fail/success/fail+info (Alexander Gordeev)
    - Export MSI mode using attributes, not kobjects (Greg Kroah-Hartman)
    - Drop "irq" param from *_restore_msi_irqs() (DuanZhenzhong)

  SR-IOV
    - Clear NumVFs when disabling SR-IOV in sriov_init() (ethan.zhao)

  Virtualization
    - Add support for save/restore of extended capabilities (Alex Williamson)
    - Add Virtual Channel to save/restore support (Alex Williamson)
    - Never treat a VF as a multifunction device (Alex Williamson)
    - Add pci_try_reset_function(), et al (Alex Williamson)

  AER
    - Ignore non-PCIe error sources (Betty Dall)
    - Support ACPI HEST error sources for domains other than 0 (Betty Dall)
    - Consolidate HEST error source parsers (Bjorn Helgaas)
    - Add a TLP header print helper (Borislav Petkov)

  Freescale i.MX6
    - Remove unnecessary code (Fabio Estevam)
    - Make reset-gpio optional (Marek Vasut)
    - Report "link up" only after link training completes (Marek Vasut)
    - Start link in Gen1 before negotiating for Gen2 mode (Marek Vasut)
    - Fix PCIe startup code (Richard Zhu)

  Marvell MVEBU
    - Remove duplicate of_clk_get_by_name() call (Andrew Lunn)
    - Drop writes to bridge Secondary Status register (Jason Gunthorpe)
    - Obey bridge PCI_COMMAND_MEM and PCI_COMMAND_IO bits (Jason Gunthorpe)
    - Support a bridge with no IO port window (Jason Gunthorpe)
    - Use max_t() instead of max(resource_size_t,) (Jingoo Han)
    - Remove redundant of_match_ptr (Sachin Kamat)
    - Call pci_ioremap_io() at startup instead of dynamically (Thomas Petazzoni)

  NVIDIA Tegra
    - Disable Gen2 for Tegra20 and Tegra30 (Eric Brower)

  Renesas R-Car
    - Add runtime PM support (Valentine Barshak)
    - Fix rcar_pci_probe() return value check (Wei Yongjun)

  Synopsys DesignWare
    - Fix crash in dw_msi_teardown_irq() (Bjørn Erik Nilsen)
    - Remove redundant call to pci_write_config_word() (Bjørn Erik Nilsen)
    - Fix missing MSI IRQs (Harro Haan)
    - Add dw_pcie prefix before cfg_read/write (Pratyush Anand)
    - Fix I/O transfers by using CPU (not realio) address (Pratyush Anand)
    - Whitespace cleanup (Jingoo Han)

  EISA
    - Call put_device() if device_register() fails (Levente Kurusa)
    - Revert EISA initialization breakage ((Bjorn Helgaas)

  Miscellaneous
    - Remove unused code, including PCIe 3.0 interfaces (Stephen Hemminger)
    - Prevent bus conflicts while checking for bridge apertures (Bjorn Helgaas)
    - Stop clearing bridge Secondary Status when setting up I/O aperture (Bjorn Helgaas)
    - Use dev_is_pci() to identify PCI devices (Yijing Wang)
    - Deprecate DEFINE_PCI_DEVICE_TABLE (Joe Perches)
    - Update documentation 00-INDEX (Erik Ekman)"

* tag 'pci-v3.14-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (119 commits)
  Revert "EISA: Initialize device before its resources"
  Revert "EISA: Log device resources in dmesg"
  vfio-pci: Use pci "try" reset interface
  PCI: Check parent kobject in pci_destroy_dev()
  xen/pcifront: Use global PCI rescan-remove locking
  powerpc/eeh: Use global PCI rescan-remove locking
  PCI: Fix pci_check_and_unmask_intx() comment typos
  PCI: Add pci_try_reset_function(), pci_try_reset_slot(), pci_try_reset_bus()
  MPT / PCI: Use pci_stop_and_remove_bus_device_locked()
  platform / x86: Use global PCI rescan-remove locking
  PCI: hotplug: Use global PCI rescan-remove locking
  pcmcia: Use global PCI rescan-remove locking
  ACPI / hotplug / PCI: Use global PCI rescan-remove locking
  ACPI / PCI: Use global PCI rescan-remove locking in PCI root hotplug
  PCI: Add global pci_lock_rescan_remove()
  PCI: Cleanup pci.h whitespace
  PCI: Reorder so actual code comes before stubs
  PCI/AER: Support ACPI HEST AER error sources for PCI domains other than 0
  ACPICA: Add helper macros to extract bus/segment numbers from HEST table.
  PCI: Make local functions static
  ...
2014-01-22 16:39:28 -08:00
Daniel Vetter
0d9d349d87 Merge commit origin/master into drm-intel-next
Conflicts are getting out of hand, and now we have to shuffle even
more in -next which was also shuffled in -fixes (the call for
drm_mode_config_reset needs to move yet again).

So do a proper backmerge. I wanted to wait with this for the 3.13
relaese, but alas let's just do this now.

Conflicts:
	drivers/gpu/drm/i915/i915_reg.h
	drivers/gpu/drm/i915/intel_ddi.c
	drivers/gpu/drm/i915/intel_display.c
	drivers/gpu/drm/i915/intel_pm.c

Besides the conflict around the forcewake get/put (where we chaged the
called function in -fixes and added a new parameter in -next) code all
the current conflicts are of the adjacent lines changed type.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-01-16 22:06:30 +01:00
Daniel Vetter
0e46ce2e7a drm/i915: fix ppgtt dump code for DEBUG_FS=n
A regression in the topic/ppgtt branch introduce in

commit 87d60b63e0
Author: Ben Widawsky <ben@bwidawsk.net>
Date:   Fri Dec 6 14:11:29 2013 -0800

    drm/i915: Add PPGTT dumper

The issue is that we're missing the definitions for the seq_file
functions and hence compilation fails.

v2: Just include the right header instead of splattering #ifdefs all
over the place (Chris).

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Reported-by: Antti Koskipaa <antti.koskipaa@linux.intel.com>
Cc: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-01-10 08:21:37 +01:00
Bjorn Helgaas
21c346075c drm/i915: Rename gtt_bus_addr to gtt_phys_addr
We're dealing with CPU physical addresses here, which may be different from
bus addresses, so rename gtt_bus_addr to gtt_phys_addr to avoid confusion.

No functional change.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-01-07 11:36:43 -07:00
Chris Wilson
6f1cc99351 drm/i915: Avoid dereference past end of page array in gen8_ppgtt_insert_entries()
The bug from gen6_ppgtt_insert_entries() was replicated into
gen8_ppgtt_insert_entries(). This applies the fix for the OOPS from the
previous patch to the gen8 routine.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Ben Widawsky <benjamin.widawsky@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-01-07 08:24:16 +01:00
Chris Wilson
cc79714fbc drm/i915: Avoid dereference past end of page array in gen6_ppgtt_insert_entries()
[   89.237347] BUG: unable to handle kernel paging request at ffff880096326000
[   89.237369] IP: [<ffffffff81347227>] gen6_ppgtt_insert_entries+0x117/0x170
[   89.237382] PGD 2272067 PUD 25df0e067 PMD 25de5c067 PTE 8000000096326060
[   89.237394] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
[   89.237404] CPU: 1 PID: 1981 Comm: gem_concurrent_ Not tainted 3.13.0-rc4+ #639
[   89.237411] Hardware name: Intel Corporation 2012 Client Platform/Emerald Lake 2, BIOS ACRVMBY1.86C.0078.P00.1201161002 01/16/2012
[   89.237420] task: ffff88024c038030 ti: ffff88024b130000 task.ti: ffff88024b130000
[   89.237425] RIP: 0010:[<ffffffff81347227>]  [<ffffffff81347227>] gen6_ppgtt_insert_entries+0x117/0x170
[   89.237435] RSP: 0018:ffff88024b131ae0  EFLAGS: 00010286
[   89.237440] RAX: ffff880096325000 RBX: 0000000000000400 RCX: 0000000000001000
[   89.237445] RDX: 0000000000000200 RSI: 0000000000000001 RDI: 0000000000000010
[   89.237451] RBP: ffff88024b131b30 R08: ffff88024cc3aef0 R09: 0000000000000000
[   89.237456] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88024cc3ae00
[   89.237462] R13: ffff88024a578000 R14: 0000000000000001 R15: ffff88024a578ffc
[   89.237469] FS:  00007ff5475d8900(0000) GS:ffff88025d020000(0000) knlGS:0000000000000000
[   89.237475] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   89.237480] CR2: ffff880096326000 CR3: 000000024d531000 CR4: 00000000001407e0
[   89.237485] Stack:
[   89.237488]  ffff880000000000 0000020000000000 ffff88024b23f2c0 0000000100000000
[   89.237499]  0000000000000001 000000000007ffff ffff8801e7bf5ac0 ffff8801e7bf5ac0
[   89.237510]  ffff88024cc3ae00 ffff880248a2ee40 ffff88024b131b58 ffffffff813455ed
[   89.237521] Call Trace:
[   89.237528]  [<ffffffff813455ed>] ppgtt_bind_vma+0x3d/0x60
[   89.237534]  [<ffffffff8133d8dc>] i915_gem_object_pin+0x55c/0x6a0
[   89.237541]  [<ffffffff8134275b>] i915_gem_execbuffer_reserve_vma.isra.14+0x5b/0x110
[   89.237548]  [<ffffffff81342a88>] i915_gem_execbuffer_reserve+0x278/0x2c0
[   89.237555]  [<ffffffff81343d29>] i915_gem_do_execbuffer.isra.22+0x699/0x1250
[   89.237562]  [<ffffffff81344d91>] ? i915_gem_execbuffer2+0x51/0x290
[   89.237569]  [<ffffffff81344de6>] i915_gem_execbuffer2+0xa6/0x290
[   89.237575]  [<ffffffff813014f2>] drm_ioctl+0x4d2/0x610
[   89.237582]  [<ffffffff81080bf1>] ? cpuacct_account_field+0xa1/0xc0
[   89.237588]  [<ffffffff81080b55>] ? cpuacct_account_field+0x5/0xc0
[   89.237597]  [<ffffffff811371c0>] do_vfs_ioctl+0x300/0x520
[   89.237603]  [<ffffffff810757a1>] ? vtime_account_user+0x91/0xa0
[   89.237610]  [<ffffffff810e40eb>] ?  context_tracking_user_exit+0x9b/0xe0
[   89.237617]  [<ffffffff81083d7d>] ? trace_hardirqs_on+0xd/0x10
[   89.237623]  [<ffffffff81137425>] SyS_ioctl+0x45/0x80
[   89.237630]  [<ffffffff815afffa>] tracesys+0xd4/0xd9
[   89.237634] Code: 5b 41 5c 41 5d 41 5e 41 5f 5d c3 66 0f 1f 84 00 00 00 00 00 83 45 bc 01 49 8b 84 24 78 01 00 00 65 ff 0c 25 e0 b8 00 00 8b 55 bc <4c> 8b 2c d0 65 ff 04 25 e0 b8 00 00 49 8b 45 00 48 c1 e8 2d 48
[   89.237741] RIP  [<ffffffff81347227>] gen6_ppgtt_insert_entries+0x117/0x170
[   89.237749]  RSP <ffff88024b131ae0>
[   89.237753] CR2: ffff880096326000
[   89.237758] ---[ end trace 27416ba8b18d496c ]---

This bug dates back to the original introduction of the
gen6_ppgtt_insert_entries()

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Ben Widawsky <benjamin.widawsky@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Dropped cc: stable since without full ppgtt there's no way
we'll access the last page directory with this function since that
range is occupied (only in the allocator) with the ppgtt pdes. Without
aliasing we can start to use that range and blow up.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-01-07 08:22:42 +01:00
Chris Wilson
c0a7f81899 drm/i915: Mention when we enable the Ironlake iommu workarounds
The iommu and gfx on Ironlake do not like each other and require a
big hammer to prevent hard machine hangs. In

commit 5c0422878f
Author: Ben Widawsky <ben@bwidawsk.net>
Date:   Mon Oct 17 15:51:55 2011 -0700

    drm/i915: ILK + VT-d workaround

we added the workaround, but never emitted any debug message that it was
active. Doing so should help identify known performance regressions.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-01-07 08:10:54 +01:00
Ben Widawsky
058840c7a0 drm/i915/bdw: Flush system agent on gen8 also
gem_gtt_cpu_tlb seems to indicate that it is needed.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72869

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-01-06 11:11:04 +01:00
Ben Widawsky
87d60b63e0 drm/i915: Add PPGTT dumper
Dump the aliasing PPGTT with it. The aliasing PPGTT should actually
always be empty.

TODO: Broadwell. Since we don't yet use full PPGTT on Broadwell, not
having the dumper is okay.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-18 16:26:16 +01:00
Ben Widawsky
d2ff7192f3 drm/i915: Remove extraneous mm_switch in ppgtt enable
Originally this commit message said:
Now that do_switch does the mm switch, and we always enable the aliasing
PPGTT, and contexts at the same time, there is no need to continue doing
this during PPGTT enabling.

Since originally writing the patch however, I introduced the concept of
synchronous mm switching (using MMIO). Since this is generally not
recommended in the spec (for reasons unknown), I've isolated its usage
as much as possible. As such the "extraneous" switch only ever will
occur when we have full PPGTT.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-18 16:24:57 +01:00
Ben Widawsky
7e0d96bc03 drm/i915: Use multiple VMs -- the point of no return
As with processes which run on the CPU, the goal of multiple VMs is to
provide process isolation. Specific to GEN, there is also the ability to
map more objects per process (2GB each instead of 2Gb-2k total).

For the most part, all the pipes have been laid, and all we need to do
is remove asserts and actually start changing address spaces with the
context switch. Since prior to this we've converted the setting of the
page tables to a streamed version, this is quite easy.

One important thing to point out (since it'd been hotly contested) is
that with this patch, every context created will have it's own address
space (provided the HW can do it).

v2: Disable BDW on rebase

NOTE: I tried to make this commit as small as possible. I needed one
place where I could "turn everything on" and that is here. It could be
split into finer commits, but I didn't really see much point.

Cc: Eric Anholt <eric@anholt.net>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-18 16:24:52 +01:00
Daniel Vetter
3d7f0f9dcc Merge commit drm-intel-fixes into topic/ppgtt
I need the tricky do_switch fix before I can merge the final piece of
the ppgtt enabling puzzle. Otherwise the conflict will be a real pain
to resolve since the do_switch hunk from -fixes must be placed at the
exact right place within a hunk in the next patch.

Conflicts:
	drivers/gpu/drm/i915/i915_gem_context.c
	drivers/gpu/drm/i915/i915_gem_execbuffer.c
	drivers/gpu/drm/i915/intel_display.c

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-18 16:23:37 +01:00
Ben Widawsky
bdf4fd7ea0 drm/i915: Do aliasing PPGTT init with contexts
We have a default context which suits the aliasing PPGTT well. Tie them
together so it looks like any other context/PPGTT pair. This makes the
code cleaner as it won't have to special case aliasing as often.

The patch has one slightly tricky part in the default context creation
function. In the future (and on aliased setup) we create a new VM for a
context (potentially). However, if we have aliasing PPGTT, which occurs
at this point in time for all platforms GEN6+, we can simply manage the
refcounting to allow things to behave as normal. Now is a good time to
recall that the aliasing_ppgtt doesn't have a real VM, it uses the GGTT
drm_mm.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-18 15:32:14 +01:00
Ben Widawsky
80da216171 drm/i915: Restore PDEs for all VMs
In following with the old restore code, we must now restore ever PPGTT's
PDEs, since they aren't proper GEM ojbects.

v2: Rebased on BDW. Only do restore pdes for gen6 & 7

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-18 15:31:32 +01:00
Ben Widawsky
9f273d48aa drm/i915: Write PDEs at init instead of enable
We won't be calling enable() for all PPGTTs. We do need to write PDEs
for all PPGTTs however. By moving the writing to init (which is called
for all PPGTTs) we should accomplish this.

ADD NOTE ABOUT PDE restore

TODO: Eventually, we should allocate the page tables on demand.

v2: Rebased on BDW. Only do PDEs for pre-gen8

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-18 15:31:26 +01:00
Ben Widawsky
c7c48dfdff drm/i915: Add VM to context
Pretty straightforward so far except for the bit about the refcounting.
The PPGTT will potentially be shared amongst multiple contexts. Because
contexts themselves have a refcounted lifecycle, the easiest way to
manage this will be to refcount the PPGTT. To acheive this, we piggy
back off of the existing context refcount, and will increment and
decrement the PPGTT refcount with context creation, and destruction.

To put it more clearly, if context A, and context B both use PPGTT 0, we
can't free the PPGTT until both A, and B are destroyed.

Note that because the PPGTT is permanently pinned (for now), it really
just matters for the PPGTT destruction, as opposed to making space under
memory pressure.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-18 15:31:20 +01:00
Ben Widawsky
246cbfb5fb drm/i915: Reorganize intel_enable_ppgtt
This patch consolidates the way in which we handle the various supported
PPGTT by module parameter in addition to what the hardware supports. It
strives to make doing the right thing in the code as simple as possible,
with the USES_ macros.

I've opted to add the full PPGTT argument simply so one can see how I
intend to use this function. It will not/cannot be used until later.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-18 15:31:06 +01:00
Ben Widawsky
d6660add64 drm/i915: Generalize PPGTT init
Rearrange the initialization code to try to special case the aliasing
PPGTT less, and provide usable interfaces for the general case later.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-18 15:29:24 +01:00
Ben Widawsky
90252e5c68 drm/i915: Flush TLBs after !RCS PP_DIR_BASE
I've found this by accident. The docs don't really come out and say you
need to do this. What the docs do tell you is you need to flush the TLBs
before you set the PP_DIR_BASE, and that the RCS will invalidate its
TLBs upon setting the new PP_DIR_BASE. It makes no such comment about
any of the other rings.

Empirically, this indeed fixes a really obvious bug whereby the batches
being sent to the blitter were not executing (we were executing the
HSWP somehow instead).

NOTE: This should make no difference with the current code. It only
applies when we start using multiple VMs.

NOTE2: HSW appears to be immune to this.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-18 15:29:13 +01:00
Ben Widawsky
48a10389c8 drm/i915: Use LRI for switching PP_DIR_BASE
The docs seem to suggest this is the appropriate method (though it
doesn't say so outright). In other words, we probably should have done
this before. We certainly must do this for switching VMs on the fly,
since synchronizing the rings to MMIO updates isn't acceptable.

v2:
Make the reset code actually work for all rings. Note that this was
fixed in subsequent commits, but was indeed broken for this commit.

Add a posting read to the reset case. It probably should have existed
before hand, but since we have no failures; there is no reason to make
it a separate commit.

Make IS_GEN6 not use the ring because I am seeing crashes when using it.
It is a bit of a hack in this patch, it will get fixed up in a couple of
patches.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-18 15:28:45 +01:00
Ben Widawsky
eeb9488e75 drm/i915: Extract mm switching to function
In order to do the full context switch with address space, it's
convenient to have a way to switch the address space. We already have
this in our code - just pull it out to be called by the context switch
code later.

v2: Rebased on BDW support. Required adding BDW.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-18 15:28:33 +01:00
Ben Widawsky
b4a74e3adf drm/i915: Use platform specific ppgtt enable
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-18 15:27:59 +01:00
Ben Widawsky
e3cc19957f drm/i915: One hopeful eviction on PPGTT alloc
The patch before this changed the way in which we allocate space for the
PPGTT PDEs. It began carving out the PPGTT PDEs (which live in the
Global GTT) from the GGTT's drm_mm. Prior to that patch, the PDEs were
hidden from the drm_mm, and therefore could never fail to be allocated.

In unfortunate cases, the drm_mm may be full when we want to allocate
the space. This can technically occur whenever we try to allocate, which
happens in two places currently. Practically, it can only really ever
happen at GPU reset.

Later, when we allocate more PDEs for multiple PPGTTs this will
potentially even more useful.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-18 15:27:58 +01:00
Ben Widawsky
c8d4c0d668 drm/i915: Use drm_mm for PPGTT PDEs
When PPGTT support was originally enabled, it was only designed to
support 1 PPGTT. It therefore made sense to simply hide the GGTT space
required to enable this from the drm_mm allocator.

Since we intend to support full PPGTT, which means more than 1, and they
can be created and destroyed ad hoc it will be required to use the
proper allocation techniques we already have.

The first step here is to make the existing single PPGTT use the
allocator.

The astute observer will notice that we are reserving space in the GGTT
for the PDEs for the lifetime of the address space, and would be right
to question whether or not this is a good idea. It does not make a
difference with this current patch only the aliasing PPGTT (indeed the
PDEs should still be hidden from the shrinker). For the future, we are
allocating from top to bottom to avoid using the precious "gtt
space" The GGTT space at that point should only be used for scanout, HW
contexts, ringbuffers, HWSP, PDEs, and a couple of other small buffers
(potentially) used by the kernel. Everything else should be mapped into
a PPGTT. To put the consumption in more tangible terms, it takes
approximately 4 sets of PDEs to equal one 19x10 framebuffer (with no
fancy stride or alignment constraints). 3/4 of the total [average] GGTT
can be used for PDEs, and hopefully never touch the 1/4 that the
framebuffer needs.

The astute, and persistent observer might ask about the page tables
which are also pinned for the address space. This waste is unfortunate.
We use 2MB of memory per address space. We leave wrapping the PDEs as a
real GEM object as a TODO.

v2: Align PDEs to 64b in GTT
Allocate the node dynamically so we can use drm_mm_put_block
Now tested on IGT
Allocate node at the top to avoid fragmentation (Chris)

v3: Use Chris' top down allocator

v4: Embed drm_mm_node into ppgtt struct (Jesse)
Remove hunks which didn't belong (Jesse)

v5: Don't subtract guard page since we now killed the guard page prior
to this patch. (Ben)

v6: Rebased and removed guard page stuff.
Added a chunk to the commit message
Allow adding a context to mappable region

v7: Undo v3, so we can make the drm patch last in the series

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> (v4)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

squash: drm/i915: allow PPGTT to use mappable
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-18 15:27:57 +01:00
Ben Widawsky
a3d67d2396 drm/i915: PPGTT vfuncs should take a ppgtt argument
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-18 15:27:56 +01:00
Ben Widawsky
6f65e29aca drm/i915: Create bind/unbind abstraction for VMAs
To sum up what goes on here, we abstract the vma binding, similarly to
the previous object binding. This helps for distinguishing legacy
binding, versus modern binding. To keep the code churn as minimal as
possible, I am leaving in insert_entries(). It serves as the per
platform pte writing basically. bind_vma and insert_entries do share a
lot of similarities, and I did have designs to combine the two, but as
mentioned already... too much churn in an already massive patchset.

What follows are the 3 commits which existed discretely in the original
submissions. Upon rebasing on Broadwell support, it became clear that
separation was not good, and only made for more error prone code. Below
are the 3 commit messages with all their history.

drm/i915: Add bind/unbind object functions to VMA
drm/i915: Use the new vm [un]bind functions
drm/i915: reduce vm->insert_entries() usage

drm/i915: Add bind/unbind object functions to VMA

As we plumb the code with more VM information, it has become more
obvious that the easiest way to deal with bind and unbind is to simply
put the function pointers in the vm, and let those choose the correct
way to handle the page table updates. This change allows many places in
the code to simply be vm->bind, and not have to worry about
distinguishing PPGTT vs GGTT.

Notice that this patch has no impact on functionality. I've decided to
save the actual change until the next patch because I think it's easier
to review that way. I'm happy to squash the two, or let Daniel do it on
merge.

v2:
Make ggtt handle the quirky aliasing ppgtt
Add flags to bind object to support above
Don't ever call bind/unbind directly for PPGTT until we have real, full
PPGTT (use NULLs to assert this)
Make sure we rebind the ggtt if there already is a ggtt binding.  This
happens on set cache levels.
Use VMA for bind/unbind (Daniel, Ben)

v3: Reorganize ggtt_vma_bind to be more concise and easier to read
(Ville). Change logic in unbind to only unbind ggtt when there is a
global mapping, and to remove a redundant check if the aliasing ppgtt
exists.

v4: Make the bind function a bit smarter about the cache levels to avoid
unnecessary multiple remaps. "I accept it is a wart, I think unifying
the pin_vma / bind_vma could be unified later" (Chris)
Removed the git notes, and put version info here. (Daniel)

v5: Update the comment to not suck (Chris)

v6:
Move bind/unbind to the VMA. It makes more sense in the VMA structure
(always has, but I was previously lazy). With this change, it will allow
us to keep a distinct insert_entries.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

drm/i915: Use the new vm [un]bind functions

Building on the last patch which created the new function pointers in
the VM for bind/unbind, here we actually put those new function pointers
to use.

Split out as a separate patch to aid in review. I'm fine with squashing
into the previous patch if people request it.

v2: Updated to address the smart ggtt which can do aliasing as needed
Make sure we bind to global gtt when mappable and fenceable. I thought
we could get away without this initialy, but we cannot.

v3: Make the global GTT binding explicitly use the ggtt VM for
bind_vma(). While at it, use the new ggtt_vma helper (Chris)

At this point the original mailing list thread diverges. ie.

v4^:
use target_obj instead of obj for gen6 relocate_entry
vma->bind_vma() can be called safely during pin. So simply do that
instead of the complicated conditionals.
Don't restore PPGTT bound objects on resume path
Bug fix in resume path for globally bound Bos
Properly handle secure dispatch
Rebased on vma bind/unbind conversion

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

drm/i915: reduce vm->insert_entries() usage

FKA: drm/i915: eliminate vm->insert_entries()

With bind/unbind function pointers in place, we no longer need
insert_entries. We could, and want, to remove clear_range, however it's
not totally easy at this point. Since it's used in a couple of place
still that don't only deal in objects: setup, ppgtt init, and restore
gtt mappings.

v2: Don't actually remove insert_entries, just limit its usage. It will
be useful when we introduce gen8. It will always be called from the vma
bind/unbind.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v1)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-18 15:27:50 +01:00
Ben Widawsky
e178f7057b drm/i915: Provide PDP updates via MMIO
The initial implementation of this function used MMIO to write the PDPs.
Upon review it was determined (correctly) that the docs say to use LRI.
The issue is there are times where we want to do a synchronous write
(GPU reset).

I've tested this, and it works. I've verified with as many people as
possible that it should work.

This should fix the failing reset problems.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-18 15:27:43 +01:00
Dave Airlie
62a3a12667 Merge branch 'bdw-fixes' of git://people.freedesktop.org/~danvet/drm-intel into drm-fixes
As promised bdw fixes come separate for now. Just a few minior things.

* 'bdw-fixes' of git://people.freedesktop.org/~danvet/drm-intel:
  drm/i915/bdw: PIPE_[BC] I[ME]R moved to powerwell
  drm/i915/bdw: Limit GTT to 2GB
  drm/i915/bdw: Add comment about gen8 HWS PGA
  drm/i915/bdw: Free correct number of ppgtt pages
  drm/i915/bdw: Do gen6 style reset for gen8
  drm/i915/bdw: GEN8 backlight support
  drm/i915/bdw: Add BDW to ULT macro
2013-12-12 10:38:08 +10:00
Ben Widawsky
5ed1678206 drm/i915: Move the gtt mm takedown to cleanup
Our VM code already has a cleanup function, and this is a nice place to
put the drm_mm_takedown. This should have no functional impact, it just
leaves the unload function a bit cleaer, and is more logical IMO

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-26 10:21:27 +01:00
Ben Widawsky
686e1f6f87 drm/i915: Add a few missed bits to the mm
This should really have been added in BDW integration, as well as:

commit 93bd8649db
Author: Ben Widawsky <ben@bwidawsk.net>
Date:   Tue Jul 16 16:50:06 2013 -0700

    drm/i915: Put the mm in the parent address space

It didn't really matter before, but it will in the future.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-26 10:06:22 +01:00
Ben Widawsky
d595bd4bbd drm/i915: Fix BDW PPGTT error path
When we fail for some reason on loading the PDPs, it would be wise to
disable the PPGTT in the ring registers. If we do not do this, we have
undefined results.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-26 09:59:05 +01:00
Chris Wilson
c51e9701c4 drm/i915: Prefer setting PTE cache age to 3
We have conflicting benchmark data that suggest either age 0 or age 3 is
better. However, the earlier benchmark on which we based the switch to
age 0

(commit 0d8ff15e9a
Author: Ben Widawsky <benjamin.widawsky@intel.com>
Date:   Thu Jul 4 11:02:03 2013 -0700

    drm/i915/hsw: Set correct Haswell PTE encodings)

actually seems to prefer the default PTE encoding as age 3. Presumably,
this is in part due to the use of MOCS to override the PTE encodings
when appropriate.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=69870
Tested-by: mengmeng.meng@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Eric Anholt <eric@anholt.net
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-25 09:50:15 +01:00
Daniel Vetter
c09cd6e969 Merge branch 'backlight-rework' into drm-intel-next-queued
Pull in Jani's backlight rework branch. This was merged through a
separate branch to be able to sort out the Broadwell conflicts
properly before pulling it into the main development branch.

Conflicts:
	drivers/gpu/drm/i915/intel_display.c

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-15 10:02:39 +01:00
Ben Widawsky
3a2ffb65ee drm/i915/bdw: Limit GTT to 2GB
Because of the way in which we're allocating the pages for the Aliasing
PPGTT, we cannot actually successfully alloc enough space for anything
greater than 2GB.

Instead of a quick hack to fix this, we should defer until we have the
real solution in place (allocating much less contiguous space).

This wasn't found sooner because we didn't not have any systems
supporting more than a 2GB GTT.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-14 09:33:11 +01:00
Ben Widawsky
230f955f73 drm/i915/bdw: Free correct number of ppgtt pages
I am unclear how this got messed up in the shuffle, but it did.

Cc: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-14 09:33:10 +01:00
Jesse Barnes
b53c8c3577 drm/i915: drop duplicate ggtt vma list add in setup_global_gtt
Preallocated objects will already have been added to the vma_list when
creating their ggtt vma entry, and coincidentally also marked as holding
a ggtt mapping. Repeating the vma_list manipulation when setting up the
ggtt after preallocation is a recipe for an unhappy kernel.

Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: Use the improve commit message suggest by Chris.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-13 01:05:01 +01:00
Ville Syrjälä
b42218c19f drm/i915/bdw: Don't muck with gtt_size on Gen8 when PPGTT setup fails
v2: Resolve rebase conflicts and switch to gen < 8 color for GenX
checking.

v3: Rebase on top of the address space refactoring.

Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> (v1)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:49 +01:00
Ben Widawsky
28cf541543 drm/i915/bdw: unleash PPGTT
v2: Squash in fix from Ben: Set PPGTT batches as necessary

This fixes the regression in the last couple of days when we enabled
PPGTT.

v3: Squash in fixup to still use GTT for secure batches from Ville:

BDW doesn't have a separate secure vs. non-secure bit in
MI_BATCH_BUFFER_START. So for secure batches we have to simply
leave the PPGTT bit unset. Fortunately older generations (except
HSW) had similar limitations so execbuffer already creates a GTT
mapping for all secure batches.

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:48 +01:00
Ben Widawsky
94e409c144 drm/i915/bdw: Implement PPGTT enable
Legacy PPGTT on GEN8 requires programming 4 PDP registers per ring.
Since all rings are using the same address space with the current code
the logic is simply to program all the tables we've setup for the PPGTT.

v2: Turn on PPGTT in GFX_MODE

v3: v2 was the wrong patch

v4: Resolve conflicts due to patch series reordering.

v5: Squash in fixup from Ben: Use LRI to write PDPs

The docs (and simulator seems to back up) suggest that we can only
program legacy PPGTT PDPs with LRI commands.

v6: Rebase around context differences conflicts.

v7: Use #defines for per ring PDPs. (Damien)

v8: Don't use typede'f private_t.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (up to v3 and v7)
Reviewed-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:47 +01:00
Ben Widawsky
9df15b499b drm/i915/bdw: Implement PPGTT insert
GEN8 insertion is very similar to GEN6.

v2: Rebase on top of Imre's for_each_sg_page helpers.

v3: Fixup my conversion (spotted by Ville).

v4: Rebase on top of the address space refactoring.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (v1)
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:47 +01:00
Ben Widawsky
459108b8cc drm/i915/bdw: Implement PPGTT clear range
GEN8 PPGTT range clearing is very similar to GEN6 if we assume that our
PDEs are all valid, which they should be.

v2: Rebase on top of the address space refactoring.

v3: Rebase on top of the bool use_scratch addition to the clear_range interface.

Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (v1)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:46 +01:00
Ben Widawsky
b1fe667329 drm/i915/bdw: Initialize the PDEs
The upcoming clear and insert routines will expect that PDEs all point
to valid Page Directories. Doing that lazily doesn't really buy us
anything.

The page allocation is done regardless earlier in init so it shouldn't
hurt set the PDEs.

v2: Squash in patches to implement fixed PDE write function:

- If I had done this in the first place, the bug that's going to be
  fixed in an upcoming patch would have been much easier to find.

- Use WB for PDEs.

  The PAT bit is used for page size. 2ME PDEs aren't even supported in
  BDW, so this was completely invalid. The solution is to make our
  PDEs WB+LLC instead of the pervious WB+eLLC. As far as I can guess,
  this change won't matter for performance.

  Thanks to Ville for the quick correction when discussing on IRC.

v3: Return the pde type for pde encoding (Damien)

Reviewed-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:46 +01:00
Ben Widawsky
37aca44ad5 drm/i915/bdw: PPGTT init & cleanup
Aside from the potential size increase of the PPGTT, the primary
difference from previous hardware is the Page Directories are no longer
carved out of the Global GTT.

Note that the PDE allocation is done as a 8MB contiguous allocation,
this needs to be eventually fixed (since driver reloading will be a
pain otherwise). Also, this will be a no-go for real PPGTT support.

v2: Move vtable initialization

v3: Resolve conflicts due to patch series reordering.

v4: Rebase on top of the address space refactoring of the PPGTT
support. Drop Imre's r-b tag for v2, too outdated by now.

v5: Free the correct amount of memory, "get_order takes size not a page
count." (Imre)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:45 +01:00
Ben Widawsky
fbe5d36e77 drm/i915/bdw: Support BDW caching
BDW caching works differently than the previous generations. Instead of
having bits in the PTE which directly control how the page is cached,
the 3 PTE bits PWT PCD and PAT provide an index into a PAT defined by
register 0x40e0. This style of caching is functionally equivalent to how
it works on HSW and before.

v2: Tiny bikeshed as discussed on internal irc.

v3: Squash in patch from Ville to mirror the x86 PAT setup more like
in arch/x86/mm/pat.c. Primarily, the 0th index will be WB, and not
uncached.

v4: Comment for reason to not use a 64b write on the PPAT.

v5: Add a FIXME comment that the caching bits in the PAT registers
might be wrong due to doc confusion.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (v1)
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:45 +01:00
Ben Widawsky
94ec8f6130 drm/i915/bdw: Add GTT functions
With the PTE clarifications, the bind and clear functions can now be
added for gen8.

v2: Use for_each_sg_pages in gen8_ggtt_insert_entries.

v3: Drop dev argument to pte encode functions, upstream lost it. Also
rebase on top of the scratch page movement.

v4: Rebase on top of the new address space vfuncs.

v5: Add the bool use_scratch argument to clear_range and the bool valid argument
to the PTE encode function to follow upstream changes.

v6: Add a FIXME(BDW) about the size mismatch of the readback check
that Jon Bloomfield spotted.

v7: Squash in fixup patch from Ben for the posting read to match the
64bit ptes and so shut up the WARN.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (v1)
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:44 +01:00
Ben Widawsky
d31eb10e6c drm/i915/bdw: Create gen8_gtt_pte_t
With gen6 PTE type in place, pave the way for the new gen8 type.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:44 +01:00
Ben Widawsky
63340133f3 drm/i915/bdw: Make gen8_gmch_probe
Probing gen8 is similar to gen6. To make the code cleaner and more
maintainable however we can use the probe functions to split it out.

v2: Rebased on top of update gtt probe infrastructure.

v3: Rebased on top of Kenneth' Graunke's ->pte_encode refactoring.

V4: Resolve conflicts with Ben's latest ppgtt patches, also switch to
gen < 8 testing instead of gen <= 7.

v5: Resolve conflicts with address space vfunc changes in upstream.

v6: Use 39b DMA mask. At least, for this mode, it is the correct mask.
(Imre)

Cc: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:43 +01:00
Ben Widawsky
9459d25237 drm/i915/bdw: support GMS and GGMS changes
All the BARs have the ability to grow.

v2: Pulled out the simulator workaround to a separate patch.
Rebased.

v3: Rebase onto latest vlv patches from Jesse.

v4: Rebased on top of the early stolen quirk patch from Jesse.

v5: Use the new macro names.
s/INTEL_BDW_PCI_IDS_D/INTEL_BDW_D_IDS
s/INTEL_BDW_PCI_IDS_M/INTEL_BDW_M_IDS
It's Jesse's fault for not following the convention I originally set.

Cc: Ingo Molnar <mingo@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:39 +01:00
Daniel Vetter
8fe6bd239a drm/i915/bdw: Disable PPGTT for now
This will be changed once the gen8 code is fully implemented.

v2: Use ENOSYS instead of ENXIO as suggested by Chris.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:35 +01:00
Daniel Vetter
7f16e5c141 Merge tag 'v3.12' into drm-intel-next
I want to merge in the new Broadwell support as a late hw enabling
pull request. But since the internal branch was based upon our
drm-intel-nightly integration branch I need to resolve all the
oustanding conflicts in drm/i915 with a backmerge to make the 60+
patches apply properly.

We'll propably have some fun because Linus will come up with a
slightly different merge solution.

Conflicts:
	drivers/gpu/drm/i915/i915_dma.c
	drivers/gpu/drm/i915/i915_drv.c
	drivers/gpu/drm/i915/intel_crt.c
	drivers/gpu/drm/i915/intel_ddi.c
	drivers/gpu/drm/i915/intel_display.c
	drivers/gpu/drm/i915/intel_dp.c
	drivers/gpu/drm/i915/intel_drv.h

All rather simple adjacent lines changed or partial backports from
-next to -fixes, with the exception of the thaw code in i915_dma.c.
That one needed a bit of shuffling to restore the intent.

Oh and the massive header file reordering in intel_drv.h is a bit
trouble. But not much.

v2: Also don't forget the fixup for the silent conflict that results
in compile fail ...

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-04 16:28:52 +01:00
Ben Widawsky
828c79087c drm/i915: Disable GGTT PTEs on GEN6+ suspend
Once the machine gets to a certain point in the suspend process, we
expect the GPU to be idle. If it is not, we might corrupt memory.
Empirically (with an early version of this patch) we have seen this is
not the case. We cannot currently explain why the latent GPU writes
occur.

In the technical sense, this patch is a workaround in that we have an
issue we can't explain, and the patch indirectly solves the issue.
However, it's really better than a workaround because we understand why
it works, and it really should be a safe thing to do in all cases.

The noticeable effect other than the debug messages would be an increase
in the suspend time. I have not measure how expensive it actually is.

I think it would be good to spend further time to root cause why we're
seeing these latent writes, but it shouldn't preclude preventing the
fallout.

NOTE: It should be safe (and makes some sense IMO) to also keep the
VALID bit unset on resume when we clear_range(). I've opted not to do
this as properly clearing those bits at some later point would be extra
work.

v2: Fix bugzilla link

Bugzilla: http://bugs.freedesktop.org/show_bug.cgi?id=65496
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=59321
Tested-by: Takashi Iwai <tiwai@suse.de>
Tested-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Tested-By: Todd Previte <tprevite@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-10-18 15:44:47 +02:00
Ben Widawsky
b35b380ed4 drm/i915: Make PTE valid encoding optional
We need this to work around a corruption when the boot kernel image
loads the hibernated kernel image from swap on Haswell systems -
somehow not everything is properly shut off.

This is just the prep work, the next patch will implement the actual
workaround.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Add a commit message suitable for -fixes and add cc: stable]
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-10-18 15:40:21 +02:00
Daniel Vetter
a1e2265332 drm/i915: Use kcalloc more
No buffer overflows here, but better safe than sorry.

v2:
- Fixup the sizeof conversion, I've missed the pointer deref (Jani).
- Drop the redundant GFP_ZERO, kcalloc alreads memsets (Jani).
- Use kmalloc_array for the execbuf fastpath to avoid the memset
  (Chris). I've opted to leave all other conversions as-is since they
  aren't in a fastpath and dealing with cleared memory instead of
  random garbage is just generally nicer.

Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
[danvet: Drop the contentious kmalloc_array hunk in execbuf.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-10-01 07:45:01 +02:00
Chris Wilson
651d794fae drm/i915: Use Write-Through cacheing for the display plane on Iris
Haswell GT3e has the unique feature of supporting Write-Through cacheing
of objects within the eLLC/LLC. The purpose of this is to enable the display
plane to remain coherent whilst objects lie resident in the eLLC/LLC - so
that we, in theory, get the best of both worlds, perfect display and fast
access.

However, we still need to be careful as the CPU does not see the WT when
accessing the cache. In particular, this means that we need to flush the
cache lines after writing to an object through the CPU, and on
transitioning from a cached state to WT.

v2: Actually do the clflush on transition to WT, nagging by Ville.
v3: Flush the CPU cache after writes into WT objects.
v4: Rease onto LLC updates and report WT as "uncached" for
get_cache_level_ioctl to remain symmetric with set_cache_level_ioctl.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-08-22 13:31:38 +02:00
Chris Wilson
2c22569bba drm/i915: Update rules for writing through the LLC with the cpu
As mentioned in the previous commit, reads and writes from both the CPU
and GPU go through the LLC. This gives us coherency between the CPU and
GPU irrespective of the attribute settings either device sets. We can
use to avoid having to clflush even uncached memory.

Except for the scanout.

The scanout resides within another functional block that does not use
the LLC but reads directly from main memory. So in order to maintain
coherency with the scanout, writes to uncached memory must be flushed.
In order to optimize writes elsewhere, we start tracking whether an
framebuffer is attached to an object.

v2: Use pin_display tracking rather than fb_count (to ensure we flush
cursors as well etc) and only force the clflush along explicit writes to
the scanout paths (i.e. pin_to_display_plane and pwrite into scanout).

v3: Force the flush after hitting the slowpath in pwrite, as after
dropping the lock the object's cache domain may be invalidated. (Ville)

Based on a patch by Ville Syrjälä.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-08-10 11:20:49 +02:00
Chris Wilson
350ec881d9 drm/i915: Rename I915_CACHE_MLC_LLC to L3_LLC for Ivybridge
MLC_LLC was never validated for Sandybridge and was superseded by a new
level of cacheing for the GPU in Ivybridge. Update our names to be
consistent with usage, and in the process stop setting the unwanted bit
on Sandybridge.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
[danvet: s/BUG/WARN_ON(1) bikeshed.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-08-06 16:35:30 +02:00
Ben Widawsky
40d74980d3 drm/i915: Use ggtt_vm to save some typing
Just some small cleanups, and a rename of vm->ggtt_vm requested by
Daniel.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-08-05 19:04:10 +02:00
Ben Widawsky
a70a3148b0 drm/i915: Make proper functions for VMs
Earlier in the conversion sequence we attempted to quickly wedge in the
transitional interface as static inlines.

Now that we're sure these interfaces are sane, for easier debug and to
decrease code size (since many of these functions may be called quite a
bit), make them real functions

While at it, kill off the set_color interface. We'll always have the
VMA, or easily get to it.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-08-05 19:04:08 +02:00
Ben Widawsky
87a6b688cc drm/i915/hsw: Change default LLC age to 3
The default LLC age was changed:
commit 0d8ff15e9a
Author: Ben Widawsky <benjamin.widawsky@intel.com>
Date:   Thu Jul 4 11:02:03 2013 -0700

drm/i915/hsw: Set correct Haswell PTE encodings.

On the surface it would seem setting a default age wouldn't matter
because all GEM BOs are aged similarly, so the order in which objects
are evicted would not be subject to aging. The current working theory as
to why this caused a regression though is that LLC is a bit special in
that it is shared with the CPU. Presumably (not verified) the CPU
fetches cachelines with age 3, and therefore recently cached GPU objects
would be evicted before similar CPU object first when the LLC is full.
It stands to reason therefore that this would negatively impact CPU
bound benchmarks - but those seem to be low on the priority list.

eLLC OTOH does not have this same property as LLC. It should be used
entirely for the GPU, and so the age really shouldn't matter.
Furthermore, we have no evidence to suggest one is better than another
on eLLC. Since we've never properly supported eLLC before no, there
should be no regression. If the GPU client really wants "younger"
objects, they should use MOCS.

v2: Drop the extra #define (Chad)

v3: Actually git add

v4: Pimped commit message

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67062
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-08-05 19:04:06 +02:00
Chris Wilson
08c45263a6 drm/i915: Use the same pte_encoding for ppgtt as for gtt
The PTE layouts are the same for both ppgtt and gtt, so we can simplify
the setup for ppgtt by copying the encoding function pointer from gtt.
This prevents bugs where we update one function pointer, but forget the
other.

For instance,

commit 4d15c145a6
Author: Ben Widawsky <ben@bwidawsk.net>
Date:   Thu Jul 4 11:02:06 2013 -0700

    drm/i915: Use eLLC/LLC by default when available

only extends the gtt to use eLLC/LLC cacheing and forgets to also update
the ppgtt function pointer.

v2: Actually mention the bug being fixed (Kenneth)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-08-04 21:29:57 +02:00
Ben Widawsky
2f63315692 drm/i915: Create VMAs
Formerly: "drm/i915: Create VMAs (part 1)"

In a previous patch, the notion of a VM was introduced. A VMA describes
an area of part of the VM address space. A VMA is similar to the concept
in the linux mm. However, instead of representing regular memory, a VMA
is backed by a GEM BO. There may be many VMAs for a given object, one
for each VM the object is to be used in. This may occur through flink,
dma-buf, or a number of other transient states.

Currently the code depends on only 1 VMA per object, for the global GTT
(and aliasing PPGTT). The following patches will address this and make
the rest of the infrastructure more suited

v2: s/i915_obj/i915_gem_obj (Chris)

v3: Only move an object to the now global unbound list if there are no
more VMAs for the object which are bound into a VM (ie. the list is
empty).

v4: killed obj->gtt_space
some reworks due to rebase

v5: Free vma on error path (Imre)

v6: Another missed vma free in i915_gem_object_bind_to_gtt error path
(Imre)
Fixed vma freeing in stolen preallocation (Imre)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Imre Deak <imre.deak@intel.com>
[danvet: Squash in fixup from Ben to not deref a non-existing vma in
set_cache_level, reported by Chris.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-18 08:46:13 +02:00
Ben Widawsky
93bd8649db drm/i915: Put the mm in the parent address space
Every address space should support object allocation. It therefore makes
sense to have the allocator be part of the "superclass" which GGTT and
PPGTT will derive.

Since our maximum address space size is only 2GB we're not yet able to
avoid doing allocation/eviction; but we'd hope one day this becomes
almost irrelvant.

v2: Rebased

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-17 22:23:43 +02:00
Ben Widawsky
853ba5d223 drm/i915: Move gtt and ppgtt under address space umbrella
The GTT and PPGTT can be thought of more generally as GPU address
spaces. Many of their actions (insert entries), state (LRU lists), and
many of their characteristics (size) can be shared. Do that.

The change itself doesn't actually impact most of the VMA/VM rework
coming up, it just fits in with the grand scheme of abstracting the GPU
VM operations. GGTT will usually be a special case where we either know
an object must be in the GGTT (dislay engine, workarounds, etc.).

The scratch page is left as part of the VM (even though it's currently
shared with the ppgtt code) because in the future when we have Full
PPGTT, I intend to create a separate scratch page for each.

v2: Drop usage of i915_gtt_vm (Daniel)
Make cleanup also part of the parent class (Ben)
Modified commit msg
Rebased

v3: Properly share scratch page (Imre)
Finish commit message (Daniel, Imre)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-17 22:21:47 +02:00
Ben Widawsky
4d15c145a6 drm/i915: Use eLLC/LLC by default when available
DRI clients really should be using MOCS to get fine grained streaming
cache controls. With that note, I *hope* that this patch doesn't improve
performance overwhelmingly, because if it does - it means there is a
problem elsewhere.

In any case, the kernel, and old userspace should get some benefit from
this, so let's do it. eLLC is always a good default, and really not
using it is the special case for MOCS.

References: http://www.intel.com/newsroom/kits/restricted/ha$well!/pdfs/4th_Gen_Intel_Core_PressBriefing_5-29.pdf (page 57)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-16 08:08:39 +02:00
Ben Widawsky
0d8ff15e9a drm/i915/hsw: Set correct Haswell PTE encodings.
The cacheability controls have changed, and the bits have been
rearranged in general.

Note that age 0 is the oldest (most likely to get evicted) and age 3
is the youngest (most likely to stick around for a bit). We've picked
0 for no reason, but atm it shouldn't matter anyway (since we don't
yet try to differentiate between different objects).

v2: Remove comments for snb/ivb cache leves, that's a separate change.

v3: Resolve conflicts due to patch series reordering.

v4: Rebased on top of Kenneth Graunke's ->pte_encode refactoring.

v5: Removed eLLC bits for separate patch.

In the internal repository this was:
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
[danvet: Add comment about cache ages as requested by Ben provoked due
to a question from Damien.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-16 07:57:42 +02:00
Ben Widawsky
c6cfb32567 drm/i915: Embed drm_mm_node in i915 gem obj
Embedding the node in the obj is more natural in the transition to VMAs
which will also have embedded nodes. This change also helps transition
away from put_block to remove node.

Though it's quite an uncommon occurrence, it's somewhat convenient to not
fail at bind time because we cannot allocate the node. Though in
practice there are other allocations (like the request structure) which
would probably make this point not terribly useful.

Quoting Daniel:
Note that the only difference between put_block and remove_node is
that the former fills up the preallocation cache. Which we don't need
anyway and hence is just wasted space.

v2: Clean up the stolen preallocation code.
Rebased on the reserve_node patches
renames ggtt_ stuff to gtt_ stuff
WARN_ON if the object is already bound (which doesn't mean it's in the
bound list, tricky)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-08 22:04:36 +02:00
Ben Widawsky
edd41a870f drm/i915: Kill obj->gtt_offset
With the getters in place from the previous patch this members serves no
purpose other than saving one spare pointer chase, which will be killed
in the next patch anyway.

Moving to VMAs, this members adds unnecessary confusion since an object
may exist at different offsets in different VMs.

v2: Properly preserve the stolen offset. This code is a bit hacky but it
all goes away when we embed the drm_mm_node and removes the need for the
incorrect patch I submitted previously: "Use gtt_space->start for stolen
reservation"

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-08 22:04:35 +02:00
Ben Widawsky
f343c5f647 drm/i915: Getter/setter for object attributes
Soon we want to gut a lot of our existing assumptions how many address
spaces an object can live in, and in doing so, embed the drm_mm_node in
the object (and later the VMA).

It's possible in the future we'll want to add more getter/setter
methods, but for now this is enough to enable the VMAs.

v2: Reworked commit message (Ben)
Added comments to the main functions (Ben)
sed -i "s/i915_gem_obj_set_color/i915_gem_obj_ggtt_set_color/" drivers/gpu/drm/i915/*.[ch]
sed -i "s/i915_gem_obj_bound/i915_gem_obj_ggtt_bound/" drivers/gpu/drm/i915/*.[ch]
sed -i "s/i915_gem_obj_size/i915_gem_obj_ggtt_size/" drivers/gpu/drm/i915/*.[ch]
sed -i "s/i915_gem_obj_offset/i915_gem_obj_ggtt_offset/" drivers/gpu/drm/i915/*.[ch]
(Daniel)

v3: Rebased on new reserve_node patch
Changed DRM_DEBUG_KMS to actually work (will need fixing later)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-08 22:04:34 +02:00
Ben Widawsky
338710e7af drm: Change create block to reserve node
With the previous patch we no longer actually create a node, we simply
find the correct hole and occupy it. This very well could have been
squashed with the last patch, but since I already had David's review, I
figured it's easiest to keep it distinct.

Also update the users in i915. Conveniently this is the only user of the
interface.

CC: David Airlie <airlied@linux.ie>
CC: <dri-devel@lists.freedesktop.org>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Acked-by: David Airlie <airlied@linux.ie>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-08 22:04:33 +02:00
Ben Widawsky
b3a070cccb drm: pre allocate node for create_block
For an upcoming patch where we introduce the i915 VMA, it's ideal to
have the drm_mm_node as part of the VMA struct (ie. it's pre-allocated).
Part of the conversion to VMAs is to kill off obj->gtt_space. Doing this
will break a bunch of code, but amongst them are 2 callers of
drm_mm_create_block(), both related to stolen memory.

It also allows us to embed the drm_mm_node into the object currently
which provides a nice transition over to the new code.

v2: Reordered to do before ripping out obj->gtt_offset.
Some minor cleanups made available because of reordering.

v3: s/continue/break on failed stolen node allocation (David)
Set obj->gtt_space on failed node allocation (David)
Only unref stolen (fix double free) on failed create_stolen (David)
Free node, and NULL it in failed create_stolen (David)
Add back accidentally removed newline (David)

CC: <dri-devel@lists.freedesktop.org>
Reviewed-by: David Herrmann <dh.herrmann@gmail.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Acked-by: David Airlie <airlied@linux.ie>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-08 22:04:32 +02:00
Ben Widawsky
b2f21b4dfd drm/i915: Use gtt shortform where possible
Just for compactness.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-01 11:27:59 +02:00
Ben Widawsky
80a74f7f9c drm/i915: Drop dev from pte_encode
The original pte_encode function needed the dev argument so we could do
platform specific handling via IS_GENX, etc. With the merging of a pte
encoding function there should never been a need to quirk away gen
specific details.

The patch doesn't do much but makes the upcoming reworks in gtt/ppgtt/mm
slightly (albeit, ever so) easier.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-01 11:27:59 +02:00
Ben Widawsky
6716724006 drm/i915: Combine scratch members into a struct
There isn't any special reason to do this other than it makes it obvious
that the two members are connected.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-01 11:27:58 +02:00
Ben Widawsky
84f1356058 drm/i915: Really share scratch page
A previous patch had set up the ppgtt and ggtt to use the same scratch
page, but still kept around both pointers. Kill it, it's not needed and
gets in our way for upcoming cleanups.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-01 11:27:57 +02:00
Ben Widawsky
6670a5a5c7 drm/i915: make PDE|PTE platform specific
Nothing outside of i915_gem_gtt.c and more specifically, the relevant
gen specific init function should need to know about number of PDEs, or
PTEs per PD. Exposing this will only lead to circumventing using the
upcoming VM abstraction.

To accomplish this, move the defines into the .c file, rename the PDE
define to be GEN6, and make the PTE count less of a magic number.

The remaining code in the global gtt setup is a bit messy, but an
upcoming patch will clean that one up.

v2: Don't hardcode number of PDEs (Daniel + Jesse)
Reworded commit message to reflect change.

Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-01 11:27:57 +02:00
Ben Widawsky
35c20a60c7 drm/i915: Rename the gtt_list to global_list
Since it will be used for the global bound/unbound list with full PPGTT,
this helps clarify things for upcoming code rework.

Recommended-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-06-03 10:51:14 +02:00
Daniel Vetter
e1b73cba13 Linux 3.10-rc2
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.19 (GNU/Linux)
 
 iQEcBAABAgAGBQJRmpexAAoJEHm+PkMAQRiGrRIH/1uWFW38RvaCV/PXm/ia6Z+x
 BfBJfBIvPxGwb4n7aQNQlhU25xkfrPZ6szO4WiBH5/KPH3xYi2I2OZ1AzffkYqMF
 BWkPmsPK6EsTdp16zsi6JtH2aXArG4SpYA7ZamPvDkmfigHuiZg7GlL/9eHTRPNV
 P7Q8JToOrcnP8RoGgNj0uFiQeQbc62Kmoq7WuPtUhVlpQCCCknXgOJiYgz9w6Xe9
 /i79YFS8WRrzAquExT1NbIOh4ZMqB9MvuroaVWy8JDDLUyz7QUvOCe3tCDNguwgi
 FdWvU6nfkdQq5SLaWCWXDE9Rp/pL1MvfBn9vCOwFcp42aw0aQ0PgJVIXvsqufd0=
 =jgDI
 -----END PGP SIGNATURE-----

Merge tag 'v3.10-rc2' into drm-intel-next-queued

Backmerge Linux 3.10-rc2 since the various (rather trivial) conflicts
grew a bit out of hand. intel_dp.c has the only real functional
conflict since the logic changed while dev_priv->edp.bpp was moved
around.

Also squash in a whitespace fixup from Ben Widawsky for
i915_gem_gtt.c, git seems to do something pretty strange in there
(which I don't fully understand tbh).

Conflicts:
	drivers/gpu/drm/i915/i915_reg.h
	drivers/gpu/drm/i915/intel_dp.c

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-05-21 09:52:16 +02:00
Ben Widawsky
c4ae25ecdf Revert "drm/i915: Calculate correct stolen size for GEN7+"
This reverts commit 03752f5b7b.

This revert requires a bit of explanation on how I understand things
work. Internally the architects/designers decide how the stolen encoding
works. We put it in a doc. BIOS writers take these docs and implement
it. Driver writers read the doc too, and read the value left by the BIOS
writers, and then we make magic.

The failing here is that in the docs we had[1] contained two different
definitions for this register for Gen7. (We have both a PCI register,
and an MMIO, and each of these were different). At the time [2] of
03752f5, we asked the architects what the correct value should be; but
that doesn't match the reality (BIOS) unfortunately.

So on all machines I can get my hands on, this revert is the right thing
to do. I've also worked with the product group to confirm that they
agree this revert is what we should do. People using HW made my "people"
who both write their own BIOS, and have access to our docs (Apple?).
Investigations are still ongoing about whether we need to add a list
of machines needing special handling, but this patch should be the
right thing for pretty much everyone.

[1] The docs are still wrong on this one. Now instead of two registers with
two definitions, we have one register with BOTH definitions, progress?
[2] The open source PRMs have the "wrong" definitions in chapter Volume
1 part6, section 1.1.12.

This digging was inspired by Paulo.

Cc: Paulo Zanoni <przanoni@gmail.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
[danvet: Augment the patch saying that it's still a bit unclear
whether there are any machines out there with "wrong" firmware and
whether we need to add a list to handle them specially.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-05-07 18:59:09 +02:00
Ben Widawsky
3e30254205 drm/i915: Extract PDE writes
It also makes some sense IMO to have these two functions separate
irrespective of the number of callers.

Only the single caller for now, but that will change as we add more
PPGTTs.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
[danvet: Resolve conflict.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-05-06 11:49:27 +02:00
Ben Widawsky
0a73287060 drm/i915: BUG_ON bad PPGTT offset
Because PPGTT PDEs within the GTT are calculated in cachelines
(HW guys consistency ftw) we do a divide which will wreak havoc if this
is wrong, and I know that from experience).

If/when we move to multiple PPGTTs this will have to become a WARN, and
return an error. For now however it should always be considered fatal,
and only a developer could hit it.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
[danvet: s/BUG/WARN]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-05-06 11:40:47 +02:00
Zhang, Xiong Y
43b27290dd drm/i915: correct the calculation of first_pd_entry_in_global_pt
When ppgtt is enabled, dev_priv->gtt.total has excluded the gtt space
occupied by ppgtt table in i915_gem_init_global_gtt() function. So the
calculation of first_pd_entry_in_global_pt doesn't need to subtract
I915_PPGTT_PD_ENTRIES again. Or else PPGTT directory table will be
destroyed by global gtt allocation.

This regression has been introduced in

commit a54c0c279f
Author: Ben Widawsky <ben@bwidawsk.net>
Date:   Thu Jan 24 14:45:00 2013 -0800

    drm/i915: remove intel_gtt structure

The breakage is pretty subtile since the old gtt_total_entries
included the pde range, whereas the new on did not.

Cc: stable@vger.kernel.org
Signed-off-by: Xiong Zhang<xiong.y.zhang@intel.com>
[danvet: Add regression citation and cc: stable. Thanks to Chris for
correcting my wrong guess about which commit broke things.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-04-27 14:07:16 +02:00
Kenneth Graunke
9119708cd4 drm/i915: Split out Haswell code from gen6_pte_encode.
Now that we have function pointers, it's cleaner to just create a new
per-platform PTE encoding function.

This should be identical in behavior to the previous code.

v2: Drop accidental inline keyword on hsw_pte_encode.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Tested-by: Daniel Leung <daniel.leung@linux.intel.com> [v1]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-04-22 11:44:21 +02:00
Kenneth Graunke
93c34e70eb drm/i915: Fix page table entries for Bay Trail.
On Bay Trail, bit 1 means "writeable by the GPU."  Failing to set that
means basically anything using the GPU will cause hangs.

v2: Drop accidental inline keyword on byt_pte_encode.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Tested-by: Daniel Leung <daniel.leung@linux.intel.com> [v1]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-04-22 11:44:11 +02:00
Kenneth Graunke
2d04befb94 drm/i915: Add PTE encoding function to the gtt/ppgtt vtables.
Sandybridge/Ivybridge, Bay Trail, and Haswell all have slightly
different page table entry formats.  Rather than polluting one function
with generation checks, simply use a function pointer and set up the
correct PTE encoding function at startup.

v2: Move the gen6_gtt_pte_t typedef to i915_drv.h so that the function
    pointers and implementations have identical signatures.  Also remove
    inline keyword on gen6_pte_encode.  Both suggested by Jani Nikula.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jani Nikula <jani.nikula@linux.intel.com>
Tested-by: Daniel Leung <daniel.leung@linux.intel.com> [v1]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-04-22 11:20:15 +02:00