linux

Author	SHA1	Message	Date
Ville Syrjälä	c0b4c66031	drm/i915: Move VLV/CHV prepare_pll later With DPIO powergating active on CHV, we can't even access the DPIO PLL registers until the lane power state overrides have been enabled. That will happen from the encoder .pre_pll_enable() hook, so move chv_prepare_pll() to happen after that point, which puts it just before chv_enable_pll() actually. Do the same for VLV to avoid accumulating weird differences between the platforms. Both platforms seem happy with the new arrangement. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Deepak S <deepak.s@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-08-26 10:22:51 +02:00
Ville Syrjälä	770effb19f	drm/i915: Add locking around chv_phy_control_init() dev_priv->chv_phy_control is protected by the power_domains->lock elsewhere, so also grab it when initializing chv_phy_control. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Deepak S <deepak.s@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-08-26 10:22:37 +02:00
Ville Syrjälä	e27f299ec3	drm/i915: Move DPIO port init earlier To implement DPIO lane power gating on CHV we're going to need to access DPIO registers from the cmn power well enable hook. That gets called rather early, so we need to move the DPIO port IOSF sideband port assignment earlier as well. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Deepak S <deepak.s@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-08-26 10:22:29 +02:00
Ville Syrjälä	d6db995fe3	drm/i915: Add encoder->post_pll_disable() hooks and move CHV clock buffer disables there Move the CHV clock buffer disable from chv_disable_pll() to the new encoder .post_pll_disable() hook. This is more symmetric since the clock buffer enable happens from the .pre_pll_enable() hook. We'll have more use for the new hook soon. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Deepak S <deepak.s@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-08-26 10:22:19 +02:00
Ville Syrjälä	67fa24b404	drm/i915: Always program unique transition scale for CHV The docs give you the impression that the unique transition scale value shouldn't matter when unique transition scale is enabled. But as Imre found on BXT (and I verfied also on BSW) the value does matter. So from now on just program the same value 0x9a always. Cc: Imre Deak <imre.deak@intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Deepak S <deepak.s@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-08-26 10:18:41 +02:00
Ville Syrjälä	25a25dfce4	drm/i915: Always program m2 fractional value on CHV When fractional m2 divider isn't used on CHV the fractional part is ignore by the hardware. Despite that, program the fractional value (0 in this case) to the hardware register just to keep things a bit more consistent. Might at least make register dumps a bit less confusing when there isn't some stale fractional part hanging around. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Deepak S <deepak.s@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-08-26 10:18:33 +02:00
Dave Gordon	4eee4920f0	drm/i915: fix driver's versions of WARN_ON & WARN_ON_ONCE The current versions of these two macros don't work correctly if the argument expression happens to contain a modulo operator (%) -- when stringified, it gets interpreted as a printf formatting character! With a specifically crafted parameter, this could probably cause a kernel OOPS; consider WARN_ON(p%s) or WARN_ON(f %*pEp). Instead, we should use an explicit "%s" format, with the stringified expression as the coresponding literal-string argument. Signed-off-by: Dave Gordon <david.s.gordon@intel.com> Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-08-26 09:59:28 +02:00
Ville Syrjälä	901c2daf05	drm/i915: Put back lane_count into intel_dp and add link_rate too With MST there won't be a crtc assigned to the main link encoder, so trying to dig up the pipe_config from there is a recipe for an oops. Instead store the parameters (lane_count and link_rate) in the encoder, and use those values during link training etc. Since those parameters are now assigned only when the link is actually enabled, .compute_config() won't clobber them as it did before. Hardware state readout is still bonkers though as we don't transfer the link parameters from pipe_config intel_dp. We should do that during encoder sanitation. But since we don't even do a proper job of reading out the main link encoder state for MST there's littel point in worrying about this now. Fixes a regression with MST caused by: commit `90a6b7b052` Author: Ville Syrjälä <ville.syrjala@linux.intel.com> Date: Mon Jul 6 16:39:15 2015 +0300 drm/i915: Move intel_dp->lane_count into pipe_config v2: Different apporoach that should keep intel_dp_check_mst_status() somewhat less oopsy Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reported-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Tested-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-08-26 09:58:19 +02:00
Imre Deak	e5756c10d8	drm/i915/bxt: don't allow cached GEM mappings on A stepping Due to a coherency issue on BXT A steppings we can't guarantee a coherent view of cached (CPU snooped) GPU mappings, so fail such requests. User space is supposed to fall back to uncached mappings in this case. v2: - limit the WA to A steppings, on later stepping this HW issue is fixed v3: - return error instead of trying to work around the issue in kernel, since that could confuse user space (Chris) Testcast: igt/gem_store_dword_batches_loop/cached-mapping Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-08-26 09:39:14 +02:00
Imre Deak	319404df2f	drm/i915/bxt: work around HW coherency issue when accessing GPU seqno By running igt/store_dword_loop_render on BXT we can hit a coherency problem where the seqno written at GPU command completion time is not seen by the CPU. This results in __i915_wait_request seeing the stale seqno and not completing the request (not considering the lost interrupt/GPU reset mechanism). I also verified that this isn't a case of a lost interrupt, or that the command didn't complete somehow: when the coherency issue occured I read the seqno via an uncached GTT mapping too. While the cached version of the seqno still showed the stale value the one read via the uncached mapping was the correct one. Work around this issue by clflushing the corresponding CPU cacheline following any store of the seqno and preceding any reading of it. When reading it do this only when the caller expects a coherent view. v2: - fix using the proper logical && instead of a bitwise & (Jani, Mika) - limit the workaround to A stepping, on later steppings this HW issue is fixed v3: - use a separate get_seqno/set_seqno vfunc (Chris) Testcase: igt/store_dword_loop_render Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-08-26 09:39:13 +02:00
Rodrigo Vivi	8be6ca8537	drm/i915: Also call frontbuffer flip when disabling planes. We also need to call the frontbuffer flip to trigger proper invalidations when disabling planes. Otherwise we will miss screen updates when disabling sprites or cursor. On core platforms where HW tracking also works, this issue is totally masked because HW tracking triggers PSR exit however on VLV/CHV that has only SW tracking we miss screen updates when disabling planes. It was caught with kms_psr_sink_crc sprite_plane_onoff and cursor_plane_onoff subtests running on VLV/CHV. This is probably a regression since I can also get this with the manual test case, but with so many changes on atomic modeset I couldn't track exactly when this was introduced. Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-08-26 08:46:41 +02:00
Arun Siluvery	f1afe24f0e	drm/i915: Change SRM, LRM instructions to use correct length MI_STORE_REGISTER_MEM, MI_LOAD_REGISTER_MEM instructions are not really variable length instructions unlike MI_LOAD_REGISTER_IMM where it expects (reg, addr) pairs so use fixed length for these instructions. v2: rebase Cc: Dave Gordon <david.s.gordon@intel.com> Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> [danvet: Appease checkpatch as Mika spotted in i915_reg.h - it seems terminally unhappy about i915_cmd_parser.c so that would be a separate patch.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-08-26 08:44:41 +02:00
Graham Whaley	cff4f55bf4	doc: drm: Fix mis-spelling of i915_guc_submission includes In commit `d1675198e`: drm/i915: Integrate GuC-based command submission the drm.tmpl include lines reference the intel_guc_submission.c but the patch adds the file i915_guc_submission.c. drm.tmpl fails to build with: docproc: .//drivers/gpu/drm/i915/intel_guc_submission.c: No such file or directory Change the file reference to the actual file. Signed-off-by: Graham Whaley <graham.whaley@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-08-25 11:23:55 +02:00
Jani Nikula	66e2806656	drm/i915: remove excessive scaler debugging messages There's so much scaler debugging messages that it makes other debugging hard. Remove them. Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-08-14 18:16:45 +02:00
Dave Gordon	8b417c266b	drm/i915: Debugfs interface for GuC submission statistics This provides a means of reading status and counts relating to GuC actions and submissions. v2: Remove surplus blank line in output [Chris Wilson] v5: Added GuC per-engine submission & seqno statistics v6: Add per-ring statistics to client, refactor client-dumper. Signed-off-by: Dave Gordon <david.s.gordon@intel.com> Signed-off-by: Alex Dai <yu.dai@intel.com> Reviewed-by: Tom O'Rourke <Tom.O'Rourke@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-08-14 18:16:45 +02:00
Alex Dai	d1675198ed	drm/i915: Integrate GuC-based command submission GuC-based submission is mostly the same as execlist mode, up to intel_logical_ring_advance_and_submit(), where the context being dispatched would be added to the execlist queue; at this point we submit the context to the GuC backend instead. There are, however, a few other changes also required, notably: 1. Contexts must be pinned at GGTT addresses accessible by the GuC i.e. NOT in the range [0..WOPCM_SIZE), so we have to add the PIN_OFFSET_BIAS flag to the relevant GGTT-pinning calls. 2. The GuC's TLB must be invalidated after a context is pinned at a new GGTT address. 3. GuC firmware uses the one page before Ring Context as shared data. Therefore, whenever driver wants to get base address of LRC, we will offset one page for it. LRC_PPHWSP_PN is defined as the page number of LRCA. 4. In the work queue used to pass requests to the GuC, the GuC firmware requires the ring-tail-offset to be represented as an 11-bit value, expressed in QWords. Therefore, the ringbuffer size must be reduced to the representable range (4 pages). v2: Defer adding #defines until needed [Chris Wilson] Rationalise type declarations [Chris Wilson] v4: Squashed kerneldoc patch into here [Daniel Vetter] v5: Update request->tail in code common to both GuC and execlist modes. Add a private version of lr_context_update(), as sharing the execlist version leads to race conditions when the CPU and the GuC both update TAIL in the context image. Conversion of error-captured HWS page to string must account for offset from start of object to actual HWS (LRC_PPHWSP_PN). Issue: VIZ-4884 Signed-off-by: Alex Dai <yu.dai@intel.com> Signed-off-by: Dave Gordon <david.s.gordon@intel.com> Reviewed-by: Tom O'Rourke <Tom.O'Rourke@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-08-14 18:16:44 +02:00
Dave Gordon	4df001d398	drm/i915: Interrupt routing for GuC submission Turn on interrupt steering to route necessary interrupts to GuC. v6: Rebased Issue: VIZ-4884 Signed-off-by: Alex Dai <yu.dai@intel.com> Signed-off-by: Dave Gordon <david.s.gordon@intel.com> Reviewed-by: Tom O'Rourke <Tom.O'Rourke@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-08-14 18:16:43 +02:00
Dave Gordon	44a28b1d36	drm/i915: Implementation of GuC submission client A GuC client has its own doorbell and workqueue. It maintains the doorbell cache line, process description object and work queue item. A default guc_client is created for the i915 driver to use for normal-priority in-order submission. Note that the created client is not yet ready for use; doorbell allocation will fail as we haven't yet linked the GuC's context descriptor to the default contexts for each ring (see later patch). v2: Defer adding structure members until needed [Chris Wilson] Rationalise type declarations [Chris Wilson] v5: Add GuC per-engine submission & seqno statistics. Move wq locking to encompass both get_space() and add_item(). Take forcewake lock in host2guc_action() [Tom O'Rourke] v6: Fix GuC doorbell cacheline selection code (the cacheline-within-page calculation was wrong). Rename GuC priorities to make them closer to the names used in the GuC firmware source, matching what the autogenerated versions will (probably) be. Add per-ring statistics to client. Issue: VIZ-4884 Signed-off-by: Alex Dai <yu.dai@intel.com> Signed-off-by: Dave Gordon <david.s.gordon@intel.com> Reviewed-by: Tom O'Rourke <Tom.O'Rourke@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-08-14 18:16:42 +02:00
Alex Dai	4c7e77fc10	drm/i915: Enable GuC firmware log Allocate a GEM object to hold GuC log data. A debugfs interface (i915_guc_log_dump) is provided to print out the log content. v2: Add struct members at point of use [Chris Wilson] v6: Rebased Issue: VIZ-4884 Signed-off-by: Alex Dai <yu.dai@intel.com> Signed-off-by: Dave Gordon <david.s.gordon@intel.com> Reviewed-by: Tom O'Rourke <Tom.O'Rourke@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-08-14 18:16:42 +02:00
Alex Dai	bac427f8ab	drm/i915: Prepare for GuC-based command submission This adds the first of the data structures used to communicate with the GuC (the pool of guc_context structures). We create a GuC-specific wrapper round the GEM object allocator as all GEM objects shared with the GuC must be pinned into GGTT space at an address that is NOT in the range [0..WOPCM_TOP), as that range of GGTT addresses is not accessible to the GuC (from the GuC's point of view, it's permanently reserved for other objects such as the BootROM & SRAM). Later, we will need to allocate additional GuC-sharable objects for the submission client(s) and the GuC's debug log. v2: Remove redundant initialisation [Chris Wilson] Defer adding struct members until needed [Chris Wilson] Local functions should pass dev_priv rather than dev [Chris Wilson] v5: Invalidate GuC TLB after allocating and pinning a new object v6: Rebased Issue: VIZ-4884 Signed-off-by: Alex Dai <yu.dai@intel.com> Signed-off-by: Dave Gordon <david.s.gordon@intel.com> Reviewed-by: Tom O'Rourke <Tom.O'Rourke@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-08-14 18:16:41 +02:00
Dave Gordon	919f1f55d9	drm/i915: Expose one LRC function for GuC submission mode GuC submission is basically execlist submission, but with the GuC handling the actual writes to the ELSP and the resulting context switch interrupts. So to describe a context for submission via the GuC, we need one of the same functions used in execlist mode. This commit exposes one such function, changing its name to better describe what it does (it's related to logical ring contexts rather than to execlists per se). v2: Replaces previous "drm/i915: Move execlists defines from .c to .h" v3: Incorporates a change to one of the functions exposed here that was previously part of an internal patch, but which was omitted from the version recently committed to drm-intel-nightly: `7a01a0a` drm/i915/lrc: Update PDPx registers with lri commands So we reinstate this change here. v4: Drop v3 change, update function parameters due to collision with `8ee3615` drm/i915: Convert execlists_ctx_descriptor() for requests v5: Don't expose execlists_update_context() after all. The current version is no longer compatible with GuC submission; trying to share the execlist version of this function results in both GuC and CPU updating TAIL in the context image, with bad results when they get out of step. The GuC submission path now has its own private version that just updates the ringbuffer start address, and not TAIL or PDPx. v6: Rebased Issue: VIZ-4884 Signed-off-by: Dave Gordon <david.s.gordon@intel.com> Reviewed-by: Tom O'Rourke <Tom.O'Rourke@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-08-14 18:16:40 +02:00
Alex Dai	fdf5d3572f	drm/i915: Debugfs interface to read GuC load status The new node provides access to the status of the GuC-specific loader; also the scratch registers used for communication between the i915 driver and the GuC firmware. v2: Changes to output formats per Chris Wilson's suggestions v6: Rebased Issue: VIZ-4884 Signed-off-by: Alex Dai <yu.dai@intel.com> Signed-off-by: Dave Gordon <david.s.gordon@intel.com> Reviewed-by: Tom O'Rourke <Tom.O'Rourke@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-08-14 18:16:40 +02:00
Alex Dai	33a732f407	drm/i915: GuC-specific firmware loader This fetches the required firmware image from the filesystem, then loads it into the GuC's memory via a dedicated DMA engine. This patch is derived from GuC loading work originally done by Vinit Azad and Ben Widawsky. v2: Various improvements per review comments by Chris Wilson v3: Removed 'wait' parameter to intel_guc_ucode_load() as firmware prefetch is no longer supported in the common firmware loader, per Daniel Vetter's request. Firmware checker callback fn now returns errno rather than bool. v4: Squash uC-independent code into GuC-specifc loader [Daniel Vetter] Don't keep the driver working (by falling back to execlist mode) if GuC firmware loading fails [Daniel Vetter] v5: Clarify WOPCM-related #defines [Tom O'Rourke] Delete obsolete code no longer required with current h/w & f/w [Tom O'Rourke] Move the call to intel_guc_ucode_init() later, so that it can allocate GEM objects, and have it fetch the firmware; then intel_guc_ucode_load() doesn't need to fetch it later. [Daniel Vetter]. v6: Update comment describing intel_guc_ucode_load() [Tom O'Rourke] Issue: VIZ-4884 Signed-off-by: Alex Dai <yu.dai@intel.com> Signed-off-by: Dave Gordon <david.s.gordon@intel.com> Reviewed-by: Tom O'Rourke <Tom.O'Rourke@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-08-14 18:16:39 +02:00
Ville Syrjälä	04a60f9ffa	drm/i915: Kill intel_dp->{link_bw, rate_select} We only need the link_bw/rate_select parameters when starting link training, and they should be computed based on the currently active config, so throw them out from intel_dp and just compute on demand. Toss in an extra debug print to see rate_select in addition to link_bw, as the latter may be 0 for eDP 1.4. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-08-14 18:16:38 +02:00
Ville Syrjälä	a79b8165be	drm/i915: Don't use link_bw to select between TP1 and TP3 intel_dp->link_bw is going away, so consul the port_clock instead when choosing between TP1 and TP3. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-08-14 18:16:38 +02:00
Ville Syrjälä	90a6b7b052	drm/i915: Move intel_dp->lane_count into pipe_config Currently we clobber intel_dp->lane_count in compute config, which means after a rejected modeset we may no longer be able to retrain the current link. Move lane_count into pipe_config to avoid that. v2: Add missing ':' to the pipe config debug dump Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-08-14 18:16:37 +02:00
Ville Syrjälä	b81e34c29e	drm/i915: Avoid confusion between DP and TRANS_DP_CTL in DP .get_config() Use a separate variable for the TRANS_DP_CTL value instead of reusing 'tmp' that otherwise contains the DP port register value. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-08-14 18:16:36 +02:00
Ville Syrjälä	96f3f1f905	drm/i915: Don't pass clock to DDI PLL select functions All the *_ddi_pll_select() functions get passed the port_clock and pipe config as parameters. We only need to pass the pipe config, and the functions can dig up the port_clock themselves. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-08-14 18:16:36 +02:00
Ville Syrjälä	840b32b7ed	drm/i915: Don't use link_bw for PLL setup Use port_clock instead of link_bw when picking the PLL parameters for DP. link_bw may be zero with an eDP 1.4 sink that supports DP_LINK_RATE_SET so we shouldn't use it for anything other than feed it to the sink appropriately. v2: Fix typo in commit message (Sivakumar) Reviewed-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-08-14 18:16:35 +02:00
Ville Syrjälä	0f2a2a756e	drm/i915: Clean up DP/HDMI limited color range handling Currently we treat intel_{dp,hdmi}->color_range as partly user controller value (via the property) but we also change it during .compute_config() when using the "Automatic" mode. That is a bit confusing, so let's just change things so that we store the user property values in intel_dp, and only change what's stored in pipe_config during .compute_config(). There should be no functional change. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-08-14 18:16:34 +02:00
Chris Wilson	908565c208	drm/i915: Do not check or a stalled pageflip prior to it being queued When we queue the command or operation to change the scanout address, we mark the flip as in progress. We can use this flag to prevent us from checking for a stalled flip prior to its existence! Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-08-14 18:16:34 +02:00
Ville Syrjälä	ed75a55bb3	drm/i915: clflush on pin_to_display after pwrite to UC bo in LLC Currently we don't clflush on pin_to_display if the bo is already UC/WT and is not in the CPU write domain. This causes problems with pwrite since pwrite doesn't change the write domain, and it avoids clflushing on UC/WT buffers on LLC platforms unless the buffer is currently being scanned out. Fix the problem by marking the cache dirty and adjusting i915_gem_object_set_cache_level() to clflush when the cache is dirty even if the cache_level doesn't change. My last attempt [1] at fixing this via write domain frobbing was shot down, but now with the cache_dirty flag we can do things in a nicer way. [1] http://lists.freedesktop.org/archives/intel-gfx/2014-November/055390.html v2: Drop the I915_CACHE_NONE/WT checks from pwrite Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86422 Testcase: igt/kms_pwrite_crc Testcase: igt/gem_pwrite_snooped Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-08-14 18:16:33 +02:00
Sonika Jindal	cf1d58833f	drm/i915/bxt: WA for swapped HPD pins in A stepping WA for BXT A0/A1, where DDIB's HPD pin is swapped to DDIA, so enabling DDIA HPD pin in place of DDIB. v2: For DP, irq_port is used to determine the encoder instead of hpd_pin and removing the edp HPD logic because port A HPD is not present(Imre) v3: Rebased on top of Imre's patchset for enabling HPD on PORT A. Added hpd_pin swapping for intel_dp_init_connector, setting encoder for PORT_A as per the WA in irq_port (Imre) v4: Dont enable interrupt for edp, also reframe the description (Siva) v5: Don’t check for PORT_A in intel_ddi_init to update dig_port, instead avoid setting hpd_pin itself (Imre) Signed-off-by: Sonika Jindal <sonika.jindal@intel.com> Reviewed-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-08-14 18:16:32 +02:00
Sonika Jindal	7f3561bec7	drm/i915/bxt: Add HPD support for DDIA Also remove redundant comments. Signed-off-by: Sonika Jindal <sonika.jindal@intel.com> Reviewed-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-08-14 18:16:32 +02:00
Michel Thierry	25f5033761	drm/i915: Always pass dev pointer in pdp_init And fix 0-DAY kernel test infrastructure warning. Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Michel Thierry <michel.thierry@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-08-14 18:16:31 +02:00
Michel Thierry	f365f91181	drm/i915: Use complete virtual address range on 32-bit platforms With the offset length being taken care of in ("drm/i915/gtt: Allow >= 4GB offsets in X86_32"), the code should be finally safe in 32-bit kernels. This reverts commit `501fd70fca` Author: Michel Thierry <michel.thierry@intel.com> Date: Fri May 29 14:15:05 2015 +0100 drm/i915: limit PPGTT size to 2GB in 32-bit platforms Signed-off-by: Michel Thierry <michel.thierry@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-08-14 18:16:30 +02:00
Michel Thierry	088e0df402	drm/i915/gtt: Allow >= 4GB offsets in X86_32 Similar to commit `c44ef60e43` ("drm/i915/gtt: Allow >= 4GB sizes for vm"), i915_gem_obj_offset and i915_gem_obj_ggtt_offset return an unsigned long, which in only 4-bytes long in 32-bit kernels. Change return type (and other related offset variables) to u64. Since Global GTT is always limited to 4GB, this change would not be required in i915_gem_obj_ggtt_offset, but this is done for consistency. v2: Remove unnecessary offset variable in do_pin, as we already have vma->node.start (Chris). Update GGTT offset too (Tvrtko). Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Signed-off-by: Michel Thierry <michel.thierry@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-08-14 18:16:30 +02:00
Rodrigo Vivi	aabc95dcf2	drm/i915: Dont -ETIMEDOUT on identical new and previous (count, crc). By Vesa DP 1.2 spec TEST_CRC_COUNT is a "4 bit wrap counter which increments each time the TEST_CRC_x_x are updated." However if we are trying to verify the screen hasn't changed we get same (count, crc) pair twice. Without this patch we would return -ETIMEOUT in this case. So, if in 6 vblanks the pair (count, crc) hasn't changed we return it anyway instead of returning error and let test case decide if it was right or not. Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-08-14 18:16:29 +02:00
Rodrigo Vivi	621d4c76fd	drm/i915: Save latest known sink CRC to compensate delayed counter reset. By Vesa DP 1.2 Spec TEST_CRC_COUNT should be "reset to 0 when TEST_SINK bit 0 = 0." However for some strange reason when PSR is enabled in certain platforms this is not true. At least not immediatelly. So we face cases like this: first get_sink_crc operation: count: 0, crc: 000000000000 count: 1, crc: c101c101c101 returned expected crc: c101c101c101 secont get_sink_crc operation: count: 1, crc: c101c101c101 count: 0, crc: 000000000000 count: 1, crc: 0000c1010000 should return expected crc: 0000c1010000 But also the reset to 0 should be faster resulting into: get_sink_crc operation: count: 1, crc: c101c101c101 count: 1, crc: 0000c1010000 should return expected crc: 0000c1010000 So in order to know that the second one is valid one we need to compare the pair (count, crc) with latest (count, crc). If the pair changed you have your valid CRC. Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-08-14 18:16:28 +02:00
Rodrigo Vivi	e5a1cab5e5	drm/i915: Force sink crc stop before start. By Vesa DP spec, test counter at DP_TEST_SINK_MISC just reset to 0 when unsetting DP_TEST_SINK_START, so let's force this stop here. But let's minimize the aux transactions and just do it when we know it hasn't been properly stoped. Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-08-14 18:16:28 +02:00
Michel Thierry	c6d576cc57	drm/i915/userptr: Kill user_size limit check GTT was only 32b and its max value is 4GB. In order to allow objects bigger than 4GB in 48b PPGTT, i915_gem_userptr_ioctl we could check against max 48b range (1ULL << 48). But since the check no longer applies, just kill the limit. v2: Use the default ctx to infer the ppgtt max size (Akash). v3: Just kill the limit, it was only there for early detection of an error when used for execbuffer (Chris). Cc: Akash Goel <akash.goel@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Michel Thierry <michel.thierry@intel.com> Reviewed-by: Akash Goel <akash.goel@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-08-14 18:16:27 +02:00
Michel Thierry	af98714e5d	drm/i915: batch_obj vm offset must be u64 Otherwise it can overflow in 48-bit mode, and cause an incorrect exec_start. Before commit `5f19e2bffa` ("drm/i915: Merged the many do_execbuf() parameters into a structure"), it was already an u64. Signed-off-by: Michel Thierry <michel.thierry@intel.com> Reviewed-by: Akash Goel <akash.goel@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-08-14 18:16:26 +02:00
Michel Thierry	65bd342ff2	drm/i915: object size needs to be u64 In a 48b world, users can try to allocate buffers bigger than 4GB; in these cases it is important that size is a 64b variable. v2: Drop the warning about bind with size 0, it shouldn't happen anyway. Signed-off-by: Michel Thierry <michel.thierry@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-08-14 18:16:26 +02:00
Michel Thierry	ea91e40150	drm/i915/gen8: Add ppgtt info and debug_dump v2: Clean up patch after rebases. v3: gen8_dump_ppgtt for 32b and 48b PPGTT. v4: Use used_pml4es/pdpes (Akash). v5: Rebase after Mika's ppgtt cleanup / scratch merge patch series. v6: Rely on used_px bits instead of null checking (Akash) Cc: Akash Goel <akash.goel@intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+) Reviewed-by: Akash Goel <akash.goel@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-08-14 18:16:25 +02:00
Michel Thierry	e1f123257a	drm/i915: Expand error state's address width to 64b v2: For semaphore errors, object is mapped to GGTT and offset will not be > 4GB, print only lower 32-bits (Akash) v3: Print gtt_offset in groups of 32-bit (Chris) Cc: Akash Goel <akash.goel@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Michel Thierry <michel.thierry@intel.com> Reviewed-by: Akash Goel <akash.goel@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-08-14 18:16:24 +02:00
Michel Thierry	69ab76fd3d	drm/i915/gen8: Initialize PDPs and PML4 Similar to PDs, while setting up a page directory pointer, make all entries of the pdp point to the scratch pd before mapping (and make all its entries point to the scratch page); this is to be safe in case of out of bound access or proactive prefetch. Also add a scratch pdp, which the PML4 entries point to. v2: Handle scratch_pdp allocation failure correctly, and keep initialize_px functions together (Akash) v3: Rebase after Mika's ppgtt cleanup / scratch merge patch series. Rely on the added macros to initialize the pdps. v4: Rebase after final merged version of Mika's ppgtt/scratch patches (and removed commit message part related to v3). v5: Update commit message to also mention PML4 table initialization and the new scratch pdp (Akash). Suggested-by: Akash Goel <akash.goel@intel.com> Signed-off-by: Michel Thierry <michel.thierry@intel.com> Reviewed-by: Akash Goel <akash.goel@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-08-14 18:16:23 +02:00
Michel Thierry	de5ba8eb9c	drm/i915/gen8: Add 4 level support in insert_entries and clear_range When 48b is enabled, gen8_ppgtt_insert_entries needs to read the Page Map Level 4 (PML4), before it selects which Page Directory Pointer (PDP) it will write to. Similarly, gen8_ppgtt_clear_range needs to get the correct PDP/PD range. This patch was inspired by Ben's "Depend exclusively on map and unmap_vma". v2: Rebase after s/page_tables/page_table/. v3: Remove unnecessary pdpe loop in gen8_ppgtt_clear_range_4lvl and use clamp_pdp in gen8_ppgtt_insert_entries (Akash). v4: Merge gen8_ppgtt_clear_range_4lvl into gen8_ppgtt_clear_range to maintain symmetry with gen8_ppgtt_insert_entries (Akash). v5: Do not mix pages and bytes in insert_entries (Akash). v6: Prevent overflow in sg_nents << PAGE_SHIFT, when inserting 4GB at once. v7: Rebase after Mika's ppgtt cleanup / scratch merge patch series. Use gen8_px_index functions, and remove unnecessary number of pages parameter in insert_pte_entries. v8: Change gen8_ppgtt_clear_pte_range to stop at PDP boundary, instead of adding and extra clamp function; remove unnecessary pdp_start/pdp_len variables (Akash). v9: pages->orig_nents instead of sg_nents(pages->sgl) to get the length (Akash). v10: Remove pdp warning check ingen8_ppgtt_insert_pte_entries until this commit (Akash). Reviewed-by: Akash Goel <akash.goel@intel.com> (v9) Cc: Akash Goel <akash.goel@intel.com> Signed-off-by: Michel Thierry <michel.thierry@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-08-14 18:16:23 +02:00
Michel Thierry	3387d433b0	drm/i915/gen8: Pass sg_iter through pte inserts As a step towards implementing 4 levels, while not discarding the existing pte insert functions, we need to pass the sg_iter through. The current function understands to the page directory granularity. An object's pages may span the page directory, and so using the iter directly as we write the PTEs allows the iterator to stay coherent through a VMA insert operation spanning multiple page table levels. v2: Rebase after s/page_tables/page_table/. v3: Rebase after Mika's ppgtt cleanup / scratch merge patch series; updated commit message (s/map/insert). v4: Rebase. Reviewed-by: Akash Goel <akash.goel@intel.com> (v3) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+) Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-08-14 18:16:22 +02:00
Michel Thierry	2dba3239f5	drm/i915/gen8: Add 4 level switching infrastructure and lrc support In 64b (48bit canonical) PPGTT addressing, the PDP0 register contains the base address to PML4, while the other PDP registers are ignored. In LRC, the addressing mode must be specified in every context descriptor, and the base address to PML4 is stored in the reg state. v2: PML4 update in legacy context switch is left for historic reasons, the preferred mode of operation is with lrc context based submission. v3: s/gen8_map_page_directory/gen8_setup_page_directory and s/gen8_map_page_directory_pointer/gen8_setup_page_directory_pointer. Also, clflush will be needed for bxt. (Akash) v4: Squashed lrc-specific code and use a macro to set PML4 register. v5: Rebase after Mika's ppgtt cleanup / scratch merge patch series. PDP update in bb_start is only for legacy 32b mode. v6: Rebase after final merged version of Mika's ppgtt/scratch patches. v7: There is no need to update the pml4 register value in execlists_update_context. (Akash) v8: Move pd and pdp setup functions to a previous patch, they do not belong here. (Akash) v9: Check USES_FULL_48BIT_PPGTT instead of GEN8_CTX_ADDRESSING_MODE in gen8_emit_bb_start to check if emit pdps is needed. (Akash) Cc: Akash Goel <akash.goel@intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+) Reviewed-by: Akash Goel <akash.goel@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-08-14 18:16:21 +02:00
Michel Thierry	762d99363d	drm/i915/gen8: implement alloc/free for 4lvl PML4 has no special attributes, and there will always be a PML4. So simply initialize it at creation, and destroy it at the end. The code for 4lvl is able to call into the existing 3lvl page table code to handle all of the lower levels. v2: Return something at the end of gen8_alloc_va_range_4lvl to keep the compiler happy. And define ret only in one place. Updated gen8_ppgtt_unmap_pages and gen8_ppgtt_free to handle 4lvl. v3: Use i915_dma_unmap_single instead of pci API. Fix a couple of incorrect checks when unmapping pdp and pd pages (Akash). v4: Call __pdp_fini also for 32b PPGTT. Clean up alloc_pdp param list. v5: Prevent (harmless) out of range access in gen8_for_each_pml4e. v6: Simplify alloc_vma_range_4lvl and gen8_ppgtt_init_common error paths. (Akash) v7: Rebase, s/gen8_ppgtt_free_/gen8_ppgtt_cleanup_/. v8: Change location of pml4_init/fini. It will make next patches cleaner. v9: Rebase after Mika's ppgtt cleanup / scratch merge patch series, while trying to reuse as much as possible for pdp alloc. pml4_init/fini replaced by setup/cleanup_px macros. v10: Rebase after Mika's merged ppgtt cleanup patch series. v11: Rebase after final merged version of Mika's ppgtt/scratch patches. v12: Fix pdpe start value in trace (Akash) v13: Define all 4lvl functions in this patch directly, instead of previous patches, add i915_page_directory_pointer_entry_alloc here, use test_bit to detect when pdp is already allocated (Akash). v14: Move pdp allocation into a new gen8_ppgtt_alloc_page_dirpointers funtion, as we do for pds and pts; move pd and pdp setup functions to this patch (Akash). v15: Added kfree(pdp) from previous patch to this (Akash). Cc: Akash Goel <akash.goel@intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+) Reviewed-by: Akash Goel <akash.goel@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-08-14 18:16:21 +02:00

1 2 3 4 5 ...

535205 Commits