The bo creation placement is where the bo will be. Instead of trying
to move bo at each command stream let this work to another worker
thread that will use more advance heuristic.
agd5f: remove leftover unused variable
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
For GMCH platforms we set up the hpd irq registers in the irq
postinstall hook. But since we only enable the irq sources we actually
need in PORT_HOTPLUG_EN/STATUS, taking dev_priv->hotplug_supported_mask
into account, no hpd interrupt sources is enabled since
commit 52d7ecedac
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date: Sat Dec 1 21:03:22 2012 +0100
drm/i915: reorder setup sequence to have irqs for output setup
Wrongly set-up interrupts also lead to broken hw-based load-detection
on at least GM45, resulting in ghost VGA/TV-out outputs.
To fix this, delay the hotplug register setup until after all outputs
are set up, by moving it into a new dev_priv->display.hpd_irq_callback.
We might also move the PCH_SPLIT platforms to such a setup eventually.
Another funny part is that we need to delay the fbdev initial config
probing until after the hpd regs are setup, for otherwise it'll detect
ghost outputs. But we can only enable the hpd interrupt handling
itself (and the output polling) _after_ that initial scan, due to
massive locking brain-damage in the fbdev setup code. Add a big
comment to explain this cute little dragon lair.
v2: Encapsulate all the fbdev handling by wrapping the move call into
intel_fbdev_initial_config in intel_fb.c. Requested by Chris Wilson.
v3: Applied bikeshed from Jesse Barnes.
v4: Imre Deak noticed that we also need to call intel_hpd_init after
the drm_irqinstall calls in the gpu reset and resume paths - otherwise
hotplug will be broken. Also improve the comment a bit about why
hpd_init needs to be called before we set up the initial fbdev config.
Bugzilla: Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=54943
Reported-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> (v3)
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
To gain confidence in the wrap handling, make it happen quite
soon after the boot.
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The complication is that during seqno wrapping we must be extremely
careful not to write to any ring as that will require a new seqno, and
so would recurse back into the seqno wrap handler. So we cannot call
i915_gpu_idle() as that does additional work beyond simply retiring the
current set of requests, and instead must do the minimal work ourselves
during seqno wrapping.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
If wrap just happened we need to prevent emitting waits for
pre wrap values. Detect this and emit no-ops instead.
v2: Use olr > seqno to detect wrap instead of *seqno == 0
as suggested by Chris Wilson.
v3: Use last used seqno to detect the wraparound. From
Chris Wilson
v4: Fixed unnecessary last_seqno assigment
References: https://bugs.freedesktop.org/show_bug.cgi?id=57967
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The commit [23670b322: drm/i915: CPT+ pch transcoder workaround]
caused a regression on some HP laptops with IvyBridge. The whole
laptop screen is shifted downward for a few pixels constantly.
The problem appears only on LVDS while DP and VGA seem unaffected.
Also, the problem disappears once when go and back from S3.
(S4 resume still shows the same problem.)
This patch revives the minimum part the commit above dropped.
For fixing this regression, only the setup of CHICKEN2 bit in
cpt_init_clock_gating() is needed.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Alex writes:
"adds support for the
asynchronous DMA engines on r6xx-SI. These engines are used
for ttm bo moves and VM page table updates currently. They
could also be exposed via the CS ioctl for userspace use,
but I haven't had a chance to add proper CS checker patches
for them yet. These patches have been tested extensively
internally for months, so they should be pretty solid."
* 'drm-next-3.8' of git://people.freedesktop.org/~agd5f/linux:
drm/radeon: use DMA engine for VM page table updates on SI
drm/radeon: add dma engine support for vm pt updates on si (v2)
drm/radeon: use DMA engine for VM page table updates on cayman/TN
drm/radeon: add dma engine support for vm pt updates on ni (v5)
drm/radeon: use async dma for ttm buffer moves on 6xx-SI
drm/radeon/kms: add support for dma rings to radeon_test_moves()
drm/radeon/kms: Add initial support for async DMA on SI
drm/radeon/kms: Add initial support for async DMA on cayman/TN
drm/radeon/kms: Add initial support for async DMA on evergreen
drm/radeon/kms: Add initial support for async DMA on r6xx/r7xx
DMA engine has special packets to facilitate this and it also keeps
the 3D engine free for other things.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
DMA engine has special packets to facilitate this and it also keeps
the 3D engine free for other things.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Async DMA has a special packet for contiguous pt updates
which saves overhead.
v2: leave the CP method enabled for now as doing the updates
in the DMA rings is not working properly yet.
v3: update for 2 level pts
v4: rebase
v5: drop pte/pde packet. doesn't seem to work on NI.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
There are 2 async DMA engines on cayman, one at 0xd000 and
one at 0xd800. The programming interface is the same as
evergreen however there are some changes to the commands
for using vmids.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Pretty similar to 6xx/7xx except the count field increased in the
packet header and the max IB size increased.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
We've originally added this in
commit 291427f5fd
Author: Jesse Barnes <jbarnes@virtuousgeek.org>
Date: Fri Jul 29 12:42:37 2011 -0700
drm/i915: apply phase pointer override on SNB+ too
and then copy-pasted it over to ivb/ppt. The w/a was originally added
for ilk/ibx in
commit 5b2adf8971
Author: Jesse Barnes <jbarnes@virtuousgeek.org>
Date: Thu Oct 7 16:01:15 2010 -0700
drm/i915: add Ironlake clock gating workaround for FDI link training
and fixed up a bit in
commit 6f06ce184c
Author: Jesse Barnes <jbarnes@virtuousgeek.org>
Date: Tue Jan 4 15:09:38 2011 -0800
drm/i915: set phase sync pointer override enable before setting phase sync pointer
It turns out that this w/a isn't actually required on cpt/ppt and
positively harmful on ivb/ppt when using fdi B/C links - it results in
a black screen occasionally, with seemingfully everything working as
it should. The only failure indication I've found in the hw is that
eventually (but not right after the modeset completes) a pipe underrun
is signalled.
Big thanks to Arthur Runyan for all the ideas for registers to check
and changes to test, otherwise I couldn't ever have tracked this down!
Cc: "Runyan, Arthur J" <arthur.j.runyan@intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This reverts commits a50915394f and
d7c3b937bd.
This is a revert of a revert of a revert. In addition, it reverts the
even older i915 change to stop using the __GFP_NO_KSWAPD flag due to the
original commits in linux-next.
It turns out that the original patch really was bogus, and that the
original revert was the correct thing to do after all. We thought we
had fixed the problem, and then reverted the revert, but the problem
really is fundamental: waking up kswapd simply isn't the right thing to
do, and direct reclaim sometimes simply _is_ the right thing to do.
When certain allocations fail, we simply should try some direct reclaim,
and if that fails, fail the allocation. That's the right thing to do
for THP allocations, which can easily fail, and the GPU allocations want
to do that too.
So starting kswapd is sometimes simply wrong, and removing the flag that
said "don't start kswapd" was a mistake. Let's hope we never revisit
this mistake again - and certainly not this many times ;)
Acked-by: Mel Gorman <mgorman@suse.de>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
All items on the lru list are always reservable, so this is a stupid
thing to keep. Not only that, it is used in a way which would
guarantee deadlocks if it were ever to be set to block on reserve.
This is a lot of churn, but mostly because of the removal of the
argument which can be nested arbitrarily deeply in many places.
No change of code in this patch except removal of the no_wait_reserve
argument, the previous patch removed the use of no_wait_reserve.
v2:
- Warn if -EBUSY is returned on reservation, all objects on the list
should be reservable. Adjusted patch slightly due to conflicts.
v3:
- Focus on no_wait_reserve removal only.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Replace the goto loop with a simple for each loop, and only run the
delayed destroy cleanup if we can reserve the buffer first.
No race occurs, since lru lock is never dropped any more. An empty list
and a list full of unreservable buffers both cause -EBUSY to be returned,
which is identical to the previous situation, because previously buffers
on the lru list were always guaranteed to be reservable.
This should work since currently ttm guarantees items on the lru are
always reservable, and reserving items blockingly with some bo held
are enough to cause you to run into a deadlock.
Currently this is not a concern since removal off the lru list and
reservations are always done with atomically, but when this guarantee
no longer holds, we have to handle this situation or end up with
possible deadlocks.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Replace the while loop with a simple for each loop, and only run the
delayed destroy cleanup if we can reserve the buffer first.
No race occurs, since lru lock is never dropped any more. An empty list
and a list full of unreservable buffers both cause -EBUSY to be returned,
which is identical to the previous situation.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
By removing the unlocking of lru and retaking it immediately, a race is
removed where the bo is taken off the swap list or the lru list between
the unlock and relock. As such the cleanup_refs code can be simplified,
it will attempt to call ttm_bo_wait non-blockingly, and if it fails
it will drop the locks and perform a blocking wait, or return an error
if no_wait_gpu was set.
The need for looping is also eliminated, since swapout and evict_mem_first
will always follow the destruction path, no new fence is allowed
to be attached. As far as I can see this may already have been the case,
but the unlocking / relocking required a complicated loop to deal with
re-reservation.
Changes since v1:
- Simplify no_wait_gpu case by folding it in with empty ddestroy.
- Hold a reservation while calling ttm_bo_cleanup_memtype_use again.
Changes since v2:
- Do not remove bo from lru list while waiting
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
If we fail to set the bit when needed we get some nice FDI link
training failures (AKA "black screen on VGA output").
While we don't really know how to properly choose whether we need to
set the bit or not (VBT?), just read the initial value set by the BIOS
and store it for later usage.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We need this code to init the PCH SSC refclk and the FDI registers.
The BIOS does this too and that's why VGA worked before this patch,
until you tried to suspend the machine...
This patch implements the "Sequence to enable CLKOUT_DP for FDI usage
and configure PCH FDI/IO" from our documentation.
v2:
- Squash Damien Lespiau's reset spelling fix on top.
- Add a comment that we don't need to bother about the ULT special
case Damien noticed, since ULT won't have VGA.
- Add a comment to rip out the SDV codepaths once haswell ships for
real.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> (v1)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The few places that care should have those checks instead.
This allows destruction of bo backed memory without a reservation.
It's required for being able to rework the delayed destroy path,
as it is no longer guaranteed to hold a reservation before unlocking.
However any previous wait is still guaranteed to complete, and it's
one of the last things to be done before the buffer object is freed.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This requires changing the order in ttm_bo_cleanup_refs_or_queue to
take the reservation first, as there is otherwise no race free way to
take lru lock before fence_lock.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Alex writes:
Pretty minor -next pull request. We some additional new bits waiting
internally for release. Hopefully Monday we can get at least some of
them out. The others will probably take a few more weeks.
Highlights of the current request:
- ELD registers for passing audio information to the sound hardware
- Handle GPUVM page faults more gracefully
- Misc fixes
Merge radeon test
* 'drm-next-3.8' of git://people.freedesktop.org/~agd5f/linux: (483 commits)
drm/radeon: bump driver version for new info ioctl requests
drm/radeon: fix eDP clk and lane setup for scaled modes
drm/radeon: add new INFO ioctl requests
drm/radeon/dce32+: use fractional fb dividers for high clocks
drm/radeon: use cached memory when evicting for vram on non agp
drm/radeon: add a CS flag END_OF_FRAME
drm/radeon: stop page faults from hanging the system (v2)
drm/radeon/dce4/5: add registers for ELD handling
drm/radeon/dce3.2: add registers for ELD handling
radeon: fix pll/ctrc mapping on dce2 and dce3 hardware
Linux 3.7-rc7
powerpc/eeh: Do not invalidate PE properly
Revert "drm/i915: enable rc6 on ilk again"
ALSA: hda - Fix build without CONFIG_PM
of/address: sparc: Declare of_iomap as an extern function for sparc again
PM / QoS: fix wrong error-checking condition
bnx2x: remove redundant warning log
vxlan: fix command usage in its doc
8139cp: revert "set ring address before enabling receiver"
MPI: Fix compilation on MIPS with GCC 4.4 and newer
...
Conflicts:
drivers/gpu/drm/exynos/exynos_drm_encoder.c
drivers/gpu/drm/exynos/exynos_drm_fbdev.c
drivers/gpu/drm/nouveau/core/engine/disp/nv50.c
This way we should be able to write mPHY registers using the Sideband
Interface in the next commit. Also fixed some syntax oddities in the
related code.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
smatch warning:
drivers/gpu/drm/i915/intel_display.c:7019 intel_set_mode() warn: function puts
500 bytes on stack
Refactor so that saved_mode and saved_hwmode are dynamically allocated as opposed
to being automatic variables. 500 bytes seems like it could run the potential for blowing
the kernel stack.
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: David Airlie <airlied@linux.ie>
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Commit 9ee32fea5f unconditionally prevents the CPU from entering idle states
until intel_dp_aux_ch completes for the first time, which never happens on my
DisplayPort-less intel gfx, causing the CPU to get rather hot.
Signed-off-by: Tomas Janousek <tomi@nomi.cz>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
More specifically, the LPT FDI RX only supports 8bpc and a maximum of
2 lanes, so anything above that won't work and should be rejected.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We were previously doing exactly what the "mode set sequence for CRT"
document mandates, but whenever we failed to train the link in the
first tentative, all the other subsequent retries always failed. In
one of my monitors that has 47 modes, I was usually getting around 3
failures when running "testdisplay -a".
After this patch, even if we fail in the first tentative, we can
succeed in the next ones. So now when running "testdisplay -a" I see
around 3 times the message "FDI link training done on step 1" and no
failures.
Notice that now the "retry" code looks a lot like the DP retry code.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Add requests to get the number of shader engines (SE) and
the number of SH per SE. These are needed for geometry
and tesselation shaders in the 3D driver as well as setting
up PA_SC_RASTER_CONFIG on SI asics.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Force the use of cached memory when evicting from vram on non agp
hardware. Also force write combine on agp hw. This is to insure
the minimum cache type change when allocating memory and improving
memory eviction especialy on pci/pcie hw.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Redirect invalid memory accesses to the default page
instead of locking up the memory controller. Also
enable the invalid memory access interrupts and
start spamming system log with it.
v2 (agd5f): fix up against 2 level PT changes
Signed-off-by: Christian König <deathsimple@vodafone.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
On a machine with bit17 swizzling, we need to store the bit17 of the
physical page address in put-pages. This requires a memory allocation,
on average less than a page, which may be difficult to satisfy is the
request to put-pages is on behalf of the shrinker. We could allow that
allocation to pull from the reserved memory pools, but it seems much
safer to preallocate the array for tiled objects on affected machines.
v2: Export i915_gem_object_needs_bit17_swizzle() for reuse.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Both the dp and fdi code use the exact same computations (ignore minor
differences in conversion between bits and bytes).
This makes it even more apparent that we have a _massive_ mess between
cpu transcoder/fdi link/pch transcoder and pch link settings. And also
that we have hilarious amounts of confusion between edp and dp
(despite that they're identical at a link level).
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This has originally been added in
commit 8db9d77b1b
Author: Zhenyu Wang <zhenyuw@linux.intel.com>
Date: Wed Apr 7 16:15:54 2010 +0800
drm/i915: Support for Cougarpoint PCH display pipeline
probably to combat issues with hw state left behind by the BIOS. And
indeed, I've checked out that specific revision, and there is no DP
support yet. So the pch dp transcoder won't be correctly disabled, and
that's important since it requires a rether special disable dance:
Just writing 0 to TRANS_DP_CTL won't cut it, since we need to select
the NONE port when disabling, too.
And indeed, things seem to still work, so let's just remove this.
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This could have happened with the old crtc helper based modeset code,
but can't happen any longer with the new code.
Hence put in a WARN and adjust the comment. If no one hits this, we
can eventually remove it (like a few other such cases across our
code).
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
17 ms is eerily close to 60 Hz ^-1
Unfortunately this goes back to the original DP enabling for ilk, and
unfortunately does not come with a reason for it's existance attached.
Some closer inspection of the code and DP specs shows that we set the
idle link pattern before we disable the port. And it seems like that
the DP spec (or at least our hw) only switch to the idle pattern on
the next vblank. Hence a vblank wait at this spot makes _much_ more
sense than a really long wait.
v2: Rebase fixup.
v3: Add comment requested by Paulo Zanoni saying that we don't really
know what this wait is for.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
While reading docs I've noticed that this special workaround to select
the 1.6 GHz DP clock only applies to pre-production ilk machines.
Since the registers we're touching here are rather undocumented and
might be harmful on later chips, rip it out.
For the Bspec reference of this w/a look in "vol4g CPU Display
Registers [DevILK]", Section 4.1.7.1 "DP_A—DisplayPort A
Control Register", "DP_PLL_Frequency_Select".
v2: Keep a debug message as a hint in case something regresses.
Requested by Chris Wilson.
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Now that we enable the cpu edp pll in intel_dp->pre_enable and no
longer in crtc_mode_set, we can also move the modeset part to the
intel_dp->mode_set callback. Previously this was not possible because
the encoder ->mode_set callbacks are called after the crtc mode set
callback.
v2: Rebase on top of copy&pasted hsw crtc_mode_set.
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Especially getting rid of all things lvds is ... great!
v2: Drop the two additional pre-hsw hunks noticed by Paulo Zanoni.
v3:
- handle DP ports correctly (spoted by Paulo)
- don't leave {} behind for a single-line block (again spotted by
Paulo)
- kill another if (IBX || CPT) block
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Before queuing the flip but crucially after attaching the unpin-work to
the crtc, we continue to setup the unpin-work. However, should the
hardware fire early, we see the connected unpin-work and queue the task.
The task then promptly runs and unpins the fb before we finish taking
the required references or even pinning it... Havoc.
To close the race, we use the flip-pending atomic to indicate when the
flip is finally setup and enqueued. So during the flip-done processing,
we can check more accurately whether the flip was expected.
v2: Add the appropriate mb() to ensure that the writes to the page-flip
worker are complete prior to marking it active and emitting the MI_FLIP.
On the read side, the mb should be enforced by the spinlocks.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: stable@vger.kernel.org
[danvet: Review the barriers a bit, we need a write barrier both
before and after updating ->pending. Similarly we need a read barrier
in the interrupt handler both before and after reading ->pending. With
well-ordered irqs only one barrier in each place should be required,
but since this patch explicitly sets out to combat spurious interrupts
with is staged activation of the unpin work we need to go full-bore on
the barriers, too. Discussed with Chris Wilson on irc and changes
acked by him.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>