* 'nouveau/drm-nouveau-next' of ../drm-nouveau-next:
drm/nouveau: fix __nouveau_fence_wait performance
drm/nv40: attempt to reserve just enough vram for all 32 channels
drm/nv50: check for vm traps on every gr irq
drm/nv50: decode vm faults some more
drm/nouveau: add nouveau_enum_find() util function
drm/nouveau: properly handle pushbuffer check failures
drm/nvc0: remove vm hack forcing large/small pages to not share a PDE
Commit 21e86c1c8a ("drm/nouveau: remove
cpu_writers lock") turned on lazy waits. Unfortunately
__nouveau_fence_wait was not optimized for this case and on HZ=100
kernel wasted up to 10 ms per call.
Depending on application, it led to 10-30% FPS regression.
Fix it.
Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Signed-off-by: Francisco Jerez <currojerez@riseup.net>
This also makes the fact we're giving 512MiB of GART space to all PCIE
boards explicit, although the vast majority (if not all) of them will
now have a ramin_rsvd_vram larger than 2MiB anyway.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Appears to be fixed with commit:
"drm/nv50-nvc0: make sure vma is definitely unmapped when destroying bo"
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
At least on my HP 2540p this is wrong at bootup, fine
at any other time once a lid event has occured. This is due to
_REG vs _INI ordering in the ACPI tables.
Signed-off-by: Dave Airlie <airlied@redhat.com>
* 'intel/drm-intel-next' of ../drm-next: (755 commits)
drm/i915: Only wait on a pending flip if we intend to write to the buffer
drm/i915/dp: Sanity check eDP existence
drm/i915: Rebind the buffer if its alignment constraints changes with tiling
drm/i915: Disable GPU semaphores by default
drm/i915: Do not overflow the MMADDR write FIFO
Revert "drm/i915: fix corruptions on i8xx due to relaxed fencing"
drm/i915: Don't save/restore hardware status page address register
drm/i915: don't store the reg value for HWS_PGA
drm/i915: fix memory corruption with GM965 and >4GB RAM
Linux 2.6.38-rc7
Revert "TPM: Long default timeout fix"
drm/i915: Re-enable GPU semaphores for SandyBridge mobile
drm/i915: Replace vblank PM QoS with "Interrupt-Based AGPBUSY#"
Revert "drm/i915: Use PM QoS to prevent C-State starvation of gen3 GPU"
drm/i915: Allow relocation deltas outside of target bo
drm/i915: Silence an innocuous compiler warning for an unused variable
fs/block_dev.c: fix new kernel-doc warning
ACPI: Fix build for CONFIG_NET unset
mm: <asm-generic/pgtable.h> must include <linux/mm_types.h>
x86: Use u32 instead of long to set reset vector back to 0
...
Conflicts:
drivers/gpu/drm/i915/i915_gem.c
So we used to use lpfn directly to restrict VRAM when we couldn't
access the unmappable area, however this was removed in
93225b0d7b as it also restricted
the gtt placements. However it was only later noticed that this
broke on some hw.
This removes the active_vram_size, and just explicitly sets it
when it changes, TTM/drm_mm will always use the real_vram_size,
and the active vram size will change the TTM size used for lpfn
setting.
We should re-work the fpfn/lpfn to per-placement at some point
I suspect, but that is too late for this kernel.
Hopefully this addresses:
https://bugs.freedesktop.org/show_bug.cgi?id=35254
v2: fix reported useful VRAM size to userspace to be correct.
Signed-off-by: Dave Airlie <airlied@redhat.com>
We've been getting reports of complete system lockups with rv3xx hw on
AGP and PCIE when running gnome-shell or kwin with compositing.
It appears the hw really doesn't like setting these registers while
stuff is running, this moves the setting of the registers into the modeset
since they aren't required to be changed anywhere else.
fixes: https://bugs.freedesktop.org/show_bug.cgi?id=35183
Reported-and-tested-by: Álmos <aaalmosss@gmail.com
Signed-off-by: Dave Airlie <airlied@redhat.com>
Looks like these got passed over with both being merged at the same
time but not quite meeting in the middle.
should fix: https://bugs.freedesktop.org/show_bug.cgi?id=34137
along with Michael's phoronix article.
Reported-by: Chi-Thanh Christopher Nguyen
Article-written-by: Michael Larabel @ phoronix
Signed-off-by: Dave Airlie <airlied@redhat.com>
This reverts commit 951f3512db
drm/i915: Do not handle backlight combination mode specially
since this commit introduced other regressions due to untouched LBPC
register, e.g. the backlight dimmed after resume.
In addition to the revert, this patch includes a fix for the original
issue (weird backlight levels) by removing the wrong bit shift for
computing the current backlight level.
Also, including typo fixes (lpbc -> lbpc).
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=34524
Acked-by: Indan Zupancic <indan@nul.nu>
Reviewed-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: <stable@kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* ickle/drm-intel-fixes:
drm/i915: Rebind the buffer if its alignment constraints changes with tiling
drm/i915: Disable GPU semaphores by default
drm/i915: Do not overflow the MMADDR write FIFO
Revert "drm/i915: fix corruptions on i8xx due to relaxed fencing"
The per-vm mutex doesn't prevent this completely, a flush coming from the
BAR VM could potentially happen at the same time as one for the channel
VM. Not to mention that if/when we get per-client/channel VM, this will
happen far more frequently.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
TTM assumes an error condition from man->func->get_node() means that
something went horribly wrong, and causes it to bail.
The driver is supposed to return 0, and leave mm_node == NULL to
signal that it couldn't allocate any memory.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Some hardware claims to have both an LVDS panel and an eDP output.
Whilst this may be true in a rare case, more often it is just broken
hardware. If we see an eDP device we know that it must be connected and
so we can confirm its existence with a simple probe.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=34165
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=24822
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Early gen3 and gen2 chipset do not have the relaxed per-surface tiling
constraints of the later chipsets, so we need to check that the GTT
alignment is correct for the new tiling. If it is not, we need to
rebind.
Reported-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Andi Kleen narrowed his GPU hangs on his Sugar Bay (SNB desktop) rev 09
down to the use of GPU semaphores, and we already know that they appear
broken up to Huron River (mobile) rev 08. (I'm optimistic that disabling
GPU semaphores is simply hiding another bug by the latency and
side-effects of the additional device interaction it introduces...)
However, use of semaphores is a massive performance improvement... Only
as long as the system remains stable. Enable at your peril.
Reported-by: Andi Kleen <andi-fd@firstfloor.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=33921
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Whilst the GT is powered down (rc6), writes to MMADDR are placed in a
FIFO by the System Agent. This is a limited resource, only 64 entries, of
which 20 are reserved for Display and PCH writes, and so we must take
care not to queue up too many writes. To avoid this, there is counter
which we can poll to ensure there are sufficient free entries in the
fifo.
"Issuing a write to a full FIFO is not supported; at worst it could
result in corruption or a system hang."
Reported-and-Tested-by: Matt Turner <mattst88@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=34056
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
This reverts commit c2e0eb1670.
As it turns out, userspace already depends upon being able to enable
tiling on existing bo which it promises to be large enough for its
purposes i.e. it will not access beyond the end of the last full-tile
row.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=35016
Reported-and-tested-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
We're coming to see a need to have a set of generic capability checks in
the core DRM, in addition to the driver-specific ioctls that already
exist.
This patch defines an ioctl to do as such, but does not yet define any
capabilities.
[airlied: drop the driver callback for now.]
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
The nv30/nv40 3d driver is about to start using DMA_FENCE from the 3D
object which, it turns out, doesn't like its DMA object to not be
aligned to a 4KiB boundary.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
hdmi 1.3 raises the max clock from 165 Mhz to 340 Mhz.
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
These should be handled by the clear_state setup, but set them
directly as well just to be sure.
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Cayman is different enough from evergreen to warrant it's own functions.
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Cayman asics have 3 ring buffers:
ring 0 supports both gfx and compute
rings 1 and 2 are compute only
At the moment we only support ring 0.
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This patch sets up the gart in legacy mode. We
probably want to switch to full VM mode at some point.
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
The MC ucode is no longer loaded by the vbios
tables as on previous asics. It now must be loaded
by the driver.
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Cayman is DCE5 display plus a new 4-way shader block.
3D state programming is similar to evergreen.
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
It's cleaned before saving and re-initialized after restoring.
So don't need to save/restore it. And also new chip has new address
for hardware status page register, don't write to old address.
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
On a Thinkpad x61s, I noticed some memory corruption when
plugging/unplugging the external VGA connection. The symptoms are that
4 bytes at the beginning of a page get overwritten by zeroes.
The address of the corruption varies when rebooting the machine, but
stays constant while it's running (so it's possible to repeatedly write
some data and then corrupt it again by plugging the cable).
Further investigation revealed that the corrupted address is
(dev_priv->status_page_dmah->busaddr & 0xffffffff), ie. the beginning of
the hardware status page of the i965 graphics card, cut to 32 bits.
So it seems that for some memory access, the hardware uses only 32 bit
addressing. If the hardware status page is located >4GB, this
corrupts unrelated memory.
Signed-off-by: Jan Niehusmann <jan@gondor.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: stable@kernel.org
This seems to be running stably on my test laptop, so hopefully the
reported hangs where just symptoms of other bugs.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
I stumbled over this magic bit in the gen3 INSTPM:
Bit11 Interrupt-Based AGPBUSY# Enable:
‘0’ = Pending GMCH interrupts will not cause AGPBUSY# assertion.
‘1’ = Pending GMCH interrupts will cause AGPBUSY# assertion and hence
can cause the CPU to exit C3. There is no suppression of cacheable
writes.
Note that in either case in C3 the interrupts are not lost. They will be
forwarded to the ICH when the GMCH is out of C3.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Tested-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: stable@kernel.org
Using PM latency request turns out to be very fragile and only works for
some systems, depending upon the ACPI implementation. However, I've
stumbled across a promising bit in INSTPM: "Interrupt-Based AGPBUSY#".
This reverts commit b0b544cd37.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Userspace has a legitimate requirement to use a delta that points to
outside of the target bo, and so we need to enable this. (As this is an
abi break, albeit a relaxation of the current restrictions, mark the change
with a new flag.)
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
drivers/gpu/drm/i915/i915_irq.c: In function ‘ironlake_irq_postinstall’:
drivers/gpu/drm/i915/i915_irq.c:1618: warning: unused variable ‘pipe’
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Somehow fixes a misrendering + hang at GDM startup on my NVA8...
My first guess would have been stale TLB entries laying around that a new
bo then accidentally inherits. That doesn't make a great deal of sense
however, as when we mapped the pages for the new bo the TLBs would've
gotten flushed anyway.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
have to read values from the IB in order as we could cross
a page boundary at any time and won't be able to go backwards.
Signed-off-by: Dave Airlie <airlied@redhat.com>
There are a bunch of off by one errors in the sanity checks here.
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
The immediate benefit of doing this is that on NV50 and up, the GPU
virtual address of any buffer is now constant, regardless of what
memtype they're placed in.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
This adds a table of known nvc0 memtypes, and modifies the validity check
to allow any non-compressed type. Support for Z compression will come at
a later point.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Upcoming patches are going to enable full support for buffers that keep
a constant GPU virtual address whenever they're validated for use by
the GPU.
In order for this to work properly while keeping support for large pages,
we need to know if it's ever going to be possible for a buffer to end
up in GART, and if so, disable large pages for the buffer's VMA.
This is a new restriction that's not present in earlier kernel's, but
should not break userspace as the current code never attempts to validate
buffers into a memtype other than it was created with.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
'mappable' isn't really used at all, nor is it necessary anymore as the
bo code is capable of moving buffers to mappable vram as required.
'no_vm' isn't necessary anymore either, any places that don't want to be
mapped into a GPU address space should allocate the VRAM directly instead.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
The code was supposed to print registers around 0x405018 (which is read
earlier), not 0x405818.
Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
The hw doesn't really appear to be designed to be used the way we have to
use it due to DRI2's design. This leads us to having to keep the flipped
fb support active at all times.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Also imports a couple of helper functions that'll be used to implement
page flipping in the following commits..
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
This should prevent a number of races from occuring, the most obvious of
which will be exposed when we start making use of the "display sync" evo
channel for page flipping. The DS channel will reject any command stream
that doesn't completely agree with the current "master" state.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
We need to be able to have the bh run while possibly spinning waiting for
the EVO notifier to signal. This apparently happens in some circumstances
with preempt disabled, so our workqueue was never being run.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
The nv50 display isr bh needs to be converted to a tasklet, which means
we can't sleep anymore. The places we execute vbios init tables are
rare, and not in any way performance critical, so this isn't a huge
problem.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
With cmwq, there's no reason for nouveau to use a dedicated workqueue.
Drop dev_priv->wq and use system_wq instead. Each work item is sync
flushed when the containing structure is unregistered/destroyed.
Note that this change also makes sure that nv50_gpio_handler is not
freed while the contained work item is still running.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: David Airlie <airlied@linux.ie>
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
This gives a small, but noticeable performance gain at lower performance
levels, and unchanged at the higher ones.
With this commit, we're now using the same timeslice size as the NVIDIA
binary driver currently does, and dropping an unknown bit that NVIDIA
no longer appear to set.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
We may well be making more use of semaphores in the future, having the
entire VM available makes requiring DMA objects for each and every
semaphore block unnecessary.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>