linux/drivers/gpu/drm
Daniel Vetter 7abb690a0e drm/i915: Fix spurious -EIO/SIGBUS on wedged gpus
Chris Wilson noticed that since

commit 1f83fee08d [v3.9]
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Thu Nov 15 17:17:22 2012 +0100

    drm/i915: clear up wedged transitions

X can again get -EIO when it does not expect it. And even worse score
a SIGBUS when accessing gtt mmaps. The established ABI is that we
_only_ return an -EIO from execbuf - all other ioctls should just
work. And since the reset code moves all bos out of gpu domains and
clears out all the last_seqno/ring tracking there really shouldn't be
any reason for non-execbuf code to ever touch the hw and see an -EIO.

After some extensive discussions we've noticed that these spurios -EIO
are caused by i915_gem_wait_for_error:

http://www.mail-archive.com/intel-gfx@lists.freedesktop.org/msg20540.html

That is easy to fix by returning 0 instead of -EIO, since grabbing the
dev->struct_mutex does not yet mean that we actually want to touch the
hw. And so there is no reason at all to fail with -EIO.

But that's not the entire since, since often (at least it's easily
googleable) dmesg indicates that the reset fails and we declare the
gpu wedged. Then, quite a bit later X wakes up with the "Timed out
waiting for the gpu reset to complete" DRM_ERROR message in
wait_for_errror and brings down the desktop with an -EIO/SIGBUS.

So clearly we're missing a wakeup somewhere, since the gpu reset just
doesn't take 10 seconds to complete. And indeed we're do handle the
terminally wedged state wrong.

Fix this all up.

References: https://bugs.freedesktop.org/show_bug.cgi?id=63921
References: https://bugs.freedesktop.org/show_bug.cgi?id=64073
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Damien Lespiau <damien.lespiau@intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-06-03 14:35:18 +02:00
..
ast drm/ast: deal with bo reserve fail in dirty update path 2013-05-02 12:46:47 +10:00
cirrus drm/cirrus: deal with bo reserve fail in dirty update path 2013-05-02 12:46:56 +10:00
exynos Merge branch 'exynos-drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos into drm-next 2013-05-24 10:14:57 +10:00
gma500 Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linux 2013-05-02 19:40:34 -07:00
i2c drm/i2c: nxp-tda998x (v3) 2013-02-19 17:57:44 -05:00
i810
i915 drm/i915: Fix spurious -EIO/SIGBUS on wedged gpus 2013-06-03 14:35:18 +02:00
mga
mgag200 Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linux 2013-05-13 07:59:59 -07:00
nouveau Merge remote-tracking branch 'pfdo/drm-fixes' into drm-next 2013-05-24 10:12:22 +10:00
omapdrm Merge branch 'fbdev-3.10-fixes' of git://gitorious.org/linux-omap-dss2/linux into linux-fbdev/for-3.10-fixes 2013-05-29 17:00:34 +08:00
qxl drm/qxl: fix build warnings on 32-bit 2013-05-31 12:45:09 +10:00
r128
radeon radeon: use max_bus_speed to activate gen2 speeds 2013-05-29 12:36:12 -04:00
savage
shmobile drm/shmob: use drm_send_vblank_event() helper 2013-05-22 09:13:41 +10:00
sis drm/sis: convert to idr_alloc() 2013-02-27 19:10:16 -08:00
tdfx
tilcdc Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linux 2013-05-02 19:40:34 -07:00
ttm drm: use vma_pages() to replace (vm_end - vm_start) >> PAGE_SHIFT 2013-04-16 13:14:00 +10:00
udl Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linux 2013-05-02 19:40:34 -07:00
via drm/via: convert to idr_alloc() 2013-02-27 19:10:16 -08:00
vmwgfx drm/vmwgfx: convert to idr_alloc() 2013-02-27 19:10:16 -08:00
ati_pcigart.c
drm_agpsupport.c
drm_auth.c
drm_buffer.c
drm_bufs.c
drm_cache.c lib/scatterlist: sg_page_iter: support sg lists w/o backing pages 2013-03-27 17:13:44 +01:00
drm_context.c drm: convert to idr_alloc() 2013-02-27 19:10:15 -08:00
drm_crtc_helper.c drm: Only print a debug message when the polled connector has changed 2013-05-13 12:13:06 +10:00
drm_crtc.c drm: Make the HPD status updates debug logs more readable 2013-05-13 12:12:57 +10:00
drm_debugfs.c
drm_dma.c
drm_dp_helper.c
drm_drv.c drm: Use names of ioctls in debug traces 2013-05-10 14:46:50 +10:00
drm_edid_load.c drm: Add 1600x1200 (UXGA) screen resolution to the built-in EDIDs 2013-04-12 14:06:16 +10:00
drm_edid.c drm/edid: Check both 60Hz and 59.94Hz when looking for a CEA mode 2013-04-26 10:25:54 +10:00
drm_encoder_slave.c drm: refactor call to request_module 2013-05-10 14:46:03 +10:00
drm_fb_cma_helper.c Merge branch 'tilcdc-next' of git://people.freedesktop.org/~robclark/linux into drm-next 2013-02-21 09:31:47 +10:00
drm_fb_helper.c Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linux 2013-05-02 19:40:34 -07:00
drm_fops.c drm: correctly restore mappings if drm_open fails 2013-04-03 06:44:38 +10:00
drm_gem_cma_helper.c drm/cma: add debugfs helpers 2013-02-17 17:55:42 -05:00
drm_gem.c drm/prime: keep a reference from the handle to exported dma-buf (v6) 2013-05-01 09:30:15 +10:00
drm_global.c
drm_hashtab.c hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
drm_info.c
drm_ioc32.c
drm_ioctl.c
drm_irq.c drm: small fix in drm_send_vblank_event() 2013-02-17 17:55:42 -05:00
drm_lock.c
drm_memory.c
drm_mm.c drm/mm: fix dump table BUG 2013-04-30 15:15:58 +02:00
drm_modes.c Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linux 2013-05-13 07:59:59 -07:00
drm_pci.c drm: Silence some sparse warnings 2013-04-30 10:02:25 +10:00
drm_platform.c
drm_prime.c drm/prime: warn for non-empty handle lookup list during drm file release 2013-05-01 16:08:18 +10:00
drm_proc.c drm: proc: Use remove_proc_subtree() 2013-05-01 17:29:44 -04:00
drm_scatter.c
drm_stub.c drm: proc: Use minor->index to label things, not PDE->name 2013-05-01 17:29:44 -04:00
drm_sysfs.c drm: remove legacy drm_connector_property fxns 2012-11-30 10:30:48 -06:00
drm_trace_points.c
drm_trace.h
drm_usb.c drm/usb: bind driver to correct device 2013-02-07 12:37:41 +10:00
drm_vm.c drm: export drm_vm_open_locked 2013-04-26 10:20:00 +10:00
Kconfig drm/tegra: Move drm to live under host1x 2013-04-22 12:39:11 +02:00
Makefile drm/tegra: Move drm to live under host1x 2013-04-22 12:39:11 +02:00
README.drm

************************************************************
* For the very latest on DRI development, please see:      *
*     http://dri.freedesktop.org/                          *
************************************************************

The Direct Rendering Manager (drm) is a device-independent kernel-level
device driver that provides support for the XFree86 Direct Rendering
Infrastructure (DRI).

The DRM supports the Direct Rendering Infrastructure (DRI) in four major
ways:

    1. The DRM provides synchronized access to the graphics hardware via
       the use of an optimized two-tiered lock.

    2. The DRM enforces the DRI security policy for access to the graphics
       hardware by only allowing authenticated X11 clients access to
       restricted regions of memory.

    3. The DRM provides a generic DMA engine, complete with multiple
       queues and the ability to detect the need for an OpenGL context
       switch.

    4. The DRM is extensible via the use of small device-specific modules
       that rely extensively on the API exported by the DRM module.


Documentation on the DRI is available from:
    http://dri.freedesktop.org/wiki/Documentation
    http://sourceforge.net/project/showfiles.php?group_id=387
    http://dri.sourceforge.net/doc/

For specific information about kernel-level support, see:

    The Direct Rendering Manager, Kernel Support for the Direct Rendering
    Infrastructure
    http://dri.sourceforge.net/doc/drm_low_level.html

    Hardware Locking for the Direct Rendering Infrastructure
    http://dri.sourceforge.net/doc/hardware_locking_low_level.html

    A Security Analysis of the Direct Rendering Infrastructure
    http://dri.sourceforge.net/doc/security_low_level.html