Commit Graph

4834 Commits

Author SHA1 Message Date
Maarten Lankhorst
0ea0397a3a Merge remote-tracking branch 'drm/drm-next' into drm-misc-next
drm-next is forwarded to v4.20-rc1, and we need this to make
a patch series apply.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
2018-11-13 10:59:10 +01:00
Linus Torvalds
bc6080ae38 drm, i915, amdgpu, bridge + core quirk
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJb2+C2AAoJEAx081l5xIa+tUYP/060xa9VeKe0M11fMCCkchNV
 TQfPQ8avy2/Tt9Mlly/XPMuBZ6FCTVBsseCxIdzl4rZAfm0SDVaTB4Z7XBBplXkv
 R50Je95Fyv0/lnUbagI5qRtc+O6xdkuNIMHWlRRWGlCEbFE2atey9R2yiq/1q2J9
 4NcGjykJ7d1vCgMm/9/iV4pLrxrRSxvizO4mS+SQpJ/0fEwJSZSvKdKp9LU/LTgx
 qZGSBegPkR+j6znsMkuPYxQtMrhz+Pr49qp7Qm6wQV/K2Y0z7CoHZFikvpMp38i5
 MQfYE0yDN3pKm5THt+RjV8laPIpnh2z8bcotUCGcjFWtGHQBfyQ4IaXkDPujcD6m
 vRm4hhxP9DvNCq5eWFz3vU2n8NWf54BYFF8dPxcoghRkE6VB2CGpr+o9dzxX16QJ
 oEi2Ak3G7RXa8Hn/C9EOSs7yijjKD8GZ8gR5osmI10EitIH7mrG9Xghts5XlcIJj
 Ls30t0jPavO+t8V2/1HYP2vfSNvi/X953+pQIdN8Rk2vp0ziFdstzLg6ai8E1Hvl
 jyvOF3/xstwIP8nFvYaNHkly2iVRC3rgj4GDncSe2U6kt8N4/aA+FemUwXJKBbFN
 hVwPZuSlrCw5TE7tO3ulJon4uAWH5WO/gIOMn0+e+pBLJtUXQtWAR+JC2D+RDZO9
 V+V7GuHIbIV6xarCwLjx
 =5lmL
 -----END PGP SIGNATURE-----

Merge tag 'drm-next-2018-11-02' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
 "Pretty much a normal fixes pull pre-rc1, mostly amdgpu fixes, one i915
  link training regression fix, and a couple of minor panel/bridge fixes
  and a panel quirk"

* tag 'drm-next-2018-11-02' of git://anongit.freedesktop.org/drm/drm: (37 commits)
  drm/amdgpu: revert "enable gfxoff in non-sriov and stutter mode by default"
  drm/amd/pp: Print warning if od_sclk/mclk out of range
  drm/amd/pp: Fix pp_sclk/mclk_od not work on Vega10
  drm/amd/pp: Fix pp_sclk/mclk_od not work on smu7
  drm/amd/powerplay: no MGPU fan boost enablement on DPM disabled
  drm/amdgpu: Fix skipping hangged job reset during gpu recover.
  drm/amd/powerplay: revise Vega20 pptable version check
  drm/amd/display: set backlight level limit to 1
  drm/panel: simple: Innolux TV123WAM is actually P120ZDG-BF1
  dt-bindings: drm/panel: simple: Innolux TV123WAM is actually P120ZDG-BF1
  drm/bridge: ti-sn65dsi86: Remove the mystery delay
  drm/panel: simple: Add "no-hpd" delay for Innolux TV123WAM
  drm/panel: simple: Support panels with HPD where HPD isn't connected
  dt-bindings: drm/panel: simple: Add no-hpd property
  drm/edid: Add 6 bpc quirk for BOE panel.
  drm/amdgpu: fix reporting of failed msg sent to SMU (v2)
  drm/amdgpu: Fix compute ring 1.0.0 failure after reset
  drm/amdgpu: fix VM leaf walking
  drm/amdgpu: fix amdgpu_vm_fini
  drm/amd/powerplay: commonize the API for retrieving current clocks
  ...
2018-11-02 10:58:20 -07:00
Linus Torvalds
53b3b6bbfd drm pull for 4.20-rc1
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJbz8FfAAoJEAx081l5xIa+t8AQAJ6KaMM7rYlRDzIr1Vuh0++2
 kFmfEQnSZnOWxO+zpyQpCJQr+/1aAHQ6+QzWq+/fX/bwH01H1Q4+z6t+4QoPqYlw
 rUYXZYRxnqGVPYx8hwNLmRbJcsXMVKly1SemBXIoabKkEtPNX5AZ4FXCR2WEbsV3
 /YLxsYQv2KMd8aoeC9Hupa07Jj9GfOtEo0a9B1hKmo+XiF9HqPadxaofqOpQ6MCh
 54itBgP7Kj4mwwr8KxG2JCNJagG5aG8q3yiEwaU5b0KzWya4o1wrOAJcBaEAIxpj
 JAgPde+f2L6w9dQbBBWeVFYKNn0jJqJdmbPs5Ek3i/NNFyx01Mn/3vlTZoRUqJN7
 TGXwOI/BWz1iTaHyFPqVH6RPQAoUUDeCwgHkXonogFxvQLpiFG+dRNqxue0XVUMX
 9tDSdZefWPoH3n9J/gDhwbV2Qbw/2n6yzCRYCb8HkqX1Y1JTmdYVgKvcnOwyYwsJ
 QzcVkWUJ31UAaZcTLCEW6SVqcUR0mso3LJAPSKp2NJiVLL8mSd/ViUTUbxRNkkXf
 H0abVGDjWAAZaT5uqNVqg4kV1Vc4Kj+/9QtspW4ktGezOz9DsctwJtfhTgOmT8Fx
 zlEwWmAbf1iJP9UgqI7r4+Nq24saqUYmIX0bowEasLIRO+l14Pf9mQJjgKRcMs/j
 SK4W5EreSFosKsxtQU4H
 =3yxi
 -----END PGP SIGNATURE-----

Merge tag 'drm-next-2018-10-24' of git://anongit.freedesktop.org/drm/drm

Pull drm updates from Dave Airlie:
 "This is going to rebuild more than drm as it adds a new helper to
  list.h for doing bulk updates. Seemed like a reasonable addition to
  me.

  Otherwise the usual merge window stuff lots of i915 and amdgpu, not so
  much nouveau, and piles of everything else.

  Core:
   - Adds a new list.h helper for doing bulk list updates for TTM.
   - Don't leak fb address in smem_start to userspace (comes with EXPORT
     workaround for people using mali out of tree hacks)
   - udmabuf device to turn memfd regions into dma-buf
   - Per-plane blend mode property
   - ref/unref replacements with get/put
   - fbdev conflicting framebuffers code cleaned up
   - host-endian format variants
   - panel orientation quirk for Acer One 10

  bridge:
   - TI SN65DSI86 chip support

  vkms:
   - GEM support.
   - Cursor support

  amdgpu:
   - Merge amdkfd and amdgpu into one module
   - CEC over DP AUX support
   - Picasso APU support + VCN dynamic powergating
   - Raven2 APU support
   - Vega20 enablement + kfd support
   - ACP powergating improvements
   - ABGR/XBGR display support
   - VCN jpeg support
   - xGMI support
   - DC i2c/aux cleanup
   - Ycbcr 4:2:0 support
   - GPUVM improvements
   - Powerplay and powerplay endian fixes
   - Display underflow fixes

  vmwgfx:
   - Move vmwgfx specific TTM code to vmwgfx
   - Split out vmwgfx buffer/resource validation code
   - Atomic operation rework

  bochs:
   - use more helpers
   - format/byteorder improvements

  qxl:
   - use more helpers

  i915:
   - GGTT coherency getparam
   - Turn off resource streamer API
   - More Icelake enablement + DMC firmware
   - Full PPGTT for Ivybridge, Haswell and Valleyview
   - DDB distribution based on resolution
   - Limited range DP display support

  nouveau:
   - CEC over DP AUX support
   - Initial HDMI 2.0 support

  virtio-gpu:
   - vmap support for PRIME objects

  tegra:
   - Initial Tegra194 support
   - DMA/IOMMU integration fixes

  msm:
   - a6xx perf improvements + clock prefix
   - GPU preemption optimisations
   - a6xx devfreq support
   - cursor support

  rockchip:
   - PX30 support
   - rgb output interface support

  mediatek:
   - HDMI output support on mt2701 and mt7623

  rcar-du:
   - Interlaced modes on Gen3
   - LVDS on R8A77980
   - D3 and E3 SoC support

  hisilicon:
   - misc fixes

  mxsfb:
   - runtime pm support

  sun4i:
   - R40 TCON support
   - Allwinner A64 support
   - R40 HDMI support

  omapdrm:
   - Driver rework changing display pipeline ordering to use common code
   - DMM memory barrier and irq fixes
   - Errata workarounds

  exynos:
   - out-bridge support for LVDS bridge driver
   - Samsung 16x16 tiled format support
   - Plane alpha and pixel blend mode support

  tilcdc:
   - suspend/resume update

  mali-dp:
   - misc updates"

* tag 'drm-next-2018-10-24' of git://anongit.freedesktop.org/drm/drm: (1382 commits)
  firmware/dmc/icl: Add missing MODULE_FIRMWARE() for Icelake.
  drm/i915/icl: Fix signal_levels
  drm/i915/icl: Fix DDI/TC port clk_off bits
  drm/i915/icl: create function to identify combophy port
  drm/i915/gen9+: Fix initial readout for Y tiled framebuffers
  drm/i915: Large page offsets for pread/pwrite
  drm/i915/selftests: Disable shrinker across mmap-exhaustion
  drm/i915/dp: Link train Fallback on eDP only if fallback link BW can fit panel's native mode
  drm/i915: Fix intel_dp_mst_best_encoder()
  drm/i915: Skip vcpi allocation for MSTB ports that are gone
  drm/i915: Don't unset intel_connector->mst_port
  drm/i915: Only reset seqno if actually idle
  drm/i915: Use the correct crtc when sanitizing plane mapping
  drm/i915: Restore vblank interrupts earlier
  drm/i915: Check fb stride against plane max stride
  drm/amdgpu/vcn:Fix uninitialized symbol error
  drm: panel-orientation-quirks: Add quirk for Acer One 10 (S1003)
  drm/amd/amdgpu: Fix debugfs error handling
  drm/amdgpu: Update gc_9_0 golden settings.
  drm/amd/powerplay: update PPtable with DC BTC and Tvr SocLimit fields
  ...
2018-10-28 17:49:53 -07:00
Christian König
ca05359f1e dma-buf: allow reserving more than one shared fence slot
Let's support simultaneous submissions to multiple engines.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Link: https://patchwork.kernel.org/patch/10626149/
2018-10-25 13:45:07 +02:00
Lyude Paul
7b0f61e91b drm/nouveau: Fix nv50_mstc->best_encoder()
As mentioned in the previous commit, we currently prevent new modesets
on recently-removed MST connectors by returning no encoder from our
->best_encoder() callback once the MST port has disappeared. This is
wrong however, because it prevents legacy modesetting users from being
able to disable CRTCs on MST connectors after the connector's respective
topology has disappeared.

So, fix this by instead by just always returning a valid encoder.

Changes since v2:
- Remove usage of atomic MST helper for now, since that got replaced
  with a much simpler solution

Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Cc: stable@vger.kernel.org
Link: https://patchwork.freedesktop.org/patch/msgid/20181008232437.5571-3-lyude@redhat.com
(cherry picked from commit e87b0bbc9f)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2018-10-19 11:46:46 +03:00
Gustavo A. R. Silva
74a07c0a59 drm/nouveau/secboot/acr: fix memory leak
In case memory resources for *bl_desc* were allocated, release
them before return.

Addresses-Coverity-ID: 1472021 ("Resource leak")
Fixes: 0d46690155 ("drm/nouveau/secboot/acr: Remove VLA usage")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2018-10-11 09:54:10 +10:00
Ilia Mirkin
9340d77f53 drm/nouveau/disp: take sink support into account for exposing 594mhz
Scrambling is required for supporting any mode over 340MHz. If it's not
supported, reject any modes that would require it.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2018-10-11 09:54:10 +10:00
Ilia Mirkin
7a406f8a62 drm/nouveau/disp: add support for setting scdc parameters for high modes
When SCDC is supported, make sure that we configure the GPU and monitor
to the same parameters.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2018-10-11 09:54:10 +10:00
Ilia Mirkin
a971558c29 drm/nouveau/disp: keep track of high-speed state, program into clock
The register programmed by the clock method needs to contain a different
setting for the link speed as well as special divider settings.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2018-10-11 09:54:10 +10:00
Ilia Mirkin
4834e05049 drm/nouveau/disp/gm200-: add scdc parameter setter
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2018-10-11 09:54:10 +10:00
Ilia Mirkin
4126b99e74 drm/nouveau/disp: add a way to configure scrambling/tmds for hdmi 2.0
High pixel clocks are required to use a 40 TMDS divider instead of 10,
and even low ones may optionally use scrambling depending on device
support.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2018-10-11 09:54:10 +10:00
Lyude Paul
cfea88a4d8 drm/nouveau: Start using new drm_dev initialization helpers
Per the documentation in drm_get_pci_dev(), this function is deprecated
and shouldn't be used anymore. As it turns out, we're going to need to
stop using drm_get_pci_dev() anyway in order to allow us to turn off the
card before full system shutdowns, otherwise we'll hit race conditions
with userspace while trying to tear down the card on shutdown.

So, start using drm_dev_get() and drm_dev_put(), and just turn our
load/unload callbacks into open coded init/fini() functions.

Signed-off-by: Lyude Paul <lyude@redhat.com>
Cc: Karol Herbst <kherbst@redhat.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2018-10-11 09:54:10 +10:00
Lyude Paul
c4cee69a44 drm/nouveau: Fix potential memory leak in nouveau_drm_load()
We forget to free drm in all instances of failure, and additionally also
forget to destroy the master client if the other client fails
initialization.

Signed-off-by: Lyude Paul <lyude@redhat.com>
Cc: Karol Herbst <kherbst@redhat.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2018-10-11 09:54:10 +10:00
Lyude Paul
e15e4c13e5 drm/nouveau: Refactor nvXX_backlight_init()
There's literally no difference between any of the backlight init
functions besides the backlight properties they set and the backlight
callbacks that they set, so move all of the duplicated backlight init
code out of there and into nouveau_backlight_init().

This gets rid of a lot of copy pasta!

Changes since v1:
- Some of the pre-refactor callbacks were storing nv_encoder in callback
  data for the backlight devices that they registered, as opposed to
  nouveau_drm. This got missed and caused some bugs that didn't
  originally appear on my setup (NULL kernel derefs) for some reason.
  So, fix this by finding the nouveau_encoder in
  nouveau_backlight_init(), and using that as the callback data for all
  gens instead even if they don't care about the encoder.

Signed-off-by: Lyude Paul <lyude@redhat.com>
Cc: Jeffery Miller <jmiller@neverware.com>
Cc: Karol Herbst <kherbst@redhat.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2018-10-11 09:54:10 +10:00
Lyude Paul
f76e174bd3 drm/nouveau: Cleanup indenting in nouveau_backlight.c
Still no functional changes.

Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2018-10-11 09:54:10 +10:00
Lyude Paul
a4e05f415e drm/nouveau/drm/nouveau: s/nouveau_backlight_exit/nouveau_backlight_fini/
More consistent with the rest of the codebase, no functional changes
here.

Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2018-10-11 09:54:10 +10:00
Lyude Paul
6d757753ce drm/nouveau: Move backlight device into nouveau_connector
Currently module unloading is broken in nouveau due to a rather annoying
race condition resulting from nouveau_backlight.c having gone a bit
stale over time:

[ 1960.791143] ==================================================================
[ 1960.791394] BUG: KASAN: use-after-free in nouveau_backlight_exit+0x112/0x150 [nouveau]
[ 1960.791460] Read of size 4 at addr ffff88075accf350 by task zsh/11185
[ 1960.791521]
[ 1960.791545] CPU: 7 PID: 11185 Comm: zsh Kdump: loaded Tainted: G           O      4.18.0Lyude-Test+ #4
[ 1960.791580] Hardware name: LENOVO 20EQS64N0B/20EQS64N0B, BIOS N1EET79W (1.52 ) 07/13/2018
[ 1960.791628] Call Trace:
[ 1960.791680]  dump_stack+0xa4/0xfd
[ 1960.791721]  print_address_description+0x71/0x239
[ 1960.791833]  ? nouveau_backlight_exit+0x112/0x150 [nouveau]
[ 1960.791877]  kasan_report.cold.6+0x242/0x2fe
[ 1960.791919]  __asan_report_load4_noabort+0x19/0x20
[ 1960.792012]  nouveau_backlight_exit+0x112/0x150 [nouveau]
[ 1960.792081]  nouveau_display_destroy+0x76/0x150 [nouveau]
[ 1960.792150]  nouveau_drm_device_fini+0xb7/0x190 [nouveau]
[ 1960.792265]  nouveau_drm_device_remove+0x14b/0x1d0 [nouveau]
[ 1960.792347]  ? nouveau_cli_work_queue+0x2e0/0x2e0 [nouveau]
[ 1960.792378]  ? trace_hardirqs_on_caller+0x38b/0x570
[ 1960.792406]  ? trace_hardirqs_on+0xd/0x10
[ 1960.792472]  nouveau_drm_remove+0x37/0x50 [nouveau]
[ 1960.792502]  pci_device_remove+0x112/0x2d0
[ 1960.792530]  ? pcibios_free_irq+0x10/0x10
[ 1960.792558]  ? kasan_check_write+0x14/0x20
[ 1960.792587]  device_release_driver_internal+0x35c/0x650
[ 1960.792617]  device_release_driver+0x12/0x20
[ 1960.792643]  pci_stop_bus_device+0x172/0x1e0
[ 1960.792671]  pci_stop_and_remove_bus_device_locked+0x1a/0x30
[ 1960.792715]  remove_store+0xcb/0xe0
[ 1960.792753]  ? sriov_numvfs_store+0x2e0/0x2e0
[ 1960.792779]  ? __lock_is_held+0xb5/0x140
[ 1960.792808]  ? component_add+0x530/0x530
[ 1960.792834]  dev_attr_store+0x3f/0x70
[ 1960.792859]  ? sysfs_file_ops+0x11d/0x170
[ 1960.792885]  sysfs_kf_write+0x104/0x150
[ 1960.792915]  ? sysfs_file_ops+0x170/0x170
[ 1960.792940]  kernfs_fop_write+0x24f/0x400
[ 1960.792978]  ? __lock_acquire+0x6ea/0x47f0
[ 1960.793021]  __vfs_write+0xeb/0x760
[ 1960.793048]  ? kernel_read+0x130/0x130
[ 1960.793076]  ? __lock_is_held+0xb5/0x140
[ 1960.793107]  ? rcu_read_lock_sched_held+0xdd/0x110
[ 1960.793135]  ? rcu_sync_lockdep_assert+0x78/0xb0
[ 1960.793162]  ? __sb_start_write+0x183/0x220
[ 1960.793189]  vfs_write+0x14d/0x4a0
[ 1960.793229]  ksys_write+0xd2/0x1b0
[ 1960.793255]  ? __ia32_sys_read+0xb0/0xb0
[ 1960.793298]  ? fput+0x1d/0x120
[ 1960.793324]  ? filp_close+0xf3/0x130
[ 1960.793349]  ? entry_SYSCALL_64_after_hwframe+0x59/0xbe
[ 1960.793380]  __x64_sys_write+0x73/0xb0
[ 1960.793407]  do_syscall_64+0xaa/0x400
[ 1960.793433]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 1960.793460] RIP: 0033:0x7f59df433164
[ 1960.793486] Code: 89 02 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 8d 05 81 38 2d 00 8b 00 85 c0 75 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 41 54 49 89 d4 55 48 89 f5 53
[ 1960.793541] RSP: 002b:00007ffd70ee2fb8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[ 1960.793576] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f59df433164
[ 1960.793620] RDX: 0000000000000002 RSI: 00005578088640c0 RDI: 0000000000000001
[ 1960.793665] RBP: 00005578088640c0 R08: 00007f59df7038c0 R09: 00007f59e0995b80
[ 1960.793696] R10: 000000000000000a R11: 0000000000000246 R12: 00007f59df702760
[ 1960.793730] R13: 0000000000000002 R14: 00007f59df6fd760 R15: 0000000000000002
[ 1960.793768]
[ 1960.793790] Allocated by task 11167:
[ 1960.793816]  save_stack+0x43/0xd0
[ 1960.793841]  kasan_kmalloc+0xc4/0xe0
[ 1960.793880]  kasan_slab_alloc+0x11/0x20
[ 1960.793905]  kmem_cache_alloc+0xd7/0x270
[ 1960.793944]  getname_flags+0xbd/0x520
[ 1960.793969]  user_path_at_empty+0x23/0x50
[ 1960.793994]  do_faccessat+0x1fc/0x5d0
[ 1960.794018]  __x64_sys_access+0x59/0x80
[ 1960.794043]  do_syscall_64+0xaa/0x400
[ 1960.794067]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 1960.794093]
[ 1960.794127] Freed by task 11167:
[ 1960.794152]  save_stack+0x43/0xd0
[ 1960.794190]  __kasan_slab_free+0x139/0x190
[ 1960.794215]  kasan_slab_free+0xe/0x10
[ 1960.794239]  kmem_cache_free+0xcb/0x2c0
[ 1960.794264]  putname+0xad/0xe0
[ 1960.794287]  filename_lookup.part.59+0x1f1/0x360
[ 1960.794313]  user_path_at_empty+0x3e/0x50
[ 1960.794338]  do_faccessat+0x1fc/0x5d0
[ 1960.794362]  __x64_sys_access+0x59/0x80
[ 1960.794393]  do_syscall_64+0xaa/0x400
[ 1960.794421]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 1960.794461]
[ 1960.794483] The buggy address belongs to the object at ffff88075acceac0
[ 1960.794483]  which belongs to the cache names_cache of size 4096
[ 1960.794540] The buggy address is located 2192 bytes inside of
[ 1960.794540]  4096-byte region [ffff88075acceac0, ffff88075accfac0)
[ 1960.794581] The buggy address belongs to the page:
[ 1960.794609] page:ffffea001d6b3200 count:1 mapcount:0 mapping:ffff880778e4b1c0 index:0x0 compound_mapcount: 0
[ 1960.794651] flags: 0x8000000000008100(slab|head)
[ 1960.794679] raw: 8000000000008100 ffffea001d39e808 ffffea001d39ea08 ffff880778e4b1c0
[ 1960.794739] raw: 0000000000000000 0000000000070007 00000001ffffffff 0000000000000000
[ 1960.794785] page dumped because: kasan: bad access detected
[ 1960.794813]
[ 1960.794834] Memory state around the buggy address:
[ 1960.794861]  ffff88075accf200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 1960.794894]  ffff88075accf280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 1960.794925] >ffff88075accf300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 1960.794956]                                                  ^
[ 1960.794985]  ffff88075accf380: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 1960.795017]  ffff88075accf400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 1960.795061] ==================================================================
[ 1960.795106] Disabling lock debugging due to kernel taint
[ 1960.795131] ------------[ cut here ]------------
[ 1960.795148] ida_remove called for id=1802201963 which is not allocated.
[ 1960.795193] WARNING: CPU: 7 PID: 11185 at lib/idr.c:521 ida_remove+0x184/0x210
[ 1960.795213] Modules linked in: nouveau(O) mxm_wmi ttm i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm joydev vfat fat intel_rapl x86_pkg_temp_thermal coretemp crc32_pclmul iTCO_wdt psmouse wmi_bmof mei_me tpm_tis mei tpm_tis_core tpm i2c_i801 thinkpad_acpi pcc_cpufreq crc32c_intel serio_raw xhci_pci xhci_hcd wmi video i2c_dev i2c_core
[ 1960.795305] CPU: 7 PID: 11185 Comm: zsh Kdump: loaded Tainted: G    B      O      4.18.0Lyude-Test+ #4
[ 1960.795330] Hardware name: LENOVO 20EQS64N0B/20EQS64N0B, BIOS N1EET79W (1.52 ) 07/13/2018
[ 1960.795352] RIP: 0010:ida_remove+0x184/0x210
[ 1960.795370] Code: 4c 89 f7 e8 ae c8 00 00 eb 22 41 83 c4 02 4c 89 e8 41 83 fc 3f 0f 86 64 ff ff ff 44 89 fe 48 c7 c7 20 94 1e 83 e8 54 ed 81 fe <0f> 0b 48 b8 00 00 00 00 00 fc ff df 48 01 c3 c7 03 00 00 00 00 c7
[ 1960.795402] RSP: 0018:ffff88074d4df7b8 EFLAGS: 00010082
[ 1960.795421] RAX: 0000000000000000 RBX: 1ffff100e9a9befa RCX: ffffffff81479975
[ 1960.795440] RDX: 0000000000000000 RSI: 0000000000000008 RDI: ffff88077c1de690
[ 1960.795460] RBP: ffff88074d4df878 R08: ffffed00ef83bcd3 R09: ffffed00ef83bcd2
[ 1960.795479] R10: ffffed00ef83bcd2 R11: ffff88077c1de697 R12: 000000000000036b
[ 1960.795498] R13: 0000000000000202 R14: ffffffffa0aa7fa0 R15: 000000006b6b6b6b
[ 1960.795518] FS:  00007f59e0995b80(0000) GS:ffff88077c1c0000(0000) knlGS:0000000000000000
[ 1960.795553] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1960.795571] CR2: 00007f59e09a2010 CR3: 00000004a1a70005 CR4: 00000000003606e0
[ 1960.795596] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1960.795629] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 1960.795649] Call Trace:
[ 1960.795667]  ? ida_destroy+0x1d0/0x1d0
[ 1960.795686]  ? kasan_check_write+0x14/0x20
[ 1960.795704]  ? do_raw_spin_lock+0xc2/0x1c0
[ 1960.795724]  ida_simple_remove+0x26/0x40
[ 1960.795794]  nouveau_backlight_exit+0x9d/0x150 [nouveau]
[ 1960.795867]  nouveau_display_destroy+0x76/0x150 [nouveau]
[ 1960.795930]  nouveau_drm_device_fini+0xb7/0x190 [nouveau]
[ 1960.795989]  nouveau_drm_device_remove+0x14b/0x1d0 [nouveau]
[ 1960.796047]  ? nouveau_cli_work_queue+0x2e0/0x2e0 [nouveau]
[ 1960.796067]  ? trace_hardirqs_on_caller+0x38b/0x570
[ 1960.796089]  ? trace_hardirqs_on+0xd/0x10
[ 1960.796146]  nouveau_drm_remove+0x37/0x50 [nouveau]
[ 1960.796167]  pci_device_remove+0x112/0x2d0
[ 1960.796186]  ? pcibios_free_irq+0x10/0x10
[ 1960.796218]  ? kasan_check_write+0x14/0x20
[ 1960.796237]  device_release_driver_internal+0x35c/0x650
[ 1960.796257]  device_release_driver+0x12/0x20
[ 1960.796289]  pci_stop_bus_device+0x172/0x1e0
[ 1960.796308]  pci_stop_and_remove_bus_device_locked+0x1a/0x30
[ 1960.796328]  remove_store+0xcb/0xe0
[ 1960.796345]  ? sriov_numvfs_store+0x2e0/0x2e0
[ 1960.796364]  ? __lock_is_held+0xb5/0x140
[ 1960.796383]  ? component_add+0x530/0x530
[ 1960.796401]  dev_attr_store+0x3f/0x70
[ 1960.796419]  ? sysfs_file_ops+0x11d/0x170
[ 1960.796436]  sysfs_kf_write+0x104/0x150
[ 1960.796454]  ? sysfs_file_ops+0x170/0x170
[ 1960.796471]  kernfs_fop_write+0x24f/0x400
[ 1960.796488]  ? __lock_acquire+0x6ea/0x47f0
[ 1960.796520]  __vfs_write+0xeb/0x760
[ 1960.796538]  ? kernel_read+0x130/0x130
[ 1960.796556]  ? __lock_is_held+0xb5/0x140
[ 1960.796590]  ? rcu_read_lock_sched_held+0xdd/0x110
[ 1960.796608]  ? rcu_sync_lockdep_assert+0x78/0xb0
[ 1960.796626]  ? __sb_start_write+0x183/0x220
[ 1960.796648]  vfs_write+0x14d/0x4a0
[ 1960.796666]  ksys_write+0xd2/0x1b0
[ 1960.796684]  ? __ia32_sys_read+0xb0/0xb0
[ 1960.796701]  ? fput+0x1d/0x120
[ 1960.796732]  ? filp_close+0xf3/0x130
[ 1960.796749]  ? entry_SYSCALL_64_after_hwframe+0x59/0xbe
[ 1960.796768]  __x64_sys_write+0x73/0xb0
[ 1960.796800]  do_syscall_64+0xaa/0x400
[ 1960.796818]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 1960.796836] RIP: 0033:0x7f59df433164
[ 1960.796854] Code: 89 02 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 8d 05 81 38 2d 00 8b 00 85 c0 75 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 41 54 49 89 d4 55 48 89 f5 53
[ 1960.796884] RSP: 002b:00007ffd70ee2fb8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[ 1960.796906] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f59df433164
[ 1960.796926] RDX: 0000000000000002 RSI: 00005578088640c0 RDI: 0000000000000001
[ 1960.796946] RBP: 00005578088640c0 R08: 00007f59df7038c0 R09: 00007f59e0995b80
[ 1960.796966] R10: 000000000000000a R11: 0000000000000246 R12: 00007f59df702760
[ 1960.796985] R13: 0000000000000002 R14: 00007f59df6fd760 R15: 0000000000000002
[ 1960.797008] irq event stamp: 509990
[ 1960.797026] hardirqs last  enabled at (509989): [<ffffffff8119ff78>] flush_work+0x4b8/0x6d0
[ 1960.797063] hardirqs last disabled at (509990): [<ffffffff8297c395>] _raw_spin_lock_irqsave+0x25/0x60
[ 1960.797085] softirqs last  enabled at (509744): [<ffffffff82c005ad>] __do_softirq+0x5ad/0x8c0
[ 1960.797121] softirqs last disabled at (509735): [<ffffffff8115aa15>] irq_exit+0x1a5/0x1e0
[ 1960.797142] ---[ end trace fb1342325f1846b8 ]---

While I haven't actually gone into the details of what's causing this to
happen (maybe the kernel removes the backlight device in the device core
before we get to it?), it doesn't really matter anyway because the way
nouveau handles backlights has long since been deprecated.

According to the documentation on the drm_connector->late_register()
hook, the ->late_register() hook should be used for adding extra
connector-related devices. Vice versa, the ->early_unregister() hook is
meant to be used for removing those devices.

So: gut nouveau_drm->bl_list and nouveau_drm->backlight, and replace
them with per-connector backlight structures. Additionally, move
backlight registration/teardown into the ->late_register() and
->early_unregister() hooks so that DRM can give us a chance to remove
the backlight before the connector is even removed. This appears to fix
the problem once and for all.

Changes since v2:
- Use NV_INFO_ONCE for printing GMUX information, since otherwise this
  will end up printing that message for as many times as we have
  connectors

Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2018-10-11 09:54:09 +10:00
Lyude Paul
4c49707504 drm/nouveau: Add NV_PRINTK_ONCE and variants
Since we're about to use this in nouveau_backlight.c. Same thing as
DRM_WARN_ONCE, DRM_INFO_ONCE, etc...

Signed-off-by: Lyude Paul <lyude@redhat.com>
Cc: Karol Herbst <kherbst@redhat.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2018-10-11 09:54:09 +10:00
Lyude Paul
dc85491499 drm/nouveau: Check backlight IDs are >= 0, not > 0
Remember, ida IDs start at 0, not 1!

Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2018-10-11 09:54:09 +10:00
Lyude Paul
e46368cf77 drm/nouveau/drm/nouveau: Grab runtime PM ref in nv50_mstc_detect()
While we currently grab a runtime PM ref in nouveau's normal connector
detection code, we apparently don't do this for MST. This means if we're
in a scenario where the GPU is suspended and userspace attempts to do a
connector probe on an MSTC connector, the probe will fail entirely due
to the DP aux channel and GPU not being woken up:

[  316.633489] nouveau 0000:01:00.0: i2c: aux 000a: begin idle timeout ffffffff
[  316.635713] nouveau 0000:01:00.0: i2c: aux 000a: begin idle timeout ffffffff
[  316.637785] nouveau 0000:01:00.0: i2c: aux 000a: begin idle timeout ffffffff
...

So, grab a runtime PM ref here.

Signed-off-by: Lyude Paul <lyude@redhat.com>
Cc: stable@vger.kernel.org
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2018-10-05 16:43:15 +10:00
Dave Airlie
bf78296ab1 This is the 4.19-rc5 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlunyjMACgkQONu9yGCS
 aT52HhAA0JU7E88QPZ1gSxc1ifTaIlHXhLQSvQKAXOhIvHDwj4tEKDqPhpCN/dWX
 /o/xaUf36gU0VzUD/1IyEiMFmJEeFKnfvN5SZYZLk8uSrd4swqaY8mSueZxNEDz4
 YNK9ugI/tPztuuz7I6KrO1iVquY1WlnECxc9FH76wvHsit8Sr3PvzhR+CVrOi+8p
 k3cpWlhHiOzT/3K3Wv2Et+oh+U+myKtQTlJDSe3fMx5chksJpBmsV/IDEtsLNZfz
 3v25fHz5a3DOYqKkGJaDrbLyPNC85249B+CiXqbXvfOAHDVkMwYqcxYUG+YZ5cpm
 U0OShLXm67dz8vT9cxqOSguCliPRlM9W5+EKzmVT7l8+ycds3BuEEHg1xWPrJWgG
 7XO10HkhZl+VvnJCj54KaszMUOdpvdEQSUs82gAFxjPbQIx5gosN9O0H+DnirMhS
 6VtzS20ZoIzjd4YVkRoLNcobHB4bZVTNXZ1Zi3C/neP9pxUjhOk0y+Vr/crC5Xph
 3TykIMgiVa+CdvQ/f4LOSiCgTFhF0tLGtfDQTG7f+9+W5pMc4NKSLi8EOMlJtYEy
 wsCYZ7/T9ElgrEzFvlxSvDwiPUhcldNao/EGdRYvMxXtgj0Ctw8LhR/2YKkqo6LK
 oMoKKWkj0o7uKSHKq+dakS0FprKnBnvE2Y+XA4SO/saPGFlDAVc=
 =OFJh
 -----END PGP SIGNATURE-----

BackMerge v4.19-rc5 into drm-next

Sean Paul requested an -rc5 backmerge from some sun4i fixes.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-09-27 11:06:46 +10:00
Ben Skeggs
3483f08106 drm/nouveau/devinit: fix warning when PMU/PRE_OS is missing
Messed up when sending pull request and sent an outdated version of
previous patch, this fixes it up to remove warnings.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2018-09-13 10:56:58 +10:00
Daniel Vetter
04cfcc7ab3 fbdev: Drop FBINFO_CAN_FORCE_OUTPUT flag
This was only added for the drm's fbdev emulation support, so that it
would try harder to show the Oops.

Unfortunately this never really worked reliably, and in practice ended
up pushing the real Oops off the screen due to plentyfull locking,
sleep-while-atomic and other issues. So we removed all that support
from the fbdev emulation a while back. Aside: We've also removed the
kgdb support, for similar reasons.

Since it's such a small patch I figured I don't split this up into the
usual 3-phase removal.

Acked-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Cc: Ben Skeggs <bskeggs@redhat.com>
Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Hans de Goede <hdegoede@redhat.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Alexander Kapshuk <alexander.kapshuk@gmail.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Thierry Reding <treding@nvidia.com>
Cc: David Lechner <david@lechnology.com>
Cc: nouveau@lists.freedesktop.org
Cc: linux-fbdev@vger.kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180822085405.10787-1-daniel.vetter@ffwll.ch
2018-09-11 14:11:01 +02:00
Daniel Vetter
d78aa65067 drm: Add drm/drm_util.h header file
We have a bunch of neat little macros all over the place which should
move to kernel.h. But some of them died in bikesheds on lkml, and we
need a decent home for them.

Start out by moving the for_each_if macro there.

v2: Rename to drm_util.h instead (Dave&Sean)

Cc: Sean Paul <seanpaul@chromium.org>
Acked-by: Sean Paul <seanpaul@chromium.org>
Cc: Dave Airlie <airlied@gmail.com>
Acked-by: Dave Airlie <airlied@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20180905135711.28370-1-daniel.vetter@ffwll.ch
2018-09-09 14:18:11 +02:00
Ben Skeggs
53b0cc46f2 drm/nouveau/disp/gm200-: enforce identity-mapped SOR assignment for LVDS/eDP panels
Fixes eDP backlight issues on more recent laptops.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2018-09-07 06:54:28 +10:00
Ben Skeggs
e04cfdc9b7 drm/nouveau/disp: fix DP disable race
If a HPD pulse signalling the need to retrain the link occurs between
the KMS driver releasing the output and the supervisor interrupt that
finishes the teardown, it was possible get a NULL-ptr deref.

Avoid this by marking the link as inactive earlier.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2018-09-07 06:54:28 +10:00
Ben Skeggs
f6d52b2172 drm/nouveau/disp: move eDP panel power handling
We need to do this earlier to prevent aux channel timeouts in resume
paths on certain systems.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2018-09-07 06:54:28 +10:00
Ben Skeggs
606557708f drm/nouveau/disp: remove unused struct member
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2018-09-07 06:54:28 +10:00
Ben Skeggs
0a6986c659 drm/nouveau/TBDdevinit: don't fail when PMU/PRE_OS is missing from VBIOS
This Falcon application doesn't appear to be present on some newer
systems, so let's not fail init if we can't find it.

TBD: is there a way to determine whether it *should* be there?

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2018-09-07 06:54:28 +10:00
Ben Skeggs
51ed833c88 drm/nouveau/mmu: don't attempt to dereference vmm without valid instance pointer
Fixes oopses in certain failure paths.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2018-09-07 06:54:28 +10:00
Ben Skeggs
a43b16dda2 drm/nouveau: fix oops in client init failure path
The NV_ERROR macro requires drm->client to be initialised, which it may not
be at this stage of the init process.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2018-09-07 06:54:27 +10:00
Lyude Paul
d5986a1c4d drm/nouveau: Fix nouveau_connector_ddc_detect()
It looks like that when we moved over to using
drm_connector_for_each_possible_encoder() in nouveau, that one rather
important part of this function got dropped by accident:

	/*          Right   v   here */
	for (i = 0; nv_encoder = NULL, i < DRM_CONNECTOR_MAX_ENCODER; i++) {
		int id = connector->encoder_ids[i];
		if (id == 0)
			break;

Since it's rather difficult to notice: the conditional in this loop is
actually:

	nv_encoder = NULL, i < DRM_CONNECTOR_MAX_ENCODER

Meaning that all early breaks result in nv_encoder keeping it's value,
otherwise nv_encoder = NULL. Ugh.

Since this got dropped, nouveau_connector_ddc_detect() now returns an
encoder for every single connector, regardless of whether or not it's
detected:

    [ 1780.056185] nouveau 0000:01:00.0: DRM: DDC responded, but no EDID for DP-2

So: fix this to ensure we only return an encoder if we actually found
one, and clean up the rest of the function while we're at it since it's
nearly impossible to read properly.

Changes since v1:
- Don't skip ddc probing for LVDS if we can't switch DDC through
  vga-switcheroo, just do the DDC probing without calling
  vga_switcheroo_lock_ddc() - skeggsb

Signed-off-by: Lyude Paul <lyude@redhat.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Fixes: ddba766dd0 ("drm/nouveau: Use drm_connector_for_each_possible_encoder()")
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2018-09-07 06:54:27 +10:00
Lyude Paul
2f7ca781fd drm/nouveau/drm/nouveau: Don't forget to cancel hpd_work on suspend/unload
Currently, there's nothing in nouveau that actually cancels this work
struct. So, cancel it on suspend/unload. Otherwise, if we're unlucky
enough hpd_work might try to keep running up until the system is
suspended.

Signed-off-by: Lyude Paul <lyude@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2018-09-07 06:54:27 +10:00
Lyude Paul
79e765ad66 drm/nouveau/drm/nouveau: Prevent handling ACPI HPD events too early
On most systems with ACPI hotplugging support, it seems that we always
receive a hotplug event once we re-enable EC interrupts even if the GPU
hasn't even been resumed yet.

This can cause problems since even though we schedule hpd_work to handle
connector reprobing for us, hpd_work synchronizes on
pm_runtime_get_sync() to wait until the device is ready to perform
reprobing. Since runtime suspend/resume callbacks are disabled before
the PM core calls ->suspend(), any calls to pm_runtime_get_sync() during
this period will grab a runtime PM ref and return immediately with
-EACCES. Because we schedule hpd_work from our ACPI HPD handler, and
hpd_work synchronizes on pm_runtime_get_sync(), this causes us to launch
a connector reprobe immediately even if the GPU isn't actually resumed
just yet. This causes various warnings in dmesg and occasionally, also
prevents some displays connected to the dedicated GPU from coming back
up after suspend. Example:

usb 1-4: USB disconnect, device number 14
usb 1-4.1: USB disconnect, device number 15
WARNING: CPU: 0 PID: 838 at drivers/gpu/drm/nouveau/include/nvkm/subdev/i2c.h:170 nouveau_dp_detect+0x17e/0x370 [nouveau]
CPU: 0 PID: 838 Comm: kworker/0:6 Not tainted 4.17.14-201.Lyude.bz1477182.V3.fc28.x86_64 #1
Hardware name: LENOVO 20EQS64N00/20EQS64N00, BIOS N1EET77W (1.50 ) 03/28/2018
Workqueue: events nouveau_display_hpd_work [nouveau]
RIP: 0010:nouveau_dp_detect+0x17e/0x370 [nouveau]
RSP: 0018:ffffa15143933cf0 EFLAGS: 00010293
RAX: 0000000000000000 RBX: ffff8cb4f656c400 RCX: 0000000000000000
RDX: ffffa1514500e4e4 RSI: ffffa1514500e4e4 RDI: 0000000001009002
RBP: ffff8cb4f4a8a800 R08: ffffa15143933cfd R09: ffffa15143933cfc
R10: 0000000000000000 R11: 0000000000000000 R12: ffff8cb4fb57a000
R13: ffff8cb4fb57a000 R14: ffff8cb4f4a8f800 R15: ffff8cb4f656c418
FS:  0000000000000000(0000) GS:ffff8cb51f400000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f78ec938000 CR3: 000000073720a003 CR4: 00000000003606f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 ? _cond_resched+0x15/0x30
 nouveau_connector_detect+0x2ce/0x520 [nouveau]
 ? _cond_resched+0x15/0x30
 ? ww_mutex_lock+0x12/0x40
 drm_helper_probe_detect_ctx+0x8b/0xe0 [drm_kms_helper]
 drm_helper_hpd_irq_event+0xa8/0x120 [drm_kms_helper]
 nouveau_display_hpd_work+0x2a/0x60 [nouveau]
 process_one_work+0x187/0x340
 worker_thread+0x2e/0x380
 ? pwq_unbound_release_workfn+0xd0/0xd0
 kthread+0x112/0x130
 ? kthread_create_worker_on_cpu+0x70/0x70
 ret_from_fork+0x35/0x40
Code: 4c 8d 44 24 0d b9 00 05 00 00 48 89 ef ba 09 00 00 00 be 01 00 00 00 e8 e1 09 f8 ff 85 c0 0f 85 b2 01 00 00 80 7c 24 0c 03 74 02 <0f> 0b 48 89 ef e8 b8 07 f8 ff f6 05 51 1b c8 ff 02 0f 84 72 ff
---[ end trace 55d811b38fc8e71a ]---

So, to fix this we attempt to grab a runtime PM reference in the ACPI
handler itself asynchronously. If the GPU is already awake (it will have
normal hotplugging at this point) or runtime PM callbacks are currently
disabled on the device, we drop our reference without updating the
autosuspend delay. We only schedule connector reprobes when we
successfully managed to queue up a resume request with our asynchronous
PM ref.

This also has the added benefit of preventing redundant connector
reprobes from ACPI while the GPU is runtime resumed!

Signed-off-by: Lyude Paul <lyude@redhat.com>
Cc: stable@vger.kernel.org
Cc: Karol Herbst <kherbst@redhat.com>
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1477182#c41
Signed-off-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2018-09-07 06:54:27 +10:00
Lyude Paul
fa3cdf8d0b drm/nouveau: Reset MST branching unit before enabling
When probing a new MST device, it's not safe to make any assumptions
about it's current state. While most well mannered MST hubs will just
disable the branching unit on hotplug disconnects, this isn't enough to
save us from various other scenarios that might have resulted in
something writing to the MST branching unit before we got control of it.
This could happen if a previous probe we tried failed, if we're booting
in kexec context and the hub is still in the state the last kernel put
it in, etc.

Luckily; there is no reason we can't just reset the branching unit
every time we enable a new topology. So, fix this by resetting it on
enabling new topologies to ensure that we always start off with a clean,
unmodified topology state on MST sinks.

This fixes occasional hard-lockups on my P50's laptop dock (e.g. AUX
times out all DPCD trasactions) observed after multiple docks, undocks,
and module reloads.

Signed-off-by: Lyude Paul <lyude@redhat.com>
Cc: stable@vger.kernel.org
Cc: Karol Herbst <karolherbst@gmail.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2018-09-07 06:54:27 +10:00
Lyude Paul
b26b4590dd drm/nouveau: Only write DP_MSTM_CTRL when needed
Currently, nouveau will re-write the DP_MSTM_CTRL register for an MST
hub every time it receives a long HPD pulse on DP. This isn't actually
necessary and additionally, has some unintended side effects.

With the P50 I've got here, rewriting DP_MSTM_CTRL constantly seems to
make it rather likely (1 out of 5 times usually) that bringing up MST
with it's ThinkPad dock will fail and result in sideband messages timing
out in the middle. Afterwards, successive probes don't manage to get the
dock to communicate properly over MST sideband properly.

Many times sideband message timeouts from MST hubs are indicative of
either the source or the sink dropping an ESI event, which can cause
DRM's perspective of the topology's current state to go out of sync with
reality. While it's tough to really know for sure what's happening to
the dock, using userspace tools to write to DP_MSTM_CTRL in the middle
of the MST link probing process does appear to make things flaky. It's
possible that when we write to DP_MSTM_CTRL, the function that gets
triggered to respond in the dock's firmware temporarily puts it in a
state where it might end up not reporting an ESI to the source, or ends
up dropping a sideband message we sent it.

So, to fix this we make it so that when probing an MST topology, we
respect it's current state. If the dock's already enabled, we simply
read DP_MSTM_CTRL and disable the topology if it's value is not what we
expected. Otherwise, we perform the normal MST probing dance. We avoid
taking any action except if the state of the MST topology actually
changes.

This fixes MST sideband message timeouts and detection failures on my
P50 with its ThinkPad dock.

Signed-off-by: Lyude Paul <lyude@redhat.com>
Cc: stable@vger.kernel.org
Cc: Karol Herbst <karolherbst@gmail.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2018-09-07 06:54:26 +10:00
Lyude Paul
7326ead982 drm/nouveau: Remove useless poll_enable() call in drm_load()
Again, this doesn't do anything. drm_kms_helper_poll_enable() will have
already been called in nouveau_display_init()

Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Daniel Vetter <daniel@ffwll.ch>
Cc: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2018-09-07 06:54:26 +10:00
Lyude Paul
0d7b2d4def drm/nouveau: Remove useless poll_disable() call in switcheroo_set_state()
This won't do anything but potentially make us miss hotplugs. We already
call drm_kms_helper_poll_disable() in
nouveau_pmops_suspend()->nouveau_display_suspend()->nouveau_display_fini()

Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Daniel Vetter <daniel@ffwll.ch>
Cc: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2018-09-07 06:54:26 +10:00
Lyude Paul
0445f7537d drm/nouveau: Remove useless poll_enable() call in switcheroo_set_state()
This doesn't do anything, drm_kms_helper_poll_enable() gets called in
nouveau_pmops_resume()->nouveau_display_resume()->nouveau_display_init()
already.

Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Daniel Vetter <daniel@ffwll.ch>
Cc: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2018-09-07 06:54:26 +10:00
Lyude Paul
3e1a12754d drm/nouveau: Fix deadlocks in nouveau_connector_detect()
When we disable hotplugging on the GPU, we need to be able to
synchronize with each connector's hotplug interrupt handler before the
interrupt is finally disabled. This can be a problem however, since
nouveau_connector_detect() currently grabs a runtime power reference
when handling connector probing. This will deadlock the runtime suspend
handler like so:

[  861.480896] INFO: task kworker/0:2:61 blocked for more than 120 seconds.
[  861.483290]       Tainted: G           O      4.18.0-rc6Lyude-Test+ #1
[  861.485158] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  861.486332] kworker/0:2     D    0    61      2 0x80000000
[  861.487044] Workqueue: events nouveau_display_hpd_work [nouveau]
[  861.487737] Call Trace:
[  861.488394]  __schedule+0x322/0xaf0
[  861.489070]  schedule+0x33/0x90
[  861.489744]  rpm_resume+0x19c/0x850
[  861.490392]  ? finish_wait+0x90/0x90
[  861.491068]  __pm_runtime_resume+0x4e/0x90
[  861.491753]  nouveau_display_hpd_work+0x22/0x60 [nouveau]
[  861.492416]  process_one_work+0x231/0x620
[  861.493068]  worker_thread+0x44/0x3a0
[  861.493722]  kthread+0x12b/0x150
[  861.494342]  ? wq_pool_ids_show+0x140/0x140
[  861.494991]  ? kthread_create_worker_on_cpu+0x70/0x70
[  861.495648]  ret_from_fork+0x3a/0x50
[  861.496304] INFO: task kworker/6:2:320 blocked for more than 120 seconds.
[  861.496968]       Tainted: G           O      4.18.0-rc6Lyude-Test+ #1
[  861.497654] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  861.498341] kworker/6:2     D    0   320      2 0x80000080
[  861.499045] Workqueue: pm pm_runtime_work
[  861.499739] Call Trace:
[  861.500428]  __schedule+0x322/0xaf0
[  861.501134]  ? wait_for_completion+0x104/0x190
[  861.501851]  schedule+0x33/0x90
[  861.502564]  schedule_timeout+0x3a5/0x590
[  861.503284]  ? mark_held_locks+0x58/0x80
[  861.503988]  ? _raw_spin_unlock_irq+0x2c/0x40
[  861.504710]  ? wait_for_completion+0x104/0x190
[  861.505417]  ? trace_hardirqs_on_caller+0xf4/0x190
[  861.506136]  ? wait_for_completion+0x104/0x190
[  861.506845]  wait_for_completion+0x12c/0x190
[  861.507555]  ? wake_up_q+0x80/0x80
[  861.508268]  flush_work+0x1c9/0x280
[  861.508990]  ? flush_workqueue_prep_pwqs+0x1b0/0x1b0
[  861.509735]  nvif_notify_put+0xb1/0xc0 [nouveau]
[  861.510482]  nouveau_display_fini+0xbd/0x170 [nouveau]
[  861.511241]  nouveau_display_suspend+0x67/0x120 [nouveau]
[  861.511969]  nouveau_do_suspend+0x5e/0x2d0 [nouveau]
[  861.512715]  nouveau_pmops_runtime_suspend+0x47/0xb0 [nouveau]
[  861.513435]  pci_pm_runtime_suspend+0x6b/0x180
[  861.514165]  ? pci_has_legacy_pm_support+0x70/0x70
[  861.514897]  __rpm_callback+0x7a/0x1d0
[  861.515618]  ? pci_has_legacy_pm_support+0x70/0x70
[  861.516313]  rpm_callback+0x24/0x80
[  861.517027]  ? pci_has_legacy_pm_support+0x70/0x70
[  861.517741]  rpm_suspend+0x142/0x6b0
[  861.518449]  pm_runtime_work+0x97/0xc0
[  861.519144]  process_one_work+0x231/0x620
[  861.519831]  worker_thread+0x44/0x3a0
[  861.520522]  kthread+0x12b/0x150
[  861.521220]  ? wq_pool_ids_show+0x140/0x140
[  861.521925]  ? kthread_create_worker_on_cpu+0x70/0x70
[  861.522622]  ret_from_fork+0x3a/0x50
[  861.523299] INFO: task kworker/6:0:1329 blocked for more than 120 seconds.
[  861.523977]       Tainted: G           O      4.18.0-rc6Lyude-Test+ #1
[  861.524644] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  861.525349] kworker/6:0     D    0  1329      2 0x80000000
[  861.526073] Workqueue: events nvif_notify_work [nouveau]
[  861.526751] Call Trace:
[  861.527411]  __schedule+0x322/0xaf0
[  861.528089]  schedule+0x33/0x90
[  861.528758]  rpm_resume+0x19c/0x850
[  861.529399]  ? finish_wait+0x90/0x90
[  861.530073]  __pm_runtime_resume+0x4e/0x90
[  861.530798]  nouveau_connector_detect+0x7e/0x510 [nouveau]
[  861.531459]  ? ww_mutex_lock+0x47/0x80
[  861.532097]  ? ww_mutex_lock+0x47/0x80
[  861.532819]  ? drm_modeset_lock+0x88/0x130 [drm]
[  861.533481]  drm_helper_probe_detect_ctx+0xa0/0x100 [drm_kms_helper]
[  861.534127]  drm_helper_hpd_irq_event+0xa4/0x120 [drm_kms_helper]
[  861.534940]  nouveau_connector_hotplug+0x98/0x120 [nouveau]
[  861.535556]  nvif_notify_work+0x2d/0xb0 [nouveau]
[  861.536221]  process_one_work+0x231/0x620
[  861.536994]  worker_thread+0x44/0x3a0
[  861.537757]  kthread+0x12b/0x150
[  861.538463]  ? wq_pool_ids_show+0x140/0x140
[  861.539102]  ? kthread_create_worker_on_cpu+0x70/0x70
[  861.539815]  ret_from_fork+0x3a/0x50
[  861.540521]
               Showing all locks held in the system:
[  861.541696] 2 locks held by kworker/0:2/61:
[  861.542406]  #0: 000000002dbf8af5 ((wq_completion)"events"){+.+.}, at: process_one_work+0x1b3/0x620
[  861.543071]  #1: 0000000076868126 ((work_completion)(&drm->hpd_work)){+.+.}, at: process_one_work+0x1b3/0x620
[  861.543814] 1 lock held by khungtaskd/64:
[  861.544535]  #0: 0000000059db4b53 (rcu_read_lock){....}, at: debug_show_all_locks+0x23/0x185
[  861.545160] 3 locks held by kworker/6:2/320:
[  861.545896]  #0: 00000000d9e1bc59 ((wq_completion)"pm"){+.+.}, at: process_one_work+0x1b3/0x620
[  861.546702]  #1: 00000000c9f92d84 ((work_completion)(&dev->power.work)){+.+.}, at: process_one_work+0x1b3/0x620
[  861.547443]  #2: 000000004afc5de1 (drm_connector_list_iter){.+.+}, at: nouveau_display_fini+0x96/0x170 [nouveau]
[  861.548146] 1 lock held by dmesg/983:
[  861.548889] 2 locks held by zsh/1250:
[  861.549605]  #0: 00000000348e3cf6 (&tty->ldisc_sem){++++}, at: ldsem_down_read+0x37/0x40
[  861.550393]  #1: 000000007009a7a8 (&ldata->atomic_read_lock){+.+.}, at: n_tty_read+0xc1/0x870
[  861.551122] 6 locks held by kworker/6:0/1329:
[  861.551957]  #0: 000000002dbf8af5 ((wq_completion)"events"){+.+.}, at: process_one_work+0x1b3/0x620
[  861.552765]  #1: 00000000ddb499ad ((work_completion)(&notify->work)#2){+.+.}, at: process_one_work+0x1b3/0x620
[  861.553582]  #2: 000000006e013cbe (&dev->mode_config.mutex){+.+.}, at: drm_helper_hpd_irq_event+0x6c/0x120 [drm_kms_helper]
[  861.554357]  #3: 000000004afc5de1 (drm_connector_list_iter){.+.+}, at: drm_helper_hpd_irq_event+0x78/0x120 [drm_kms_helper]
[  861.555227]  #4: 0000000044f294d9 (crtc_ww_class_acquire){+.+.}, at: drm_helper_probe_detect_ctx+0x3d/0x100 [drm_kms_helper]
[  861.556133]  #5: 00000000db193642 (crtc_ww_class_mutex){+.+.}, at: drm_modeset_lock+0x4b/0x130 [drm]

[  861.557864] =============================================

[  861.559507] NMI backtrace for cpu 2
[  861.560363] CPU: 2 PID: 64 Comm: khungtaskd Tainted: G           O      4.18.0-rc6Lyude-Test+ #1
[  861.561197] Hardware name: LENOVO 20EQS64N0B/20EQS64N0B, BIOS N1EET78W (1.51 ) 05/18/2018
[  861.561948] Call Trace:
[  861.562757]  dump_stack+0x8e/0xd3
[  861.563516]  nmi_cpu_backtrace.cold.3+0x14/0x5a
[  861.564269]  ? lapic_can_unplug_cpu.cold.27+0x42/0x42
[  861.565029]  nmi_trigger_cpumask_backtrace+0xa1/0xae
[  861.565789]  arch_trigger_cpumask_backtrace+0x19/0x20
[  861.566558]  watchdog+0x316/0x580
[  861.567355]  kthread+0x12b/0x150
[  861.568114]  ? reset_hung_task_detector+0x20/0x20
[  861.568863]  ? kthread_create_worker_on_cpu+0x70/0x70
[  861.569598]  ret_from_fork+0x3a/0x50
[  861.570370] Sending NMI from CPU 2 to CPUs 0-1,3-7:
[  861.571426] NMI backtrace for cpu 6 skipped: idling at intel_idle+0x7f/0x120
[  861.571429] NMI backtrace for cpu 7 skipped: idling at intel_idle+0x7f/0x120
[  861.571432] NMI backtrace for cpu 3 skipped: idling at intel_idle+0x7f/0x120
[  861.571464] NMI backtrace for cpu 5 skipped: idling at intel_idle+0x7f/0x120
[  861.571467] NMI backtrace for cpu 0 skipped: idling at intel_idle+0x7f/0x120
[  861.571469] NMI backtrace for cpu 4 skipped: idling at intel_idle+0x7f/0x120
[  861.571472] NMI backtrace for cpu 1 skipped: idling at intel_idle+0x7f/0x120
[  861.572428] Kernel panic - not syncing: hung_task: blocked tasks

So: fix this by making it so that normal hotplug handling /only/ happens
so long as the GPU is currently awake without any pending runtime PM
requests. In the event that a hotplug occurs while the device is
suspending or resuming, we can simply defer our response until the GPU
is fully runtime resumed again.

Changes since v4:
- Use a new trick I came up with using pm_runtime_get() instead of the
  hackish junk we had before

Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Daniel Vetter <daniel@ffwll.ch>
Cc: stable@vger.kernel.org
Cc: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2018-09-07 06:54:26 +10:00
Lyude Paul
6833fb1ec1 drm/nouveau/drm/nouveau: Use pm_runtime_get_noresume() in connector_detect()
It's true we can't resume the device from poll workers in
nouveau_connector_detect(). We can however, prevent the autosuspend
timer from elapsing immediately if it hasn't already without risking any
sort of deadlock with the runtime suspend/resume operations. So do that
instead of entirely avoiding grabbing a power reference.

Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Daniel Vetter <daniel@ffwll.ch>
Cc: stable@vger.kernel.org
Cc: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2018-09-07 06:54:26 +10:00
Lyude Paul
7fec8f5379 drm/nouveau/drm/nouveau: Fix deadlock with fb_helper with async RPM requests
Currently, nouveau uses the generic drm_fb_helper_output_poll_changed()
function provided by DRM as it's output_poll_changed callback.
Unfortunately however, this function doesn't grab runtime PM references
early enough and even if it did-we can't block waiting for the device to
resume in output_poll_changed() since it's very likely that we'll need
to grab the fb_helper lock at some point during the runtime resume
process. This currently results in deadlocking like so:

[  246.669625] INFO: task kworker/4:0:37 blocked for more than 120 seconds.
[  246.673398]       Not tainted 4.18.0-rc5Lyude-Test+ #2
[  246.675271] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  246.676527] kworker/4:0     D    0    37      2 0x80000000
[  246.677580] Workqueue: events output_poll_execute [drm_kms_helper]
[  246.678704] Call Trace:
[  246.679753]  __schedule+0x322/0xaf0
[  246.680916]  schedule+0x33/0x90
[  246.681924]  schedule_preempt_disabled+0x15/0x20
[  246.683023]  __mutex_lock+0x569/0x9a0
[  246.684035]  ? kobject_uevent_env+0x117/0x7b0
[  246.685132]  ? drm_fb_helper_hotplug_event.part.28+0x20/0xb0 [drm_kms_helper]
[  246.686179]  mutex_lock_nested+0x1b/0x20
[  246.687278]  ? mutex_lock_nested+0x1b/0x20
[  246.688307]  drm_fb_helper_hotplug_event.part.28+0x20/0xb0 [drm_kms_helper]
[  246.689420]  drm_fb_helper_output_poll_changed+0x23/0x30 [drm_kms_helper]
[  246.690462]  drm_kms_helper_hotplug_event+0x2a/0x30 [drm_kms_helper]
[  246.691570]  output_poll_execute+0x198/0x1c0 [drm_kms_helper]
[  246.692611]  process_one_work+0x231/0x620
[  246.693725]  worker_thread+0x214/0x3a0
[  246.694756]  kthread+0x12b/0x150
[  246.695856]  ? wq_pool_ids_show+0x140/0x140
[  246.696888]  ? kthread_create_worker_on_cpu+0x70/0x70
[  246.697998]  ret_from_fork+0x3a/0x50
[  246.699034] INFO: task kworker/0:1:60 blocked for more than 120 seconds.
[  246.700153]       Not tainted 4.18.0-rc5Lyude-Test+ #2
[  246.701182] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  246.702278] kworker/0:1     D    0    60      2 0x80000000
[  246.703293] Workqueue: pm pm_runtime_work
[  246.704393] Call Trace:
[  246.705403]  __schedule+0x322/0xaf0
[  246.706439]  ? wait_for_completion+0x104/0x190
[  246.707393]  schedule+0x33/0x90
[  246.708375]  schedule_timeout+0x3a5/0x590
[  246.709289]  ? mark_held_locks+0x58/0x80
[  246.710208]  ? _raw_spin_unlock_irq+0x2c/0x40
[  246.711222]  ? wait_for_completion+0x104/0x190
[  246.712134]  ? trace_hardirqs_on_caller+0xf4/0x190
[  246.713094]  ? wait_for_completion+0x104/0x190
[  246.713964]  wait_for_completion+0x12c/0x190
[  246.714895]  ? wake_up_q+0x80/0x80
[  246.715727]  ? get_work_pool+0x90/0x90
[  246.716649]  flush_work+0x1c9/0x280
[  246.717483]  ? flush_workqueue_prep_pwqs+0x1b0/0x1b0
[  246.718442]  __cancel_work_timer+0x146/0x1d0
[  246.719247]  cancel_delayed_work_sync+0x13/0x20
[  246.720043]  drm_kms_helper_poll_disable+0x1f/0x30 [drm_kms_helper]
[  246.721123]  nouveau_pmops_runtime_suspend+0x3d/0xb0 [nouveau]
[  246.721897]  pci_pm_runtime_suspend+0x6b/0x190
[  246.722825]  ? pci_has_legacy_pm_support+0x70/0x70
[  246.723737]  __rpm_callback+0x7a/0x1d0
[  246.724721]  ? pci_has_legacy_pm_support+0x70/0x70
[  246.725607]  rpm_callback+0x24/0x80
[  246.726553]  ? pci_has_legacy_pm_support+0x70/0x70
[  246.727376]  rpm_suspend+0x142/0x6b0
[  246.728185]  pm_runtime_work+0x97/0xc0
[  246.728938]  process_one_work+0x231/0x620
[  246.729796]  worker_thread+0x44/0x3a0
[  246.730614]  kthread+0x12b/0x150
[  246.731395]  ? wq_pool_ids_show+0x140/0x140
[  246.732202]  ? kthread_create_worker_on_cpu+0x70/0x70
[  246.732878]  ret_from_fork+0x3a/0x50
[  246.733768] INFO: task kworker/4:2:422 blocked for more than 120 seconds.
[  246.734587]       Not tainted 4.18.0-rc5Lyude-Test+ #2
[  246.735393] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  246.736113] kworker/4:2     D    0   422      2 0x80000080
[  246.736789] Workqueue: events_long drm_dp_mst_link_probe_work [drm_kms_helper]
[  246.737665] Call Trace:
[  246.738490]  __schedule+0x322/0xaf0
[  246.739250]  schedule+0x33/0x90
[  246.739908]  rpm_resume+0x19c/0x850
[  246.740750]  ? finish_wait+0x90/0x90
[  246.741541]  __pm_runtime_resume+0x4e/0x90
[  246.742370]  nv50_disp_atomic_commit+0x31/0x210 [nouveau]
[  246.743124]  drm_atomic_commit+0x4a/0x50 [drm]
[  246.743775]  restore_fbdev_mode_atomic+0x1c8/0x240 [drm_kms_helper]
[  246.744603]  restore_fbdev_mode+0x31/0x140 [drm_kms_helper]
[  246.745373]  drm_fb_helper_restore_fbdev_mode_unlocked+0x54/0xb0 [drm_kms_helper]
[  246.746220]  drm_fb_helper_set_par+0x2d/0x50 [drm_kms_helper]
[  246.746884]  drm_fb_helper_hotplug_event.part.28+0x96/0xb0 [drm_kms_helper]
[  246.747675]  drm_fb_helper_output_poll_changed+0x23/0x30 [drm_kms_helper]
[  246.748544]  drm_kms_helper_hotplug_event+0x2a/0x30 [drm_kms_helper]
[  246.749439]  nv50_mstm_hotplug+0x15/0x20 [nouveau]
[  246.750111]  drm_dp_send_link_address+0x177/0x1c0 [drm_kms_helper]
[  246.750764]  drm_dp_check_and_send_link_address+0xa8/0xd0 [drm_kms_helper]
[  246.751602]  drm_dp_mst_link_probe_work+0x51/0x90 [drm_kms_helper]
[  246.752314]  process_one_work+0x231/0x620
[  246.752979]  worker_thread+0x44/0x3a0
[  246.753838]  kthread+0x12b/0x150
[  246.754619]  ? wq_pool_ids_show+0x140/0x140
[  246.755386]  ? kthread_create_worker_on_cpu+0x70/0x70
[  246.756162]  ret_from_fork+0x3a/0x50
[  246.756847]
           Showing all locks held in the system:
[  246.758261] 3 locks held by kworker/4:0/37:
[  246.759016]  #0: 00000000f8df4d2d ((wq_completion)"events"){+.+.}, at: process_one_work+0x1b3/0x620
[  246.759856]  #1: 00000000e6065461 ((work_completion)(&(&dev->mode_config.output_poll_work)->work)){+.+.}, at: process_one_work+0x1b3/0x620
[  246.760670]  #2: 00000000cb66735f (&helper->lock){+.+.}, at: drm_fb_helper_hotplug_event.part.28+0x20/0xb0 [drm_kms_helper]
[  246.761516] 2 locks held by kworker/0:1/60:
[  246.762274]  #0: 00000000fff6be0f ((wq_completion)"pm"){+.+.}, at: process_one_work+0x1b3/0x620
[  246.762982]  #1: 000000005ab44fb4 ((work_completion)(&dev->power.work)){+.+.}, at: process_one_work+0x1b3/0x620
[  246.763890] 1 lock held by khungtaskd/64:
[  246.764664]  #0: 000000008cb8b5c3 (rcu_read_lock){....}, at: debug_show_all_locks+0x23/0x185
[  246.765588] 5 locks held by kworker/4:2/422:
[  246.766440]  #0: 00000000232f0959 ((wq_completion)"events_long"){+.+.}, at: process_one_work+0x1b3/0x620
[  246.767390]  #1: 00000000bb59b134 ((work_completion)(&mgr->work)){+.+.}, at: process_one_work+0x1b3/0x620
[  246.768154]  #2: 00000000cb66735f (&helper->lock){+.+.}, at: drm_fb_helper_restore_fbdev_mode_unlocked+0x4c/0xb0 [drm_kms_helper]
[  246.768966]  #3: 000000004c8f0b6b (crtc_ww_class_acquire){+.+.}, at: restore_fbdev_mode_atomic+0x4b/0x240 [drm_kms_helper]
[  246.769921]  #4: 000000004c34a296 (crtc_ww_class_mutex){+.+.}, at: drm_modeset_backoff+0x8a/0x1b0 [drm]
[  246.770839] 1 lock held by dmesg/1038:
[  246.771739] 2 locks held by zsh/1172:
[  246.772650]  #0: 00000000836d0438 (&tty->ldisc_sem){++++}, at: ldsem_down_read+0x37/0x40
[  246.773680]  #1: 000000001f4f4d48 (&ldata->atomic_read_lock){+.+.}, at: n_tty_read+0xc1/0x870

[  246.775522] =============================================

After trying dozens of different solutions, I found one very simple one
that should also have the benefit of preventing us from having to fight
locking for the rest of our lives. So, we work around these deadlocks by
deferring all fbcon hotplug events that happen after the runtime suspend
process starts until after the device is resumed again.

Changes since v7:
 - Fixup commit message - Daniel Vetter

Changes since v6:
 - Remove unused nouveau_fbcon_hotplugged_in_suspend() - Ilia

Changes since v5:
 - Come up with the (hopefully final) solution for solving this dumb
   problem, one that is a lot less likely to cause issues with locking in
   the future. This should work around all deadlock conditions with fbcon
   brought up thus far.

Changes since v4:
 - Add nouveau_fbcon_hotplugged_in_suspend() to workaround deadlock
   condition that Lukas described
 - Just move all of this out of drm_fb_helper. It seems that other DRM
   drivers have already figured out other workarounds for this. If other
   drivers do end up needing this in the future, we can just move this
   back into drm_fb_helper again.

Changes since v3:
- Actually check if fb_helper is NULL in both new helpers
- Actually check drm_fbdev_emulation in both new helpers
- Don't fire off a fb_helper hotplug unconditionally; only do it if
  the following conditions are true (as otherwise, calling this in the
  wrong spot will cause Bad Things to happen):
  - fb_helper hotplug handling was actually inhibited previously
  - fb_helper actually has a delayed hotplug pending
  - fb_helper is actually bound
  - fb_helper is actually initialized
- Add __must_check to drm_fb_helper_suspend_hotplug(). There's no
  situation where a driver would actually want to use this without
  checking the return value, so enforce that
- Rewrite and clarify the documentation for both helpers.
- Make sure to return true in the drm_fb_helper_suspend_hotplug() stub
  that's provided in drm_fb_helper.h when CONFIG_DRM_FBDEV_EMULATION
  isn't enabled
- Actually grab the toplevel fb_helper lock in
  drm_fb_helper_resume_hotplug(), since it's possible other activity
  (such as a hotplug) could be going on at the same time the driver
  calls drm_fb_helper_resume_hotplug(). We need this to check whether or
  not drm_fb_helper_hotplug_event() needs to be called anyway

Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Daniel Vetter <daniel@ffwll.ch>
Cc: stable@vger.kernel.org
Cc: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2018-09-07 06:54:26 +10:00
Lyude Paul
611ce85542 drm/nouveau: Remove duplicate poll_enable() in pmops_runtime_suspend()
Since actual hotplug notifications don't get disabled until
nouveau_display_fini() is called, all this will do is cause any hotplugs
that happen between this drm_kms_helper_poll_disable() call and the
actual hotplug disablement to potentially be dropped if ACPI isn't
around to help us.

Signed-off-by: Lyude Paul <lyude@redhat.com>
Acked-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Daniel Vetter <daniel@ffwll.ch>
Cc: stable@vger.kernel.org
Cc: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2018-09-07 06:54:26 +10:00
Lyude Paul
d77ef138ff drm/nouveau/drm/nouveau: Fix bogus drm_kms_helper_poll_enable() placement
Turns out this part is my fault for not noticing when reviewing
9a2eba337c ("drm/nouveau: Fix drm poll_helper handling"). Currently
we call drm_kms_helper_poll_enable() from nouveau_display_hpd_work().
This makes basically no sense however, because that means we're calling
drm_kms_helper_poll_enable() every time we schedule the hotplug
detection work. This is also against the advice mentioned in
drm_kms_helper_poll_enable()'s documentation:

 Note that calls to enable and disable polling must be strictly ordered,
 which is automatically the case when they're only call from
 suspend/resume callbacks.

Of course, hotplugs can't really be ordered. They could even happen
immediately after we called drm_kms_helper_poll_disable() in
nouveau_display_fini(), which can lead to all sorts of issues.

Additionally; enabling polling /after/ we call
drm_helper_hpd_irq_event() could also mean that we'd miss a hotplug
event anyway, since drm_helper_hpd_irq_event() wouldn't bother trying to
probe connectors so long as polling is disabled.

So; simply move this back into nouveau_display_init() again. The race
condition that both of these patches attempted to work around has
already been fixed properly in

  d61a5c1063 ("drm/nouveau: Fix deadlock on runtime suspend")

Fixes: 9a2eba337c ("drm/nouveau: Fix drm poll_helper handling")
Signed-off-by: Lyude Paul <lyude@redhat.com>
Acked-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Daniel Vetter <daniel@ffwll.ch>
Cc: Lukas Wunner <lukas@wunner.de>
Cc: Peter Ujfalusi <peter.ujfalusi@ti.com>
Cc: stable@vger.kernel.org
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2018-09-07 06:54:26 +10:00
Gerd Hoffmann
0e94043ee1 drm: replace DRIVER_PREFER_XBGR_30BPP driver flag with mode_config quirk
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20180905060445.15008-2-kraxel@redhat.com
2018-09-06 08:40:17 +02:00
Hans Verkuil
46094b2bae drm/nouveau: add DisplayPort CEC-Tunneling-over-AUX support
Add DisplayPort CEC-Tunneling-over-AUX support to nouveau.

Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/5c0b907d-0bf2-7b80-b4b6-cbde78b03f0d@xs4all.nl
2018-08-31 10:20:39 +02:00
Sean Paul
bc537a9cc4 Merge drm/drm-next into drm-misc-next
Now that 4.19-rc1 is cut, backmerge it into -misc-next.

Signed-off-by: Sean Paul <seanpaul@chromium.org>
2018-08-27 10:00:03 -04:00
Daniel Vetter
409254281f drm/nouveau: Remove unecessary dma_fence_ops
dma_fence_default_wait is the default now.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Ben Skeggs <bskeggs@redhat.com>
Cc: nouveau@lists.freedesktop.org
Link: https://patchwork.freedesktop.org/patch/msgid/20180704092909.6599-4-daniel.vetter@ffwll.ch
2018-08-17 11:22:35 +02:00
Dave Airlie
3fce461827 BackMerge v4.18-rc7 into drm-next
rmk requested this for armada and I think we've had a few
conflicts build up.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-07-30 10:39:22 +10:00
Dave Airlie
294f96ae8a drm-misc-next for 4.19:
Core Changes:
 - add support for DisplayPort CEC-Tunneling-over-AUX (Hans Verkuil)
 - more doc updates (Daniel Vetter)
 - fourcc: Add is_yuv field to drm_format_info (Ayan Kumar Halder)
 - dma-buf: correctly place BUG_ON (Michel Dänzer)
 
 Driver Changes:
 - more vkms support(Rodrigo Siqueira)
 - many fixes and small improments to all drivers
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJbT52JAAoJEEN0HIUfOBk06UsQAIy5YwUQ9l+8GdS5bKU299KW
 ZMMi0pTgB/bg0uuqGqN1zf23kpyRTNBGu2UMZgHWTcM4gjTP9qxb5GPFyOhr5he4
 pkp0p13fcn85Mkpt6ZQQD4ErMnhJSodzPRRT+ypnM+HzcWWehQOnSbLWCTOpaCeg
 5SsSFT7RfpDcICXzZZKAHFwHAp1y1y6V027RWu0/amUTwoZPn+ktU/s0thGIdqFk
 EGb/dP4K0PAHE4ZnhZOHPFlYbVQWp0J8X7+NmkXvPgwVPahLvKbNMBfG9M3mGcku
 cMwW8phngd0ih9gd1rblG3J8pdISArg6EgqAwwUV6p8tHUBQff5mL/RTh5zrUs6D
 seLqzRM4V74WDp2meMSYogISo2b+39RiL1IhayTytdW/oaterXloSChAwKUz4pi/
 Nj3/Kn59m9DH9NoAh3DYvDg+e06U9csR6TUJZ0B6BlXIwju9/QLybsDbUdmjtSW+
 yqttEs8m4k2gB2ZRo9y2RVi/XCNv0t+GYa2HQcTGrYVZpIxKioT6WdnlobQZ6L2E
 9CClacN6v2e27cQUbZEFuU7phUkM/nw18dROFrIwJ0OxsA5nElO1DTiOy+KDwzAU
 E+l4DqZZknyxEfTxUq79+9J2HmhqA7ikQbgNJMQyQ25iRFrkvYsI7XfF4ix5z+a5
 I0/CkPP3UsTibnVhM7wn
 =HyBh
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-next-2018-07-18' of git://anongit.freedesktop.org/drm/drm-misc into drm-next

drm-misc-next for 4.19:

Core Changes:
- add support for DisplayPort CEC-Tunneling-over-AUX (Hans Verkuil)
- more doc updates (Daniel Vetter)
- fourcc: Add is_yuv field to drm_format_info (Ayan Kumar Halder)
- dma-buf: correctly place BUG_ON (Michel Dänzer)

Driver Changes:
- more vkms support(Rodrigo Siqueira)
- many fixes and small improments to all drivers

Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180718200826.GA20165@juma
2018-07-20 10:46:49 +10:00