linux/drivers/gpu/drm/ttm
Anthony DeRossi ca63d76fd2 drm/ttm: Fix TTM page pool accounting
Freed pages are not subtracted from the allocated_pages counter in
ttm_pool_type_fini(), causing a leak in the count on device removal.
The next shrinker invocation loops forever trying to free pages that are
no longer in the pool:

  rcu: INFO: rcu_sched self-detected stall on CPU
  rcu:  3-....: (9998 ticks this GP) idle=54e/1/0x4000000000000000 softirq=434857/434857 fqs=2237
    (t=10001 jiffies g=2194533 q=49211)
  NMI backtrace for cpu 3
  CPU: 3 PID: 1034 Comm: kswapd0 Tainted: P           O      5.11.0-com #1
  Hardware name: System manufacturer System Product Name/PRIME X570-PRO, BIOS 1405 11/19/2019
  Call Trace:
   <IRQ>
   ...
   </IRQ>
   sysvec_apic_timer_interrupt+0x77/0x80
   asm_sysvec_apic_timer_interrupt+0x12/0x20
  RIP: 0010:mutex_unlock+0x16/0x20
  Code: e7 48 8b 70 10 e8 7a 53 77 ff eb aa e8 43 6c ff ff 0f 1f 00 65 48 8b 14 25 00 6d 01 00 31 c9 48 89 d0 f0 48 0f b1 0f 48 39 c2 <74> 05 e9 e3 fe ff ff c3 66 90 48 8b 47 20 48 85 c0 74 0f 8b 50 10
  RSP: 0018:ffffbdb840797be8 EFLAGS: 00000246
  RAX: ffff9ff445a41c00 RBX: ffffffffc02a9ef8 RCX: 0000000000000000
  RDX: ffff9ff445a41c00 RSI: ffffbdb840797c78 RDI: ffffffffc02a9ac0
  RBP: 0000000000000080 R08: 0000000000000000 R09: ffffbdb840797c80
  R10: 0000000000000000 R11: fffffffffffffff5 R12: 0000000000000000
  R13: 0000000000000000 R14: 0000000000000084 R15: ffffffffc02a9a60
   ttm_pool_shrink+0x7d/0x90 [ttm]
   ttm_pool_shrinker_scan+0x5/0x20 [ttm]
   do_shrink_slab+0x13a/0x1a0
...

debugfs shows the incorrect total:

  $ cat /sys/kernel/debug/dri/0/ttm_page_pool
            --- 0--- --- 1--- --- 2--- --- 3--- --- 4--- --- 5--- --- 6--- --- 7--- --- 8--- --- 9--- ---10---
  wc      :        0        0        0        0        0        0        0        0        0        0        0
  uc      :        0        0        0        0        0        0        0        0        0        0        0
  wc 32   :        0        0        0        0        0        0        0        0        0        0        0
  uc 32   :        0        0        0        0        0        0        0        0        0        0        0
  DMA uc  :        0        0        0        0        0        0        0        0        0        0        0
  DMA wc  :        0        0        0        0        0        0        0        0        0        0        0
  DMA     :        0        0        0        0        0        0        0        0        0        0        0

  total   :     3029 of  8244261

Using ttm_pool_type_take() to remove pages from the pool before freeing
them correctly accounts for the freed pages.

Fixes: d099fc8f54 ("drm/ttm: new TT backend allocation pool v3")
Signed-off-by: Anthony DeRossi <ajderossi@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210303011723.22512-1-ajderossi@gmail.com
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
2021-03-11 11:11:33 +01:00
..
Makefile drm/ttm: nuke old page allocator 2020-10-29 15:57:57 +01:00
ttm_agp_backend.c drm/ttm/drivers: remove unecessary ttm_module.h include v2 2020-12-01 17:43:46 +01:00
ttm_bo_util.c drm/ttm: cleanup BO size handling v3 2020-12-14 14:20:46 +01:00
ttm_bo_vm.c drm/ttm: cleanup BO size handling v3 2020-12-14 14:20:46 +01:00
ttm_bo.c drm/ttm: soften TTM warnings 2021-03-11 11:11:33 +01:00
ttm_execbuf_util.c drm/ttm: cleanup LRU handling further 2020-12-15 17:01:55 +01:00
ttm_memory.c drm/ttm/drivers: remove unecessary ttm_module.h include v2 2020-12-01 17:43:46 +01:00
ttm_module.c drm/ttm/drivers: remove unecessary ttm_module.h include v2 2020-12-01 17:43:46 +01:00
ttm_module.h drm/ttm/drivers: remove unecessary ttm_module.h include v2 2020-12-01 17:43:46 +01:00
ttm_pool.c drm/ttm: Fix TTM page pool accounting 2021-03-11 11:11:33 +01:00
ttm_range_manager.c drm/ttm/drivers: remove unecessary ttm_module.h include v2 2020-12-01 17:43:46 +01:00
ttm_resource.c drm/ttm: replace context flags with bools v2 2020-11-04 11:23:25 +01:00
ttm_tt.c drm/ttm: cleanup BO size handling v3 2020-12-14 14:20:46 +01:00