linux/drivers/staging
Thomas Zimmermann 8c30e2d81b fbdev: Don't sort deferred-I/O pages by default
Fbdev's deferred I/O sorts all dirty pages by default, which incurs a
significant overhead. Make the sorting step optional and update the few
drivers that require it. Use a FIFO list by default.

Most fbdev drivers with deferred I/O build a bounding rectangle around
the dirty pages or simply flush the whole screen. The only two affected
DRM drivers, generic fbdev and vmwgfx, both use a bounding rectangle.
In those cases, the exact order of the pages doesn't matter. The other
drivers look at the page index or handle pages one-by-one. The patch
sets the sort_pagelist flag for those, even though some of them would
probably work correctly without sorting. Driver maintainers should update
their driver accordingly.

Sorting pages by memory offset for deferred I/O performs an implicit
bubble-sort step on the list of dirty pages. The algorithm goes through
the list of dirty pages and inserts each new page according to its
index field. Even worse, list traversal always starts at the first
entry. As video memory is most likely updated scanline by scanline, the
algorithm traverses through the complete list for each updated page.

For example, with 1024x768x32bpp each page covers exactly one scanline.
Writing a single screen update from top to bottom requires updating
768 pages. With an average list length of 384 entries, a screen update
creates (768 * 384 =) 294912 compare operation.

Fix this by making the sorting step opt-in and update the few drivers
that require it. All other drivers work with unsorted page lists. Pages
are appended to the list. Therefore, in the common case of writing the
framebuffer top to bottom, pages are still sorted by offset, which may
have a positive effect on performance.

Playing a video [1] in mplayer's benchmark mode shows the difference
(i7-4790, FullHD, simpledrm, kernel with debugging).

  mplayer -benchmark -nosound -vo fbdev ./big_buck_bunny_720p_stereo.ogg

With sorted page lists:

  BENCHMARKs: VC:  32.960s VO:  73.068s A:   0.000s Sys:   2.413s =  108.441s
  BENCHMARK%: VC: 30.3947% VO: 67.3802% A:  0.0000% Sys:  2.2251% = 100.0000%

With unsorted page lists:

  BENCHMARKs: VC:  31.005s VO:  42.889s A:   0.000s Sys:   2.256s =   76.150s
  BENCHMARK%: VC: 40.7156% VO: 56.3219% A:  0.0000% Sys:  2.9625% = 100.0000%

VC shows the overhead of video decoding, VO shows the overhead of the
video output. Using unsorted page lists reduces the benchmark's run time
by ~32s/~25%.

v2:
	* Make sorted pagelists the special case (Sam)
	* Comment on drivers' use of pagelist (Sam)
	* Warn about the overhead in comment

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Acked-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://download.blender.org/peach/bigbuckbunny_movies/big_buck_bunny_720p_stereo.ogg # [1]
Link: https://patchwork.freedesktop.org/patch/msgid/20220211094640.21632-3-tzimmermann@suse.de
2022-02-16 16:41:45 +01:00
..
android
axis-fifo staging: axis-fifo: Use platform_get_irq() to get the interrupt 2021-12-30 11:54:56 +01:00
board
clocking-wizard
emxx_udc
fbtft fbdev: Don't sort deferred-I/O pages by default 2022-02-16 16:41:45 +01:00
fieldbus staging: fieldbus: anybus: reframe comment to avoid warning 2021-10-30 11:12:17 +02:00
fwserial
gdm724x staging: use eth_hw_addr_set() in orphan drivers 2021-10-20 19:33:59 +02:00
greybus staging: greybus: audio: Check null pointer 2022-01-06 14:46:11 +01:00
gs_fpgaboot
iio staging: iio: ad9832: convert probe to device-managed 2021-10-17 11:05:54 +01:00
ks7010 staging: use eth_hw_addr_set() for dev->addr_len cases 2021-10-20 19:33:58 +02:00
media media updates for v5.17-rc1 2022-01-12 10:43:08 -08:00
most staging: most: dim2: use consistent routine naming 2021-12-28 17:10:13 +01:00
mt7621-dts staging: mt7621-dts: remove 'gdma' and 'hsdma' nodes 2021-11-15 10:29:12 +01:00
nvec
octeon Staging driver update for 5.16-rc1 2021-11-04 07:56:22 -07:00
octeon-usb
olpc_dcon
pi433 staging: pi433: add comment to rx_lock mutex definition 2022-01-06 15:00:29 +01:00
qlge Staging driver update for 5.16-rc1 2021-11-04 07:56:22 -07:00
r8188eu staging: r8188eu: rename camelcase variable uintPeerChannel 2022-01-06 15:15:04 +01:00
rtl8192e Staging drivers update for 5.17-rc1 2022-01-12 11:18:49 -08:00
rtl8192u staging: rtl8192u: remove some repeated words in some comments 2021-12-20 17:47:22 +01:00
rtl8712 staging: rtl8712: Fix alignment checks with flipped condition 2021-12-09 08:57:22 +01:00
rtl8723bs staging: rtl8723bs: removed unused if blocks 2021-12-28 17:08:07 +01:00
rts5208 exit: Rename complete_and_exit to kthread_complete_and_exit 2021-12-13 12:04:45 -06:00
sm750fb
unisys staging: unisys: visornic: removed a blank line at the end of function 2021-11-25 17:38:53 +01:00
vc04_services staging: vc04_services: rename BM2835 to BCM2835 in headers comments 2022-01-06 14:49:34 +01:00
vme
vt6655 staging: vt6655: drop off byRxMode var in device.h 2021-12-28 17:10:47 +01:00
vt6656 staging: vt6656: Remove filenames in files 2021-08-28 08:33:33 +02:00
wfx staging: wfx: sta: Fix 'else' coding style warning 2021-09-17 16:23:42 +02:00
wlan-ng staging: wlan-ng: Removed unused comments 2021-11-15 10:02:05 +01:00
Kconfig Merge 5.16-rc3 into staging-next 2021-11-29 08:03:05 +01:00
Makefile Merge 5.16-rc3 into staging-next 2021-11-29 08:03:05 +01:00