linux/include
Shaohua Li 2a8f944934 swap: change block allocation algorithm for SSD
I'm using a fast SSD to do swap.  scan_swap_map() sometimes uses up to
20~30% CPU time (when cluster is hard to find, the CPU time can be up to
80%), which becomes a bottleneck.  scan_swap_map() scans a byte array to
search a 256 page cluster, which is very slow.

Here I introduced a simple algorithm to search cluster.  Since we only
care about 256 pages cluster, we can just use a counter to track if a
cluster is free.  Every 256 pages use one int to store the counter.  If
the counter of a cluster is 0, the cluster is free.  All free clusters
will be added to a list, so searching cluster is very efficient.  With
this, scap_swap_map() overhead disappears.

This might help low end SD card swap too.  Because if the cluster is
aligned, SD firmware can do flash erase more efficiently.

We only enable the algorithm for SSD.  Hard disk swap isn't fast enough
and has downside with the algorithm which might introduce regression (see
below).

The patch slightly changes which cluster is choosen.  It always adds free
cluster to list tail.  This can help wear leveling for low end SSD too.
And if no cluster found, the scan_swap_map() will do search from the end
of last cluster.  So if no cluster found, the scan_swap_map() will do
search from the end of last free cluster, which is random.  For SSD, this
isn't a problem at all.

Another downside is the cluster must be aligned to 256 pages, which will
reduce the chance to find a cluster.  I would expect this isn't a big
problem for SSD because of the non-seek penality.  (And this is the reason
I only enable the algorithm for SSD).

Signed-off-by: Shaohua Li <shli@fusionio.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Kyungmin Park <kmpark@infradead.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Rafael Aquini <aquini@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-09-11 15:57:15 -07:00
..
acpi Merge branch 'acpi-hotplug' 2013-08-30 14:14:25 +02:00
asm-generic Merge branch 'for-v3.12' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping 2013-09-09 10:26:33 -07:00
clocksource ARM: SoC cleanups for 3.12 2013-09-06 13:21:16 -07:00
crypto crypto: scatterwalk - Add support for calculating number of SG elements 2013-08-21 21:27:58 +10:00
drm Merge tag 'drm-intel-fixes-2013-09-06' of git://people.freedesktop.org/~danvet/drm-intel into drm-fixes 2013-09-10 12:36:55 +10:00
dt-bindings Device tree core updates for v3.12 2013-09-10 13:53:52 -07:00
keys
kvm ARM: KVM: vgic: Bump VGIC_NR_IRQS to 256 2013-08-30 16:12:39 +03:00
linux swap: change block allocation algorithm for SSD 2013-09-11 15:57:15 -07:00
math-emu
media Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2013-09-05 14:54:29 -07:00
memory
misc
net Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2013-09-05 14:54:29 -07:00
pcmcia
ras
rdma Merge branches 'cxgb4', 'flowsteer', 'ipoib', 'iser', 'mlx4', 'ocrdma' and 'qib' into for-next 2013-09-03 09:01:08 -07:00
rxrpc
scsi [SCSI] Generate uevents on certain unit attention codes 2013-08-26 18:52:27 +04:00
sound Merge branch 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media 2013-09-05 11:55:59 -07:00
target
trace NFS client updates for Linux 3.12 2013-09-09 09:19:15 -07:00
uapi Add the ability to collect I/O statistics on user-defined regions of a 2013-09-10 13:06:15 -07:00
video fbdev changes for 3.12: 2013-09-05 09:49:32 -07:00
xen Features: 2013-09-04 17:45:39 -07:00
Kbuild