linux/Documentation
Aaron Lu a2468cc9bf swap: choose swap device according to numa node
If the system has more than one swap device and swap device has the node
information, we can make use of this information to decide which swap
device to use in get_swap_pages() to get better performance.

The current code uses a priority based list, swap_avail_list, to decide
which swap device to use and if multiple swap devices share the same
priority, they are used round robin.  This patch changes the previous
single global swap_avail_list into a per-numa-node list, i.e.  for each
numa node, it sees its own priority based list of available swap
devices.  Swap device's priority can be promoted on its matching node's
swap_avail_list.

The current swap device's priority is set as: user can set a >=0 value,
or the system will pick one starting from -1 then downwards.  The
priority value in the swap_avail_list is the negated value of the swap
device's due to plist being sorted from low to high.  The new policy
doesn't change the semantics for priority >=0 cases, the previous
starting from -1 then downwards now becomes starting from -2 then
downwards and -1 is reserved as the promoted value.

Take 4-node EX machine as an example, suppose 4 swap devices are
available, each sit on a different node:
swapA on node 0
swapB on node 1
swapC on node 2
swapD on node 3

After they are all swapped on in the sequence of ABCD.

Current behaviour:
their priorities will be:
swapA: -1
swapB: -2
swapC: -3
swapD: -4
And their position in the global swap_avail_list will be:
swapA   -> swapB   -> swapC   -> swapD
prio:1     prio:2     prio:3     prio:4

New behaviour:
their priorities will be(note that -1 is skipped):
swapA: -2
swapB: -3
swapC: -4
swapD: -5
And their positions in the 4 swap_avail_lists[nid] will be:
swap_avail_lists[0]: /* node 0's available swap device list */
swapA   -> swapB   -> swapC   -> swapD
prio:1     prio:3     prio:4     prio:5
swap_avali_lists[1]: /* node 1's available swap device list */
swapB   -> swapA   -> swapC   -> swapD
prio:1     prio:2     prio:4     prio:5
swap_avail_lists[2]: /* node 2's available swap device list */
swapC   -> swapA   -> swapB   -> swapD
prio:1     prio:2     prio:3     prio:5
swap_avail_lists[3]: /* node 3's available swap device list */
swapD   -> swapA   -> swapB   -> swapC
prio:1     prio:2     prio:3     prio:4

To see the effect of the patch, a test that starts N process, each mmap
a region of anonymous memory and then continually write to it at random
position to trigger both swap in and out is used.

On a 2 node Skylake EP machine with 64GiB memory, two 170GB SSD drives
are used as swap devices with each attached to a different node, the
result is:

runtime=30m/processes=32/total test size=128G/each process mmap region=4G
kernel         throughput
vanilla        13306
auto-binding   15169 +14%

runtime=30m/processes=64/total test size=128G/each process mmap region=2G
kernel         throughput
vanilla        11885
auto-binding   14879 +25%

[aaron.lu@intel.com: v2]
  Link: http://lkml.kernel.org/r/20170814053130.GD2369@aaronlu.sh.intel.com
  Link: http://lkml.kernel.org/r/20170816024439.GA10925@aaronlu.sh.intel.com
[akpm@linux-foundation.org: use kmalloc_array()]
Link: http://lkml.kernel.org/r/20170814053130.GD2369@aaronlu.sh.intel.com
Link: http://lkml.kernel.org/r/20170816024439.GA10925@aaronlu.sh.intel.com
Signed-off-by: Aaron Lu <aaron.lu@intel.com>
Cc: "Chen, Tim C" <tim.c.chen@intel.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-06 17:27:30 -07:00
..
ABI mm, swap: add sysfs interface for VMA based swap readahead 2017-09-06 17:27:29 -07:00
accounting
acpi This is the bulk of GPIO changes for the v4.13 series: 2017-07-07 12:40:27 -07:00
admin-guide mm, page_alloc: rip out ZONELIST_ORDER_ZONE 2017-09-06 17:27:25 -07:00
aoe
arm Documentation: arm: Replace use of virt_to_phys with __pa_symbol 2017-07-17 13:43:58 -06:00
arm64 arm64: Expose DC CVAP to userspace 2017-08-09 11:00:35 +01:00
auxdisplay
backlight
blackfin
block bio-integrity: fold bio_integrity_enabled to bio_integrity_prep 2017-07-03 16:56:24 -06:00
blockdev zram: add config and doc file for writeback feature 2017-09-06 17:27:25 -07:00
bus-devices
cdrom cdrom: Make device operations read-only 2017-02-14 08:29:56 -07:00
cgroup-v1 mm, vmpressure: pass-through notification support 2017-07-10 16:32:31 -07:00
cma
connector
console
core-api Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2017-09-04 08:13:52 -07:00
cpu-freq cpufreq: intel_pstate: Document the current behavior and user interface 2017-05-14 02:06:03 +02:00
cpuidle
cris
crypto KEYS: Add documentation for asymmetric keyring restrictions 2017-07-14 11:01:38 +10:00
dev-tools docs: disable KASLR when debugging kernel 2017-07-17 14:49:01 -06:00
device-mapper dm raid: bump target version 2017-07-25 14:54:20 -04:00
devicetree Power management updates for v4.14-rc1 2017-09-05 12:19:08 -07:00
dmaengine
doc-guide sphinx.rst: Allow Sphinx version 1.6 at the docs 2017-08-26 15:50:27 -06:00
driver-api This is the bulk of the GPIO changes for the v4.14 cycle: 2017-09-05 11:49:48 -07:00
driver-model genirq/irq_sim: Add a devres variant of irq_sim_init() 2017-08-16 16:40:02 +02:00
early-userspace Documentation: Fix dead URLs to ftp.kernel.org 2017-03-29 15:46:06 -06:00
EDID drm: use .hword to represent 16-bit numbers 2017-03-30 10:15:19 +02:00
extcon extcon: Remove porting compatibility of swich class 2017-04-06 10:55:24 +09:00
fault-injection fault-inject: add /proc/<pid>/fail-nth 2017-07-14 15:05:13 -07:00
fb efifb: allow user to disable write combined mapping. 2017-07-31 18:45:41 +02:00
features docs/features: parisc implements tracehook 2017-08-07 14:18:40 -06:00
filesystems fscache: remove unused ->now_uncached callback 2017-09-06 17:27:26 -07:00
firmware_class firmware: revamp firmware documentation 2017-01-11 09:42:59 +01:00
fmc
fpga fpga: Add scatterlist based programming 2017-02-10 15:20:44 +01:00
frv
gpio pinctrl: generic: update references to Documentation/pinctrl.txt 2017-08-07 15:26:34 +02:00
gpu Merge tag 'drm-intel-next-2017-08-18' of git://anongit.freedesktop.org/git/drm-intel into drm-next 2017-08-22 10:03:07 +10:00
hid Documentation: hid: fix path to input bus definitions 2017-03-13 17:15:19 -06:00
hwmon hwmon: (pmbus/lm25066) Add support for TI LM5066I 2017-08-30 06:31:13 -07:00
i2c i2c: i801: Add support for Intel Cannon Lake 2017-06-19 16:17:41 +02:00
ia64
ide
iio iio: adc: New driver for Cirrus Logic EP93xx ADC 2017-07-25 19:56:23 +01:00
infiniband Documentation: Hardware tag matching 2017-08-29 08:30:21 -04:00
input Documentation:input: fix typo 2017-08-30 15:18:24 -06:00
ioctl scsi: cxlflash: Introduce host ioctl support 2017-06-26 15:01:11 -04:00
isdn
kbuild kbuild: Update example for ccflags-y usage 2017-08-07 14:24:28 -06:00
kdump kexec/kdump: minor Documentation updates for arm64 and Image 2017-07-12 16:26:00 -07:00
kernel-hacking There has been a fair amount of activity in the docs tree this time 2017-07-03 21:13:25 -07:00
laptops
leds Documentaion: leds: leds-lp55xx.txt: Fix typos 2017-03-17 13:06:14 -06:00
lightnvm lightnvm: physical block device (pblk) target 2017-04-16 10:06:33 -06:00
livepatch livepatch: allow removal of a disabled patch 2017-03-08 09:38:43 +01:00
locking Merge branch 'linus' into locking/core, to fix up conflicts 2017-09-04 11:01:18 +02:00
m68k
md Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md 2017-05-03 10:05:38 -07:00
media Merge branch 'docs-next' of git://git.lwn.net/linux 2017-09-03 21:07:29 -07:00
memory-devices
metag
mic
mips
misc-devices Documentation: misc-devices: Add Documentation for pci-endpoint-test driver 2017-04-28 10:23:19 -05:00
mmc MMC core: 2017-05-02 17:34:32 -07:00
mn10300
mtd spi-nor: Add support for Intel SPI serial flash controller 2017-01-03 17:33:36 +00:00
namespaces
netlabel
networking Merge branch 'docs-next' of git://git.lwn.net/linux 2017-09-03 21:07:29 -07:00
nfc
nios2
nvdimm
nvmem NVMEM documentation fix: A minor typo 2017-08-24 13:31:58 -06:00
parisc
PCI docs: update old references for DocBook from the documentation 2017-05-16 08:44:19 -03:00
pcmcia
perf perf: qcom: Add L3 cache PMU driver 2017-04-03 18:53:50 +01:00
phy
platform
power Merge branch 'pm-sleep' 2017-09-04 00:06:02 +02:00
powerpc powerpc updates for 4.13 2017-07-07 13:55:45 -07:00
pps Doc: clarify source of jitter in USB1.1, and USB2.0 2017-01-04 14:40:52 -07:00
process docs: process: drop git snapshots from applying-patches.rst 2017-08-30 15:25:30 -06:00
pti
ptp
rapidio
RCU doc: Set down RCU's scheduling-clock-interrupt needs 2017-08-17 07:31:14 -07:00
s390 docs: add documentation for vfio-ccw 2017-03-31 12:55:11 +02:00
scheduler sched/deadline: Add documentation about GRUB reclaiming 2017-06-08 10:31:56 +02:00
scsi scsi: make asynchronous aborts mandatory 2017-04-06 13:07:33 -04:00
security docs: ReSTify table of contents in core.rst 2017-08-30 15:27:58 -06:00
serial tty: n_gsm: do not send/receive in ldisc close path 2017-06-03 18:48:52 +09:00
sh docs-rst: convert sh book to ReST 2017-05-16 08:44:18 -03:00
sound sound updates for 4.13-rc1 2017-07-06 10:56:51 -07:00
sparc Documentation/sparc: Steps for sending break on sunhv console 2017-02-23 08:27:25 -08:00
sphinx Documentation/sphinx: fix kernel-doc decode for non-utf-8 locale 2017-08-31 13:36:28 -06:00
sphinx-static docs RTD theme: code-block with line nos - lines and line numbers don't line up. 2017-07-17 13:48:45 -06:00
spi spi: Document SPI slave controller support 2017-05-26 13:11:00 +01:00
sysctl mm, page_alloc: rip out ZONELIST_ORDER_ZONE 2017-09-06 17:27:25 -07:00
target Documentation/target: add an example script to configure an iSCSI target 2017-05-01 22:21:35 -07:00
thermal Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux 2017-05-12 11:58:45 -07:00
timers rcu: Eliminate NOCBs CPU-state Kconfig options 2017-06-08 18:52:43 -07:00
trace stm class: Document the stm_ftrace 2017-08-25 17:58:34 +03:00
translations Merge branch 'linus' into locking/core, to fix up conflicts 2017-09-04 11:01:18 +02:00
usb usb: gadget: add f_uac1 variant based on a new u_audio api 2017-06-19 09:22:47 +03:00
userspace-api doc: ReSTify no_new_privs.txt 2017-05-18 10:30:09 -06:00
virtual kvm: x86: hyperv: make VP_INDEX managed by userspace 2017-07-14 16:28:18 +02:00
vm swap: choose swap device according to numa node 2017-09-06 17:27:30 -07:00
w1 w1: add documentation for w1_ds2438 2017-03-17 15:10:49 +09:00
watchdog watchdog: uniphier: add UniPhier watchdog driver 2017-07-03 13:58:55 +02:00
wimax
x86 Merge branch 'x86-cache-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2017-09-04 13:56:37 -07:00
xtensa of: update ePAPR references to point to Devicetree Specification 2017-06-22 11:22:06 -05:00
.gitignore
00-INDEX linux-kselftest-4.13-rc1-update 2017-07-07 14:04:47 -07:00
atomic_bitops.txt Documentation/locking/atomic: Add documents for new atomic_t APIs 2017-08-10 12:29:00 +02:00
atomic_t.txt Documentation/locking/atomic: Finish the document... 2017-08-25 11:06:33 +02:00
bcache.txt bcache.txt: standardize document format 2017-07-14 13:51:27 -06:00
bt8xxgpio.txt bt8xxgpio.txt: standardize document format 2017-07-14 13:51:27 -06:00
btmrvl.txt btmrvl.txt: standardize document format 2017-07-14 13:51:27 -06:00
bus-virt-phys-mapping.txt bus-virt-phys-mapping.txt: standardize document format 2017-07-14 13:51:28 -06:00
cachetlb.txt cachetlb.txt: standardize document format 2017-07-14 13:51:28 -06:00
cgroup-v2.txt cgroup-v2.txt: standardize document format 2017-07-14 13:58:13 -06:00
Changes docs: add back 'Documentation/Changes' file (as symlink) 2016-12-14 16:30:12 -08:00
circular-buffers.txt circular-buffers.txt: standardize document format 2017-07-14 13:51:29 -06:00
clk.txt clk.txt: standardize document format 2017-07-14 13:51:29 -06:00
CodingStyle
conf.py docs-rst: fix verbatim font size on tables 2017-08-26 15:50:20 -06:00
cpu-load.txt cpu-load: standardize document format 2017-07-14 13:51:30 -06:00
cputopology.txt cputopology.txt: standardize document format 2017-07-14 13:51:30 -06:00
crc32.txt crc32.txt: standardize document format 2017-07-14 13:51:30 -06:00
dcdbas.txt dcdbas.txt: standardize document format 2017-07-14 13:51:31 -06:00
debugging-modules.txt
debugging-via-ohci1394.txt debugging-via-ohci1394.txt: standardize document format 2017-07-14 13:51:34 -06:00
dell_rbu.txt dell_rbu.txt: standardize document format 2017-07-14 13:58:12 -06:00
digsig.txt digsig.txt: standardize document format 2017-07-14 13:51:31 -06:00
DMA-API-HOWTO.txt DMA-API-HOWTO.txt: standardize document format 2017-07-14 13:51:32 -06:00
DMA-API.txt DMA-API.txt: standardize document format 2017-07-14 13:51:32 -06:00
DMA-attributes.txt DMA-attributes.txt: standardize document format 2017-07-14 13:51:33 -06:00
DMA-ISA-LPC.txt DMA-ISA-LPC.txt: standardize document format 2017-07-14 13:51:33 -06:00
docutils.conf
dontdiff GCC plugin updates: 2017-07-05 11:46:59 -07:00
efi-stub.txt efi-stub.txt: standardize document format 2017-07-14 13:51:34 -06:00
eisa.txt eisa.txt: standardize document format 2017-07-14 13:51:34 -06:00
flexible-arrays.txt flexible-arrays.txt: standardize document format 2017-07-14 13:51:35 -06:00
futex-requeue-pi.txt futex-requeue-pi.txt: standardize document format 2017-07-14 13:51:35 -06:00
gcc-plugins.txt gcc-plugins.txt: standardize document format 2017-07-14 13:51:36 -06:00
highuid.txt highuid.txt: standardize document format 2017-07-14 13:51:36 -06:00
hw_random.txt hw_random.txt: standardize document format 2017-07-14 13:51:37 -06:00
hwspinlock.txt hwspinlock.txt: standardize document format 2017-07-14 13:51:37 -06:00
index.rst Make the main documentation title less Geocities 2017-06-23 14:02:27 -06:00
intel_txt.txt intel_txt.txt: standardize document format 2017-07-14 13:51:38 -06:00
Intel-IOMMU.txt Intel-IOMMU.txt: standardize document format 2017-07-14 13:51:38 -06:00
io_ordering.txt io_ordering.txt: standardize document format 2017-07-14 13:51:39 -06:00
io-mapping.txt io-mapping.txt: standardize document format 2017-07-14 13:51:38 -06:00
iostats.txt iostats.txt: update it to cover recent Kernels 2017-07-14 13:51:40 -06:00
IPMI.txt IPMI.txt: standardize document format 2017-07-14 13:51:40 -06:00
IRQ-affinity.txt IRQ-affinity.txt: standardize document format 2017-07-14 13:51:41 -06:00
IRQ-domain.txt IRQ-domain.txt: standardize document format 2017-07-14 13:51:41 -06:00
IRQ.txt IRQ.txt: add a markup for its title 2017-07-14 13:51:42 -06:00
irqflags-tracing.txt irqflags-tracing.txt: standardize document format 2017-07-14 13:51:42 -06:00
isa.txt isa.txt: standardize document format 2017-07-14 13:51:43 -06:00
isapnp.txt isapnp.txt: promote title level 2017-07-14 13:51:43 -06:00
kernel-doc-nano-HOWTO.txt docs: update old references for DocBook from the documentation 2017-05-16 08:44:19 -03:00
kernel-per-CPU-kthreads.txt kernel-per-CPU-kthreads.txt: standardize document format 2017-07-14 13:51:43 -06:00
kobject.txt kobject.txt: standardize document format 2017-07-14 13:51:44 -06:00
kprobes.txt docs: kprobes.txt: Fix whitespacing 2017-07-14 13:58:14 -06:00
kref.txt kref.txt: standardize document format 2017-07-14 13:51:45 -06:00
ldm.txt ldm.txt: standardize document format 2017-07-14 13:51:45 -06:00
lockup-watchdogs.txt lockup-watchdogs.txt: standardize document format 2017-07-14 13:51:46 -06:00
logo.gif
logo.txt
lsm.txt docs-rst: convert lsm from DocBook to ReST 2017-05-16 08:44:19 -03:00
lzo.txt lzo.txt: standardize document format 2017-07-14 13:51:46 -06:00
mailbox.txt mailbox.txt: standardize document format 2017-07-14 13:51:47 -06:00
Makefile doc: Makefile: if sphinx is not found, run a check script 2017-08-24 13:18:30 -06:00
memory-barriers.txt Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2017-09-04 11:52:29 -07:00
memory-hotplug.txt memory-hotplug.txt: standardize document format 2017-07-14 13:57:53 -06:00
men-chameleon-bus.txt men-chameleon-bus.txt: standardize document format 2017-07-14 13:57:54 -06:00
nommu-mmap.txt nommu-mmap.txt: don't use all upper case on titles 2017-07-14 13:57:55 -06:00
ntb.txt This series converts a number of top-level documents to the RST format 2017-07-15 12:58:58 -07:00
numastat.txt numastat.txt: standardize document format 2017-07-14 13:57:56 -06:00
padata.txt padata.txt: standardize document format 2017-07-14 13:57:56 -06:00
parport-lowlevel.txt parport-lowlevel.txt: standardize document format 2017-07-14 13:57:57 -06:00
percpu-rw-semaphore.txt percpu-rw-semaphore.txt: standardize document format 2017-07-14 13:57:58 -06:00
phy.txt phy.txt: standardize document format 2017-07-14 13:57:58 -06:00
pi-futex.txt pi-futex.txt: standardize document format 2017-07-14 13:57:59 -06:00
pnp.txt pnp.txt: standardize document format 2017-07-14 13:57:59 -06:00
preempt-locking.txt preempt-locking.txt: standardize document format 2017-07-14 13:58:00 -06:00
printk-formats.txt printk-formats.txt: Add examples for %pF and %pS usage 2017-08-24 18:49:52 +02:00
pwm.txt pwm: Standardize document format 2017-07-06 08:23:30 +02:00
rbtree.txt rbtree.txt: standardize document format 2017-07-14 13:58:01 -06:00
remoteproc.txt remoteproc.txt: standardize document format 2017-07-14 13:58:02 -06:00
rfkill.txt rfkill.txt: standardize document format 2017-07-14 13:58:02 -06:00
robust-futex-ABI.txt robust-futex-ABI.txt: standardize document format 2017-07-14 13:58:03 -06:00
robust-futexes.txt robust-futexes.txt: standardize document format 2017-07-14 13:58:03 -06:00
rpmsg.txt rpmsg.txt: standardize document format 2017-07-14 13:58:04 -06:00
rtc.txt rtc: add generic nvmem support 2017-07-07 13:14:14 +02:00
SAK.txt SAK.txt: standardize document format 2017-07-14 13:58:04 -06:00
sgi-ioc4.txt sgi-ioc4.txt: standardize document format 2017-07-14 13:58:05 -06:00
siphash.txt siphash.txt: standardize document format 2017-07-14 13:58:06 -06:00
SM501.txt SM501.txt: standardize document format 2017-07-14 13:58:06 -06:00
smsc_ece1099.txt smsc_ece1099.txt: standardize document format 2017-07-14 13:58:07 -06:00
static-keys.txt jump_label: Provide hotplug context variants 2017-08-10 12:28:59 +02:00
SubmittingPatches
svga.txt svga.txt: standardize document format 2017-07-14 13:58:08 -06:00
switchtec.txt switchtec: Add IOCTLs to the Switchtec driver 2017-04-12 12:23:37 -05:00
sync_file.txt sync_file.txt: standardize document format 2017-05-24 13:01:27 -03:00
tee.txt tee.txt: standardize document format 2017-07-14 13:58:14 -06:00
this_cpu_ops.txt this_cpu_ops.txt: standardize document format 2017-07-14 13:58:08 -06:00
unaligned-memory-access.txt unaligned-memory-access.txt: standardize document format 2017-07-14 13:58:09 -06:00
vfio-mediated-device.txt vfio-mediated-device.txt: standardize document format 2017-07-14 13:58:10 -06:00
vfio.txt vfio.txt: standardize document format 2017-07-14 13:58:10 -06:00
video-output.txt
xillybus.txt xillybus.txt: standardize document format 2017-07-14 13:58:11 -06:00
xz.txt xz.txt: standardize document format 2017-07-14 13:58:11 -06:00
zorro.txt zorro.txt: standardize document format 2017-07-14 13:58:12 -06:00