linux/Documentation
Roman Gushchin cf11e85fc0 mm: hugetlb: optionally allocate gigantic hugepages using cma
Commit 944d9fec8d ("hugetlb: add support for gigantic page allocation
at runtime") has added the run-time allocation of gigantic pages.

However it actually works only at early stages of the system loading,
when the majority of memory is free.  After some time the memory gets
fragmented by non-movable pages, so the chances to find a contiguous 1GB
block are getting close to zero.  Even dropping caches manually doesn't
help a lot.

At large scale rebooting servers in order to allocate gigantic hugepages
is quite expensive and complex.  At the same time keeping some constant
percentage of memory in reserved hugepages even if the workload isn't
using it is a big waste: not all workloads can benefit from using 1 GB
pages.

The following solution can solve the problem:
1) On boot time a dedicated cma area* is reserved. The size is passed
   as a kernel argument.
2) Run-time allocations of gigantic hugepages are performed using the
   cma allocator and the dedicated cma area

In this case gigantic hugepages can be allocated successfully with a
high probability, however the memory isn't completely wasted if nobody
is using 1GB hugepages: it can be used for pagecache, anon memory, THPs,
etc.

* On a multi-node machine a per-node cma area is allocated on each node.
  Following gigantic hugetlb allocation are using the first available
  numa node if the mask isn't specified by a user.

Usage:
1) configure the kernel to allocate a cma area for hugetlb allocations:
   pass hugetlb_cma=10G as a kernel argument

2) allocate hugetlb pages as usual, e.g.
   echo 10 > /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages

If the option isn't enabled or the allocation of the cma area failed,
the current behavior of the system is preserved.

x86 and arm-64 are covered by this patch, other architectures can be
trivially added later.

The patch contains clean-ups and fixes proposed and implemented by Aslan
Bakirov and Randy Dunlap.  It also contains ideas and suggestions
proposed by Rik van Riel, Michal Hocko and Mike Kravetz.  Thanks!

Signed-off-by: Roman Gushchin <guro@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Andreas Schaufler <andreas.schaufler@gmx.de>
Acked-by: Mike Kravetz <mike.kravetz@oracle.com>
Acked-by: Michal Hocko <mhocko@kernel.org>
Cc: Aslan Bakirov <aslan@fb.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Joonsoo Kim <js1304@gmail.com>
Link: http://lkml.kernel.org/r/20200407163840.92263-3-guro@fb.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-10 15:36:21 -07:00
..
ABI f2fs-for-5.7-rc1 2020-04-07 13:48:26 -07:00
accounting doc: cgroup: improve formatting of references 2020-03-02 12:57:03 -07:00
admin-guide mm: hugetlb: optionally allocate gigantic hugepages using cma 2020-04-10 15:36:21 -07:00
arm docs: arm: tcm: Fix a few typos 2020-02-19 02:42:21 -07:00
arm64 arm64 updates for 5.7: 2020-03-31 10:05:01 -07:00
block block: Document genhd capability flags 2020-03-12 07:47:22 -06:00
bpf bpf: lsm: Add Documentation 2020-03-30 01:35:12 +02:00
cdrom
core-api mm: add pagemap.h to the fine documentation 2020-04-02 09:35:29 -07:00
cpu-freq docs: cpu-freq: convert cpufreq-stats.txt to ReST 2020-03-06 00:01:02 +01:00
crypto crypto: algapi - make unregistration functions return void 2019-12-20 14:58:35 +08:00
dev-tools linux-kselftest-kunit-5.7-rc1 2020-04-01 16:11:40 -07:00
devicetree linux-watchdog 5.7-rc1 tag 2020-04-08 21:29:10 -07:00
doc-guide Documentation: build warnings related to missing blank lines after explicit markups has been fixed 2020-02-05 10:30:03 -07:00
driver-api - Convert tsens configuration DT binding to yaml (Rajeshwari) 2020-04-07 20:00:16 -07:00
fault-injection
fb fbdev: fbmem: allow overriding the number of bootup logos 2020-01-03 14:27:40 +01:00
features documentation: vm: Advertise support for pte_special in riscv 2020-02-19 02:41:01 -07:00
filesystems 9p pull request for inclusion in 5.7 (take 2) 2020-04-08 21:51:14 -07:00
firmware_class
firmware-guide docs: firmware-guide: ACPI: Replace dma_request_slave_channel() with dma_request_chan() 2019-12-19 22:58:12 +01:00
fpga Documentation: fpga: dfl: add descriptions for thermal/power management interfaces 2019-10-16 19:18:26 -07:00
gpu UAPI Changes: 2020-03-19 10:40:27 +10:00
hid
hwmon docs: hwmon: Update documentation for isl68137 pmbus driver 2020-03-22 16:42:54 -07:00
i2c i2c: convert SMBus alert setup function to return an ERRPTR 2020-03-10 12:19:52 +01:00
ia64
ide
iio
infiniband Documentation/infiniband: update name of some functions 2019-09-13 16:55:55 -03:00
input
isdn isdn: capi: dead code removal 2019-12-11 09:12:38 +01:00
kbuild Kbuild updates for v5.7 2020-03-31 16:03:39 -07:00
kernel-hacking docs: locking: Drop :c:func: throughout 2020-03-20 17:16:24 -06:00
leds
livepatch livepatch: Documentation of the new API for tracking system state changes 2019-11-01 13:08:24 +01:00
locking Documentation/locking/locktypes: Minor copy editor fixes 2020-03-28 12:47:34 +01:00
m68k
maintainer Add a maintainer entry profile for documentation 2020-01-24 09:48:39 -07:00
media media updates for v5.7-rc1 2020-03-30 13:42:05 -07:00
mhi docs: Add documentation for MHI bus 2020-03-19 07:41:04 +01:00
mips docs: mips: remove no longer needed au1xxx_ide.rst documentation 2020-03-24 15:53:48 +01:00
misc-devices Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 2020-04-01 14:47:40 -07:00
netlabel
networking Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next 2020-03-31 17:29:33 -07:00
nios2
nvdimm docs: nvdimm: use ReST notation for subsection 2020-01-24 09:54:42 -07:00
openrisc openrisc: configs: Cleanup CONFIG_CROSS_COMPILE 2020-01-31 22:10:58 +09:00
parisc
PCI pci-v5.7-changes 2020-04-03 14:25:02 -07:00
pcmcia
power Merge branches 'pm-core', 'pm-sleep', 'pm-acpi' and 'pm-domains' 2020-03-30 14:46:58 +02:00
powerpc powerpc updates for 5.7 2020-04-05 11:12:59 -07:00
process Char/Misc driver patches for 5.7-rc1 2020-04-03 13:22:40 -07:00
RCU doc: Add rcutorture scripting to torture.txt 2020-02-27 07:03:14 -08:00
riscv It has been a relatively quiet cycle for documentation, but there's still a 2020-01-29 15:27:31 -08:00
s390
scheduler Documentation/scheduler: fix links in sched-stats 2019-10-29 04:35:41 -06:00
scsi SCSI misc on 20200402 2020-04-02 17:03:53 -07:00
security docs: prevent warnings due to autosectionlabel 2020-03-20 17:01:29 -06:00
sh
sound ALSA: hda/realtek - Remove now-unnecessary XPS 13 headphone noise fixups 2020-03-31 10:54:06 +02:00
sparc
sphinx docs: Fix empty parallelism argument 2020-02-25 03:11:04 -07:00
sphinx-static doc-rst: Reduce CSS padding around Field 2019-10-02 10:03:06 -06:00
spi
target docs: prevent warnings due to autosectionlabel 2020-03-20 17:01:29 -06:00
timers
trace New tracing features: 2020-04-05 10:36:18 -07:00
translations media updates for v5.7-rc1 2020-03-30 13:42:05 -07:00
usb usb: gadget: add raw-gadget interface 2020-03-15 11:34:48 +02:00
userspace-api docs: userspace: ioctl-number: remove mc146818rtc conflict 2020-02-13 11:42:02 -07:00
virt ARM: 2020-04-02 15:13:15 -07:00
vm mm/zswap: allow setting default status, compressor and allocator in Kconfig 2020-04-07 10:43:41 -07:00
w1 docs: w1: Fix a typo in omap-hdq.rst 2019-12-30 11:58:02 -07:00
watchdog linux-watchdog 5.4-rc1 tag 2019-09-27 11:17:38 -07:00
x86 Merge branch 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2020-03-31 11:04:05 -07:00
xtensa
.gitignore .gitignore: add SPDX License Identifier 2020-03-25 11:50:48 +01:00
asm-annotations.rst Documentation: Call out example SYM_FUNC_* usage as x86-specific 2020-01-16 12:53:16 -07:00
atomic_bitops.txt
atomic_t.txt
bus-virt-phys-mapping.txt
Changes
CodingStyle
conf.py docs: conf.py: avoid thousands of duplicate label warning on Sphinx 2020-03-20 17:01:34 -06:00
COPYING-logo
crc32.txt
debugging-via-ohci1394.txt
digsig.txt
DMA-API-HOWTO.txt
DMA-API.txt dma-mapping: remove dma_release_declared_memory 2019-09-04 11:13:19 +02:00
DMA-attributes.txt dma-mapping: remove the DMA_ATTR_WRITE_BARRIER flag 2019-11-14 12:01:54 -04:00
DMA-ISA-LPC.txt
docutils.conf
dontdiff modpost: dump missing namespaces into a single modules.nsdeps file 2019-11-11 20:10:01 +09:00
futex-requeue-pi.txt
hwspinlock.txt
index.rst Char/Misc driver patches for 5.7-rc1 2020-04-03 13:22:40 -07:00
IPMI.txt
IRQ-affinity.txt
IRQ-domain.txt
IRQ.txt
irqflags-tracing.txt
Kconfig
kprobes.txt
kref.txt docs: kref: Clarify the use of two kref_put() in example code 2020-02-25 03:39:10 -07:00
logo.gif
lzo.txt
mailbox.txt
Makefile Kbuild updates for v5.7 2020-03-31 16:03:39 -07:00
memory-barriers.txt Documentation/memory-barriers: Fix typos 2020-02-27 07:03:14 -08:00
nommu-mmap.txt
percpu-rw-semaphore.txt
pi-futex.txt
preempt-locking.txt
rbtree.txt
remoteproc.txt remoteproc: Add elf64 support in elf loader 2020-03-25 22:29:40 -07:00
robust-futex-ABI.txt threads: Update PID limit comment according to futex UAPI change 2020-03-21 17:48:13 +01:00
robust-futexes.txt
rpmsg.txt
speculation.txt
static-keys.txt
SubmittingPatches
tee.txt Documentation: tee: add AMD-TEE driver details 2020-01-04 13:49:51 +08:00
this_cpu_ops.txt
unaligned-memory-access.txt
xz.txt