linux/kernel/bpf
Martynas Pumputis f01a7dbe98 bpf: Try harder when allocating memory for large maps
It has been observed that sometimes a higher order memory allocation
for BPF maps fails when there is no obvious memory pressure in a system.
E.g. the map (BPF_MAP_TYPE_LRU_HASH, key=38, value=56, max_elems=524288)
could not be created due to vmalloc unable to allocate 75497472B,
when the system's memory consumption (in MB) was the following:

    Total: 3942 Used: 837 (21.24%) Free: 138 Buffers: 239 Cached: 2727

Later analysis [1] by Michal Hocko showed that the vmalloc was not trying
to reclaim memory from the page cache and was failing prematurely due to
__GFP_NORETRY.

Considering dcda9b0471 ("mm, tree wide: replace __GFP_REPEAT by
__GFP_RETRY_MAYFAIL with more useful semantic") and [1], we can replace
__GFP_NORETRY with __GFP_RETRY_MAYFAIL, as it won't invoke OOM killer
and will try harder to fulfil allocation requests.

Unfortunately, replacing the body of the BPF map memory allocation
function with the kvmalloc_node helper function is not an option at
this point in time, given 1) kmalloc is non-optional for higher order
allocations, and 2) passing __GFP_RETRY_MAYFAIL to the kmalloc would
stress the slab allocator too much for large requests.

The change has been tested with the workloads mentioned above and by
observing oom_kill value from /proc/vmstat.

[1]: https://lore.kernel.org/bpf/20190310071318.GW5232@dhcp22.suse.cz/

Signed-off-by: Martynas Pumputis <m@lambda.lt>
Acked-by: Yonghong Song <yhs@fb.com>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20190318153940.GL8924@dhcp22.suse.cz/
2019-03-18 16:48:25 +01:00
..
arraymap.c bpf: introduce BPF_F_LOCK flag 2019-02-01 20:55:39 +01:00
bpf_lru_list.c bpf: lru: Lower the PERCPU_NR_SCANS from 16 to 4 2017-04-17 13:55:52 -04:00
bpf_lru_list.h bpf: Only set node->ref = 1 if it has not been set 2017-09-01 09:57:39 -07:00
btf.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2019-02-08 15:00:17 -08:00
cgroup.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2019-02-08 15:00:17 -08:00
core.c Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2019-03-06 07:59:36 -08:00
cpumap.c bpf/cpumap: make sure frame_size for build_skb is aligned if headroom isn't 2018-12-20 23:19:12 +01:00
devmap.c bpf: devmap: fix wrong interface selection in notifier_call 2018-10-26 00:32:21 +02:00
disasm.c bpf: disassembler support JMP32 2019-01-26 13:33:01 -08:00
disasm.h bpf: Remove struct bpf_verifier_env argument from print_bpf_insn 2018-03-23 17:38:57 +01:00
hashtab.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2019-02-08 15:00:17 -08:00
helpers.c bpf: introduce BPF_F_LOCK flag 2019-02-01 20:55:39 +01:00
inode.c bpf: decouple btf from seq bpf fs dump and enable more maps 2018-08-13 00:52:45 +02:00
local_storage.c bpf: introduce BPF_F_LOCK flag 2019-02-01 20:55:39 +01:00
lpm_trie.c bpf, lpm: fix lookup bug in map_delete_elem 2019-02-22 16:17:53 +01:00
Makefile bpf: add queue and stack maps 2018-10-19 13:24:31 -07:00
map_in_map.c bpf: set inner_map_meta->spin_lock_off correctly 2019-02-27 17:03:13 -08:00
map_in_map.h bpf: Add syscall lookup support for fd array and htab 2017-06-29 13:13:25 -04:00
offload.c bpf: offload: add priv field for drivers 2019-02-12 17:07:09 +01:00
percpu_freelist.c bpf: fix lockdep false positive in percpu_freelist 2019-01-31 23:18:21 +01:00
percpu_freelist.h bpf: fix lockdep false positive in percpu_freelist 2019-01-31 23:18:21 +01:00
queue_stack_maps.c bpf: fix integer overflow in queue_stack_map 2018-11-22 21:29:40 +01:00
reuseport_array.c bpf: Introduce BPF_MAP_TYPE_REUSEPORT_SOCKARRAY 2018-08-11 01:58:46 +02:00
stackmap.c bpf: fix lockdep false positive in stackmap 2019-02-11 16:36:24 +01:00
syscall.c bpf: Try harder when allocating memory for large maps 2019-03-18 16:48:25 +01:00
tnum.c bpf/verifier: improve register value range tracking with ARSH 2018-04-29 08:45:53 -07:00
verifier.c bpf: Fix bpf_tcp_sock and bpf_sk_fullsock issue related to bpf_sk_release 2019-03-13 12:04:35 -07:00
xskmap.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-10-19 11:03:06 -07:00