linux

mirror of https://github.com/torvalds/linux.git synced 2024-11-23 12:42:02 +00:00

History

Alexei Starovoitov d83525ca62 bpf: introduce bpf_spin_lock Introduce 'struct bpf_spin_lock' and bpf_spin_lock/unlock() helpers to let bpf program serialize access to other variables. Example: struct hash_elem { int cnt; struct bpf_spin_lock lock; }; struct hash_elem * val = bpf_map_lookup_elem(&hash_map, &key); if (val) { bpf_spin_lock(&val->lock); val->cnt++; bpf_spin_unlock(&val->lock); } Restrictions and safety checks: - bpf_spin_lock is only allowed inside HASH and ARRAY maps. - BTF description of the map is mandatory for safety analysis. - bpf program can take one bpf_spin_lock at a time, since two or more can cause dead locks. - only one 'struct bpf_spin_lock' is allowed per map element. It drastically simplifies implementation yet allows bpf program to use any number of bpf_spin_locks. - when bpf_spin_lock is taken the calls (either bpf2bpf or helpers) are not allowed. - bpf program must bpf_spin_unlock() before return. - bpf program can access 'struct bpf_spin_lock' only via bpf_spin_lock()/bpf_spin_unlock() helpers. - load/store into 'struct bpf_spin_lock lock;' field is not allowed. - to use bpf_spin_lock() helper the BTF description of map value must be a struct and have 'struct bpf_spin_lock anyname;' field at the top level. Nested lock inside another struct is not allowed. - syscall map_lookup doesn't copy bpf_spin_lock field to user space. - syscall map_update and program map_update do not update bpf_spin_lock field. - bpf_spin_lock cannot be on the stack or inside networking packet. bpf_spin_lock can only be inside HASH or ARRAY map value. - bpf_spin_lock is available to root only and to all program types. - bpf_spin_lock is not allowed in inner maps of map-in-map. - ld_abs is not allowed inside spin_lock-ed region. - tracing progs and socket filter progs cannot use bpf_spin_lock due to insufficient preemption checks Implementation details: - cgroup-bpf class of programs can nest with xdp/tc programs. Hence bpf_spin_lock is equivalent to spin_lock_irqsave. Other solutions to avoid nested bpf_spin_lock are possible. Like making sure that all networking progs run with softirq disabled. spin_lock_irqsave is the simplest and doesn't add overhead to the programs that don't use it. - arch_spinlock_t is used when its implemented as queued_spin_lock - archs can force their own arch_spinlock_t - on architectures where queued_spin_lock is not available and sizeof(arch_spinlock_t) != sizeof(__u32) trivial lock is used. - presence of bpf_spin_lock inside map value could have been indicated via extra flag during map_create, but specifying it via BTF is cleaner. It provides introspection for map key/value and reduces user mistakes. Next steps: - allow bpf_spin_lock in other map types (like cgroup local storage) - introduce BPF_F_LOCK flag for bpf_map_update() syscall and helper to request kernel to grab bpf_spin_lock before rewriting the value. That will serialize access to map elements. Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>		2019-02-01 20:55:38 +01:00
..
arraymap.c	bpf: introduce bpf_spin_lock	2019-02-01 20:55:38 +01:00
bpf_lru_list.c	bpf: lru: Lower the PERCPU_NR_SCANS from 16 to 4	2017-04-17 13:55:52 -04:00
bpf_lru_list.h	bpf: Only set node->ref = 1 if it has not been set	2017-09-01 09:57:39 -07:00
btf.c	bpf: introduce bpf_spin_lock	2019-02-01 20:55:38 +01:00
cgroup.c	bpf, cgroups: clean up kerneldoc warnings	2019-01-31 10:32:01 +01:00
core.c	bpf: introduce bpf_spin_lock	2019-02-01 20:55:38 +01:00
cpumap.c	bpf/cpumap: make sure frame_size for build_skb is aligned if headroom isn't	2018-12-20 23:19:12 +01:00
devmap.c	bpf: devmap: fix wrong interface selection in notifier_call	2018-10-26 00:32:21 +02:00
disasm.c	bpf: disassembler support JMP32	2019-01-26 13:33:01 -08:00
disasm.h	bpf: Remove struct bpf_verifier_env argument from print_bpf_insn	2018-03-23 17:38:57 +01:00
hashtab.c	bpf: introduce bpf_spin_lock	2019-02-01 20:55:38 +01:00
helpers.c	bpf: introduce bpf_spin_lock	2019-02-01 20:55:38 +01:00
inode.c	bpf: decouple btf from seq bpf fs dump and enable more maps	2018-08-13 00:52:45 +02:00
local_storage.c	bpf: enable cgroup local storage map pretty print with kind_flag	2018-12-18 01:11:59 +01:00
lpm_trie.c	bpf: pass struct btf pointer to the map_check_btf() callback	2018-12-12 15:33:33 -08:00
Makefile	bpf: add queue and stack maps	2018-10-19 13:24:31 -07:00
map_in_map.c	bpf: introduce bpf_spin_lock	2019-02-01 20:55:38 +01:00
map_in_map.h	bpf: Add syscall lookup support for fd array and htab	2017-06-29 13:13:25 -04:00
offload.c	bpf: notify offload JITs about optimizations	2019-01-23 17:35:32 -08:00
percpu_freelist.c	bpf: fix lockdep splat	2017-11-15 19:46:32 +09:00
percpu_freelist.h	bpf: introduce percpu_freelist	2016-03-08 15:28:31 -05:00
queue_stack_maps.c	bpf: fix integer overflow in queue_stack_map	2018-11-22 21:29:40 +01:00
reuseport_array.c	bpf: Introduce BPF_MAP_TYPE_REUSEPORT_SOCKARRAY	2018-08-11 01:58:46 +02:00
stackmap.c	bpf: zero out build_id for BPF_STACK_BUILD_ID_IP	2019-01-17 16:42:35 +01:00
syscall.c	bpf: introduce bpf_spin_lock	2019-02-01 20:55:38 +01:00
tnum.c	bpf/verifier: improve register value range tracking with ARSH	2018-04-29 08:45:53 -07:00
verifier.c	bpf: introduce bpf_spin_lock	2019-02-01 20:55:38 +01:00
xskmap.c	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2018-10-19 11:03:06 -07:00