linux/kernel/bpf
Andrey Ignatov 7dd68b3279 bpf: Support replacing cgroup-bpf program in MULTI mode
The common use-case in production is to have multiple cgroup-bpf
programs per attach type that cover multiple use-cases. Such programs
are attached with BPF_F_ALLOW_MULTI and can be maintained by different
people.

Order of programs usually matters, for example imagine two egress
programs: the first one drops packets and the second one counts packets.
If they're swapped the result of counting program will be different.

It brings operational challenges with updating cgroup-bpf program(s)
attached with BPF_F_ALLOW_MULTI since there is no way to replace a
program:

* One way to update is to detach all programs first and then attach the
  new version(s) again in the right order. This introduces an
  interruption in the work a program is doing and may not be acceptable
  (e.g. if it's egress firewall);

* Another way is attach the new version of a program first and only then
  detach the old version. This introduces the time interval when two
  versions of same program are working, what may not be acceptable if a
  program is not idempotent. It also imposes additional burden on
  program developers to make sure that two versions of their program can
  co-exist.

Solve the problem by introducing a "replace" mode in BPF_PROG_ATTACH
command for cgroup-bpf programs being attached with BPF_F_ALLOW_MULTI
flag. This mode is enabled by newly introduced BPF_F_REPLACE attach flag
and bpf_attr.replace_bpf_fd attribute to pass fd of the old program to
replace

That way user can replace any program among those attached with
BPF_F_ALLOW_MULTI flag without the problems described above.

Details of the new API:

* If BPF_F_REPLACE is set but replace_bpf_fd doesn't have valid
  descriptor of BPF program, BPF_PROG_ATTACH will return corresponding
  error (EINVAL or EBADF).

* If replace_bpf_fd has valid descriptor of BPF program but such a
  program is not attached to specified cgroup, BPF_PROG_ATTACH will
  return ENOENT.

BPF_F_REPLACE is introduced to make the user intent clear, since
replace_bpf_fd alone can't be used for this (its default value, 0, is a
valid fd). BPF_F_REPLACE also makes it possible to extend the API in the
future (e.g. add BPF_F_BEFORE and BPF_F_AFTER if needed).

Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Andrii Narkyiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/30cd850044a0057bdfcaaf154b7d2f39850ba813.1576741281.git.rdna@fb.com
2019-12-19 21:22:25 -08:00
..
arraymap.c bpf: Simplify __bpf_arch_text_poke poke type handling 2019-11-24 17:12:11 -08:00
bpf_lru_list.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 206 2019-05-30 11:29:53 -07:00
bpf_lru_list.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 206 2019-05-30 11:29:53 -07:00
btf.c bpf: Fix build in minimal configurations 2019-11-29 01:03:42 +01:00
cgroup.c bpf: Support replacing cgroup-bpf program in MULTI mode 2019-12-19 21:22:25 -08:00
core.c bpf: Remove unnecessary assertion on fp_old 2019-12-19 22:24:15 +01:00
cpumap.c xdp: Make cpumap flush_list common for all map instances 2019-12-19 21:09:43 -08:00
devmap.c xdp: Make devmap flush_list common for all map instances 2019-12-19 21:09:43 -08:00
disasm.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 295 2019-06-05 17:36:38 +02:00
disasm.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 295 2019-06-05 17:36:38 +02:00
dispatcher.c bpf: Introduce BPF dispatcher 2019-12-13 13:09:32 -08:00
hashtab.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2019-06-17 20:20:36 -07:00
helpers.c cgroup: use cgrp->kn->id as the cgroup ID 2019-11-12 08:18:04 -08:00
inode.c bpf: Convert bpf_prog refcnt to atomic64_t 2019-11-18 11:41:59 +01:00
local_storage.c cgroup: use cgrp->kn->id as the cgroup ID 2019-11-12 08:18:04 -08:00
lpm_trie.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2019-06-22 08:59:24 -04:00
Makefile bpf: Introduce BPF dispatcher 2019-12-13 13:09:32 -08:00
map_in_map.c bpf: Move owner type, jited info into array auxiliary data 2019-11-24 17:04:11 -08:00
map_in_map.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 206 2019-05-30 11:29:53 -07:00
offload.c bpf, offload: Unlock on error in bpf_offload_dev_create() 2019-11-07 00:20:27 +01:00
percpu_freelist.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 206 2019-05-30 11:29:53 -07:00
percpu_freelist.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 206 2019-05-30 11:29:53 -07:00
queue_stack_maps.c bpf: move memory size checks to bpf_map_charge_init() 2019-05-31 16:52:56 -07:00
reuseport_array.c bpf: move memory size checks to bpf_map_charge_init() 2019-05-31 16:52:56 -07:00
stackmap.c Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2019-12-03 09:29:50 -08:00
syscall.c bpf: Support replacing cgroup-bpf program in MULTI mode 2019-12-19 21:22:25 -08:00
sysfs_btf.c btf: fix return value check in btf_vmlinux_init() 2019-08-15 22:18:17 -07:00
tnum.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
trampoline.c bpf: Move trampoline JIT image allocation to a function 2019-12-13 13:09:32 -08:00
verifier.c bpf: Fix a bug when getting subprog 0 jited image in check_attach_btf_id 2019-12-04 21:20:07 -08:00
xskmap.c xsk: Make xskmap flush_list common for all map instances 2019-12-19 21:09:43 -08:00