linux

mirror of https://github.com/torvalds/linux.git synced 2024-11-29 23:51:37 +00:00

History

Martin KaFai Lau 46f8bc9275 bpf: Add a bpf_sock pointer to __sk_buff and a bpf_sk_fullsock helper In kernel, it is common to check "skb->sk && sk_fullsock(skb->sk)" before accessing the fields in sock. For example, in __netdev_pick_tx: static u16 __netdev_pick_tx(struct net_device dev, struct sk_buff skb, struct net_device sb_dev) { / ... / struct sock sk = skb->sk; if (queue_index != new_index && sk && sk_fullsock(sk) && rcu_access_pointer(sk->sk_dst_cache)) sk_tx_queue_set(sk, new_index); /* ... / return queue_index; } This patch adds a "struct bpf_sock sk" pointer to the "struct __sk_buff" where a few of the convert_ctx_access() in filter.c has already been accessing the skb->sk sock_common's fields, e.g. sock_ops_convert_ctx_access(). "__sk_buff->sk" is a PTR_TO_SOCK_COMMON_OR_NULL in the verifier. Some of the fileds in "bpf_sock" will not be directly accessible through the "__sk_buff->sk" pointer. It is limited by the new "bpf_sock_common_is_valid_access()". e.g. The existing "type", "protocol", "mark" and "priority" in bpf_sock are not allowed. The newly added "struct bpf_sock bpf_sk_fullsock(struct bpf_sock sk)" can be used to get a sk with all accessible fields in "bpf_sock". This helper is added to both cg_skb and sched_(cls\|act). int cg_skb_foo(struct __sk_buff skb) { struct bpf_sock sk; sk = skb->sk; if (!sk) return 1; sk = bpf_sk_fullsock(sk); if (!sk) return 1; if (sk->family != AF_INET6 \|\| sk->protocol != IPPROTO_TCP) return 1; /* some_traffic_shaping(); */ return 1; } (1) The sk is read only (2) There is no new "struct bpf_sock_common" introduced. (3) Future kernel sock's members could be added to bpf_sock only instead of repeatedly adding at multiple places like currently in bpf_sock_ops_md, bpf_sock_addr_md, sk_reuseport_md...etc. (4) After "sk = skb->sk", the reg holding sk is in type PTR_TO_SOCK_COMMON_OR_NULL. (5) After bpf_sk_fullsock(), the return type will be in type PTR_TO_SOCKET_OR_NULL which is the same as the return type of bpf_sk_lookup_xxx(). However, bpf_sk_fullsock() does not take refcnt. The acquire_reference_state() is only depending on the return type now. To avoid it, a new is_acquire_function() is checked before calling acquire_reference_state(). (6) The WARN_ON in "release_reference_state()" is no longer an internal verifier bug. When reg->id is not found in state->refs[], it means the bpf_prog does something wrong like "bpf_sk_release(bpf_sk_fullsock(skb->sk))" where reference has never been acquired by calling "bpf_sk_fullsock(skb->sk)". A -EINVAL and a verbose are done instead of WARN_ON. A test is added to the test_verifier in a later patch. Since the WARN_ON in "release_reference_state()" is no longer needed, "__release_reference_state()" is folded into "release_reference_state()" also. Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>		2019-02-10 19:46:17 -08:00
..
arraymap.c	bpf: introduce BPF_F_LOCK flag	2019-02-01 20:55:39 +01:00
bpf_lru_list.c	bpf: lru: Lower the PERCPU_NR_SCANS from 16 to 4	2017-04-17 13:55:52 -04:00
bpf_lru_list.h	bpf: Only set node->ref = 1 if it has not been set	2017-09-01 09:57:39 -07:00
btf.c	bpf: introduce bpf_spin_lock	2019-02-01 20:55:38 +01:00
cgroup.c	bpf, cgroups: clean up kerneldoc warnings	2019-01-31 10:32:01 +01:00
core.c	bpf: introduce bpf_spin_lock	2019-02-01 20:55:38 +01:00
cpumap.c	bpf/cpumap: make sure frame_size for build_skb is aligned if headroom isn't	2018-12-20 23:19:12 +01:00
devmap.c	bpf: devmap: fix wrong interface selection in notifier_call	2018-10-26 00:32:21 +02:00
disasm.c	bpf: disassembler support JMP32	2019-01-26 13:33:01 -08:00
disasm.h	bpf: Remove struct bpf_verifier_env argument from print_bpf_insn	2018-03-23 17:38:57 +01:00
hashtab.c	bpf: introduce BPF_F_LOCK flag	2019-02-01 20:55:39 +01:00
helpers.c	bpf: introduce BPF_F_LOCK flag	2019-02-01 20:55:39 +01:00
inode.c	bpf: decouple btf from seq bpf fs dump and enable more maps	2018-08-13 00:52:45 +02:00
local_storage.c	bpf: introduce BPF_F_LOCK flag	2019-02-01 20:55:39 +01:00
lpm_trie.c	bpf: pass struct btf pointer to the map_check_btf() callback	2018-12-12 15:33:33 -08:00
Makefile	bpf: add queue and stack maps	2018-10-19 13:24:31 -07:00
map_in_map.c	bpf: introduce bpf_spin_lock	2019-02-01 20:55:38 +01:00
map_in_map.h	bpf: Add syscall lookup support for fd array and htab	2017-06-29 13:13:25 -04:00
offload.c	bpf: notify offload JITs about optimizations	2019-01-23 17:35:32 -08:00
percpu_freelist.c	bpf: fix lockdep splat	2017-11-15 19:46:32 +09:00
percpu_freelist.h	bpf: introduce percpu_freelist	2016-03-08 15:28:31 -05:00
queue_stack_maps.c	bpf: fix integer overflow in queue_stack_map	2018-11-22 21:29:40 +01:00
reuseport_array.c	bpf: Introduce BPF_MAP_TYPE_REUSEPORT_SOCKARRAY	2018-08-11 01:58:46 +02:00
stackmap.c	bpf: zero out build_id for BPF_STACK_BUILD_ID_IP	2019-01-17 16:42:35 +01:00
syscall.c	bpf: introduce BPF_F_LOCK flag	2019-02-01 20:55:39 +01:00
tnum.c	bpf/verifier: improve register value range tracking with ARSH	2018-04-29 08:45:53 -07:00
verifier.c	bpf: Add a bpf_sock pointer to __sk_buff and a bpf_sk_fullsock helper	2019-02-10 19:46:17 -08:00
xskmap.c	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2018-10-19 11:03:06 -07:00