linux

mirror of https://github.com/torvalds/linux.git synced 2024-12-22 10:56:40 +00:00

Author	SHA1	Message	Date
Yonghong Song	534e0e52bc	samples/bpf: fix a compilation failure samples/bpf build failed with the following errors: $ make samples/bpf/ ... HOSTCC samples/bpf/sockex3_user.o /data/users/yhs/work/net-next/samples/bpf/sockex3_user.c:16:8: error: redefinition of ‘struct bpf_flow_keys’ struct bpf_flow_keys { ^ In file included from /data/users/yhs/work/net-next/samples/bpf/sockex3_user.c:4:0: ./usr/include/linux/bpf.h:2338:9: note: originally defined here struct bpf_flow_keys flow_keys; ^ make[3]: ** [samples/bpf/sockex3_user.o] Error 1 Commit `d58e468b11` ("flow_dissector: implements flow dissector BPF hook") introduced struct bpf_flow_keys in include/uapi/linux/bpf.h and hence caused the naming conflict with samples/bpf/sockex3_user.c. The fix is to rename struct bpf_flow_keys in samples/bpf/sockex3_user.c to flow_keys to avoid the conflict. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-09-18 17:50:02 +02:00
YueHaibing	664e787845	samples/bpf: remove duplicated includes Remove duplicated includes. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-09-18 17:49:33 +02:00
Yonghong Song	7900efc192	tools/bpf: bpftool: improve output format for bpftool net This is a followup patch for Commit `f6f3bac08f` ("tools/bpf: bpftool: add net support"). Some improvements are made for the bpftool net output. Specially, plain output is more concise such that per attachment should nicely fit in one line. Compared to previous output, the prog tag is removed since it can be easily obtained with program id. Similar to xdp attachments, the device name is added to tc attachments. The bpf program attached through shared block mechanism is supported as well. $ ip link add dev v1 type veth peer name v2 $ tc qdisc add dev v1 ingress_block 10 egress_block 20 clsact $ tc qdisc add dev v2 ingress_block 10 egress_block 20 clsact $ tc filter add block 10 protocol ip prio 25 bpf obj bpf_shared.o sec ingress flowid 1:1 $ tc filter add block 20 protocol ip prio 30 bpf obj bpf_cyclic.o sec classifier flowid 1:1 $ bpftool net xdp: tc: v2(7) clsact/ingress bpf_shared.o:[ingress] id 23 v2(7) clsact/egress bpf_cyclic.o:[classifier] id 24 v1(8) clsact/ingress bpf_shared.o:[ingress] id 23 v1(8) clsact/egress bpf_cyclic.o:[classifier] id 24 The documentation and "bpftool net help" are updated to make it clear that current implementation only supports xdp and tc attachments. For programs attached to cgroups, "bpftool cgroup" can be used to dump attachments. For other programs e.g. sk_{filter,skb,msg,reuseport} and lwt/seg6, iproute2 tools should be used. The new output: $ bpftool net xdp: eth0(2) driver id 198 tc: eth0(2) clsact/ingress fbflow_icmp id 335 act [{icmp_action id 336}] eth0(2) clsact/egress fbflow_egress id 334 $ bpftool -jp net [{ "xdp": [{ "devname": "eth0", "ifindex": 2, "mode": "driver", "id": 198 } ], "tc": [{ "devname": "eth0", "ifindex": 2, "kind": "clsact/ingress", "name": "fbflow_icmp", "id": 335, "act": [{ "name": "icmp_action", "id": 336 } ] },{ "devname": "eth0", "ifindex": 2, "kind": "clsact/egress", "name": "fbflow_egress", "id": 334 } ] } ] Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-09-18 17:42:31 +02:00
Alexei Starovoitov	70e88c758a	selftests/bpf: fix bpf_flow.c build fix the following build error: clang -I. -I./include/uapi -I../../../include/uapi -idirafter /usr/local/include -idirafter /data/users/ast/llvm/bld/lib/clang/7.0.0/include -idirafter /usr/include -Wno-compare-distinct-pointer-types \ -O2 -target bpf -emit-llvm -c bpf_flow.c -o - \| \ llc -march=bpf -mcpu=generic -filetype=obj -o /data/users/ast/bpf-next/tools/testing/selftests/bpf/bpf_flow.o LLVM ERROR: 'dissect' label emitted multiple times to assembly file make: *** [/data/users/ast/bpf-next/tools/testing/selftests/bpf/bpf_flow.o] Error 1 Fixes: `9c98b13cc3` ("flow_dissector: implements eBPF parser") Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-09-14 12:09:05 -07:00
Alexei Starovoitov	4a9f42c9dc	Merge branch 'bpf-flow-dissector' Petar Penkov says: ==================== This patch series hardens the RX stack by allowing flow dissection in BPF, as previously discussed [1]. Because of the rigorous checks of the BPF verifier, this provides significant security guarantees. In particular, the BPF flow dissector cannot get inside of an infinite loop, as with CVE-2013-4348, because BPF programs are guaranteed to terminate. It cannot read outside of packet bounds, because all memory accesses are checked. Also, with BPF the administrator can decide which protocols to support, reducing potential attack surface. Rarely encountered protocols can be excluded from dissection and the program can be updated without kernel recompile or reboot if a bug is discovered. Patch 1 adds infrastructure to execute a BPF program in __skb_flow_dissect. This includes a new BPF program and attach type. Patch 2 adds the new BPF flow dissector definitions to tools/uapi. Patch 3 adds support for the new BPF program type to libbpf and bpftool. Patch 4 adds a flow dissector program in BPF. This parses most protocols in __skb_flow_dissect in BPF for a subset of flow keys (basic, control, ports, and address types). Patch 5 adds a selftest that attaches the BPF program to the flow dissector and sends traffic with different levels of encapsulation. Performance Evaluation: The in-kernel implementation was compared against the demo program from patch 4 using the test in patch 5 with IPv4/UDP traffic over 10 seconds. $perf record -a -C 4 taskset -c 4 ./test_flow_dissector -i 4 -f 8 \ -t 10 In-kernel Dissector: __skb_flow_dissect overhead: 2.12% Total Packets: 3,272,597 (from output of ./test_flow_dissector) BPF Dissector: __skb_flow_dissect overhead: 1.63% Total Packets: 3,232,356 (from output of ./test_flow_dissector) No-op BPF Dissector: __skb_flow_dissect overhead: 1.52% Total Packets: 3,330,635 (from output of ./test_flow_dissector) Changes since v3: 1/ struct bpf_flow_keys reorganized to remove holes in patch 1 and patch 2. Changes since v2: 1/ Changes to tools/include/uapi pulled into a separate patch 2 2/ Changes to tools/lib and tools/bpftool pulled into a separate patch 3 3/ Changed flow_keys in __sk_buff from __u32 to struct bpf_flow_keys * 4/ Added nhoff field in struct bpf_flow_keys to pass initial offset 5/ Saving all of the modified control block, rather than just the qdisc 6/ Sample BPF program in patch 4 modified to use the changes above Changes since v1: 1/ LD_ABS instructions now disallowed for the new BPF prog type 2/ now checks if skb is NULL in __skb_flow_dissect() 3/ fixed incorrect accesses in flow_dissector_is_valid_access() - writes to the flow_keys field now disallowed - reads/writes to tc_classid and data_meta now disallowed 4/ headers pulled with bpf_skb_load_data if direct access fails Changes since RFC: 1/ Flow dissector hook changed from global to per-netns 2/ Defined struct bpf_flow_keys to be used in BPF flow dissector programs instead of exposing the internal flow keys layout. Added a function to translate from bpf_flow_keys to the internal layout after BPF dissection is complete. The pointer to this struct is stored in qdisc_skb_cb rather than inside of the 20 byte control block which simplifies verification and allows access to all 20 bytes of the cb. 3/ Removed GUE parsing as it relied on a hardcoded port 4/ MPLS parsing now stops at the first label which is consistent with the in-kernel flow dissector 5/ Refactored to use direct packet access and to write out to struct bpf_flow_keys [1] http://vger.kernel.org/netconf2017_files/rx_hardening_and_udp_gso.pdf ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-09-14 12:04:34 -07:00
Petar Penkov	50b3ed57de	selftests/bpf: test bpf flow dissection Adds a test that sends different types of packets over multiple tunnels and verifies that valid packets are dissected correctly. To do so, a tc-flower rule is added to drop packets on UDP src port 9, and packets are sent from ports 8, 9, and 10. Only the packets on port 9 should be dropped. Because tc-flower relies on the flow dissector to match flows, correct classification demonstrates correct dissection. Also add support logic to load the BPF program and to inject the test packets. Signed-off-by: Petar Penkov <ppenkov@google.com> Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-09-14 12:04:33 -07:00
Petar Penkov	9c98b13cc3	flow_dissector: implements eBPF parser This eBPF program extracts basic/control/ip address/ports keys from incoming packets. It supports recursive parsing for IP encapsulation, and VLAN, along with IPv4/IPv6 and extension headers. This program is meant to show how flow dissection and key extraction can be done in eBPF. Link: http://vger.kernel.org/netconf2017_files/rx_hardening_and_udp_gso.pdf Signed-off-by: Petar Penkov <ppenkov@google.com> Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-09-14 12:04:33 -07:00
Petar Penkov	c22fbae76c	bpf: support flow dissector in libbpf and bpftool This patch extends libbpf and bpftool to work with programs of type BPF_PROG_TYPE_FLOW_DISSECTOR. Signed-off-by: Petar Penkov <ppenkov@google.com> Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-09-14 12:04:33 -07:00
Petar Penkov	2f965e3fcd	bpf: sync bpf.h uapi with tools/ This patch syncs tools/include/uapi/linux/bpf.h with the flow dissector definitions from include/uapi/linux/bpf.h Signed-off-by: Petar Penkov <ppenkov@google.com> Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-09-14 12:04:33 -07:00
Petar Penkov	d58e468b11	flow_dissector: implements flow dissector BPF hook Adds a hook for programs of type BPF_PROG_TYPE_FLOW_DISSECTOR and attach type BPF_FLOW_DISSECTOR that is executed in the flow dissector path. The BPF program is per-network namespace. Signed-off-by: Petar Penkov <ppenkov@google.com> Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-09-14 12:04:33 -07:00
Anders Roxell	1edb6e035e	net/core/filter: fix unused-variable warning Building with CONFIG_INET=n will show the warning below: net/core/filter.c: In function ‘____bpf_getsockopt’: net/core/filter.c:4048:19: warning: unused variable ‘tp’ [-Wunused-variable] struct tcp_sock tp; ^~ net/core/filter.c:4046:31: warning: unused variable ‘icsk’ [-Wunused-variable] struct inet_connection_sock icsk; ^~~~ Move the variable declarations inside the {} block where they are used. Fixes: `1e215300f1` ("bpf: add TCP_SAVE_SYN/TCP_SAVED_SYN options for bpf_(set\|get)sockopt") Signed-off-by: Anders Roxell <anders.roxell@linaro.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-09-11 14:54:52 -07:00
Yonghong Song	9d0b3c1f14	tools/bpf: fix a netlink recv issue Commit `f7010770fb` ("tools/bpf: move bpf/lib netlink related functions into a new file") introduced a while loop for the netlink recv path. This while loop is needed since the buffer in recv syscall may not be enough to hold all the information and in such cases multiple recv calls are needed. There is a bug introduced by the above commit as the while loop may block on recv syscall if there is no more messages are expected. The netlink message header flag NLM_F_MULTI is used to indicate that more messages are expected and this patch fixed the bug by doing further recv syscall only if multipart message is expected. The patch added another fix regarding to message length of 0. When netlink recv returns message length of 0, there will be no more messages for returning data so the while loop can end. Fixes: `f7010770fb` ("tools/bpf: move bpf/lib netlink related functions into a new file") Reported-by: Björn Töpel <bjorn.topel@intel.com> Tested-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-09-11 14:26:30 -07:00
Alexei Starovoitov	2e2a0c961a	Merge branch 'progarray_mapinmap_dump' Yonghong Song says: ==================== The support to dump program array and map_in_map maps for bpffs and bpftool is added. Patch #1 added bpffs support and Patch #2 added bpftool support. Please see individual patches for example output. ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-09-11 14:17:25 -07:00
Yonghong Song	ad3338d250	tools/bpf: bpftool: support prog array map and map of maps Currently, prog array map and map of maps are not supported in bpftool. This patch added the support. Different from other map types, for prog array map and map of maps, the key returned bpf_get_next_key() may not point to a valid value. So for these two map types, no error will be printed out when such a scenario happens. The following is the plain and json dump if btf is not available: $ ./bpftool map dump id 10 key: 08 00 00 00 value: 5c 01 00 00 Found 1 element $ ./bpftool -jp map dump id 10 [{ "key": ["0x08","0x00","0x00","0x00" ], "value": ["0x5c","0x01","0x00","0x00" ] }] If the BTF is available, the dump looks below: $ ./bpftool map dump id 2 [{ "key": 0, "value": 7 } ] $ ./bpftool -jp map dump id 2 [{ "key": ["0x00","0x00","0x00","0x00" ], "value": ["0x07","0x00","0x00","0x00" ], "formatted": { "key": 0, "value": 7 } }] Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-09-11 14:17:24 -07:00
Yonghong Song	a7c19db38d	bpf: add bpffs pretty print for program array map Added bpffs pretty print for program array map. For a particular array index, if the program array points to a valid program, the "<index>: <prog_id>" will be printed out like 0: 6 which means bpf program with id "6" is installed at index "0". Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-09-11 14:17:24 -07:00
Yonghong Song	f6f3bac08f	tools/bpf: bpftool: add net support Add "bpftool net" support. Networking devices are enumerated to dump device index/name associated with xdp progs. For each networking device, tc classes and qdiscs are enumerated in order to check their bpf filters. In addition, root handle and clsact ingress/egress are also checked for bpf filters. Not all filter information is printed out. Only ifindex, kind, filter name, prog_id and tag are printed out, which are good enough to show attachment information. If the filter action is a bpf action, its bpf program id, bpf name and tag will be printed out as well. For example, $ ./bpftool net xdp [ ifindex 2 devname eth0 prog_id 198 ] tc_filters [ ifindex 2 kind qdisc_htb name prefix_matcher.o:[cls_prefix_matcher_htb] prog_id 111727 tag d08fe3b4319bc2fd act [] ifindex 2 kind qdisc_clsact_ingress name fbflow_icmp prog_id 130246 tag 3f265c7f26db62c9 act [] ifindex 2 kind qdisc_clsact_egress name prefix_matcher.o:[cls_prefix_matcher_clsact] prog_id 111726 tag 99a197826974c876 ifindex 2 kind qdisc_clsact_egress name cls_fg_dscp prog_id 108619 tag dc4630674fd72dcc act [] ifindex 2 kind qdisc_clsact_egress name fbflow_egress prog_id 130245 tag 72d2d830d6888d2c ] $ ./bpftool -jp net [{ "xdp": [{ "ifindex": 2, "devname": "eth0", "prog_id": 198 } ], "tc_filters": [{ "ifindex": 2, "kind": "qdisc_htb", "name": "prefix_matcher.o:[cls_prefix_matcher_htb]", "prog_id": 111727, "tag": "d08fe3b4319bc2fd", "act": [] },{ "ifindex": 2, "kind": "qdisc_clsact_ingress", "name": "fbflow_icmp", "prog_id": 130246, "tag": "3f265c7f26db62c9", "act": [] },{ "ifindex": 2, "kind": "qdisc_clsact_egress", "name": "prefix_matcher.o:[cls_prefix_matcher_clsact]", "prog_id": 111726, "tag": "99a197826974c876" },{ "ifindex": 2, "kind": "qdisc_clsact_egress", "name": "cls_fg_dscp", "prog_id": 108619, "tag": "dc4630674fd72dcc", "act": [] },{ "ifindex": 2, "kind": "qdisc_clsact_egress", "name": "fbflow_egress", "prog_id": 130245, "tag": "72d2d830d6888d2c" } ] } ] Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-09-06 22:34:08 -07:00
Yonghong Song	36f1678d9e	tools/bpf: add more netlink functionalities in lib/bpf This patch added a few netlink attribute parsing functions and the netlink API functions to query networking links, tc classes, tc qdiscs and tc filters. For example, the following API is to get networking links: int nl_get_link(int sock, unsigned int nl_pid, dump_nlmsg_t dump_link_nlmsg, void cookie); Note that when the API is called, the user also provided a callback function with the following signature: int (dump_nlmsg_t)(void cookie, void msg, struct nlattr **tb); The "cookie" is the parameter the user passed to the API and will be available for the callback function. The "msg" is the information about the result, e.g., ifinfomsg or tcmsg. The "tb" is the parsed netlink attributes. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-09-06 22:34:08 -07:00
Yonghong Song	f7010770fb	tools/bpf: move bpf/lib netlink related functions into a new file There are no functionality change for this patch. In the subsequent patches, more netlink related library functions will be added and a separate file is better than cluttering bpf.c. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-09-06 22:34:08 -07:00
Yonghong Song	52b7b7843d	tools/bpf: sync kernel uapi header if_link.h to tools Among others, this header will be used later for bpftool net support. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-09-06 22:34:08 -07:00
Mauricio Vasquez B	f5bd3948eb	selftests/bpf/test_progs: do not check errno == 0 The errno man page states: "The value in errno is significant only when the return value of the call indicated an error..." then it is not correct to check it, it could be different than zero even if the function succeeded. It causes some false positives if errno is set by a previous function. Signed-off-by: Mauricio Vasquez B <mauricio.vasquez@polito.it> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-09-06 22:34:08 -07:00
Mauricio Vasquez B	ad1242d8a0	selftests/bpf: add missing executables to .gitignore Signed-off-by: Mauricio Vasquez B <mauricio.vasquez@polito.it> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-09-06 22:34:08 -07:00
Jesper Dangaard Brouer	47b123ed9e	xdp: split code for map vs non-map redirect The compiler does an efficient job of inlining static C functions. Perf top clearly shows that almost everything gets inlined into the function call xdp_do_redirect. The function xdp_do_redirect end-up containing and interleaving the map and non-map redirect code. This is sub-optimal, as it would be strange for an XDP program to use both types of redirect in the same program. The two use-cases are separate, and interleaving the code just cause more instruction-cache pressure. I would like to stress (again) that the non-map variant bpf_redirect is very slow compared to the bpf_redirect_map variant, approx half the speed. Measured with driver i40e the difference is: - map redirect: 13,250,350 pps - non-map redirect: 7,491,425 pps For this reason, the function name of the non-map variant of redirect have been called xdp_do_redirect_slow. This hopefully gives a hint when using perf, that this is not the optimal XDP redirect operating mode. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-09-06 22:34:08 -07:00
Jesper Dangaard Brouer	2a68d85fe1	xdp: explicit inline __xdp_map_lookup_elem The compiler chooses to not-inline the function __xdp_map_lookup_elem, because it can see that it is used by both Generic-XDP and native-XDP do redirect calls (xdp_do_generic_redirect_map and xdp_do_redirect_map). The compiler cannot know that this is a bad choice, as it cannot know that a net device cannot run both XDP modes (Generic or Native) at the same time. Thus, mark this function inline, even-though we normally leave this up-to the compiler. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-09-06 22:34:08 -07:00
Jesper Dangaard Brouer	e1302542e3	xdp: unlikely instrumentation for xdp map redirect Notice the compiler generated ASM code layout was suboptimal. It assumed map enqueue errors as the likely case, which is shouldn't. It assumed that xdp_do_flush_map() was a likely case, due to maps changing between packets, which should be very unlikely. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-09-06 22:34:08 -07:00
Alexei Starovoitov	a9c676bc8f	bpf/verifier: fix verifier instability Edward Cree says: In check_mem_access(), for the PTR_TO_CTX case, after check_ctx_access() has supplied a reg_type, the other members of the register state are set appropriately. Previously reg.range was set to 0, but as it is in a union with reg.map_ptr, which is larger, upper bytes of the latter were left in place. This then caused the memcmp() in regsafe() to fail, preventing some branches from being pruned (and occasionally causing the same program to take a varying number of processed insns on repeated verifier runs). Fix the instability by clearing bpf_reg_state in __mark_reg_[un]known() Fixes: `f1174f77b5` ("bpf/verifier: rework value tracking") Debugged-by: Edward Cree <ecree@solarflare.com> Acked-by: Edward Cree <ecree@solarflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-09-05 22:21:00 -07:00
Taeung Song	69495d2a52	libbpf: Remove the duplicate checking of function storage After the commit `eac7d84519` ("tools: libbpf: don't return '.text' as a program for multi-function programs"), bpf_program__next() in bpf_object__for_each_program skips the function storage such as .text, so eliminate the duplicate checking. Cc: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Taeung Song <treeze.taeung@gmail.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-09-05 22:16:00 -07:00
Dmitry Safonov	428f944bd5	netlink: Make groups check less stupid in netlink_bind() As Linus noted, the test for 0 is needless, groups type can follow the usual kernel style and 8sizeof(unsigned long) is BITS_PER_LONG: > The code [..] isn't technically incorrect... > But it is stupid. > Why stupid? Because the test for 0 is pointless. > > Just doing > if (nlk->ngroups < 8sizeof(groups)) > groups &= (1UL << nlk->ngroups) - 1; > > would have been fine and more understandable, since the "mask by shift > count" already does the right thing for a ngroups value of 0. Now that > test for zero makes me go "what's special about zero?". It turns out > that the answer to that is "nothing". [..] > The type of "groups" is kind of silly too. > > Yeah, "long unsigned int" isn't _technically_ wrong. But we normally > call that type "unsigned long". Cleanup my piece of pointlessness. Cc: "David S. Miller" <davem@davemloft.net> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: netdev@vger.kernel.org Fairly-blamed-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Dmitry Safonov <dima@arista.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-05 22:11:33 -07:00
Vincent Whitchurch	fa788d986a	packet: add sockopt to ignore outgoing packets Currently, the only way to ignore outgoing packets on a packet socket is via the BPF filter. With MSG_ZEROCOPY, packets that are looped into AF_PACKET are copied in dev_queue_xmit_nit(), and this copy happens even if the filter run from packet_rcv() would reject them. So the presence of a packet socket on the interface takes away the benefits of MSG_ZEROCOPY, even if the packet socket is not interested in outgoing packets. (Even when MSG_ZEROCOPY is not used, the skb is unnecessarily cloned, but the cost for that is much lower.) Add a socket option to allow AF_PACKET sockets to ignore outgoing packets to solve this. Note that the *BSDs already have something similar: BIOCSSEESENT/BIOCSDIRECTION and BIOCSDIRFILT. The first intended user is lldpd. Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-05 22:09:37 -07:00
YueHaibing	05dcc71298	net: lan743x_ptp: make function lan743x_ptp_set_sync_ts_insert() static Fixes the following sparse warning: drivers/net/ethernet/microchip/lan743x_ptp.c:980:6: warning: symbol 'lan743x_ptp_set_sync_ts_insert' was not declared. Should it be static? Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-05 08:07:05 -07:00
Wei Yongjun	fbb66ad5dc	net/mlx5e: Make function mlx5i_grp_sw_update_stats() static Fixes the following sparse warning: drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c:119:6: warning: symbol 'mlx5i_grp_sw_update_stats' was not declared. Should it be static? Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-05 08:06:13 -07:00
David S. Miller	579d03fecb	This time, we have some pretty impactful work. Among the changes: * changes to make PTK rekeying work better, or actually better/safely if drivers get updated * VHT extended NSS support - some APs had capabilities that didn't fit into the VHT (11ac) spec, so the spec was updated and we follow that now * some TXQ and A-MSDU building work - will allow iwlwifi to use this soon * more HE work, including aligning to 802.11ax Draft 3.0 * L-SIG and 0-length-PSDU support in radiotap -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEH1e1rEeCd0AIMq6MB8qZga/fl8QFAluPm+MACgkQB8qZga/f l8T5QA/6A7zWg81fz1c4UDMtA9fNbv/IeGnqQocDRVHpSCV7PwwhwzzoRI5mRaqE vnP4JzD8Yt1VcYmE76K3WSoQ2UAxm/02sKah+o9CrTk7ar4hqvJkRC9/aBZPp75f Pf8T7WL0Mlem9JimBKr3VIj6Rcom49o2JdWy+3v0guoxNFGwA7xpo5OHoTSHGgfi Ziome5WZu1eL9uAi79TAbkYe6ru6f7J7rx+jjtHVvnDN/5ky+G2LtDTClS0juhJt Jvw3+4V76WmYahYme6yEs6yiiYTFA1I7fUOjuW8/oLW29fjuQu6//K/ZHYU5Vdw1 4ujpbKW1jv+ncpKnIKYJ598JGBURRu6Mc8WOOljJt5wvxkgZGqCjs+APdzf/YQ1B XrGQ9gr9HSYoQ5Gu/9vlPHH/KrXG8GvPW+4bHlkwrcd1IVrDcBDfRr/o8nAMo8j+ WOIf6Ck8zqdzyf2rfHa4D1WGC+c4hJnKgn6OpSvcuOJhghLvFeVmIJk4CSTAI+Ds 0BKKH4VHJDdibE7wslTb4coRU8wxcem6Pm9IhtqeKKYVgDfxjfG+sJjQnRICbRL6 SWfwtGUodA1hdp/3uzlV0X/W9flw5vJCcMFzg2W5qxBoCD7K1yz3LcQM1AkYyQr+ GuubYUyoQIEBn7z9GCsQplUpdrpAsvCpJQ17oka/gQt4wo/L5o0= =HYA6 -----END PGP SIGNATURE----- Merge tag 'mac80211-next-for-davem-2018-09-05' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next Johannes Berg says: ==================== This time, we have some pretty impactful work. Among the changes: * changes to make PTK rekeying work better, or actually better/safely if drivers get updated * VHT extended NSS support - some APs had capabilities that didn't fit into the VHT (11ac) spec, so the spec was updated and we follow that now * some TXQ and A-MSDU building work - will allow iwlwifi to use this soon * more HE work, including aligning to 802.11ax Draft 3.0 * L-SIG and 0-length-PSDU support in radiotap ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-05 07:48:52 -07:00
Stanislaw Gruszka	014f5a250f	cfg80211: validate wmm rule when setting Add validation check for wmm rule when copy rules from fwdb and print error when rule is invalid. Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2018-09-05 10:16:59 +02:00
Gustavo A. R. Silva	40b5a0f8c6	mac80211: remove unnecessary NULL check Both old and new cannot be NULL at the same time, hence checking new when old is not NULL is unnecessary. Also, notice that new is being dereferenced before it is checked: idx = new->conf.keyidx; The above triggers a static code analysis warning. Address this by removing the NULL check on new and adding a code comment based on the following piece of code: 387 /* caller must provide at least one old/new */ 388 if (WARN_ON(!new && !old)) 389 return 0; Addresses-Coverity-ID: 1473176 ("Dereference before null check") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2018-09-05 10:15:48 +02:00
Sara Sharon	9739fe29a2	mac80211: add an option for drivers to check if packets can be aggregated Some hardwares have limitations on the packets' type in AMSDU. Add an optional driver callback to determine if two skbs can be used in the same AMSDU or not. Signed-off-by: Sara Sharon <sara.sharon@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2018-09-05 10:11:50 +02:00
Sara Sharon	edba6bdad6	mac80211: allow AMSDU size limitation per-TID Some drivers may have AMSDU size limitation per TID, due to HW constrains. Add an option to set this limit. Signed-off-by: Sara Sharon <sara.sharon@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2018-09-05 10:10:26 +02:00
Sara Sharon	0eeb2b674f	mac80211: add an option for station management TXQ We have a TXQ abstraction for non-data packets that need powersave buffering. Since the AP cannot sleep, in case of station we can use this TXQ for all management frames, regardless if they are bufferable. Add HW flag to allow that. Signed-off-by: Sara Sharon <sara.sharon@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2018-09-05 10:10:11 +02:00
Shaul Triebitz	add7453ad6	wireless: align to draft 11ax D3.0 Align to new 11ax draft D3.0. Change/add new MAC and PHY capabilities and update drivers' 11ax capabilities and mac80211's debugfs accordingly. Signed-off-by: Shaul Triebitz <shaul.triebitz@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2018-09-05 10:09:50 +02:00
Naftali Goldstein	77cbbc35a4	mac80211: fix saving a few HE values After masking the he_oper_params, to get the requested values as integers one must rshift and not lshift. Fix that by using the le32_get_bits() macro. Fixes: `41cbb0f5a2` ("mac80211: add support for HE") Signed-off-by: Naftali Goldstein <naftali.goldstein@intel.com> [converted to use le32_get_bits()] Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2018-09-05 10:08:42 +02:00
Shaul Triebitz	c3d1f87528	mac80211: support reporting 0-length PSDU in radiotap For certain sounding frames, it may be useful to report them to userspace even though they don't have a PSDU in order to determine the PHY parameters (e.g. VHT rate/stream config.) Add support for this to mac80211. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Shaul Triebitz <shaul.triebitz@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2018-09-05 10:08:25 +02:00
Alexander Wetzel	62872a9b9a	mac80211: Fix PTK rekey freezes and clear text leak Rekeying PTK keys without "Extended Key ID for Individually Addressed Frames" did use a procedure not suitable to replace in-use keys and could caused the following issues: 1) Freeze caused by incoming frames: If the local STA installed the key prior to the remote STA we still had the old key active in the hardware when mac80211 switched over to the new key. Therefore there was a window where the card could hand over frames decoded with the old key to mac80211 and bump the new PN (IV) value to an incorrect high number. When it happened the local replay detection silently started to drop all frames sent with the new key. 2) Freeze caused by outgoing frames: If mac80211 was providing the PN (IV) and handed over a clear text frame for encryption to the hardware prior to a key change the driver/card could have processed the queued frame after switching to the new key. This bumped the PN value on the remote STA to an incorrect high number, tricking the remote STA to discard all frames we sent later. 3) Freeze caused by RX aggregation reorder buffer: An aggregation session started with the old key and ending after the switch to the new key also bumped the PN to an incorrect high number, freezing the connection quite similar to 1). 4) Freeze caused by repeating lost frames in an aggregation session: A driver could repeat a lost frame and encrypt it with the new key while in a TX aggregation session without updating the PN for the new key. This also could freeze connections similar to 2). 5) Clear text leak: Removing encryption offload from the card cleared the encryption offload flag only after the card had deleted the key and we did not stop TX during the rekey. The driver/card could therefore get unencrypted frames from mac80211 while no longer be instructed to encrypt them. To prevent those issues the key install logic has been changed: - Mac80211 divers known to be able to rekey PTK0 keys have to set @NL80211_EXT_FEATURE_CAN_REPLACE_PTK0, - mac80211 stops queuing frames depending on the key during the replace - the key is first replaced in the hardware and after that in mac80211 - and mac80211 stops/blocks new aggregation sessions during the rekey. For drivers not setting @NL80211_EXT_FEATURE_CAN_REPLACE_PTK0 the user space must avoid PTK rekeys if "Extended Key ID for Individually Addressed Frames" is not being used. Rekeys for mac80211 drivers without this flag will generate a warning and use an extra call to ieee80211_flush_queues() to both highlight and try to prevent the issues with not updated drivers. The core of the fix changes the key install procedure from: - atomic switch over to the new key in mac80211 - remove the old key in the hardware (stops encryption offloading, fall back to software encryption with a potential clear text packet leak in between) - delete the inactive old key in mac80211 - enable hardware encryption offloading for the new key to: - if it's a PTK mark the old key as tainted to drop TX frames with the outgoing key - replace the key in hardware with the new one - atomic switch over to the new (not marked as tainted) key in mac80211 (which also resumes TX) - delete the inactive old key in mac80211 With the new sequence the hardware will be unable to decrypt frames encrypted with the old key prior to switching to the new key in mac80211 and thus prevent PNs from packets decrypted with the old key to be accounted against the new key. For that to work the drivers have to provide a clear boundary. Mac80211 drivers setting @NL80211_EXT_FEATURE_CAN_REPLACE_PTK0 confirm to provide it and mac80211 will then be able to correctly rekey in-use PTK keys with those drivers. The mac80211 requirements for drivers to set the flag have been added to the "Hardware crypto acceleration" documentation section. It drills down to: The drivers must not hand over frames decrypted with the old key to mac80211 once the call to set_key() with %DISABLE_KEY has been completed. It's allowed to either drop or continue to use the old key for any outgoing frames which are already in the queues, but it must not send out any of them unencrypted or encrypted with the new key. Even with the new boundary in place aggregation sessions with the reorder buffer are problematic: RX aggregation session started prior and completed after the rekey could still dump frames received with the old key at mac80211 after it switched over to the new key. This is side stepped by stopping all (RX and TX) aggregation sessions when replacing a PTK key and hardware key offloading. Stopping TX aggregation sessions avoids the need to get the PNs (IVs) updated in frames prepared for the old key and (re)transmitted after the switch to the new key. As a bonus it improves the compatibility when the remote STA is not handling rekeys as it should. When using software crypto aggregation sessions are not stopped. Mac80211 won't be able to decode the dangerous frames and discard them without special handling. Signed-off-by: Alexander Wetzel <alexander@wetzel-home.de> [trim overly long rekey warning] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2018-09-05 10:03:17 +02:00
Alexander Wetzel	2b815b04df	nl80211: Add CAN_REPLACE_PTK0 API Drivers able to correctly replace a in-use key should set @NL80211_EXT_FEATURE_CAN_REPLACE_PTK0 to allow the user space (e.g. hostapd or wpa_supplicant) to rekey PTK keys. The user space must detect a PTK rekey attempt and only go ahead with it when the driver has set this flag. If the driver is not supporting the feature the user space either must not replace the PTK key or perform a full re-association instead. Ignoring this flag and continuing to rekey the connection can still work but has to be considered insecure and broken. Depending on the driver it can leak clear text packets or freeze the connection and is only supported to allow the user space to be updated. Signed-off-by: Alexander Wetzel <alexander@wetzel-home.de> Reviewed-by: Denis Kenzior <denkenz@gmail.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2018-09-05 10:03:17 +02:00
Shaul Triebitz	d1332e7be2	mac80211: support radiotap L-SIG data As before with HE, the data needs to be provided by the driver in the skb head, since there's not enough space in the skb CB. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Shaul Triebitz <shaul.triebitz@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2018-09-05 10:03:15 +02:00
Wen Gong	70e53669c4	mac80211: Store sk_pacing_shift in ieee80211_hw Make it possibly for drivers to adjust the default skb_pacing_shift by storing it in the hardware struct. Signed-off-by: Wen Gong <wgong@codeaurora.org> [adjust commit log, move & adjust comment] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2018-09-05 10:03:15 +02:00
Johannes Berg	e80d642552	mac80211: copy VHT EXT NSS BW Support/Capable data to station When taking VHT capabilities for a station, copy the new fields if we support them as a transmitter. Also adjust the maximum bandwidth the station supports appropriately. Also, since it was missing, copy tx_highest and rx_highest. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2018-09-05 10:03:14 +02:00
Johannes Berg	7eb26df297	mac80211: add ability to parse CCFS2 With newer VHT implementations, it's necessary to look at the HT operation's CCFS2 field to identify the actual bandwidth used. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2018-09-05 10:03:14 +02:00
Johannes Berg	09b4a4faf9	mac80211: introduce capability flags for VHT EXT NSS support Depending on whether or not rate control supports selecting rates depending on the bandwidth, we can use VHT extended NSS support. In essence, this is dot11VHTExtendedNSSBWCapable from the spec, since depending on that we'll need to parse the bandwidth. If needed, also set/clear the VHT Capability Element bit for this capability so that we don't advertise it erroneously or don't advertise it when we actually use it. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2018-09-05 10:03:14 +02:00
Johannes Berg	b0aa75f0b1	ieee80211: add new VHT capability fields/parsing IEEE 802.11-2016 extended the VHT capability fields to allow indicating the number of spatial streams depending on the actually used bandwidth, add support for decoding this. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2018-09-05 10:03:14 +02:00
Shaul Triebitz	34fb190ec0	mac80211: in AP mode, set bss_conf::he_supported In AP mode, If AP advertises HE capabilities, set to true bss_conf::he_supported so that the Driver knows about it. Signed-off-by: Shaul Triebitz <shaul.triebitz@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2018-09-05 10:03:13 +02:00
Shaul Triebitz	244eb9ae79	cfg80211: add he_capabilities (ext) IE to AP settings Same as for HT and VHT. This helps the lower level to know whether the AP supports HE. Signed-off-by: Shaul Triebitz <shaul.triebitz@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2018-09-05 10:03:13 +02:00
Sara Sharon	03512ceb60	ieee80211: remove redundant leading zeroes The defines of IEEE80211_HE_OPERATION_VHT_OPER_INFO and IEEE80211_HE_OPERATION_MULTI_BSSID_AP have leading zeroes that makes the number look like it is bigger than 32 bit. This is misleading, remove it. Signed-off-by: Sara Sharon <sara.sharon@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2018-09-05 10:03:13 +02:00

1 2 3 4 5 ...

782249 Commits