linux/net
John Fastabend 174a79ff95 bpf: sockmap with sk redirect support
Recently we added a new map type called dev map used to forward XDP
packets between ports (6093ec2dc3). This patches introduces a
similar notion for sockets.

A sockmap allows users to add participating sockets to a map. When
sockets are added to the map enough context is stored with the
map entry to use the entry with a new helper

  bpf_sk_redirect_map(map, key, flags)

This helper (analogous to bpf_redirect_map in XDP) is given the map
and an entry in the map. When called from a sockmap program, discussed
below, the skb will be sent on the socket using skb_send_sock().

With the above we need a bpf program to call the helper from that will
then implement the send logic. The initial site implemented in this
series is the recv_sock hook. For this to work we implemented a map
attach command to add attributes to a map. In sockmap we add two
programs a parse program and a verdict program. The parse program
uses strparser to build messages and pass them to the verdict program.
The parse programs use the normal strparser semantics. The verdict
program is of type SK_SKB.

The verdict program returns a verdict SK_DROP, or  SK_REDIRECT for
now. Additional actions may be added later. When SK_REDIRECT is
returned, expected when bpf program uses bpf_sk_redirect_map(), the
sockmap logic will consult per cpu variables set by the helper routine
and pull the sock entry out of the sock map. This pattern follows the
existing redirect logic in cls and xdp programs.

This gives the flow,

 recv_sock -> str_parser (parse_prog) -> verdict_prog -> skb_send_sock
                                                     \
                                                      -> kfree_skb

As an example use case a message based load balancer may use specific
logic in the verdict program to select the sock to send on.

Sample programs are provided in future patches that hopefully illustrate
the user interfaces. Also selftests are in follow-on patches.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-16 11:27:53 -07:00
..
6lowpan
9p Merge branch 'work.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2017-07-15 12:00:42 -07:00
802 net: introduce __skb_put_[zero, data, u8] 2017-06-20 13:30:14 -04:00
8021q net: add netlink_ext_ack argument to rtnl_link_ops.validate 2017-06-26 23:13:22 -04:00
appletalk networking: make skb_push & __skb_push return void pointers 2017-06-16 11:48:40 -04:00
atm net: atm: make atmdev_ops const 2017-08-09 22:43:50 -07:00
ax25 net, ax25: convert ax25_cb.refcount from atomic_t to refcount_t 2017-07-04 22:35:19 +01:00
batman-adv Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2017-08-09 16:28:45 -07:00
bluetooth 6lowpan: fix set not used warning 2017-07-25 12:31:37 -07:00
bpf
bridge rtnetlink: make rtnl_register accept a flags parameter 2017-08-09 16:57:38 -07:00
caif net: convert sock.sk_wmem_alloc from atomic_t to refcount_t 2017-07-01 07:39:08 -07:00
can rtnetlink: make rtnl_register accept a flags parameter 2017-08-09 16:57:38 -07:00
ceph libceph: make RECOVERY_DELETES feature create a new interval 2017-08-01 16:46:45 +02:00
core bpf: sockmap with sk redirect support 2017-08-16 11:27:53 -07:00
dcb rtnetlink: make rtnl_register accept a flags parameter 2017-08-09 16:57:38 -07:00
dccp Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2017-08-15 20:23:23 -07:00
decnet rtnetlink: make rtnl_register accept a flags parameter 2017-08-09 16:57:38 -07:00
dns_resolver
dsa Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2017-08-15 20:23:23 -07:00
ethernet networking: make skb_push & __skb_push return void pointers 2017-06-16 11:48:40 -04:00
hsr net: add netlink_ext_ack argument to rtnl_link_ops.newlink 2017-06-26 23:13:21 -04:00
ieee802154 net: add netlink_ext_ack argument to rtnl_link_ops.validate 2017-06-26 23:13:22 -04:00
ife
ipv4 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2017-08-15 20:23:23 -07:00
ipv6 net: add sendmsg_locked and sendpage_locked to af_inet6 2017-08-16 11:27:52 -07:00
ipx net, ipx: convert ipx_route.refcnt from atomic_t to refcount_t 2017-07-04 22:35:17 +01:00
irda Merge branch 'work.memdup_user' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2017-07-05 16:05:24 -07:00
iucv iucv: Convert sk_wmem_alloc accesses to refcount_t. 2017-07-03 02:31:22 -07:00
kcm strparser: Generalize strparser 2017-08-01 15:26:19 -07:00
key Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2017-08-15 20:23:23 -07:00
l2tp Revert "l2tp: constify inet6_protocol structures" 2017-08-01 10:03:17 -07:00
l3mdev
lapb net, lapb: convert lapb_cb.refcnt from atomic_t to refcount_t 2017-07-04 22:35:16 +01:00
llc net, llc: convert llc_sap.refcnt from atomic_t to refcount_t 2017-07-04 22:35:15 +01:00
mac80211 mac80211: add api to start ba session timer expired flow 2017-08-09 09:49:42 +03:00
mac802154 net: Fix inconsistent teardown and release of private netdev state. 2017-06-07 15:53:24 -04:00
mpls rtnetlink: make rtnl_register accept a flags parameter 2017-08-09 16:57:38 -07:00
ncsi networking: make skb_push & __skb_push return void pointers 2017-06-16 11:48:40 -04:00
netfilter net: ipv6: add second dif to inet6 socket lookups 2017-08-07 11:39:22 -07:00
netlabel
netlink net: convert sock.sk_refcnt from atomic_t to refcount_t 2017-07-01 07:39:08 -07:00
netrom net, netrom: convert nr_node.refcount from atomic_t to refcount_t 2017-07-04 22:35:17 +01:00
nfc NFC: Add sockaddr length checks before accessing sa_family in bind handlers 2017-06-23 00:38:31 +02:00
openvswitch openvswitch: Remove unnecessary newlines from OVS_NLERR uses 2017-08-11 14:52:47 -07:00
packet Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2017-08-10 12:11:16 -07:00
phonet rtnetlink: make rtnl_register accept a flags parameter 2017-08-09 16:57:38 -07:00
psample networking: make skb_put & friends return void pointers 2017-06-16 11:48:39 -04:00
qrtr rtnetlink: make rtnl_register accept a flags parameter 2017-08-09 16:57:38 -07:00
rds Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2017-08-09 16:28:45 -07:00
rfkill net: rfkill: gpio: Switch to devm_acpi_dev_add_driver_gpios() 2017-06-13 11:07:51 +02:00
rose
rxrpc rxrpc: Move the packet.h include file into net/rxrpc/ 2017-07-21 11:00:20 +01:00
sched net_sched/hfsc: opencode trivial set_active() and set_passive() 2017-08-16 10:55:34 -07:00
sctp sctp: fix some indents in sm_make_chunk.c 2017-08-11 10:02:44 -07:00
smc net/smc: synchronize buffer usage with device 2017-07-29 11:22:58 -07:00
strparser net: early init support for strparser 2017-08-16 11:27:52 -07:00
sunrpc NFS client bugfixes for 4.13 2017-07-21 16:26:01 -07:00
switchdev net: switchdev: Remove bridge bypass support from switchdev 2017-08-07 14:48:48 -07:00
tipc tipc: avoid inheriting msg_non_seq flag when message is returned 2017-08-14 11:20:36 -07:00
tls TLS: Fix length check in do_tls_getsockopt_tx() 2017-07-06 10:58:19 +01:00
unix net/unix: drop obsolete fd-recursion limits 2017-07-17 08:57:59 -07:00
vmw_vsock net: manual clean code which call skb_put_[data:zero] 2017-06-20 13:30:15 -04:00
wimax
wireless netlink validation fixes for nl80211 2017-07-07 11:35:55 +01:00
x25 X25: constify null_x25_address 2017-08-03 09:13:51 -07:00
xfrm xfrm: check that cached bundle is still valid 2017-08-07 14:25:39 -07:00
compat.c get_compat_bpf_fprog(): don't copyin field-by-field 2017-07-04 13:14:34 -04:00
Kconfig tls: kernel TLS support 2017-06-15 12:12:40 -04:00
Makefile tls: kernel TLS support 2017-06-15 12:12:40 -04:00
socket.c net: fixes for skb_send_sock 2017-08-16 11:27:52 -07:00
sysctl_net.c