linux/net/openvswitch
Mark Gray b83d23a2a3 openvswitch: Introduce per-cpu upcall dispatch
The Open vSwitch kernel module uses the upcall mechanism to send
packets from kernel space to user space when it misses in the kernel
space flow table. The upcall sends packets via a Netlink socket.
Currently, a Netlink socket is created for every vport. In this way,
there is a 1:1 mapping between a vport and a Netlink socket.
When a packet is received by a vport, if it needs to be sent to
user space, it is sent via the corresponding Netlink socket.

This mechanism, with various iterations of the corresponding user
space code, has seen some limitations and issues:

* On systems with a large number of vports, there is a correspondingly
large number of Netlink sockets which can limit scaling.
(https://bugzilla.redhat.com/show_bug.cgi?id=1526306)
* Packet reordering on upcalls.
(https://bugzilla.redhat.com/show_bug.cgi?id=1844576)
* A thundering herd issue.
(https://bugzilla.redhat.com/show_bug.cgi?id=1834444)

This patch introduces an alternative, feature-negotiated, upcall
mode using a per-cpu dispatch rather than a per-vport dispatch.

In this mode, the Netlink socket to be used for the upcall is
selected based on the CPU of the thread that is executing the upcall.
In this way, it resolves the issues above as:

a) The number of Netlink sockets scales with the number of CPUs
rather than the number of vports.
b) Ordering per-flow is maintained as packets are distributed to
CPUs based on mechanisms such as RSS and flows are distributed
to a single user space thread.
c) Packets from a flow can only wake up one user space thread.

The corresponding user space code can be found at:
https://mail.openvswitch.org/pipermail/ovs-dev/2021-July/385139.html

Bugzilla: https://bugzilla.redhat.com/1844576
Signed-off-by: Mark Gray <mark.d.gray@redhat.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-16 11:06:33 -07:00
..
actions.c openvswitch: Introduce per-cpu upcall dispatch 2021-07-16 11:06:33 -07:00
conntrack.c net: openvswitch: Remove unnecessary skb_nfct() 2021-05-10 14:18:19 -07:00
conntrack.h net/sched: act_api: fix miss set post_ct for ovs after do conntrack in act_ct 2021-03-16 15:22:18 -07:00
datapath.c openvswitch: Introduce per-cpu upcall dispatch 2021-07-16 11:06:33 -07:00
datapath.h openvswitch: Introduce per-cpu upcall dispatch 2021-07-16 11:06:33 -07:00
dp_notify.c net: openvswitch: use netif_ovs_is_port() instead of opencode 2019-07-08 15:53:25 -07:00
flow_netlink.c net: openvswitch: add log message for error case 2021-01-14 16:32:14 -08:00
flow_netlink.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 269 2019-06-05 17:30:29 +02:00
flow_table.c openvswitch: Optimize operation for key comparison 2021-07-01 11:13:10 -07:00
flow_table.h net: openvswitch: fix to make sure flow_lookup() is not preempted 2020-10-18 12:29:36 -07:00
flow.c net/sched: act_api: fix miss set post_ct for ovs after do conntrack in act_ct 2021-03-16 15:22:18 -07:00
flow.h treewide: Use sizeof_field() macro 2019-12-09 10:36:44 -08:00
Kconfig treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00
Makefile openvswitch: add trace points 2021-06-22 10:47:32 -07:00
meter.c openvswitch: meter: fix race when getting now_ms. 2021-05-13 15:54:59 -07:00
meter.h net: openvswitch: use u64 for meter bucket 2020-04-23 18:26:11 -07:00
openvswitch_trace.c openvswitch: add trace points 2021-06-22 10:47:32 -07:00
openvswitch_trace.h openvswitch: add trace points 2021-06-22 10:47:32 -07:00
vport-geneve.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
vport-gre.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 269 2019-06-05 17:30:29 +02:00
vport-internal_dev.c net: openvswitch: use core API to update/provide stats 2020-11-14 16:59:32 -08:00
vport-internal_dev.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 269 2019-06-05 17:30:29 +02:00
vport-netdev.c net: openvswitch: Use 'skb_push_rcsum()' instead of hand coding it 2021-04-04 01:43:02 -07:00
vport-netdev.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 269 2019-06-05 17:30:29 +02:00
vport-vxlan.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 269 2019-06-05 17:30:29 +02:00
vport.c openvswitch: Warn over-mtu packets only if iface is UP. 2021-03-16 16:28:30 -07:00
vport.h openvswitch: Fix a typo 2021-03-22 12:59:46 -07:00