forked from Minki/linux
Documentation: networking: switchdev: clarify device driver behavior
This patch provides details on the expected behavior of switchdev enabled network devices when operating in a "stand alone" mode, as well as when being bridge members. This clarifies a number of things that recently came up during a bug fixing session on the b53 DSA switch driver. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
This commit is contained in:
parent
6e9530f4c0
commit
0f22ad45f4
@ -385,3 +385,155 @@ The driver can monitor for updates to arp_tbl using the netevent notifier
|
|||||||
NETEVENT_NEIGH_UPDATE. The device can be programmed with resolved nexthops
|
NETEVENT_NEIGH_UPDATE. The device can be programmed with resolved nexthops
|
||||||
for the routes as arp_tbl updates. The driver implements ndo_neigh_destroy
|
for the routes as arp_tbl updates. The driver implements ndo_neigh_destroy
|
||||||
to know when arp_tbl neighbor entries are purged from the port.
|
to know when arp_tbl neighbor entries are purged from the port.
|
||||||
|
|
||||||
|
Device driver expected behavior
|
||||||
|
-------------------------------
|
||||||
|
|
||||||
|
Below is a set of defined behavior that switchdev enabled network devices must
|
||||||
|
adhere to.
|
||||||
|
|
||||||
|
Configuration-less state
|
||||||
|
^^^^^^^^^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
|
Upon driver bring up, the network devices must be fully operational, and the
|
||||||
|
backing driver must configure the network device such that it is possible to
|
||||||
|
send and receive traffic to this network device and it is properly separated
|
||||||
|
from other network devices/ports (e.g.: as is frequent with a switch ASIC). How
|
||||||
|
this is achieved is heavily hardware dependent, but a simple solution can be to
|
||||||
|
use per-port VLAN identifiers unless a better mechanism is available
|
||||||
|
(proprietary metadata for each network port for instance).
|
||||||
|
|
||||||
|
The network device must be capable of running a full IP protocol stack
|
||||||
|
including multicast, DHCP, IPv4/6, etc. If necessary, it should program the
|
||||||
|
appropriate filters for VLAN, multicast, unicast etc. The underlying device
|
||||||
|
driver must effectively be configured in a similar fashion to what it would do
|
||||||
|
when IGMP snooping is enabled for IP multicast over these switchdev network
|
||||||
|
devices and unsolicited multicast must be filtered as early as possible in
|
||||||
|
the hardware.
|
||||||
|
|
||||||
|
When configuring VLANs on top of the network device, all VLANs must be working,
|
||||||
|
irrespective of the state of other network devices (e.g.: other ports being part
|
||||||
|
of a VLAN-aware bridge doing ingress VID checking). See below for details.
|
||||||
|
|
||||||
|
If the device implements e.g.: VLAN filtering, putting the interface in
|
||||||
|
promiscuous mode should allow the reception of all VLAN tags (including those
|
||||||
|
not present in the filter(s)).
|
||||||
|
|
||||||
|
Bridged switch ports
|
||||||
|
^^^^^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
|
When a switchdev enabled network device is added as a bridge member, it should
|
||||||
|
not disrupt any functionality of non-bridged network devices and they
|
||||||
|
should continue to behave as normal network devices. Depending on the bridge
|
||||||
|
configuration knobs below, the expected behavior is documented.
|
||||||
|
|
||||||
|
Bridge VLAN filtering
|
||||||
|
^^^^^^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
|
The Linux bridge allows the configuration of a VLAN filtering mode (statically,
|
||||||
|
at device creation time, and dynamically, during run time) which must be
|
||||||
|
observed by the underlying switchdev network device/hardware:
|
||||||
|
|
||||||
|
- with VLAN filtering turned off: the bridge is strictly VLAN unaware and its
|
||||||
|
data path will process all Ethernet frames as if they are VLAN-untagged.
|
||||||
|
The bridge VLAN database can still be modified, but the modifications should
|
||||||
|
have no effect while VLAN filtering is turned off. Frames ingressing the
|
||||||
|
device with a VID that is not programmed into the bridge/switch's VLAN table
|
||||||
|
must be forwarded and may be processed using a VLAN device (see below).
|
||||||
|
|
||||||
|
- with VLAN filtering turned on: the bridge is VLAN-aware and frames ingressing
|
||||||
|
the device with a VID that is not programmed into the bridges/switch's VLAN
|
||||||
|
table must be dropped (strict VID checking).
|
||||||
|
|
||||||
|
When there is a VLAN device (e.g: sw0p1.100) configured on top of a switchdev
|
||||||
|
network device which is a bridge port member, the behavior of the software
|
||||||
|
network stack must be preserved, or the configuration must be refused if that
|
||||||
|
is not possible.
|
||||||
|
|
||||||
|
- with VLAN filtering turned off, the bridge will process all ingress traffic
|
||||||
|
for the port, except for the traffic tagged with a VLAN ID destined for a
|
||||||
|
VLAN upper. The VLAN upper interface (which consumes the VLAN tag) can even
|
||||||
|
be added to a second bridge, which includes other switch ports or software
|
||||||
|
interfaces. Some approaches to ensure that the forwarding domain for traffic
|
||||||
|
belonging to the VLAN upper interfaces are managed properly:
|
||||||
|
* If forwarding destinations can be managed per VLAN, the hardware could be
|
||||||
|
configured to map all traffic, except the packets tagged with a VID
|
||||||
|
belonging to a VLAN upper interface, to an internal VID corresponding to
|
||||||
|
untagged packets. This internal VID spans all ports of the VLAN-unaware
|
||||||
|
bridge. The VID corresponding to the VLAN upper interface spans the
|
||||||
|
physical port of that VLAN interface, as well as the other ports that
|
||||||
|
might be bridged with it.
|
||||||
|
* Treat bridge ports with VLAN upper interfaces as standalone, and let
|
||||||
|
forwarding be handled in the software data path.
|
||||||
|
|
||||||
|
- with VLAN filtering turned on, these VLAN devices can be created as long as
|
||||||
|
the bridge does not have an existing VLAN entry with the same VID on any
|
||||||
|
bridge port. These VLAN devices cannot be enslaved into the bridge since they
|
||||||
|
duplicate functionality/use case with the bridge's VLAN data path processing.
|
||||||
|
|
||||||
|
Non-bridged network ports of the same switch fabric must not be disturbed in any
|
||||||
|
way by the enabling of VLAN filtering on the bridge device(s). If the VLAN
|
||||||
|
filtering setting is global to the entire chip, then the standalone ports
|
||||||
|
should indicate to the network stack that VLAN filtering is required by setting
|
||||||
|
'rx-vlan-filter: on [fixed]' in the ethtool features.
|
||||||
|
|
||||||
|
Because VLAN filtering can be turned on/off at runtime, the switchdev driver
|
||||||
|
must be able to reconfigure the underlying hardware on the fly to honor the
|
||||||
|
toggling of that option and behave appropriately. If that is not possible, the
|
||||||
|
switchdev driver can also refuse to support dynamic toggling of the VLAN
|
||||||
|
filtering knob at runtime and require a destruction of the bridge device(s) and
|
||||||
|
creation of new bridge device(s) with a different VLAN filtering value to
|
||||||
|
ensure VLAN awareness is pushed down to the hardware.
|
||||||
|
|
||||||
|
Even when VLAN filtering in the bridge is turned off, the underlying switch
|
||||||
|
hardware and driver may still configure itself in a VLAN-aware mode provided
|
||||||
|
that the behavior described above is observed.
|
||||||
|
|
||||||
|
The VLAN protocol of the bridge plays a role in deciding whether a packet is
|
||||||
|
treated as tagged or not: a bridge using the 802.1ad protocol must treat both
|
||||||
|
VLAN-untagged packets, as well as packets tagged with 802.1Q headers, as
|
||||||
|
untagged.
|
||||||
|
|
||||||
|
The 802.1p (VID 0) tagged packets must be treated in the same way by the device
|
||||||
|
as untagged packets, since the bridge device does not allow the manipulation of
|
||||||
|
VID 0 in its database.
|
||||||
|
|
||||||
|
When the bridge has VLAN filtering enabled and a PVID is not configured on the
|
||||||
|
ingress port, untagged 802.1p tagged packets must be dropped. When the bridge
|
||||||
|
has VLAN filtering enabled and a PVID exists on the ingress port, untagged and
|
||||||
|
priority-tagged packets must be accepted and forwarded according to the
|
||||||
|
bridge's port membership of the PVID VLAN. When the bridge has VLAN filtering
|
||||||
|
disabled, the presence/lack of a PVID should not influence the packet
|
||||||
|
forwarding decision.
|
||||||
|
|
||||||
|
Bridge IGMP snooping
|
||||||
|
^^^^^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
|
The Linux bridge allows the configuration of IGMP snooping (statically, at
|
||||||
|
interface creation time, or dynamically, during runtime) which must be observed
|
||||||
|
by the underlying switchdev network device/hardware in the following way:
|
||||||
|
|
||||||
|
- when IGMP snooping is turned off, multicast traffic must be flooded to all
|
||||||
|
ports within the same bridge that have mcast_flood=true. The CPU/management
|
||||||
|
port should ideally not be flooded (unless the ingress interface has
|
||||||
|
IFF_ALLMULTI or IFF_PROMISC) and continue to learn multicast traffic through
|
||||||
|
the network stack notifications. If the hardware is not capable of doing that
|
||||||
|
then the CPU/management port must also be flooded and multicast filtering
|
||||||
|
happens in software.
|
||||||
|
|
||||||
|
- when IGMP snooping is turned on, multicast traffic must selectively flow
|
||||||
|
to the appropriate network ports (including CPU/management port). Flooding of
|
||||||
|
unknown multicast should be only towards the ports connected to a multicast
|
||||||
|
router (the local device may also act as a multicast router).
|
||||||
|
|
||||||
|
The switch must adhere to RFC 4541 and flood multicast traffic accordingly
|
||||||
|
since that is what the Linux bridge implementation does.
|
||||||
|
|
||||||
|
Because IGMP snooping can be turned on/off at runtime, the switchdev driver
|
||||||
|
must be able to reconfigure the underlying hardware on the fly to honor the
|
||||||
|
toggling of that option and behave appropriately.
|
||||||
|
|
||||||
|
A switchdev driver can also refuse to support dynamic toggling of the multicast
|
||||||
|
snooping knob at runtime and require the destruction of the bridge device(s)
|
||||||
|
and creation of a new bridge device(s) with a different multicast snooping
|
||||||
|
value.
|
||||||
|
Loading…
Reference in New Issue
Block a user