forked from Minki/linux
Documentation: networking: switchdev: clarify device driver behavior
This patch provides details on the expected behavior of switchdev enabled network devices when operating in a "stand alone" mode, as well as when being bridge members. This clarifies a number of things that recently came up during a bug fixing session on the b53 DSA switch driver. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
This commit is contained in:
parent
6e9530f4c0
commit
0f22ad45f4
@ -385,3 +385,155 @@ The driver can monitor for updates to arp_tbl using the netevent notifier
|
||||
NETEVENT_NEIGH_UPDATE. The device can be programmed with resolved nexthops
|
||||
for the routes as arp_tbl updates. The driver implements ndo_neigh_destroy
|
||||
to know when arp_tbl neighbor entries are purged from the port.
|
||||
|
||||
Device driver expected behavior
|
||||
-------------------------------
|
||||
|
||||
Below is a set of defined behavior that switchdev enabled network devices must
|
||||
adhere to.
|
||||
|
||||
Configuration-less state
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
Upon driver bring up, the network devices must be fully operational, and the
|
||||
backing driver must configure the network device such that it is possible to
|
||||
send and receive traffic to this network device and it is properly separated
|
||||
from other network devices/ports (e.g.: as is frequent with a switch ASIC). How
|
||||
this is achieved is heavily hardware dependent, but a simple solution can be to
|
||||
use per-port VLAN identifiers unless a better mechanism is available
|
||||
(proprietary metadata for each network port for instance).
|
||||
|
||||
The network device must be capable of running a full IP protocol stack
|
||||
including multicast, DHCP, IPv4/6, etc. If necessary, it should program the
|
||||
appropriate filters for VLAN, multicast, unicast etc. The underlying device
|
||||
driver must effectively be configured in a similar fashion to what it would do
|
||||
when IGMP snooping is enabled for IP multicast over these switchdev network
|
||||
devices and unsolicited multicast must be filtered as early as possible in
|
||||
the hardware.
|
||||
|
||||
When configuring VLANs on top of the network device, all VLANs must be working,
|
||||
irrespective of the state of other network devices (e.g.: other ports being part
|
||||
of a VLAN-aware bridge doing ingress VID checking). See below for details.
|
||||
|
||||
If the device implements e.g.: VLAN filtering, putting the interface in
|
||||
promiscuous mode should allow the reception of all VLAN tags (including those
|
||||
not present in the filter(s)).
|
||||
|
||||
Bridged switch ports
|
||||
^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
When a switchdev enabled network device is added as a bridge member, it should
|
||||
not disrupt any functionality of non-bridged network devices and they
|
||||
should continue to behave as normal network devices. Depending on the bridge
|
||||
configuration knobs below, the expected behavior is documented.
|
||||
|
||||
Bridge VLAN filtering
|
||||
^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
The Linux bridge allows the configuration of a VLAN filtering mode (statically,
|
||||
at device creation time, and dynamically, during run time) which must be
|
||||
observed by the underlying switchdev network device/hardware:
|
||||
|
||||
- with VLAN filtering turned off: the bridge is strictly VLAN unaware and its
|
||||
data path will process all Ethernet frames as if they are VLAN-untagged.
|
||||
The bridge VLAN database can still be modified, but the modifications should
|
||||
have no effect while VLAN filtering is turned off. Frames ingressing the
|
||||
device with a VID that is not programmed into the bridge/switch's VLAN table
|
||||
must be forwarded and may be processed using a VLAN device (see below).
|
||||
|
||||
- with VLAN filtering turned on: the bridge is VLAN-aware and frames ingressing
|
||||
the device with a VID that is not programmed into the bridges/switch's VLAN
|
||||
table must be dropped (strict VID checking).
|
||||
|
||||
When there is a VLAN device (e.g: sw0p1.100) configured on top of a switchdev
|
||||
network device which is a bridge port member, the behavior of the software
|
||||
network stack must be preserved, or the configuration must be refused if that
|
||||
is not possible.
|
||||
|
||||
- with VLAN filtering turned off, the bridge will process all ingress traffic
|
||||
for the port, except for the traffic tagged with a VLAN ID destined for a
|
||||
VLAN upper. The VLAN upper interface (which consumes the VLAN tag) can even
|
||||
be added to a second bridge, which includes other switch ports or software
|
||||
interfaces. Some approaches to ensure that the forwarding domain for traffic
|
||||
belonging to the VLAN upper interfaces are managed properly:
|
||||
* If forwarding destinations can be managed per VLAN, the hardware could be
|
||||
configured to map all traffic, except the packets tagged with a VID
|
||||
belonging to a VLAN upper interface, to an internal VID corresponding to
|
||||
untagged packets. This internal VID spans all ports of the VLAN-unaware
|
||||
bridge. The VID corresponding to the VLAN upper interface spans the
|
||||
physical port of that VLAN interface, as well as the other ports that
|
||||
might be bridged with it.
|
||||
* Treat bridge ports with VLAN upper interfaces as standalone, and let
|
||||
forwarding be handled in the software data path.
|
||||
|
||||
- with VLAN filtering turned on, these VLAN devices can be created as long as
|
||||
the bridge does not have an existing VLAN entry with the same VID on any
|
||||
bridge port. These VLAN devices cannot be enslaved into the bridge since they
|
||||
duplicate functionality/use case with the bridge's VLAN data path processing.
|
||||
|
||||
Non-bridged network ports of the same switch fabric must not be disturbed in any
|
||||
way by the enabling of VLAN filtering on the bridge device(s). If the VLAN
|
||||
filtering setting is global to the entire chip, then the standalone ports
|
||||
should indicate to the network stack that VLAN filtering is required by setting
|
||||
'rx-vlan-filter: on [fixed]' in the ethtool features.
|
||||
|
||||
Because VLAN filtering can be turned on/off at runtime, the switchdev driver
|
||||
must be able to reconfigure the underlying hardware on the fly to honor the
|
||||
toggling of that option and behave appropriately. If that is not possible, the
|
||||
switchdev driver can also refuse to support dynamic toggling of the VLAN
|
||||
filtering knob at runtime and require a destruction of the bridge device(s) and
|
||||
creation of new bridge device(s) with a different VLAN filtering value to
|
||||
ensure VLAN awareness is pushed down to the hardware.
|
||||
|
||||
Even when VLAN filtering in the bridge is turned off, the underlying switch
|
||||
hardware and driver may still configure itself in a VLAN-aware mode provided
|
||||
that the behavior described above is observed.
|
||||
|
||||
The VLAN protocol of the bridge plays a role in deciding whether a packet is
|
||||
treated as tagged or not: a bridge using the 802.1ad protocol must treat both
|
||||
VLAN-untagged packets, as well as packets tagged with 802.1Q headers, as
|
||||
untagged.
|
||||
|
||||
The 802.1p (VID 0) tagged packets must be treated in the same way by the device
|
||||
as untagged packets, since the bridge device does not allow the manipulation of
|
||||
VID 0 in its database.
|
||||
|
||||
When the bridge has VLAN filtering enabled and a PVID is not configured on the
|
||||
ingress port, untagged 802.1p tagged packets must be dropped. When the bridge
|
||||
has VLAN filtering enabled and a PVID exists on the ingress port, untagged and
|
||||
priority-tagged packets must be accepted and forwarded according to the
|
||||
bridge's port membership of the PVID VLAN. When the bridge has VLAN filtering
|
||||
disabled, the presence/lack of a PVID should not influence the packet
|
||||
forwarding decision.
|
||||
|
||||
Bridge IGMP snooping
|
||||
^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
The Linux bridge allows the configuration of IGMP snooping (statically, at
|
||||
interface creation time, or dynamically, during runtime) which must be observed
|
||||
by the underlying switchdev network device/hardware in the following way:
|
||||
|
||||
- when IGMP snooping is turned off, multicast traffic must be flooded to all
|
||||
ports within the same bridge that have mcast_flood=true. The CPU/management
|
||||
port should ideally not be flooded (unless the ingress interface has
|
||||
IFF_ALLMULTI or IFF_PROMISC) and continue to learn multicast traffic through
|
||||
the network stack notifications. If the hardware is not capable of doing that
|
||||
then the CPU/management port must also be flooded and multicast filtering
|
||||
happens in software.
|
||||
|
||||
- when IGMP snooping is turned on, multicast traffic must selectively flow
|
||||
to the appropriate network ports (including CPU/management port). Flooding of
|
||||
unknown multicast should be only towards the ports connected to a multicast
|
||||
router (the local device may also act as a multicast router).
|
||||
|
||||
The switch must adhere to RFC 4541 and flood multicast traffic accordingly
|
||||
since that is what the Linux bridge implementation does.
|
||||
|
||||
Because IGMP snooping can be turned on/off at runtime, the switchdev driver
|
||||
must be able to reconfigure the underlying hardware on the fly to honor the
|
||||
toggling of that option and behave appropriately.
|
||||
|
||||
A switchdev driver can also refuse to support dynamic toggling of the multicast
|
||||
snooping knob at runtime and require the destruction of the bridge device(s)
|
||||
and creation of a new bridge device(s) with a different multicast snooping
|
||||
value.
|
||||
|
Loading…
Reference in New Issue
Block a user