Revert the use of structr_size() and stay with IP_MSFILTER_SIZE() for
now, as in this case, the size of struct ip_msfilter didn't change with
the addition of the flexible array imsf_slist_flex[]. So, if we use
struct_size() we will be allocating and calculating the size of
struct ip_msfilter with one too many items for imsf_slist_flex[].
We might use struct_size() in the future, but for now let's stay
with IP_MSFILTER_SIZE().
Fixes: 2d3e5caf96 ("net/ipv4: Replace one-element array with flexible-array member")
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently xhci-mtk needs software-managed bandwidth allocation for
periodic endpoints, it allocates the microframe index for the first
start-split packet for each endpoint. As this index allocation logic
should avoid the conflicts with other full/low-speed periodic endpoints,
it uses the worst case byte budgets on high-speed bus bandwidth
For example, for an isochronos IN endpoint with 192 bytes budget,
it will consume the whole 4 u-frames(188 * 4) while the actual
full-speed bus budget should be just 192bytes.
This patch changes the low/full-speed bandwidth allocation logic
to use "approximate" best case budget for lower speed bandwidth
management. For the same endpoint from the above example, the
approximate best case budget is now reduced to (188 * 2) bytes.
Without this patch, many usb audio headsets with 3 interfaces
(audio input, audio output, and HID) cannot be configured
on xhci-mtk.
Signed-off-by: Ikjoon Jang <ikjn@chromium.org>
Link: https://lore.kernel.org/r/20210805133937.1.Ia8174b875bc926c12ce427a5a1415dea31cc35ae@changeid
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
xhci-mtk depends on xhci's internal virt_dev when it retrieves its
internal data from usb_host_endpoint both in add_endpoint and
drop_endpoint callbacks. But when setup packet was retired by
transaction errors in xhci_setup_device() path, a virt_dev for the slot
is newly created with real_port 0. This leads to xhci-mtks's NULL pointer
dereference from drop_endpoint callback as xhci-mtk assumes that virt_dev's
real_port is always started from one. The similar problems were addressed
by [1] but that can't cover the failure cases from setup_device.
This patch drops the usages of xhci's virt_dev in xhci-mtk's drop_endpoint
callback by adopting rhashtable for searching mtk's schedule entity
from a given usb_host_endpoint pointer instead of searching a linked list.
So mtk's drop_endpoint callback doesn't have to rely on virt_dev at all.
[1] https://lore.kernel.org/r/1617179142-2681-2-git-send-email-chunfeng.yun@mediatek.com
Signed-off-by: Ikjoon Jang <ikjn@chromium.org>
Link: https://lore.kernel.org/r/20210805133731.1.Icc0f080e75b1312692d4c7c7d25e7df9fe1a05c2@changeid
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5e10da5385 ("skbuff: allow 'slow_gro' for skb carring sock
reference") introduces a serious regression at the GRO layer setting
the wrong truesize for stolen-head skbs.
Restore the correct truesize: SKB_DATA_ALIGN(...) instead of
SKB_TRUESIZE(...)
Reported-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Fixes: 5e10da5385 ("skbuff: allow 'slow_gro' for skb carring sock reference")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Tested-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
M Chetan Kumar says:
====================
net: wwan: iosm: fixes
This patch series contains IOSM Driver fixes. Below is the patch
series breakdown.
PATCH1:
* Correct the td buffer type casting & format specifier to fix lkp buildbot
warning.
PATCH2:
* Endianness type correction for nr_of_bytes. This field is exchanged
as part of host-device protocol communication.
PATCH3:
* Correct ul/dl data protocol mask bit to know which protocol capability
does device implement.
PATCH4:
* Calling unregister_netdevice() inside wwan del link is trying to
acquire the held lock in ndo_stop_cb(). Instead, queue net dev to
be unregistered later.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Calling unregister_netdevice() inside wwan del link is trying to
acquire the held lock in ndo_stop_cb(). Instead, queue net dev to
be unregistered later.
Signed-off-by: M Chetan Kumar <m.chetan.kumar@linux.intel.com>
Reviewed-by: Loic Poulain <loic.poulain@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alex Elder says:
====================
net: ipa: more work toward runtime PM
The first two patches in this series are basically bug fixes, but in
practice I don't think we've seen the problems they might cause.
The third patch moves clock and interconnect related error messages
around a bit, reporting better information and doing so in the
functions where they are enabled or disabled (rather than those
functions' callers).
The last three patches move power-related code into "ipa_clock.c",
as a step toward generalizing the purpose of that source file.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The ipa->flags field is only ever used in "ipa_clock.c", related to
suspend/resume activity.
Move the definition of the ipa_flag enumerated type to "ipa_clock.c".
And move the flags field from the ipa structure and to the ipa_clock
structure. Rename the type and its values to include "power" or
"POWER" in the name.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move ipa_suspend_handler() into "ipa_clock.c" from "ipa_main.c", to
group with the reset of the suspend/resume code. This IPA interrupt
is triggered if an IPA RX endpoint is suspended but has a packet to
be delivered.
Introduce ipa_power_setup() and ipa_power_teardown() to add and
remove the handler for the IPA SUSPEND interrupt at the same place
as before, while allowing the handler to remain private.
The "power" naming convention will be adopted elsewhere in this
file as well (soon).
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move ipa_suspend() and ipa_resume(), as well as the definition of
the ipa_pm_ops structure into "ipa_clock.c". Make ipa_pm_ops public
and declare it as extern in "ipa_clock.h".
This is part of centralizing IPA power management functionality into
"ipa_clock.c" (the file will eventually get a name change).
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rearrange messages reported when errors occur in the IPA clock code,
so that the specific interconnect is identified when an error occurs
enabling or disabling it, or the core clock is indicated when an
error occurs enabling it.
Have ipa_interconnect_disable() return zero or the negative error
value returned by the first interconnect that produced an error
when disabled. For now, the callers ignore the returned value.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Assign the ipa->modem_netdev and endpoint->netdev pointers *before*
registering the network device. As soon as the device is
registered it can be opened, and by that time we'll want those
pointers valid.
Similarly, don't make those pointers NULL until *after* the modem
network device is unregistered in ipa_modem_stop().
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The modem network device is set up by ipa_modem_start(). But its
TX queue is not actually started and endpoints enabled until it is
opened.
So avoid stopping the modem network device TX queue and disabling
endpoints on suspend or stop unless the netdev is marked UP. And
skip attempting to resume unless it is UP.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vladimir Oltean says:
====================
NXP SJA1105 driver support for "H" switch topologies
Changes in v3:
Preserve the behavior of dsa_tree_setup_default_cpu() which is to pick
the first CPU port and not the last.
Changes in v2:
Send as non-RFC, drop the patches for discarding DSA-tagged packets on
user ports and DSA-untagged packets on DSA and CPU ports for now.
NXP builds boards like the Bluebox 3 where there are multiple SJA1110
switches connected to an LX2160A, but they are also connected to each
other. I call this topology an "H" tree because of the lateral
connection between switches. A piece extracted from a non-upstream
device tree looks like this:
&spi_bridge {
/* SW1 */
ethernet-switch@0 {
compatible = "nxp,sja1110a";
reg = <0>;
dsa,member = <0 0>;
ethernet-ports {
#address-cells = <1>;
#size-cells = <0>;
/* SW1_P1 */
port@1 {
reg = <1>;
label = "con_2x20";
phy-mode = "sgmii";
fixed-link {
speed = <1000>;
full-duplex;
};
};
port@2 {
reg = <2>;
ethernet = <&dpmac17>;
phy-mode = "rgmii-id";
fixed-link {
speed = <1000>;
full-duplex;
};
};
port@3 {
reg = <3>;
label = "1ge_p1";
phy-mode = "rgmii-id";
phy-handle = <&sw1_mii3_phy>;
};
sw1p4: port@4 {
reg = <4>;
link = <&sw2p1>;
phy-mode = "sgmii";
fixed-link {
speed = <1000>;
full-duplex;
};
};
port@5 {
reg = <5>;
label = "trx1";
phy-mode = "internal";
phy-handle = <&sw1_port5_base_t1_phy>;
};
port@6 {
reg = <6>;
label = "trx2";
phy-mode = "internal";
phy-handle = <&sw1_port6_base_t1_phy>;
};
port@7 {
reg = <7>;
label = "trx3";
phy-mode = "internal";
phy-handle = <&sw1_port7_base_t1_phy>;
};
port@8 {
reg = <8>;
label = "trx4";
phy-mode = "internal";
phy-handle = <&sw1_port8_base_t1_phy>;
};
port@9 {
reg = <9>;
label = "trx5";
phy-mode = "internal";
phy-handle = <&sw1_port9_base_t1_phy>;
};
port@a {
reg = <10>;
label = "trx6";
phy-mode = "internal";
phy-handle = <&sw1_port10_base_t1_phy>;
};
};
};
/* SW2 */
ethernet-switch@2 {
compatible = "nxp,sja1110a";
reg = <2>;
dsa,member = <0 1>;
ethernet-ports {
#address-cells = <1>;
#size-cells = <0>;
sw2p1: port@1 {
reg = <1>;
link = <&sw1p4>;
phy-mode = "sgmii";
fixed-link {
speed = <1000>;
full-duplex;
};
};
port@2 {
reg = <2>;
ethernet = <&dpmac18>;
phy-mode = "rgmii-id";
fixed-link {
speed = <1000>;
full-duplex;
};
};
port@3 {
reg = <3>;
label = "1ge_p2";
phy-mode = "rgmii-id";
phy-handle = <&sw2_mii3_phy>;
};
port@4 {
reg = <4>;
label = "to_sw3";
phy-mode = "2500base-x";
fixed-link {
speed = <2500>;
full-duplex;
};
};
port@5 {
reg = <5>;
label = "trx7";
phy-mode = "internal";
phy-handle = <&sw2_port5_base_t1_phy>;
};
port@6 {
reg = <6>;
label = "trx8";
phy-mode = "internal";
phy-handle = <&sw2_port6_base_t1_phy>;
};
port@7 {
reg = <7>;
label = "trx9";
phy-mode = "internal";
phy-handle = <&sw2_port7_base_t1_phy>;
};
port@8 {
reg = <8>;
label = "trx10";
phy-mode = "internal";
phy-handle = <&sw2_port8_base_t1_phy>;
};
port@9 {
reg = <9>;
label = "trx11";
phy-mode = "internal";
phy-handle = <&sw2_port9_base_t1_phy>;
};
port@a {
reg = <10>;
label = "trx12";
phy-mode = "internal";
phy-handle = <&sw2_port10_base_t1_phy>;
};
};
};
};
Basically it is a single DSA tree with 2 "ethernet" properties, i.e. a
multi-CPU-port system. There is also a DSA link between the switches,
but it is not a daisy chain topology, i.e. there is no "upstream" and
"downstream" switch, the DSA link is only to be used for the bridge data
plane (autonomous forwarding between switches, between the RJ-45 ports
and the automotive Ethernet ports), otherwise all traffic that should
reach the host should do so through the dedicated CPU port of the switch.
Of course, plain forwarding in this topology is bound to create packet
loops. I have thought long and hard about strategies to cut forwarding
in such a way as to prevent loops but also not impede normal operation
of the network on such a system, and I believe I have found a solution
that does work as expected. This relies heavily on DSA's recent ability
to perform RX filtering towards the host by installing MAC addresses as
static FDB entries. Since we have 2 distinct DSA masters, we have 2
distinct MAC addresses, and if the bridge is configured to have its own
MAC address that makes it 3 distinct MAC addresses. The bridge core,
plus the switchdev_handle_fdb_add_to_device() extension, handle each MAC
address by replicating it to each port of the DSA switch tree. So the
end result is that both switch 1 and switch 2 will have static FDB
entries towards their respective CPU ports for the 3 MAC addresses
corresponding to the DSA masters and to the bridge net device (and of
course, towards any station learned on a foreign interface).
So I think the basic design works, and it is basically just as fragile
as any other multi-CPU-port system is bound to be in terms of reliance
on static FDB entries towards the host (if hardware address learning on
the CPU port is to be used, MAC addresses would randomly bounce between
one CPU port and the other otherwise). In fact, I think it is even
better to start DSA's support of multi-CPU-port systems with something
small like the NXP Bluebox 3, because we allow some time for the code
paths like dsa_switch_host_address_match(), which were specifically
designed for it, to break in, and this board needs no user space
configuration of CPU ports, like static assignments between user and CPU
ports, or bonding between the CPU ports/DSA masters.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Right now, address learning is disabled on DSA ports, which means that a
packet received over a DSA port from a cross-chip switch will be flooded
to unrelated ports.
It is desirable to eliminate that, but for that we need a breakdown of
the possibilities for the sja1105 driver. A DSA port can be:
- a downstream-facing cascade port. This is simple because it will
always receive packets from a downstream switch, and there should be
no other route to reach that downstream switch in the first place,
which means it should be safe to learn that MAC address towards that
switch.
- an upstream-facing cascade port. This receives packets either:
* autonomously forwarded by an upstream switch (and therefore these
packets belong to the data plane of a bridge, so address learning
should be ok), or
* injected from the CPU. This deserves further discussion, as normally,
an upstream-facing cascade port is no different than the CPU port
itself. But with "H" topologies (a DSA link towards a switch that
has its own CPU port), these are more "laterally-facing" cascade
ports than they are "upstream-facing". Here, there is a risk that
the port might learn the host addresses on the wrong port (on the
DSA port instead of on its own CPU port), but this is solved by
DSA's RX filtering infrastructure, which installs the host addresses
as static FDB entries on the CPU port of all switches in a "H" tree.
So even if there will be an attempt from the switch to migrate the
FDB entry from the CPU port to the laterally-facing cascade port, it
will fail to do that, because the FDB entry that already exists is
static and cannot migrate. So address learning should be safe for
this configuration too.
Ok, so what about other MAC addresses coming from the host, not
necessarily the bridge local FDB entries? What about MAC addresses
dynamically learned on foreign interfaces, isn't there a risk that
cascade ports will learn these entries dynamically when they are
supposed to be delivered towards the CPU port? Well, that is correct,
and this is why we also need to enable the assisted learning feature, to
snoop for these addresses and write them to hardware as static FDB
entries towards the CPU, to make the switch's learning process on the
cascade ports ineffective for them. With assisted learning enabled, the
hardware learning on the CPU port must be disabled.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
H topologies like this one have a problem:
eth0 eth1
| |
CPU port CPU port
| DSA link |
sw0p0 sw0p1 sw0p2 sw0p3 sw0p4 -------- sw1p4 sw1p3 sw1p2 sw1p1 sw1p0
| | | | | |
user user user user user user
port port port port port port
Basically any packet sent by the eth0 DSA master can be flooded on the
interconnecting DSA link sw0p4 <-> sw1p4 and it will be received by the
eth1 DSA master too. Basically we are talking to ourselves.
In VLAN-unaware mode, these packets are encoded using a tag_8021q TX
VLAN, which dsa_8021q_rcv() rightfully cannot decode and complains.
Whereas in VLAN-aware mode, the packets are encoded with a bridge VLAN
which _can_ be decoded by the tagger running on eth1, so it will attempt
to reinject that packet into the network stack (the bridge, if there is
any port under eth1 that is under a bridge). In the case where the ports
under eth1 are under the same cross-chip bridge as the ports under eth0,
the TX packets will even be learned as RX packets. The only thing that
will prevent loops with the software bridging path, and therefore
disaster, is that the source port and the destination port are in the
same hardware domain, and the bridge will receive packets from the
driver with skb->offload_fwd_mark = true and will not forward between
the two.
The proper solution to this problem is to detect H topologies and
enforce that all packets are received through the local switch and we do
not attempt to receive packets on our CPU port from switches that have
their own. This is a viable solution which works thanks to the fact that
MAC addresses which should be filtered towards the host are installed by
DSA as static MAC addresses towards the CPU port of each switch.
TX from a CPU port towards the DSA port continues to be allowed, this is
because sja1105 supports bridge TX forwarding offload, and the skb->dev
used initially for xmit does not have any direct correlation with where
the station that will respond to that packet is connected. It may very
well happen that when we send a ping through a br0 interface that spans
all switch ports, the xmit packet will exit the system through a DSA
switch interface under eth1 (say sw1p2), but the destination station is
connected to a switch port under eth0, like sw0p0. So the switch under
eth1 needs to communicate on TX with the switch under eth0. The
response, however, will not follow the same path, but instead, this
patch enforces that the response is sent by the first switch directly to
its DSA master which is eth0.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since all packets are transmitted as VLAN-tagged over a DSA link (this
VLAN tag represents the tag_8021q header), we need to increase the MTU
of these interfaces to account for the possibility that we are already
transporting a user-visible VLAN header.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since commit ed040abca4 ("net: dsa: sja1105: use 4095 as the private
VLAN for untagged traffic"), this driver uses a reserved value as pvid
for the host port (DSA CPU port). Control packets which are sent as
untagged get classified to this VLAN, and all ports are members of it
(this is to be expected for control packets).
Manage all cascade ports in the same way and allow control packets to
egress everywhere.
Also, all VLANs need to be sent as egress-tagged on all cascade ports.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Manage DSA links towards other switches, be they host ports or cascade
ports, the same as the CPU port, i.e. allow forwarding and flooding
unconditionally from all user ports.
We send packets as always VLAN-tagged on a DSA port, and we rely on the
cross-chip notifiers from tag_8021q to install the RX VLAN of a switch
port only on the proper remote ports of another switch (the ports that
are in the same bridging domain). So if there is no cross-chip bridging
in the system, the flooded packets will be sent on the DSA ports too,
but they will be dropped by the remote switches due to either
(a) a lack of the RX VLAN in the VLAN table of the ingress DSA port, or
(b) a lack of valid destinations for those packets, due to a lack of the
RX VLAN on the user ports of the switch
Note that switches which only transport packets in a cross-chip bridge,
but have no user ports of their own as part of that bridge, such as
switch 1 in this case:
DSA link DSA link
sw0p0 sw0p1 sw0p2 -------- sw1p0 sw1p2 sw1p3 -------- sw2p0 sw2p2 sw2p3
ip link set sw0p0 master br0
ip link set sw2p3 master br0
will still work, because the tag_8021q cross-chip notifiers keep the RX
VLANs installed on all DSA ports.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The sja1105 switch family has a feature called "cascade ports" which can
be used in topologies where multiple SJA1105/SJA1110 switches are daisy
chained. Upstream switches set this bit for the DSA link towards the
downstream switches. This is used when the upstream switch receives a
control packet (PTP, STP) from a downstream switch, because if the
source port for a control packet is marked as a cascade port, then the
source port, switch ID and RX timestamp will not be taken again on the
upstream switch, it is assumed that this has already been done by the
downstream switch (the leaf port in the tree) and that the CPU has
everything it needs to decode the information from this packet.
We need to distinguish between an upstream-facing DSA link and a
downstream-facing DSA link, because the upstream-facing DSA links are
"host ports" for the SJA1105/SJA1110 switches, and the downstream-facing
DSA links are "cascade ports".
Note that SJA1105 supports a single cascade port, so only daisy chain
topologies work. With SJA1110, there can be more complex topologies such
as:
eth0
|
host port
|
sw0p0 sw0p1 sw0p2 sw0p3 sw0p4
| | | |
cascade cascade user user
port port port port
| |
| |
| |
| host
| port
| |
| sw1p0 sw1p1 sw1p2 sw1p3 sw1p4
| | | | |
| user user user user
host port port port port
port
|
sw2p0 sw2p1 sw2p2 sw2p3 sw2p4
| | | |
user user user user
port port port port
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Be there an "H" switch topology, where there are 2 switches connected as
follows:
eth0 eth1
| |
CPU port CPU port
| DSA link |
sw0p0 sw0p1 sw0p2 sw0p3 sw0p4 -------- sw1p4 sw1p3 sw1p2 sw1p1 sw1p0
| | | | | |
user user user user user user
port port port port port port
basically one where each switch has its own CPU port for termination,
but there is also a DSA link in case packets need to be forwarded in
hardware between one switch and another.
DSA insists to see this as a daisy chain topology, basically registering
all network interfaces as sw0p0@eth0, ... sw1p0@eth0 and disregarding
eth1 as a valid DSA master.
This is only half the story, since when asked using dsa_port_is_cpu(),
DSA will respond that sw1p1 is a CPU port, however one which has no
dp->cpu_dp pointing to it. So sw1p1 is enabled, but not used.
Furthermore, be there a driver for switches which support only one
upstream port. This driver iterates through its ports and checks using
dsa_is_upstream_port() whether the current port is an upstream one.
For switch 1, two ports pass the "is upstream port" checks:
- sw1p4 is an upstream port because it is a routing port towards the
dedicated CPU port assigned using dsa_tree_setup_default_cpu()
- sw1p1 is also an upstream port because it is a CPU port, albeit one
that is disabled. This is because dsa_upstream_port() returns:
if (!cpu_dp)
return port;
which means that if @dp does not have a ->cpu_dp pointer (which is a
characteristic of CPU ports themselves as well as unused ports), then
@dp is its own upstream port.
So the driver for switch 1 rightfully says: I have two upstream ports,
but I don't support multiple upstream ports! So let me error out, I
don't know which one to choose and what to do with the other one.
Generally I am against enforcing any default policy in the kernel in
terms of user to CPU port assignment (like round robin or such) but this
case is different. To solve the conundrum, one would have to:
- Disable sw1p1 in the device tree or mark it as "not a CPU port" in
order to comply with DSA's view of this topology as a daisy chain,
where the termination traffic from switch 1 must pass through switch 0.
This is counter-productive because it wastes 1Gbps of termination
throughput in switch 1.
- Disable the DSA link between sw0p4 and sw1p4 and do software
forwarding between switch 0 and 1, and basically treat the switches as
part of disjoint switch trees. This is counter-productive because it
wastes 1Gbps of autonomous forwarding throughput between switch 0 and 1.
- Treat sw0p4 and sw1p4 as user ports instead of DSA links. This could
work, but it makes cross-chip bridging impossible. In this setup we
would need to have 2 separate bridges, br0 spanning the ports of
switch 0, and br1 spanning the ports of switch 1, and the "DSA links
treated as user ports" sw0p4 (part of br0) and sw1p4 (part of br1) are
the gateway ports between one bridge and another. This is hard to
manage from a user's perspective, who wants to have a unified view of
the switching fabric and the ability to transparently add ports to the
same bridge. VLANs would also need to be explicitly managed by the
user on these gateway ports.
So it seems that the only reasonable thing to do is to make DSA prefer
CPU ports that are local to the switch. Meaning that by default, the
user and DSA ports of switch 0 will get assigned to the CPU port from
switch 0 (sw0p1) and the user and DSA ports of switch 1 will get
assigned to the CPU port from switch 1.
The way this solves the problem is that sw1p4 is no longer an upstream
port as far as switch 1 is concerned (it no longer views sw0p1 as its
dedicated CPU port).
So here we are, the first multi-CPU port that DSA supports is also
perhaps the most uneventful one: the individual switches don't support
multiple CPUs, however the DSA switch tree as a whole does have multiple
CPU ports. No user space assignment of user ports to CPU ports is
desirable, necessary, or possible.
Ports that do not have a local CPU port (say there was an extra switch
hanging off of sw0p0) default to the standard implementation of getting
assigned to the first CPU port of the DSA switch tree. Is that good
enough? Probably not (if the downstream switch was hanging off of switch
1, we would most certainly prefer its CPU port to be sw1p1), but in
order to support that use case too, we would need to traverse the
dst->rtable in search of an optimum dedicated CPU port, one that has the
smallest number of hops between dp->ds and dp->cpu_dp->ds. At the
moment, the DSA routing table structure does not keep the number of hops
between dl->dp and dl->link_dp, and while it is probably deducible,
there is zero justification to write that code now. Let's hope DSA will
never have to support that use case.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is nothing specific to having a default CPU port to what
dsa_tree_teardown_default_cpu() does. Even with multiple CPU ports,
it would do the same thing: iterate through the ports of this switch
tree and reset the ->cpu_dp pointer to NULL. So rename it accordingly.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Three interconnects are defined for IPA version 4.9, but there
should only be two. They should also use names that match what's
used for other platforms (and specified in the Device Tree binding).
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The pointer hdr is being initialized and also re-assigned with the
same value from the call to function mctp_hdr. Static analysis reports
that the initializated value is unused. The second assignment is
duplicated and can be removed.
Addresses-Coverity: ("Unused value").
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Set CRSVIS flag in emulated root PCI bridge to indicate support for
Completion Retry Status.
Add check for CRSSVE flag from root PCI brige when issuing Configuration
Read Request via PIO to correctly returns fabricated CRS value as it is
required by PCIe spec.
Link: https://lore.kernel.org/r/20210722144041.12661-5-pali@kernel.org
Fixes: 8a3ebd8de3 ("PCI: aardvark: Implement emulated root PCI bridge config space")
Signed-off-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: stable@vger.kernel.org # e0d9d30b73 ("PCI: pci-bridge-emul: Fix big-endian support")
Measurements in different conditions showed that aardvark hardware PIO
response can take up to 1.44s. Increase wait timeout from 1ms to 1.5s to
ensure that we do not miss responses from hardware. After 1.44s hardware
returns errors (e.g. Completer abort).
The previous two patches fixed checking for PIO status, so now we can use
it to also catch errors which are reported by hardware after 1.44s.
After applying this patch, kernel can detect and print PIO errors to dmesg:
[ 6.879999] advk-pcie d0070000.pcie: Non-posted PIO Response Status: CA, 0xe00 @ 0x100004
[ 6.896436] advk-pcie d0070000.pcie: Posted PIO Response Status: COMP_ERR, 0x804 @ 0x100004
[ 6.913049] advk-pcie d0070000.pcie: Posted PIO Response Status: COMP_ERR, 0x804 @ 0x100010
[ 6.929663] advk-pcie d0070000.pcie: Non-posted PIO Response Status: CA, 0xe00 @ 0x100010
[ 6.953558] advk-pcie d0070000.pcie: Posted PIO Response Status: COMP_ERR, 0x804 @ 0x100014
[ 6.970170] advk-pcie d0070000.pcie: Non-posted PIO Response Status: CA, 0xe00 @ 0x100014
[ 6.994328] advk-pcie d0070000.pcie: Posted PIO Response Status: COMP_ERR, 0x804 @ 0x100004
Without this patch kernel prints only a generic error to dmesg:
[ 5.246847] advk-pcie d0070000.pcie: config read/write timed out
Link: https://lore.kernel.org/r/20210722144041.12661-3-pali@kernel.org
Signed-off-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Marek Behún <kabel@kernel.org>
Cc: stable@vger.kernel.org # 7fbcb5da81 ("PCI: aardvark: Don't rely on jiffies while holding spinlock")
There is an issue that when PCIe switch is connected to an Armada 3700
board, there will be lots of warnings about PIO errors when reading the
config space. According to Aardvark PIO read and write sequence in HW
specification, the current way to check PIO status has the following
issues:
1) For PIO read operation, it reports the error message, which should be
avoided according to HW specification.
2) For PIO read and write operations, it only checks PIO operation complete
status, which is not enough, and error status should also be checked.
This patch aligns the code with Aardvark PIO read and write sequence in HW
specification on PIO status check and fix the warnings when reading config
space.
[pali: Fix CRS handling when CRSSVE is not enabled]
Link: https://lore.kernel.org/r/20210722144041.12661-2-pali@kernel.org
Tested-by: Victor Gu <xigu@marvell.com>
Signed-off-by: Evan Wang <xswang@marvell.com>
Signed-off-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Victor Gu <xigu@marvell.com>
Reviewed-by: Marek Behún <kabel@kernel.org>
Cc: stable@vger.kernel.org # b1bd571447 ("PCI: aardvark: Indicate error in 'val' when config read fails")
Jason Ekstrand requested a more efficient method than userptr+set-domain
to determine if the userptr object was backed by a complete set of pages
upon creation. To be more efficient than simply populating the userptr
using get_user_pages() (as done by the call to set-domain or execbuf),
we can walk the tree of vm_area_struct and check for gaps or vma not
backed by struct page (VM_PFNMAP). The question is how to handle
VM_MIXEDMAP which may be either struct page or pfn backed...
With discrete we are going to drop support for set_domain(), so offering
a way to probe the pages, without having to resort to dummy batches has
been requested.
v2:
- add new query param for the PROBE flag, so userspace can easily
check if the kernel supports it(Jason).
- use mmap_read_{lock, unlock}.
- add some kernel-doc.
v3:
- In the docs also mention that PROBE doesn't guarantee that the pages
will remain valid by the time they are actually used(Tvrtko).
- Add a small comment for the hole finding logic(Jason).
- Move the param next to all the other params which just return true.
Testcase: igt/gem_userptr_blits/probe
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Ramalingam C <ramalingam.c@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210723113405.427004-1-matthew.auld@intel.com
This transport enables communications with an SCMI platform through virtio;
the SCMI platform will be represented by a virtio device.
Implement an SCMI virtio driver according to the virtio SCMI device spec
[1]. Virtio device id 32 has been reserved for the SCMI device [2].
The virtio transport has one Tx channel (virtio cmdq, A2P channel) and
at most one Rx channel (virtio eventq, P2A channel).
The following feature bit defined in [1] is not implemented:
VIRTIO_SCMI_F_SHARED_MEMORY.
The number of messages which can be pending simultaneously is restricted
according to the virtqueue capacity negotiated at probing time.
As soon as Rx channel message buffers are allocated or have been read
out by the arm-scmi driver, feed them back to the virtio device.
Since some virtio devices may not have the short response time exhibited
by SCMI platforms using other transports, set a generous response
timeout.
SCMI polling mode is not supported by this virtio transport since deemed
meaningless: polling mode operation is offered by the SCMI core to those
transports that could not provide a completion interrupt on the TX path,
which is never the case for virtio whose core callbacks can easily call
into core scmi_rx_callback upon messages reception.
[1] https://github.com/oasis-tcs/virtio-spec/blob/master/virtio-scmi.tex
[2] https://www.oasis-open.org/committees/ballot.php?id=3496
Link: https://lore.kernel.org/r/20210803131024.40280-16-cristian.marussi@arm.com
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Co-developed-by: Peter Hilber <peter.hilber@opensynergy.com>
Co-developed-by: Cristian Marussi <cristian.marussi@arm.com>
Signed-off-by: Igor Skalkin <igor.skalkin@opensynergy.com>
[ Peter: Adapted patch for submission to upstream. ]
Signed-off-by: Peter Hilber <peter.hilber@opensynergy.com>
[ Cristian: simplified driver logic, changed link_supplier and channel
available/setup logic, removed dummy callbacks ]
Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Add a new opaque void *priv parameter to scmi_rx_callback which can be
optionally provided by the transport layer when invoking scmi_rx_callback
and that will be passed back to the transport layer in xfer->priv.
This can be used by transports that needs to keep track of their specific
data structures together with the valid xfers.
Link: https://lore.kernel.org/r/20210803131024.40280-15-cristian.marussi@arm.com
Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Some transports are also effectively registered with other kernel subsystem
in order to be properly probed and initialized; as a consequence such kind
of transports, and their related devices, might still not have been probed
and initialized at the time the main SCMI core driver is probed.
Add an optional .link_supplier() transport operation which can be used by
the core SCMI stack to dynamically check if the transport is ready and
dynamically link its device to the SCMI platform instance device.
Link: https://lore.kernel.org/r/20210803131024.40280-13-cristian.marussi@arm.com
Signed-off-by: Peter Hilber <peter.hilber@opensynergy.com>
[ Cristian: reworded commit message ]
Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Add abstractions for future transports using message passing, such as
virtio. Derive the abstractions from the shared memory abstractions.
Abstract the transport SDU through the opaque struct scmi_msg_payld.
Also enable the transport to determine all other required information
about the transport SDU.
Link: https://lore.kernel.org/r/20210803131024.40280-12-cristian.marussi@arm.com
Signed-off-by: Peter Hilber <peter.hilber@opensynergy.com>
[ Cristian: Adapted to new SCMI Kconfig layout, updated Copyrights ]
Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
The maximum number of simultaneously pending messages is a transport
specific quantity that is usually described statically in struct scmi_desc.
Some transports, though, can calculate such number only at run-time after
some initial transport specific setup and probing is completed; moreover
the resulting max message numbers could also be different between rx and
tx channels.
Add an optional get_max_msg() operation so that a transport can report more
accurate max message numbers for each channel type.
The value in scmi_desc.max_msg is still used as default when transport does
not provide any get_max_msg() method.
Link: https://lore.kernel.org/r/20210803131024.40280-11-cristian.marussi@arm.com
Co-developed-by: Peter Hilber <peter.hilber@opensynergy.com>
Co-developed-by: Cristian Marussi <cristian.marussi@arm.com>
Signed-off-by: Igor Skalkin <igor.skalkin@opensynergy.com>
[ Peter: Adapted patch for submission to upstream. ]
Signed-off-by: Peter Hilber <peter.hilber@opensynergy.com>
[ Cristian: refactored how get_max_msg() is used to minimize core changes ]
Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>