In order to support the new speed of 2500Mbps, the SJA1110 has achieved
the great performance of changing the encoding in the MAC Configuration
Table for the port speeds of 10, 100, 1000 compared to SJA1105.
Because this is a common driver, we need a layer of indirection in order
to program the hardware with the right values irrespective of switch
generation.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
On the SJA1105, all ports support the parallel "xMII" protocols (MII,
RMII, RGMII) except for port 4 on SJA1105R/S which supports only SGMII.
This was relatively easy to model, by special-casing the SGMII port.
On the SJA1110, certain ports can be pinmuxed between SGMII and xMII, or
between SGMII and an internal 100base-TX PHY. This creates problems,
because the driver's assumption so far was that if a port supports
SGMII, it uses SGMII.
We allow the device tree to tell us how the port pinmuxing is done, and
check that against a PHY interface type compatibility matrix for
plausibility.
The other big change is that instead of doing SGMII configuration based
on what the port supports, we do it based on what is the configured
phy_mode of the port.
The 2500base-x support added in this patch is not complete.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
So far we've succeeded in operating without keeping a copy of the
phy-mode in the driver, since we already have the static config and we
can look at the xMII Mode Parameters Table which already holds that
information.
But with the SJA1110, we cannot make the distinction between sgmii and
2500base-x, because to the hardware's static config, it's all SGMII.
So add a phy_mode property per port inside struct sja1105_private.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Looking at the SGMII PCS from SJA1110, which is accessed indirectly
through a different base address as can be seen in the next patch, it
appears odd that the address accessed through indirection still
references the base address from the SJA1105S register map (first MDIO
register is at 0x1f0000), when it could index the SGMII registers
starting from zero.
Except that the 0x1f0000 is not a base address at all, it seems. It is
0x1f << 16 | 0x0000, and 0x1f is coding for the vendor-specific MMD2.
So, it turns out, the Synopsys PCS implements all its registers inside
the vendor-specific MMDs 1 and 2 (0x1e and 0x1f). This explains why the
PCS has no overlaps (for the other MMDs) with other register regions of
the switch (because no other MMDs are implemented).
Change the code to remove the SGMII "base address" and explicitly encode
the MMD for reads/writes. This will become necessary for SJA1110 support.
Cc: Russell King <linux@armlinux.org.uk>
Cc: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The SJA1105 R and S switches have 1 SGMII port (port 4). Because there
is only one such port, there is no "port" parameter in the configuration
code for the SGMII PCS.
However, the SJA1110 can have up to 4 SGMII ports, each with its own
SGMII register map. So we need to generalize the logic.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Since commit f2f3e09396be ("net: dsa: sja1105: be compatible with
"ethernet-ports" OF node name"), DSA supports the "ethernet-ports" name
for the container node of the ports, but the sja1105 driver doesn't,
because it handles some device tree parsing of its own.
Add the second node name as a fallback.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Networking block comments don't use an empty /* line,
use /* Comment...
Block comments use * on subsequent lines.
Block comments use a trailing */ on a separate line.
This patch fixes the comments style issues.
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This is a complement to commit aa6dd211e4 ("inet: use bigger hash
table for IP ID generation"), but focusing on some specific aspects
of IPv6.
Contary to IPv4, IPv6 only uses packet IDs with fragments, and with a
minimum MTU of 1280, it's much less easy to force a remote peer to
produce many fragments to explore its ID sequence. In addition packet
IDs are 32-bit in IPv6, which further complicates their analysis. On
the other hand, it is often easier to choose among plenty of possible
source addresses and partially work around the bigger hash table the
commit above permits, which leaves IPv6 partially exposed to some
possibilities of remote analysis at the risk of weakening some
protocols like DNS if some IDs can be predicted with a good enough
probability.
Given the wide range of permitted IDs, the risk of collision is extremely
low so there's no need to rely on the positive increment algorithm that
is shared with the IPv4 code via ip_idents_reserve(). We have a fast
PRNG, so let's simply call prandom_u32() and be done with it.
Performance measurements at 10 Gbps couldn't show any difference with
the previous code, even when using a single core, because due to the
large fragments, we're limited to only ~930 kpps at 10 Gbps and the cost
of the random generation is completely offset by other operations and by
the network transfer time. In addition, this change removes the need to
update a shared entry in the idents table so it may even end up being
slightly faster on large scale systems where this matters.
The risk of at least one collision here is about 1/80 million among
10 IDs, 1/850k among 100 IDs, and still only 1/8.5k among 1000 IDs,
which remains very low compared to IPv4 where all IDs are reused
every 4 to 80ms on a 10 Gbps flow depending on packet sizes.
Reported-by: Amit Klein <aksecurity@gmail.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20210529110746.6796-1-w@1wt.eu
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Peter Geis says:
====================
fixes for yt8511 phy driver
The Intel clang bot caught a few uninitialized variables in the new
Motorcomm driver. While investigating the issue, it was found that the
driver would have unintended effects when used in an unsupported mode.
Fixed the uninitialized ret variable and abort loading the driver in
unsupported modes.
====================
Link: https://lore.kernel.org/r/20210529110556.202531-1-pgwipeout@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
While investigating the clang `ge` uninitialized variable report, it was
discovered the default switch would have unintended consequences. Due to
the switch to __phy_modify, the driver would modify the ID values in the
default scenario.
Fix this by promoting the interface mode switch and aborting when the
mode is not a supported RGMII mode.
This prevents the `ge` and `fe` variables from ever being used
uninitialized.
Fixes: 48e8c6f161 ("net: phy: add driver for Motorcomm yt8511 phy")
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Peter Geis <pgwipeout@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
clang doesn't preinitialize variables. If phy_select_page failed and
returned an error, phy_restore_page would be called with `ret` being
uninitialized.
Even though phy_restore_page won't use `ret` in this scenario,
initialize `ret` to silence the warning.
Fixes: 48e8c6f161 ("net: phy: add driver for Motorcomm yt8511 phy")
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Peter Geis <pgwipeout@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Yang Yingliang says:
====================
net: dsa: qca8k: check return value of read functions correctly
patch #1 - Change return type and add output parameter to make check
return value of read functions correctly.
patch #2 - Add missing check return value in qca8k_phylink_mac_config().
====================
Link: https://lore.kernel.org/r/20210529030439.1723306-1-yangyingliang@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Now we can check qca8k_read() return value correctly, so if
it fails, we need return directly.
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Current return type of qca8k_mii_read32() and qca8k_read() are
unsigned, it can't be negative, so the return value check is
unuseful. For check the return value correctly, change return
type of the read functions and add a output parameter to store
the read value.
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
George Cherian says:
====================
NPC KPU updates
Add support for
- Loading Custom KPU profile entries
- Add NPC profile Load from System Firmware DB
- Add Support fo Coalescing KPU profiles
- General Updates/Fixes to default KPU profile
====================
Link: https://lore.kernel.org/r/20210527094439.1910013-1-george.cherian@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Adding support to load a new type of KPU image, known as coalesced/
consolidated KPU image via firmware database. This image is a
consolidation of multiple KPU profiles into a single image.
During kernel bootup this coalesced image will be read via
firmware database and only the relevant KPU profile will be loaded.
Existing functionality of loading single KPU/MKEX profile
is intact as the images are differentiated based on the image signature.
Signed-off-by: Harman Kalra <hkalra@marvell.com>
Signed-off-by: Sunil Kovvuri Goutham <Sunil.Goutham@marvell.com>
Signed-off-by: George Cherian <george.cherian@marvell.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
CN10k introduces following new LT DEF registers:
1. APAD (alignment padding) LT DEF registers are
enhancement to existing apad calculation algorithm
where not just ipv4 and ipv6 but also other protocols
can be matched and required alignment can be added by NIX.
2. ET LT DEF register defines layer information in NPC_RESULT_S
to identify the Ethertype location in L2 header. Used for
Ethertype overwriting in inline IPsec flow.
This patch adds required structures and some header changes. Also
strict version check (based on minor field) is imposed to highlight
version mismatch between the kernel headers and KPU profile.
Signed-off-by: Harman Kalra <hkalra@marvell.com>
Signed-off-by: Jerin Jacob Kollanukkaran <jerinj@marvell.com>
Signed-off-by: Kiran Kumar Kokkilagadda <kirankumark@marvell.com>
Signed-off-by: George Cherian <george.cherian@marvell.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Currently NPC profile (KPU + MKEX) can be loaded using firmware
binary in filesystem scheme. Enhancing the functionality to load
NPC profile image from system firmware database. It uses the same
technique as used for loading MKEX profile. Firstly firmware binary
in kernel is checked for a valid image else tries to load NPC profile
from firmware database and at last uses default profile if no proper
image found.
Signed-off-by: Harman Kalra <hkalra@marvell.com>
Signed-off-by: Sunil Kovvuri Goutham <Sunil.Goutham@marvell.com>
Signed-off-by: George Cherian <george.cherian@marvell.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add ability to load a set of custom KPU entries. This
allows for flexible support for custom protocol parsing.
AF driver will attempt to load the profile and verify if it can fit
hardware capabilities. If not, it will revert to the built-in profile.
Next it will replace the first KPU_MAX_CST_LT (2) entries in each KPU
in default profile with entries read from the profile image.
The built-in profile should always contain KPU_MAX_CSR_LT first no-match
entries and AF driver will disable those in the KPU unless custom
profile is loaded.
Profile file contains also a list of default protocol overrides to
allow for custom protocols to be used there.
Signed-off-by: Stanislaw Kardach <skardach@marvell.com>
Signed-off-by: George Cherian <george.cherian@marvell.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Init it on demand in the nft_compat expression. This reduces size
of nft_pktinfo from 48 to 24 bytes on x86_64.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Convert i40e to use the auxiliary bus infrastructure to export
the RDMA functionality of the device to the RDMA driver.
Register i40e client auxiliary RDMA device on the auxiliary bus per
PCIe device function for the new auxiliary rdma driver (irdma) to
attach to.
The global i40e_register_client and i40e_unregister_client symbols
will be obsoleted once irdma replaces i40iw in the kernel
for the X722 device.
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Add the definitions to the i40e client header file in
preparation to convert i40e to use the new auxiliary bus
infrastructure. This header is shared between the 'i40e'
Intel networking driver providing RDMA support and the
'irdma' driver.
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Register ice client auxiliary RDMA device on the auxiliary bus per
PCIe device function for the auxiliary driver (irdma) to attach to.
It allows to realize a single RDMA driver (irdma) capable of working with
multiple netdev drivers over multi-generation Intel HW supporting RDMA.
There is no load ordering dependencies between ice and irdma.
Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>