Pass around priv, not ds. This will help with changing to an mdio
driver, and makes this driver more like mv88e6xxx.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add an SPDX header, and remove the license text.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The ibmvnic driver currently uses the same fixed name when using
request_irq, this makes it hard to parse when multiple VNIC devices are
available at the same time. This patch adds the unit_address as the device
identification along with an id for each queue.
The original idea was to use the interface name as an identifier, but it
is not feasible given these requests happen at adapter probe, and at this
point netdev is not yet registered so it doesn't have the proper name
assigned to it.
Signed-off-by: Murilo Fossa Vicentini <muvic@linux.ibm.com>
Reviewed-by: Mauro S. M. Rodrigues <maurosr@linux.vnet.ibm.com>
Reviewed-by: Thomas Falcon <tlfalcon@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As a preparatory patch to add support for a switchdev based cpsw driver,
move common ethtool functions to separate cpsw-ethtool.c file so that they
can be used across both drivers. It will simplify CPSW driver code
maintenance also.
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Switch CPSW driver to use the new MAC SL API.
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The MAC SL submodule has a lot of common functions between many of TI SoCs
AM335x/AM437x/DRA7(AM57xx), Keystone 2 66AK2HK/E/L/G and K3 AM654, but
there are also differences especially in registers offsets and sets of
supported functions.
This patch introduces the MAC SL submodule API which is intended to provide
a common way to access the MAC SL submodule and hide HW integrations
details.
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Sekhar Nori <nsekhar@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
move common hw init code in separate function as preparation for adding new
switchdev driver.
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use dma_addr_t for desc_mem_phys and desc_hw_addr to avoid types
conversions.
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As a preparatory patch to add a switchdev based cpsw driver move the common
header definitions to cpsw_priv.h. The plan is to develop a new driver on
switchdev driver model and obsolete the current cpsw driver after all
required functions are added to the new driver. This patch allows the same
header file to be re-used on both drivers during the transition period.
Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rework probe to group common hw initialization:
- group resources request at the beginning of the probe
- move net device initialization and registration at the end of the probe
- drop cpsw_slave_init
as preparation of refactoring of common hw initialization code to
separate function.
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The Davinci MDIO in most of the case implemented as module inside of TI
CPSW subsystem and fully depends on CPSW to be enabled, but historically
it's implemented as separate Platform device/driver and defined in DT files
in two ways:
- as standalone node
- as child node of CPSW subsystem.
In later case it's required to split CPSW subsystem "reg" property to
exclude MDIO I/O range which is not useful.
Hence, replace devm_ioremap_resource() with devm_ioremap() to allow define
full I/O range in parent CPSW subsystem without spliting.
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Do not delete multicast supervisory packet's (SUPER) entries while flushing
multicast addresses from ALE table cpsw_ale_flush_multicast(). Those
entries have to be added/removed only explicitly.
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now CPSW ALE will set/clean Host port bit in Unregistered Multicast Flood
Mask (UNREG_MCAST_FLOOD_MASK) for every VLAN without checking if this port
belongs to VLAN or not when ALLMULTI mode flag is set for nedev. This is
working in non dual_mac mode, but in dual_mac - it causes
enabling/disabling ALLMULTI flag for both ports.
Hence fix it by adding additional parameter to cpsw_ale_set_allmulti() to
specify ALE port number for which ALLMULTI has to be enabled and check if
port belongs to VLAN before modifying UNREG_MCAST_FLOOD_MASK.
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use ALE_PORT_HOST define for host port in cpsw_ale_set_allmulti() instead
of constants.
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use correct define ALE_SUPER for ALE Multicast Address Table Entry
Supervisory Packet (SUPER) bit setting instead of ALE_BLOCKED. No issues
were observed till now as it have never been set, but it's going to be used
by new CPSW switch driver.
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Drop unnecessary wrapper function cpsw_tx_packet_submit() which is used
only in one place.
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use devm_alloc_etherdev_mqs() and simplify code.
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Drop pinctrl_pm_select_default_state call from probe as default
pinctrl state is set by DD core.
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use local variable struct device *dev in probe to simplify code.
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Update cpsw_split_res() to accept struct cpsw_common instead of
struct net_device to simplify code.
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
All TI drivers CPSW/NETCP can't work without ALE, hence simplify
build of those drivers by always linking cpsw_ale and drop
CONFIG_TI_CPSW_ALE config option.
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Both drivers CPSW and EMAC can't work without CPDMA, hence simplify build
of those drivers by always linking davinci_cpdma and drop TI_DAVINCI_CPDMA
config option.
Note. the davinci_emac driver module was changed to "ti_davinci_emac" to
make build work.
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Replace textual license with SPDX-License-Identifier.
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add options to strictly validate messages and dump messages,
sometimes perhaps validating dump messages non-strictly may
be required, so add an option for that as well.
Since none of this can really be applied to existing commands,
set the options everwhere using the following spatch:
@@
identifier ops;
expression X;
@@
struct genl_ops ops[] = {
...,
{
.cmd = X,
+ .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP,
...
},
...
};
For new commands one should just not copy the .validate 'opt-out'
flags and thus get strict validation.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We currently have two levels of strict validation:
1) liberal (default)
- undefined (type >= max) & NLA_UNSPEC attributes accepted
- attribute length >= expected accepted
- garbage at end of message accepted
2) strict (opt-in)
- NLA_UNSPEC attributes accepted
- attribute length >= expected accepted
Split out parsing strictness into four different options:
* TRAILING - check that there's no trailing data after parsing
attributes (in message or nested)
* MAXTYPE - reject attrs > max known type
* UNSPEC - reject attributes with NLA_UNSPEC policy entries
* STRICT_ATTRS - strictly validate attribute size
The default for future things should be *everything*.
The current *_strict() is a combination of TRAILING and MAXTYPE,
and is renamed to _deprecated_strict().
The current regular parsing has none of this, and is renamed to
*_parse_deprecated().
Additionally it allows us to selectively set one of the new flags
even on old policies. Notably, the UNSPEC flag could be useful in
this case, since it can be arranged (by filling in the policy) to
not be an incompatible userspace ABI change, but would then going
forward prevent forgetting attribute entries. Similar can apply
to the POLICY flag.
We end up with the following renames:
* nla_parse -> nla_parse_deprecated
* nla_parse_strict -> nla_parse_deprecated_strict
* nlmsg_parse -> nlmsg_parse_deprecated
* nlmsg_parse_strict -> nlmsg_parse_deprecated_strict
* nla_parse_nested -> nla_parse_nested_deprecated
* nla_validate_nested -> nla_validate_nested_deprecated
Using spatch, of course:
@@
expression TB, MAX, HEAD, LEN, POL, EXT;
@@
-nla_parse(TB, MAX, HEAD, LEN, POL, EXT)
+nla_parse_deprecated(TB, MAX, HEAD, LEN, POL, EXT)
@@
expression NLH, HDRLEN, TB, MAX, POL, EXT;
@@
-nlmsg_parse(NLH, HDRLEN, TB, MAX, POL, EXT)
+nlmsg_parse_deprecated(NLH, HDRLEN, TB, MAX, POL, EXT)
@@
expression NLH, HDRLEN, TB, MAX, POL, EXT;
@@
-nlmsg_parse_strict(NLH, HDRLEN, TB, MAX, POL, EXT)
+nlmsg_parse_deprecated_strict(NLH, HDRLEN, TB, MAX, POL, EXT)
@@
expression TB, MAX, NLA, POL, EXT;
@@
-nla_parse_nested(TB, MAX, NLA, POL, EXT)
+nla_parse_nested_deprecated(TB, MAX, NLA, POL, EXT)
@@
expression START, MAX, POL, EXT;
@@
-nla_validate_nested(START, MAX, POL, EXT)
+nla_validate_nested_deprecated(START, MAX, POL, EXT)
@@
expression NLH, HDRLEN, MAX, POL, EXT;
@@
-nlmsg_validate(NLH, HDRLEN, MAX, POL, EXT)
+nlmsg_validate_deprecated(NLH, HDRLEN, MAX, POL, EXT)
For this patch, don't actually add the strict, non-renamed versions
yet so that it breaks compile if I get it wrong.
Also, while at it, make nla_validate and nla_parse go down to a
common __nla_validate_parse() function to avoid code duplication.
Ultimately, this allows us to have very strict validation for every
new caller of nla_parse()/nlmsg_parse() etc as re-introduced in the
next patch, while existing things will continue to work as is.
In effect then, this adds fully strict validation for any new command.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Even if the NLA_F_NESTED flag was introduced more than 11 years ago, most
netlink based interfaces (including recently added ones) are still not
setting it in kernel generated messages. Without the flag, message parsers
not aware of attribute semantics (e.g. wireshark dissector or libmnl's
mnl_nlmsg_fprintf()) cannot recognize nested attributes and won't display
the structure of their contents.
Unfortunately we cannot just add the flag everywhere as there may be
userspace applications which check nlattr::nla_type directly rather than
through a helper masking out the flags. Therefore the patch renames
nla_nest_start() to nla_nest_start_noflag() and introduces nla_nest_start()
as a wrapper adding NLA_F_NESTED. The calls which add NLA_F_NESTED manually
are rewritten to use nla_nest_start().
Except for changes in include/net/netlink.h, the patch was generated using
this semantic patch:
@@ expression E1, E2; @@
-nla_nest_start(E1, E2)
+nla_nest_start_noflag(E1, E2)
@@ expression E1, E2; @@
-nla_nest_start_noflag(E1, E2 | NLA_F_NESTED)
+nla_nest_start(E1, E2)
Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
* extended key ID support (from 802.11-2016)
* per-STA TX power control support
* mac80211 TX performance improvements
* HE (802.11ax) updates
* mesh link probing support
* enhancements of multi-BSSID support (also related to HE)
* OWE userspace processing support
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEH1e1rEeCd0AIMq6MB8qZga/fl8QFAlzC5YYACgkQB8qZga/f
l8QDWg/+N7wm+l7bTMx4hjJzZZ60n9fBvyGJx0gsnPVH8wdOiPoh/epuI04I8I4m
pGNbGvPB9Z4z2tD56XsIQnXf88ab3R27bRupSSW1vtzVSbDhg8wQ7jg0nABrdyDS
PgoTmDMfVERLewXdntqRANzVYGfoWSOzo1u6A0Xhys8FqxxX/eD+Vdo4dKzmeN47
+LDfuCpInVPn0TOpFp5IJ4+B4a0dhkz2/Q1BOE7NquXVvk4X77VJohV/BgQJ04Io
yt7mn5rzYM6j4o1XLACxUEHkXvht6h34abG0yHRnuoAEp/sdPz2jAXT4OxYqs6x0
XdLdr8gZgkMnnYaOQef74uJ2Ku+4A1ootjXSPazA7BWX0X5GqHnET/INk2S6cQPj
C95LYfKC0ICD0qfioBmmHx8icDGoovcaswCju2ozfqWaD4Lwr3BcesnNDFtkHD9o
aYaTTGGSxFyr2bZWTDpv4D4H5g3V4srRJsXs+SokL54nvlwd/smUJ4PVTLomP9y2
XswRtLdoiUsCrJy967CXfhsxnE5SRhmBQE38Jq8/pzetlRk2spvJJC5MGYF0O/nT
0UHbrjBCFUT2s8jv+gWWabOBUovsNJlgaxFwrZ/eNVIk0DK0ERoMV3V4MktU8uza
Y339T14kxw4wlY2z5pOmEgkxmKZbPb55dBba04JEZzz9zDTawTk=
=JQOx
-----END PGP SIGNATURE-----
Merge tag 'mac80211-next-for-davem-2019-04-26' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next
Johannes Berg says:
====================
Various updates, notably:
* extended key ID support (from 802.11-2016)
* per-STA TX power control support
* mac80211 TX performance improvements
* HE (802.11ax) updates
* mesh link probing support
* enhancements of multi-BSSID support (also related to HE)
* OWE userspace processing support
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
It's meaningless to trigger reset when failed to send command to IMP,
because the failure is usually caused by no authority, illegal command
and so on. When that happened, we just need to return the status code
for further debugging.
Signed-off-by: Weihang Li <liweihang@hisilicon.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds a check for the hns3_put_ring_config() to prevent
double free, and for more readable, move the NULL assignment of
priv->ring_data into the hns3_put_ring_config().
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The test results show that the maximum time of hardware return
to mac link state is 500MS.The software needs to set twice the
maximum time of hardware return state (1000MS).
If not modified, the loopback test returns probability failure.
Signed-off-by: liuzhongzhu <liuzhongzhu@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When configure pause, current implementation returns directly
after setup PFC without setup BP, which is not sufficient.
So this patch fixes it, only return while setting PFC failed.
Fixes: 44e59e375b ("net: hns3: do not return GE PFC setting err when initializing")
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since the hardware does not handle mailboxes and the hardware
reset include TQP reset, so it is unnecessary to reset TQP
in the hclgevf_ae_stop() while doing VF reset. Also it is
unnecessary to reset the remaining TQP when one reset fails.
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch uses a reserved byte in the hclge_mbx_vf_to_pf_cmd
to save the need_resp flag, so when PF received the mailbox,
it can use it to decise whether send a response to VF.
For hclge_set_vf_uc_mac_addr(), it should use mbx_need_resp flag
to decide whether send response to VF.
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since irq handler and mailbox task will both update arq's count,
so arq's count should use atomic_t instead of u32, otherwise
its value may go wrong finally.
Fixes: 07a0556a3a ("net: hns3: Changes to support ARQ(Asynchronous Receive Queue)")
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
HCLGEVF_STATE_CMD_DISABLE is more suitable than
HCLGEVF_STATE_RST_HANDLING to stop sending keep alive msg,
since HCLGEVF_STATE_RST_HANDLING only be set when the reset
task is running.
Fixes: c59a85c07e ("net: hns3: stop sending keep alive msg to PF when VF is resetting")
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The bdinfo handled in hns3_handle_bdinfo is only valid on the
last BD of the current packet, currently the bd info may be handled
based on the first BD if the packet has more than two BDs, which
may cause rx error.
This patch fixes it by using the last BD of the current packet in
hns3_handle_bdinfo.
Also, hns3_set_rx_skb_rss_type has used RSS hash value from the last
BD of the current packet, so remove the same last BD calculation in
hns3_set_rx_skb_rss_type and call it from hns3_handle_bdinfo.
Fixes: e559709505 ("net: hns3: Add handling of GRO Pkts not fully RX'ed in NAPI poll")
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
hns3_desc_unused() returns how many BD have been cleaned, but new
buffer has not been attached to them. The register of
HNS3_RING_RX_RING_FBDNUM_REG returns how many BD need allocating new
buffer to or need to cleaned. So the remaining BD need to be clean
is HNS3_RING_RX_RING_FBDNUM_REG - hns3_desc_unused().
Also, new buffer can not attach to the pending BD when the last BD is
not handled, because memcpy has not been done on the first pending BD.
This patch fixes by subtracting the pending BD num from unused_count
after 'HNS3_RING_RX_RING_FBDNUM_REG - unused_count' is used to calculate
the BD bum need to be clean.
Fixes: e559709505 ("net: hns3: Add handling of GRO Pkts not fully RX'ed in NAPI poll")
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
hns3_clean_tx_ring calls hns3_nic_reclaim_one_desc to clean
buffers and set ring->next_to_clean, then hns3_nic_net_xmit
reuses the cleaned buffers. But there are no memory barriers
when buffers gets recycled, so the recycled buffers can be
corrupted.
This patch uses smp_store_release to update ring->next_to_clean
and smp_load_acquire to read ring->next_to_clean to properly
hand off buffers from hns3_clean_tx_ring to hns3_nic_net_xmit.
Fixes: 76ad4f0ee7 ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC")
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Device may be shutdown without the hardware being reinitialized, in
which case we want to ensure we cleanup properly.
This is especially important for kexec with traffic flowing.
The shutdown procedures resembles the remove procedures, so we can reuse
those common tasks.
Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
PHY's behave differently when being reset. Some reset registers to
defaults, some don't. Some trigger an autoneg restart, some don't.
So let's also set the autoneg restart bit when resetting. Then PHY
behavior should be more consistent. Clearing BMCR_ISOLATE serves the
same purpose and is borrowed from genphy_restart_aneg.
BMCR holds the speed / duplex settings in fixed mode. Therefore
we may have an issue if a soft reset resets BMCR to its default.
So better call genphy_setup_forced() afterwards in fixed mode.
We've seen no related complaint in the last >10 yrs, so let's
treat it as an improvement.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
All Apple products use the same protocol for tethering over USB.
To simplify the code and make it future proof, use
USB_VENDOR_AND_INTERFACE_INFO() instead of
USB_DEVICE_AND_INTERFACE_INFO() to automatically detect and support
all existing and future Apple products using the same interface.
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
In preparation to enabling -Wimplicit-fallthrough, refactor code a
bit and mark switch cases where we are expecting to fall through.
This patch fixes the following warning:
drivers/net/ethernet/broadcom/cnic.c: In function ‘cnic_cm_process_kcqe’:
drivers/net/ethernet/broadcom/cnic.c:4044:11: warning: this statement may fall through [-Wimplicit-fallthrough=]
opcode = L4_KCQE_OPCODE_VALUE_CLOSE_COMP;
drivers/net/ethernet/broadcom/cnic.c:4050:2: note: here
case L4_KCQE_OPCODE_VALUE_RESET_RECEIVED:
^~~~
Warning level 3 was used: -Wimplicit-fallthrough=3
Notice that, in this particular case, the code comment is modified
in accordance with what GCC is expecting to find.
This patch is part of the ongoing efforts to enable
-Wimplicit-fallthrough.
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In preparation to enabling -Wimplicit-fallthrough, mark switch
cases where we are expecting to fall through.
This patch fixes the following warning:
drivers/net/ethernet/chelsio/cxgb4vf/cxgb4vf_main.c: In function ‘fwevtq_handler’:
drivers/net/ethernet/chelsio/cxgb4vf/cxgb4vf_main.c:520:7: warning: this statement may fall through [-Wimplicit-fallthrough=]
cpl = (void *)p;
~~~~^~~~~~~~~~~
drivers/net/ethernet/chelsio/cxgb4vf/cxgb4vf_main.c:524:2: note: here
case CPL_SGE_EGR_UPDATE: {
^~~~
Warning level 3 was used: -Wimplicit-fallthrough=3
Notice that, in this particular case, the code comment is modified
in accordance with what GCC is expecting to find.
This patch is part of the ongoing efforts to enable
-Wimplicit-fallthrough.
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In preparation to enabling -Wimplicit-fallthrough, mark switch
cases where we are expecting to fall through.
This patch fixes the following warning:
In file included from drivers/net/wimax/i2400m/debug-levels.h:30,
from drivers/net/wimax/i2400m/control.c:86:
drivers/net/wimax/i2400m/control.c: In function ‘i2400m_report_tlv_system_state’:
./include/linux/wimax/debug.h:200:4: warning: this statement may fall through [-Wimplicit-fallthrough=]
do { \
^
./include/linux/wimax/debug.h:404:36: note: in expansion of macro ‘_d_printf’
#define d_printf(l, _dev, f, a...) _d_printf(l, "", _dev, f, ## a)
^~~~~~~~~
drivers/net/wimax/i2400m/control.c:354:3: note: in expansion of macro ‘d_printf’
d_printf(1, dev, "entering BS-negotiated idle mode\n");
^~~~~~~~
drivers/net/wimax/i2400m/control.c:355:2: note: here
case I2400M_SS_DISCONNECTING:
^~~~
Warning level 3 was used: -Wimplicit-fallthrough=3
This patch is part of the ongoing efforts to enable
-Wimplicit-fallthrough.
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In preparation to enabling -Wimplicit-fallthrough, mark switch
cases where we are expecting to fall through.
This patch fixes the following warnings:
In file included from drivers/net/ethernet/amd/xgbe/xgbe-drv.c:129:
drivers/net/ethernet/amd/xgbe/xgbe-drv.c: In function ‘xgbe_set_hwtstamp_settings’:
drivers/net/ethernet/amd/xgbe/xgbe-common.h:1392:9: warning: this statement may fall through [-Wimplicit-fallthrough=]
(_var) |= (((_val) & ((0x1 << (_width)) - 1)) << (_index)); \
~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/net/ethernet/amd/xgbe/xgbe-common.h:1419:2: note: in expansion of macro ‘SET_BITS’
SET_BITS((_var), \
^~~~~~~~
drivers/net/ethernet/amd/xgbe/xgbe-drv.c:1614:3: note: in expansion of macro ‘XGMAC_SET_BITS’
XGMAC_SET_BITS(mac_tscr, MAC_TSCR, TSVER2ENA, 1);
^~~~~~~~~~~~~~
drivers/net/ethernet/amd/xgbe/xgbe-drv.c:1616:2: note: here
case HWTSTAMP_FILTER_PTP_V1_L4_EVENT:
^~~~
In file included from drivers/net/ethernet/amd/xgbe/xgbe-drv.c:129:
drivers/net/ethernet/amd/xgbe/xgbe-common.h:1392:9: warning: this statement may fall through [-Wimplicit-fallthrough=]
(_var) |= (((_val) & ((0x1 << (_width)) - 1)) << (_index)); \
~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/net/ethernet/amd/xgbe/xgbe-common.h:1419:2: note: in expansion of macro ‘SET_BITS’
SET_BITS((_var), \
^~~~~~~~
drivers/net/ethernet/amd/xgbe/xgbe-drv.c:1625:3: note: in expansion of macro ‘XGMAC_SET_BITS’
XGMAC_SET_BITS(mac_tscr, MAC_TSCR, TSVER2ENA, 1);
^~~~~~~~~~~~~~
drivers/net/ethernet/amd/xgbe/xgbe-drv.c:1627:2: note: here
case HWTSTAMP_FILTER_PTP_V1_L4_SYNC:
^~~~
In file included from drivers/net/ethernet/amd/xgbe/xgbe-drv.c:129:
drivers/net/ethernet/amd/xgbe/xgbe-common.h:1392:9: warning: this statement may fall through [-Wimplicit-fallthrough=]
(_var) |= (((_val) & ((0x1 << (_width)) - 1)) << (_index)); \
~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/net/ethernet/amd/xgbe/xgbe-common.h:1419:2: note: in expansion of macro ‘SET_BITS’
SET_BITS((_var), \
^~~~~~~~
drivers/net/ethernet/amd/xgbe/xgbe-drv.c:1636:3: note: in expansion of macro ‘XGMAC_SET_BITS’
XGMAC_SET_BITS(mac_tscr, MAC_TSCR, TSVER2ENA, 1);
^~~~~~~~~~~~~~
drivers/net/ethernet/amd/xgbe/xgbe-drv.c:1638:2: note: here
case HWTSTAMP_FILTER_PTP_V1_L4_DELAY_REQ:
^~~~
Warning level 3 was used: -Wimplicit-fallthrough=3
Notice that, in this particular case, the code comments are modified
in accordance with what GCC is expecting to find.
This patch is part of the ongoing efforts to enable
-Wimplicit-fallthrough.
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allow Extended Key ID to be used with hwsim.
Hwsim can only communicate with other hwsim cards, allowing it to bypass
creation of A-MPDUs in the first place.
Mixing keyIDs in an A-MPDU is therefore impossible and can never cause
interoperability issues with other cards.
Signed-off-by: Alexander Wetzel <alexander@wetzel-home.de>
[reword comment slightly]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
P2P device interface type was not indicated in the supported
interface types even when hwsim was configured with p2p device
support. Fix it.
Signed-off-by: Avraham Stern <avraham.stern@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Implement ndo_get_devlink_port and allow switch_id and port_name to be
handled by devlink.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove the existing way to create netdevsim over rtnetlink and move the
netdev creation/destruction to dev probe, so for every probed port,
a netdevsim-netdev instance is created.
Adjust selftests to work with new interface.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In order to test flows in core, it is beneficial to maintain previously
supported possibility to add and delete ports during netdevsim lifetime.
Do it by extending device sysfs attrs by "new_port" and "del_port".
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Implement netdevsim bus probing of netdevsim devices. For every probed
device create a devlink instance. According to the user-passed value,
create a number of ports represented by devlink port instances.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
With the model where dev is represented by devlink and ports are
represented by devlink ports, make debugfs file names independent
on netdev names. Change the topology to the one illustrated
by the following example:
$ ls /sys/kernel/debug/netdevsim/
netdevsim1
$ ls /sys/kernel/debug/netdevsim/netdevsim1/
bpf_bind_accept bpf_bind_verifier_delay bpf_bound_progs ports
$ ls /sys/kernel/debug/netdevsim/netdevsim1/ports/
0 1
$ ls /sys/kernel/debug/netdevsim/netdevsim1/ports/0/
bpf_map_accept bpf_offloaded_id bpf_tc_accept bpf_tc_non_bound_accept bpf_xdpdrv_accept bpf_xdpoffload_accept dev ipsec
$ ls /sys/kernel/debug/netdevsim/netdevsim1/ports/0/dev -l
lrwxrwxrwx 1 root root 0 Apr 13 15:58 /sys/kernel/debug/netdevsim/netdevsim1/ports/0/dev -> ../../../netdevsim1
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Current implementation of parent_id/switch_id does not follow the
original idea of being unique. The values are "0", "1", etc. Instead of
that, generate 32 random bytes.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As previously introduce dev which is mapped 1:1 to a bus device covers
the purpose of the original shared device, merge the sdev code into dev.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
These functions are going to be called from bus probe/release(),
therefore make them independent on ns struct and rename accordingly.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a way to add new netdevsim device on netdevsim bus and also to
delete existing netdevsim device from the bus. Track the bus devices
in using a list.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Instead of increments of u32 value, use ida to manage bus device ids.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In order to bus probing to work correctly, register a simple netdevsim
driver implementation.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move netdevsim device registration into bus.c and alongside with that
the related sysfs attributes. Introduce new struct nsim_bus_dev to
represent a netdevsim device on netdevsim bus.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As the code related to netdevsim bus is going to get bigger, move the
existing code to a separate file.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The existing devlink.c code is going to be extended to represent asic
device on a bus. As this is about more than just devlink,
rename the file. Do appropriate prefix renaming alongside with that.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently there is one devlink instance created per network namespace.
That is quite odd considering the fact that devlink instance should
represent an ASIC. The following patches are going to move the devlink
instance even more down to a bus device, but until then, have one
devlink instance per netdevsim instance. Struct nsim_devlink is
introduced to hold fib setting.
The changes in the fib code are only related to holding the
configuration per devlink instance instead of network namespace.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As a dependency of the subsequent patch, mode device registration to be
done earlier, directly in nsim_newlink().
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It allows some of the code to be simplified.
Tested on Turris Omnia.
Signed-off-by: Rosen Penev <rosenp@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vrf device is not able to change mac address now because lack of
ndo_set_mac_address. Complete this in case some apps need to do
this.
Reported-by: Hui Wang <wanghui104@huawei.com>
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
mlxsw currently does not support v6 gateways with v4 routes. Commit
19a9d136f1 ("ipv4: Flag fib_info with a fib_nh using IPv6 gateway")
prevents a route from being added, but nothing stops the replace or
append. Add a catch for them too.
$ ip ro add 172.16.2.0/24 via 10.99.1.2
$ ip ro replace 172.16.2.0/24 via inet6 fe80::202:ff:fe00:b dev swp1s0
Error: mlxsw_spectrum: IPv6 gateway with IPv4 route is not supported.
$ ip ro append 172.16.2.0/24 via inet6 fe80::202:ff:fe00:b dev swp1s0
Error: mlxsw_spectrum: IPv6 gateway with IPv4 route is not supported.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The netdev variant is usable on any context since it disables interrupts.
The napi variant of the call should only be used within softirq context.
Replace napi_alloc_frag on driver init with the correct netdev_alloc_frag
call
Changes since v1:
- Adjusted commit message
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Jassi Brar <jaswinder.singh@linaro.org>
Fixes: 4acb20b462 ("net: socionext: different approach on DMA")
Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are spelling mistakes in structure elements, fix these.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
According to the R-Car Gen3 Hardware Manual Rev 1.50 of Nov 30, 2018, the
TX clock internal delay mode isn't supported on R-Car E3 (r8a77990) or D3
(r8a77995). And by extension it is also not supported by RZ/G2E (r9a774c0).
This matches all ES versions of the affected SoCs as it is
not clear if this problem will be resolved in newer chips.
This can be revisited, as necessary.
This patch does not error-out if PHY_INTERFACE_MODE_RGMII_ID or
PHY_INTERFACE_MODE_RGMII_TXID are used on SoCs where TX clock delay
mode is not supported as there is a risk of introducing a regression
when used in conjunction with older DT blobs present in the field.
Rather, a warning is logged in such cases.
Based on work by Kazuya Mizuguchi.
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This series includes updates to mlx5e driver RX data path and some
significant XDP RX/TX improvements to overcome/mitigate HW and PCIE
bottlenecks.
From Tariq:
1) Some Enhancements in rq->flags
2) Stabilize RX packet rate (on Striding RQ) with
multiple outstanding UMR posts
In this patch, we add support for multiple outstanding UMR posts,
to allow faster gap closure between consuming MPWQEs and reposting
them back into the WQ.
Performance test:
As expected, huge improvement in large-scale (48 cores).
xdp_redirect_map, 64B UDP multi-stream.
Redirect from ConnectX-5 100Gbps to ConnectX-6 100Gbps.
CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz.
Before: Unstable, 7 to 30 Mpps
After: Stable, at 70.5 Mpps
From Shay:
3) XDP, Inline small packets into the TX MPWQE in XDP xmit flow
Upon high packet rate with multiple CPUs TX workloads, much of the HCA's
resources are spent on prefetching TX descriptors, thus affecting
transmission rates.
This patch comes to mitigate this problem by moving some workload to the
CPU and reducing the HW data prefetch overhead for small packets (<= 256B).
When forwarding packets with XDP, a packet that is smaller
than a certain size (set to ~256 bytes) would be sent inline within
its WQE TX descrptor (mem-copied), when the hardware tx queue is congested
beyond a pre-defined water-mark.
Performance:
Tested packet rate for UDP 64Byte multi-stream
over two dual port ConnectX-5 100Gbps NICs.
CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz
* Tested with hyper-threading disabled
XDP_TX:
| | before | after | |
| 24 rings | 51Mpps | 116Mpps | +126% |
| 1 ring | 12Mpps | 12Mpps | same |
XDP_REDIRECT:
** Below is the transmit rate, not the redirection rate
which might be larger, and is not affected by this patch.
| | before | after | |
| 32 rings | 64Mpps | 92Mpps | +43% |
| 1 ring | 6.4Mpps | 6.4Mpps | same |
As we can see, feature significantly improves scaling, without
hurting single ring performance.
From Maxim:
4) Some trivial refactoring and code improvements prior to a larger series
to support AF_XDP.
-Saeed.
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJcv2LjAAoJEEg/ir3gV/o+90gIAI8+4lwkXZAVk4mxf9PMjxuB
bQiKd80e++26sgrNHCyuWZnIzTQqYAnUJ3WRC+Kk1pFTo1O23A+fvweT8m1dqAvP
Z/5ktfbAeF3fwOVu7aGu9vh4zJEWJj8oO+I+G+OaOe2iV7FVTTFnWHxiiCfungAW
oUnXozq4vERSQLechqqgz6nACxOPgEOCJrp4T9lDYSbqZizHgFttmInMQguq/7KS
LvITcNu3EF5l4y2LxwCFiKRgGc2y/belU63AK+2pQUXhH46kQPEHdncdLg5d9QYA
xJwthn697qxS0PIP5oHPHNVN+qJXfuUHVonXqVOAJebGQnV82of6+sPweRxwh1s=
=MfAR
-----END PGP SIGNATURE-----
Merge tag 'mlx5-updates-2019-04-22' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5-updates-2019-04-22
This series includes updates to mlx5e driver RX data path and some
significant XDP RX/TX improvements to overcome/mitigate HW and PCIE
bottlenecks.
From Tariq:
1) Some Enhancements in rq->flags
2) Stabilize RX packet rate (on Striding RQ) with
multiple outstanding UMR posts
In this patch, we add support for multiple outstanding UMR posts,
to allow faster gap closure between consuming MPWQEs and reposting
them back into the WQ.
Performance test:
As expected, huge improvement in large-scale (48 cores).
xdp_redirect_map, 64B UDP multi-stream.
Redirect from ConnectX-5 100Gbps to ConnectX-6 100Gbps.
CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz.
Before: Unstable, 7 to 30 Mpps
After: Stable, at 70.5 Mpps
From Shay:
3) XDP, Inline small packets into the TX MPWQE in XDP xmit flow
Upon high packet rate with multiple CPUs TX workloads, much of the HCA's
resources are spent on prefetching TX descriptors, thus affecting
transmission rates.
This patch comes to mitigate this problem by moving some workload to the
CPU and reducing the HW data prefetch overhead for small packets (<= 256B).
When forwarding packets with XDP, a packet that is smaller
than a certain size (set to ~256 bytes) would be sent inline within
its WQE TX descrptor (mem-copied), when the hardware tx queue is congested
beyond a pre-defined water-mark.
Performance:
Tested packet rate for UDP 64Byte multi-stream
over two dual port ConnectX-5 100Gbps NICs.
CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz
* Tested with hyper-threading disabled
XDP_TX:
| | before | after | |
| 24 rings | 51Mpps | 116Mpps | +126% |
| 1 ring | 12Mpps | 12Mpps | same |
XDP_REDIRECT:
** Below is the transmit rate, not the redirection rate
which might be larger, and is not affected by this patch.
| | before | after | |
| 32 rings | 64Mpps | 92Mpps | +43% |
| 1 ring | 6.4Mpps | 6.4Mpps | same |
As we can see, feature significantly improves scaling, without
hurting single ring performance.
From Maxim:
4) Some trivial refactoring and code improvements prior to a larger series
to support AF_XDP.
====================
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Create a #define for the timeout of mlx5e_wait_for_min_rx_wqes to
clarify the meaning of a magic number.
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Remove the no longer used page_reuse stat of RQs.
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
mlx5e_trigger_irq posts a NOP to the ICO SQ just to trigger an IRQ and
enter the NAPI poll on the right CPU according to the affinity. Use it
in mlx5e_activate_rq.
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
mlx5e_mpwqe_get_log_rq_size calculates the number of WQEs (N) based on
the requested number of frames in the RQ (F) and the number of packets
per WQE (P). It ensures that N is not less than the minimum number of
WQEs in an RQ (N_min). Arithmetically, it means that F / P >= N_min
should be true. This function deals with logarithms, so it should check
that log(F) - log(P) >= log(N_min). However, if F < P, this expression
will cause an unsigned underflow. Check log(F) >= log(P) + log(N_min)
instead.
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
This commit moves the parameter calculation functions to a separate file
for better modularity and code sharing with future features.
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
If the channels fail to reopen after setting an XDP program, return the
error code instead of 0. A proper fix is still needed, as now any error
while reopening the channels brings the interface down. This patch only
adds error reporting.
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Upon high packet rate with multiple CPUs TX workloads, much of the HCA's
resources are spent on prefetching TX descriptors, thus affecting
transmission rates.
This patch comes to mitigate this problem by moving some workload to the
CPU and reducing the HW data prefetch overhead for small packets (<= 256B).
When forwarding packets with XDP, a packet that is smaller
than a certain size (set to ~256 bytes) would be sent inline within
its WQE TX descrptor (mem-copied), when the hardware tx queue is congested
beyond a pre-defined water-mark.
This is added to better utilize the HW resources (which now makes
one less packet data prefetch) and allow better scalability, on the
account of CPU usage (which now 'memcpy's the packet into the WQE).
To load balance between HW and CPU and get max packet rate, we use
watermarks to detect how much the HW is congested and move the work
loads back and forth between HW and CPU.
Performance:
Tested packet rate for UDP 64Byte multi-stream
over two dual port ConnectX-5 100Gbps NICs.
CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz
* Tested with hyper-threading disabled
XDP_TX:
| | before | after | |
| 24 rings | 51Mpps | 116Mpps | +126% |
| 1 ring | 12Mpps | 12Mpps | same |
XDP_REDIRECT:
** Below is the transmit rate, not the redirection rate
which might be larger, and is not affected by this patch.
| | before | after | |
| 32 rings | 64Mpps | 92Mpps | +43% |
| 1 ring | 6.4Mpps | 6.4Mpps | same |
As we can see, feature significantly improves scaling, without
hurting single ring performance.
Signed-off-by: Shay Agroskin <shayag@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
This counter tracks how many TX MPWQE sessions are started in XDP SQ
in XDP TX/REDIRECT flow. It counts per-channel and global stats.
Signed-off-by: Shay Agroskin <shayag@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
The XDP redirect flush indication belongs to the receive queue,
not to its XDP send queue.
For this, use a new bit on rq->flags.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Shay Agroskin <shayag@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Values in enum mlx5e_rq_flag are used as bit indixes.
Intention was to use them with no BIT(i) wrapping.
No functional bug fix here, as the same (shifted)flag bit
is used for all set, test, and clear operations.
Fixes: 121e892754 ("net/mlx5e: Refactor RQ XDP_TX indication")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Shay Agroskin <shayag@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
The buffers mapping of the Multi-Packet WQEs (of Striding RQ)
is done via UMR posts, one UMR WQE per an RX MPWQE.
A single MPWQE is capable of serving many incoming packets,
usually larger than the budget of a single napi cycle.
Hence, posting a single UMR WQE per napi cycle (and handling its
completion in the next cycle) works fine in many common cases,
but not always.
When an XDP program is loaded, every MPWQE is capable of serving less
packets, to satisfy the packet-per-page requirement.
Thus, for the same number of packets more MPWQEs (and UMR posts)
are needed (twice as much for the default MTU), giving less latency
room for the UMR completions.
In this patch, we add support for multiple outstanding UMR posts,
to allow faster gap closure between consuming MPWQEs and reposting
them back into the WQ.
For better SW and HW locality, we combine the UMR posts in bulks of
(at least) two.
This is expected to improve packet rate in high CPU scale.
Performance test:
As expected, huge improvement in large-scale (48 cores).
xdp_redirect_map, 64B UDP multi-stream.
Redirect from ConnectX-5 100Gbps to ConnectX-6 100Gbps.
CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz.
Before: Unstable, 7 to 30 Mpps
After: Stable, at 70.5 Mpps
No degradation in other tested scenarios.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Add support for VSC8514 in Microsemi driver (mscc.c)
with more features.
Signed-off-by: Kavya Sree Kotagiri <kavyasree.kotagiri@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The VSC8514 PHY is a 4-ports PHY that is 10/100/1000BASE-T, 100BASE-FX,
1000BASE-X, can communicate with the MAC via QSGMII.
The MAC interface protocol for each port within QSGMII can
be either 1000BASE-X or SGMII, if the QSGMII MAC that the VSC8514 is
connecting to supports this functionality.
VSC8514 also supports SGMII MAC-side autonegotiation on each individual
port, downshifting, can set the blinking pattern of each of its 4 LEDs,
SyncE, 1000BASE-T Ring Resiliency as well as HP Auto-MDIX detection.
This adds support for 10BASE-T, 100BASE-TX, and 1000BASE-T,
QSGMII link with the MAC, downshifting, HP Auto-MDIX detection
and blinking pattern for its 4 LEDs.
The GPIO register bank is a set of registers that are common to all PHYs
in the package. So any modification in any register of this bank affects
all PHYs of the package.
If the PHYs haven't been reset before booting the Linux kernel and were
configured to use interrupts for e.g. link status updates, it is
required to clear the interrupts mask register of all PHYs before being
able to use interrupts with any PHY. The first PHY of the package that
will be init will take care of clearing all PHYs interrupts mask
registers. Thus, we need to keep track of the init sequence in the
package, if it's already been done or if it's to be done.
Most of the init sequence of a PHY of the package is common to all PHYs
in the package, thus we use the SMI broadcast feature which enables us
to propagate a write in one register of one PHY to all PHYs in the same
package.
Signed-off-by: Kavya Sree Kotagiri <kavyasree.kotagiri@microchip.com>
Signed-off-by: Quentin Schulz <quentin.schulz@bootlin.com>
Co-developed-by: Quentin Schulz <quentin.schulz@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add missing <of_device_id> table for SPI driver relying on SPI
device match since compatible is in a DT binding or in a DTS.
Before this patch:
modinfo drivers/net/phy/spi_ks8995.ko | grep alias
alias: spi:ksz8795
alias: spi:ksz8864
alias: spi:ks8995
After this patch:
modinfo drivers/net/phy/spi_ks8995.ko | grep alias
alias: spi:ksz8795
alias: spi:ksz8864
alias: spi:ks8995
alias: of:N*T*Cmicrel,ksz8795C*
alias: of:N*T*Cmicrel,ksz8795
alias: of:N*T*Cmicrel,ksz8864C*
alias: of:N*T*Cmicrel,ksz8864
alias: of:N*T*Cmicrel,ks8995C*
alias: of:N*T*Cmicrel,ks8995
Reported-by: Javier Martinez Canillas <javier@dowhile0.org>
Signed-off-by: Daniel Gomez <dagmcr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The default m88e151x LED configuration is 0x1177, used LED[0]
for 1000M link, LED[1] for 100M link, and LED[2] for active.
But for some boards, which use LED[0] for link, and LED[1] for
active, prefer to be 0x1040. To be compatible with this case,
this patch defines a new dev_flag, and set it before connect
phy in HNS3 driver. When phy initializing, using the new
LED configuration if this dev_flag is set.
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Update all users of eth_get_headlen to pass network device, fetch
network namespace from it and pass it down to the flow dissector.
This commit is a noop until administrator inserts BPF flow dissector
program.
Cc: Maxim Krasnyansky <maxk@qti.qualcomm.com>
Cc: Saeed Mahameed <saeedm@mellanox.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: intel-wired-lan@lists.osuosl.org
Cc: Yisen Zhuang <yisen.zhuang@huawei.com>
Cc: Salil Mehta <salil.mehta@huawei.com>
Cc: Michael Chan <michael.chan@broadcom.com>
Cc: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
All we do is write the length/status and address bits to a DMA
descriptor only to write its contents into on-chip registers right
after, eliminate this unnecessary step.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Without this patch the socket address family sporadically gets wrong
value ends up the dev_set_mac_address() fails to set the desired MAC
address.
Fixes: 25766271e4 ("r8152: Refresh MAC address during USBDEVFS_RESET")
Signed-off-by: Crag.Wang <crag.wang@dell.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-By: Mario Limonciello <mario.limonciello@dell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Switch the CPU port to use the new dedicated egress pool instead the
previously used egress pool which was shared with normal front panel
ports.
Add per-port quotas for the amount of traffic that can be buffered for
the CPU port and also adjust the per-{port, TC} quotas.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The CPU port is used to transmit traffic that is trapped to the host
CPU. It is therefore irrelevant to define ingress quota for it.
Add a 'skip_ingress' argument to the function tasked with configuring
per-port quotas, so that ingress quotas could be skipped in case the
passed local port is the CPU port.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The function is used to set the per-port shared buffer quotas.
Currently, these quotas are only set for front panel ports, but a
subsequent patch will configure these quotas for the CPU port as well.
The configuration required for the CPU port is a bit different than that
of the front panel ports, so split the business logic into a separate
function which will be called with different parameters for the CPU
port.
No functional changes intended.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use the new ingress pool that was added in the previous patch for
control packets (e.g., STP, LACP) that are trapped to the CPU.
The previous management pool is no longer necessary and therefore its
size is set to 0.
The maximum quota for traffic towards the CPU is increased to 50% of the
free space in the new ingress pool and therefore the reserved space is
reduced by half, to 10KB - in both the shared and headroom buffer. This
allows for more efficient utilization of the shared buffer as reserved
space cannot be used for other purposes.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Packets that are trapped to the CPU are transmitted through the CPU port
to the attached host. The CPU port is therefore like any other port and
needs to have shared buffer configuration.
The maximum quotas configured for the CPU are provided using dynamic
threshold and cannot be changed by the user. In order to make sure that
these thresholds are always valid, the configuration of the threshold
type of these pools is forbidden.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The code currently assumes that ingress pools have lower indices than
egress pools. This makes it impossible to add more ingress pools
without breaking user configuration that relies on a certain pool index
to correspond to an egress pool.
Remove such assumptions from the code, so that more ingress pools could
be added by subsequent patches.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit e83c045e53 ("mlxsw: spectrum_buffers: Configure MC pool")
configured the threshold of the multicast TCs as infinite so that the
admission of multicast packets is only depended on per-switch priority
threshold.
Forbid the user from changing the thresholds of these multicast TCs and
their binding to a different pool.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>