Commit Graph

29979 Commits

Author SHA1 Message Date
Mauro S. M. Rodrigues
54579ca837 i40e: Implement debug macro hw_dbg using dev_dbg
There are several uses of hw_dbg in the code, producing no output. This
patch implements it using dev_debug.

Initially the intention was to implement it using netdev_dbg, analogously
to what is done in ixgbe for instance. That approach was avoided due to
some early usages of hw_dbg, like i40e_pf_reset, before the VSI structure
initialization causing NULL pointer dereference during the driver probe if
the debug messages were turned on as soon as the module is probed.

v2:
 - Use dev_dbg instead of pr_debug, and take advantage of dev_name
instead of crafting pretty much the same device name locally as suggested
by Jakub Kicinski.

Signed-off-by: "Mauro S. M. Rodrigues" <maurosr@linux.vnet.ibm.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-09 11:22:20 -07:00
Mauro S. M. Rodrigues
e1a8ca11c7 i40e: fix hw_dbg usage in i40e_hmc_get_object_va
The mentioned function references a i40e_hw attribute, as parameter for
hw_dbg, but it doesn't exist in the function scope.
Fixes it by changing  parameters from i40e_hmc_info to i40e_hw which can
retrieve the necessary i40e_hmc_info.

v2:
 - Fixed reverse xmas tree code style issue as suggested by Jakub Kicinski

Signed-off-by: "Mauro S. M. Rodrigues" <maurosr@linux.vnet.ibm.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-09 10:18:18 -07:00
Sasha Neftin
00c0916618 igc: Remove unneeded PCI bus defines
PCIe device control 2 defines does not use internally.
This patch comes to clean up those.

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-09 10:08:38 -07:00
Mitch Williams
155f0ac2c9 iavf: allow permanent MAC address to change
Allow the VF to override the "permanent" MAC address set by the host.
This allows bonding to work in the case where the administrator has set
the VF MAC.

Note that the VF must still be set to Trusted on the host if this change
is to be accepted by the PF driver.

Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-09 10:08:38 -07:00
Sasha Neftin
9b924edd8f igc: Add NVM checksum validation
Add NVM checksum validation during probe functionality.

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-09 10:08:38 -07:00
Jacob Keller
0ea7e88d3f fm10k: use a local variable for the frag pointer
In the function fm10k_xmit_frame_ring, we recently switched to using
the skb_frag_size accessor instead of directly using the size member of
the skb fragment.

This made the for loop slightly harder to read because it created a very
long line that is difficult to split up. Avoid this by using a local
variable in the for loop, so that we do not have to break the line on an
open parenthesis.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-09 10:08:38 -07:00
Sasha Neftin
10ce2c00cf igc: Remove useless forward declaration
Move igc_phy_setup_autoneg, igc_wait_autoneg and igc_set_fc_watermarks
up to avoid forward declaration.
It is not necessary to forward declare these static methods.

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-09 10:08:38 -07:00
Kai-Heng Feng
dee23594d5 e1000e: Make speed detection on hotplugging cable more reliable
After hot plugging an 1Gbps Ethernet cable with 1Gbps link partner, the
MII_BMSR may report 10Mbps, renders the network rather slow.

The issue has much lower fail rate after commit 59653e6497 ("e1000e:
Make watchdog use delayed work"), which essentially introduces some
delay before running the watchdog task.

But there's still a chance that the hot plugging event and the queued
watchdog task gets run at the same time, then the original issue can be
observed once again.

So let's use mod_delayed_work() to add a deterministic 1 second delay
before running watchdog task, after an interrupt.

Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-09 10:08:38 -07:00
Radoslaw Tyl
d7cb9da186 ixgbevf: Link lost in VM on ixgbevf when restoring from freeze or suspend
This patch fixed issue in VM which shows no link when hypervisor is
restored from low-power state. The driver is responsible for re-enabling
any features of the device that had been disabled during suspend calls,
such as IRQs and bus mastering.

Signed-off-by: Radoslaw Tyl <radoslawx.tyl@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-09 10:08:38 -07:00
YueHaibing
2410a3dad4 iavf: remove unused debug function iavf_debug_d
There is no caller of function iavf_debug_d() in tree since
commit 75051ce4c5 ("iavf: Fix up debug print macro"),
so it can be removed.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-09 10:08:38 -07:00
Fred Lotter
28abe57962 nfp: flower: cmsg rtnl locks can timeout reify messages
Flower control message replies are handled in different locations. The truly
high priority replies are handled in the BH (tasklet) context, while the
remaining replies are handled in a predefined Linux work queue. The work
queue handler orders replies into high and low priority groups, and always
start servicing the high priority replies within the received batch first.

Reply Type:			Rtnl Lock:	Handler:

CMSG_TYPE_PORT_MOD		no		BH tasklet (mtu)
CMSG_TYPE_TUN_NEIGH		no		BH tasklet
CMSG_TYPE_FLOW_STATS		no		BH tasklet
CMSG_TYPE_PORT_REIFY		no		WQ high
CMSG_TYPE_PORT_MOD		yes		WQ high (link/mtu)
CMSG_TYPE_MERGE_HINT		yes		WQ low
CMSG_TYPE_NO_NEIGH		no		WQ low
CMSG_TYPE_ACTIVE_TUNS		no		WQ low
CMSG_TYPE_QOS_STATS		no		WQ low
CMSG_TYPE_LAG_CONFIG		no		WQ low

A subset of control messages can block waiting for an rtnl lock (from both
work queue priority groups). The rtnl lock is heavily contended for by
external processes such as systemd-udevd, systemd-network and libvirtd,
especially during netdev creation, such as when flower VFs and representors
are instantiated.

Kernel netlink instrumentation shows that external processes (such as
systemd-udevd) often use successive rtnl_trylock() sequences, which can result
in an rtnl_lock() blocked control message to starve for longer periods of time
during rtnl lock contention, i.e. netdev creation.

In the current design a single blocked control message will block the entire
work queue (both priorities), and introduce a latency which is
nondeterministic and dependent on system wide rtnl lock usage.

In some extreme cases, one blocked control message at exactly the wrong time,
just before the maximum number of VFs are instantiated, can block the work
queue for long enough to prevent VF representor REIFY replies from getting
handled in time for the 40ms timeout.

The firmware will deliver the total maximum number of REIFY message replies in
around 300us.

Only REIFY and MTU update messages require replies within a timeout period (of
40ms). The MTU-only updates are already done directly in the BH (tasklet)
handler.

Move the REIFY handler down into the BH (tasklet) in order to resolve timeouts
caused by a blocked work queue waiting on rtnl locks.

Signed-off-by: Fred Lotter <frederik.lotter@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-07 18:05:50 +02:00
Colin Ian King
e9ac25b70d net: hns3: make array spec_opcode static const, makes object smaller
Don't populate the array spec_opcode on the stack but instead make it
static const. Makes the object code smaller by 48 bytes.

Before:
   text	   data	    bss	    dec	    hex	filename
   6914	   1040	    128	   8082	   1f92	hns3/hns3vf/hclgevf_cmd.o

After:
   text	   data	    bss	    dec	    hex	filename
   6866	   1040	    128	   8034	   1f62	hns3/hns3vf/hclgevf_cmd.o

(gcc version 9.2.1, amd64)

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-07 18:03:05 +02:00
Colin Ian King
f4ee147686 be2net: make two arrays static const, makes object smaller
Don't populate the arrays on the stack but instead make them
static const. Makes the object code smaller by 281 bytes.

Before:
   text	   data	    bss	    dec	    hex	filename
  87553	   5672	      0	  93225	  16c29	benet/be_cmds.o

After:
   text	   data	    bss	    dec	    hex	filename
  87112	   5832	      0	  92944	  16b10	benet/be_cmds.o

(gcc version 9.2.1, amd64)

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-07 18:02:17 +02:00
YueHaibing
52d5654046 ionic: Remove unused including <linux/version.h>
Remove including <linux/version.h> that don't need it.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-07 18:01:14 +02:00
Jose Abreu
d9da2c8717 net: stmmac: Limit max speeds of XGMAC if asked to
We may have some SoCs that can't achieve XGMAC max speed. Limit it if
asked to.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-07 17:57:41 +02:00
Jose Abreu
5f8475daa2 net: stmmac: selftests: Add Split Header test
Add a test to validate that Split Header feature is working correctly.
It works by using the rececently introduced counter that increments each
time a packet with split header is received.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-07 17:57:41 +02:00
Jose Abreu
41f2a3e636 net: stmmac: dwmac4: Enable RX Jumbo frame support
We are already doing it by default in the TX path so we can also enable
Jumbo Frame support in the RX path independently of MTU value.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-07 17:57:41 +02:00
Jose Abreu
b3138c5b0f net: stmmac: selftests: Set RX tail pointer in Flow Control test
We need to set the RX tail pointer so that RX engine starts working
again after finishing the Flow Control test.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-07 17:57:41 +02:00
Jose Abreu
034c8fadba net: stmmac: selftests: Add missing checks for support of SA
Add checks for support of Source Address Insertion/Replacement before
running the test.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-07 17:57:41 +02:00
David S. Miller
22c63d9c94 mlx5-updates-2019-09-05
1) Allover mlx5 cleanups
 
 2) Added port congestion counters to ethtool stats:
 
 Add 3 counters per priority to ethtool using PPCNT:
   2.1) rx_prio[p]_buf_discard - the number of packets discarded by device
        due to lack of per host receive buffers
   2.2) rx_prio[p]_cong_discard - the number of packets discarded by device
        due to per host congestion
   2.3) rx_prio[p]_marked - the number of packets ECN marked by device due
        to per host congestion
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAl1xgcsACgkQSD+KveBX
 +j7XSwf6A6Ri61kZ4fLRfrKcMn7rq7uzUn855UjtNQUpDlmoKSu9lY+SGtMQMENq
 7AMvUmgZJe2Sw47o3N9mxqPa86HS9uGDFhnF7HOKxa/4uzoqqvvdget4BhP0h1xS
 tRDXScnuZfPIs1nuA0w1obgzYb0FwOJhtB1m3rQ6iywAohwmM8mYe9jPREfIGaoy
 U9p+DZEJHIC2YKy4G7hymbNbaKMgMG9IYl9axNyqbGaA9xPTPO4+pBBFvEXffLh6
 UAk7p3XHV0ZIiKq1STIqDJpz6ucP88ywg9Hnkkrz2U82xuk/E43tBvIM9pxgl/bh
 UhJcoL2Wlr5aHNHgIlBgWZ07Q+J0aQ==
 =iM0A
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-updates-2019-09-05' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-updates-2019-09-05

1) Allover mlx5 cleanups

2) Added port congestion counters to ethtool stats:

Add 3 counters per priority to ethtool using PPCNT:
  2.1) rx_prio[p]_buf_discard - the number of packets discarded by device
       due to lack of per host receive buffers
  2.2) rx_prio[p]_cong_discard - the number of packets discarded by device
       due to per host congestion
  2.3) rx_prio[p]_marked - the number of packets ECN marked by device due
       to per host congestion
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-07 17:40:18 +02:00
Juliet Kim
1c2977c094 net/ibmvnic: free reset work of removed device from queue
Commit 36f1031c51 ("ibmvnic: Do not process reset during or after
 device removal") made the change to exit reset if the driver has been
removed, but does not free reset work items of the adapter from queue.

Ensure all reset work items are freed when breaking out of the loop early.

Fixes: 36f1031c51 ("ibmnvic: Do not process reset during or after device removal”)
Signed-off-by: Juliet Kim <julietk@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-07 17:36:14 +02:00
zhong jiang
9b789f476e ethernet: micrel: Use DIV_ROUND_CLOSEST directly to make it readable
The kernel.h macro DIV_ROUND_CLOSEST performs the computation (x + d/2)/d
but is perhaps more readable.

Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-07 17:17:02 +02:00
David S. Miller
6938843dd8 Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:

====================
100GbE Intel Wired LAN Driver Updates 2019-09-05

This series contains updates to ice driver.

Brett fixes the setting of num_q_vectors by using the maximum number
between the allocated transmit and receive queues.

Anirudh simplifies the code to use a helper function to return the main
VSI, which is the first element in the pf->vsi array.  Adds a pointer
check to prevent a NULL pointer dereference.  Adds a check to ensure we
do not initialize DCB on devices that are not DCB capable.  Does some
housekeeping on the code to remove unnecessary indirection and reduce
the PF structure by removing elements that are not needed since the
values they were storing can be readily gotten from
ice_get_avail_*_count()'s.  Updates the printed strings to make it
easier to search the logs for driver capabilities.

Jesse cleans up unnecessary function arguments.  Updated the code to use
prefetch() to add some efficiency to the driver to avoid a cache miss.
Did some housekeeping on the code to remove the configurable transmit
work limit via ethtool which ended up creating performance overhead.
Made additional performance enhancements by updating the driver to start
out with a reasonable number of descriptors by changing the default to
2048.

Mitch fixes the reset logic for VFs by clearing VF_MBX_ARQLEN register
when the source of the reset is not PFR.

Lukasz updates the driver to include a similar fix for the i40e driver
by reporting link down for VF's when the PF queues are not enabled.

Akeem updates the driver to report the VF link status once we get VF
resources so that we can reflect the link status similarly to how the PF
reports link speed.

Ashish updates the transmit context structure based on recent changes to
the hardware specification.

Dave updates the DCB logic to allow a delayed registration for MIB
change events so that the driver is not accepting events before it is
ready for them.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-07 15:24:50 +02:00
David S. Miller
1e46c09ec1 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
The following pull-request contains BPF updates for your *net-next* tree.

The main changes are:

1) Add the ability to use unaligned chunks in the AF_XDP umem. By
   relaxing where the chunks can be placed, it allows to use an
   arbitrary buffer size and place whenever there is a free
   address in the umem. Helps more seamless DPDK AF_XDP driver
   integration. Support for i40e, ixgbe and mlx5e, from Kevin and
   Maxim.

2) Addition of a wakeup flag for AF_XDP tx and fill rings so the
   application can wake up the kernel for rx/tx processing which
   avoids busy-spinning of the latter, useful when app and driver
   is located on the same core. Support for i40e, ixgbe and mlx5e,
   from Magnus and Maxim.

3) bpftool fixes for printf()-like functions so compiler can actually
   enforce checks, bpftool build system improvements for custom output
   directories, and addition of 'bpftool map freeze' command, from Quentin.

4) Support attaching/detaching XDP programs from 'bpftool net' command,
   from Daniel.

5) Automatic xskmap cleanup when AF_XDP socket is released, and several
   barrier/{read,write}_once fixes in AF_XDP code, from Björn.

6) Relicense of bpf_helpers.h/bpf_endian.h for future libbpf
   inclusion as well as libbpf versioning improvements, from Andrii.

7) Several new BPF kselftests for verifier precision tracking, from Alexei.

8) Several BPF kselftest fixes wrt endianess to run on s390x, from Ilya.

9) And more BPF kselftest improvements all over the place, from Stanislav.

10) Add simple BPF map op cache for nfp driver to batch dumps, from Jakub.

11) AF_XDP socket umem mapping improvements for 32bit archs, from Ivan.

12) Add BPF-to-BPF call and BTF line info support for s390x JIT, from Yauheni.

13) Small optimization in arm64 JIT to spare 1 insns for BPF_MOD, from Jerin.

14) Fix an error check in bpf_tcp_gen_syncookie() helper, from Petar.

15) Various minor fixes and cleanups, from Nathan, Masahiro, Masanari,
    Peter, Wei, Yue.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-06 16:49:17 +02:00
Colin Ian King
f9bcfe214b lan743x: remove redundant assignment to variable rx_process_result
The variable rx_process_result is being initialized with a value that
is never read and is being re-assigned immediately afterwards. The
assignment is redundant, so replace it with the return from function
lan743x_rx_process_packet.

Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-06 16:47:07 +02:00
Simon Horman
fd8ab76a85 ravb: TROCR register is only present on R-Car Gen3
Only use the TROCR register on R-Car Gen3 as it is not present on other
SoCs.

Offsets used for the undocumented registers are considered reserved and
should not be written to. After some internal investigation with Renesas it
remains unclear why this driver accesses these fields on R-Car Gen2 but
regardless of what the historical reasons are the current code is
considered incorrect.

Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Reviewed-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Acked-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-06 16:46:10 +02:00
Simon Horman
2d957a7e2a ravb: remove undocumented endianness selection
This patch removes the use of the undocumented BOC bit of the CCC register.

Current documentation for EtherAVB (ravb) describes the offset of what the
driver uses as the BOC bit as reserved and that only a value of 0 should be
written. After some internal investigation with Renesas it remains unclear
why this driver accesses these fields but regardless of what the historical
reasons are the current code is considered incorrect.

Based on work by Kazuya Mizuguchi <kazuya.mizuguchi.ks@renesas.com>

Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-06 16:46:09 +02:00
Simon Horman
009a470365 ravb: remove undocumented counter processing
This patch removes the use of the undocumented counter registers
CDCR, LCCR, CERCR, CEECR.

Offsets used for undocumented registers are considered reserved and
should not be written to. After some internal investigation with Renesas
it remains unclear why this driver accesses these fields but regardless of
what the historical reasons are the current code is considered incorrect.

Based on work by Kazuya Mizuguchi <kazuya.mizuguchi.ks@renesas.com>

Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-06 16:46:09 +02:00
Simon Horman
845e4b8014 ravb: correct typo in FBP field of SFO register
The field name is FBP rather than FPB.

This field is unused and could equally be removed from the driver entirely.
But there seems no harm in leaving as documentation of the presence of the
field.

Based on work by Kazuya Mizuguchi <kazuya.mizuguchi.ks@renesas.com>

Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Reviewed-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Acked-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-06 16:46:09 +02:00
Guojia Liao
91f8ff09ad net: hns3: make hclge_dbg_get_m7_stats_info static
hclge_dbg_get_m7_info is used only in the hclge_debugfs.c,
so it should be declared with static.

Signed-off-by: Guojia Liao <liaoguojia@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-06 15:20:35 +02:00
Yufeng Mo
1cbc662dd8 net: hns3: disable loopback setting in hclge_mac_init
If the selftest and reset are performed at the same time, the loopback
setting may be still in the enable state after the reset. As a result,
packets cannot be sent out.

This patch fixes this issue by disabling loopback in hclge_mac_init.

Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-06 15:20:35 +02:00
Guojia Liao
1483fa4946 net: hns3: remove explicit conversion to bool
Relational and logical operators evaluate to bool,
explicit conversion is overly verbose and unnecessary.

Signed-off-by: Guojia Liao <liaoguojia@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-06 15:20:34 +02:00
Peng Li
b7cf22b74a net: hns3: add client node validity judgment
HNS3 driver can only unregister client which included in hnae3_client_list.
This patch adds the client node validity judgment.

Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-06 15:20:34 +02:00
Huazhong Tan
525a294e60 net: hns3: fix mis-assignment to hdev->reset_level in hclge_reset
Since hclge_get_reset_level may return HNAE3_NONE_RESET,
so hdev->reset_level can not be assigned with the return
value in the hclge_reset(), otherwise, it will cause
the use of hdev->reset_level in hclge_reset_event get
into error.

Fixes: 012fcb52f6 ("net: hns3: activate reset timer when calling reset_event")
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-06 15:20:34 +02:00
Huazhong Tan
323a2ac522 net: hns3: fix double free bug when setting ringparam
The system will panic when change the ringparam in HNS3 drivers:

[ 1459.627727] hns3 0000:bd:00.0 eth6: Changing Tx/Rx ring ds from 1024/1024 to 24/24
[ 1459.635766] hns3 0000:bd:00.0 eth6: link down
[ 1459.640788] BUG: Bad page state in process ethtool  pfn:203f75c18
[ 1459.646940] page:ffff7ee4ffd70600 refcount:0 mapcount:0 mapping:ffff993fff40f400 index:0x0 compound_mapcount: 0
[ 1459.656987] flags: 0x9fffe00000010200(slab|head)
[ 1459.661591] raw: 9fffe00000010200 dead000000000100 dead000000000122 ffff993fff40f400
[ 1459.669302] raw: 0000000000000000 0000000080100010 00000000ffffffff 0000000000000000
[ 1459.677016] page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set
[ 1459.683432] bad because of flags: 0x200(slab)
[ 1459.687775] Modules linked in: ib_ipoib ib_umad rpcrdma ib_iser libiscsi scsi_transport_iscsi hns_roce_hw_v2 crct10dif_ce hns3 ses hclge hnae3 hisi_hpre hisi_zip qm uacce ip_tables x_tables hisi_sas_v3_hw hisi_sas_main libsas scsi_transport_sas
[ 1459.709329] CPU: 14 PID: 17244 Comm: ethtool Tainted: G           O      5.3.0-rc4-00415-gc86f057 #1
[ 1459.718419] Hardware name: Huawei TaiShan 2280 V2/BC82AMDC, BIOS 2280-V2 CS V3.B040.01 07/26/2019
[ 1459.727248] Call trace:
[ 1459.729688]  dump_backtrace+0x0/0x150
[ 1459.733335]  show_stack+0x24/0x30
[ 1459.736639]  dump_stack+0xa0/0xc4
[ 1459.739943]  bad_page+0xf0/0x158
[ 1459.743157]  free_pages_check_bad+0x84/0xa0
[ 1459.747322]  __free_pages_ok+0x348/0x378
[ 1459.751228]  page_frag_free+0x80/0x88
[ 1459.754877]  skb_free_head+0x38/0x48
[ 1459.758436]  skb_release_data+0x134/0x160
[ 1459.762427]  skb_release_all+0x30/0x40
[ 1459.766158]  consume_skb+0x38/0x108
[ 1459.769633]  __dev_kfree_skb_any+0x58/0x68
[ 1459.773718]  hns3_fini_ring+0x48/0x58 [hns3]
[ 1459.777970]  hns3_set_ringparam+0x2a8/0x418 [hns3]
[ 1459.782741]  dev_ethtool+0x5f4/0x2080
[ 1459.786390]  dev_ioctl+0x190/0x3d8
[ 1459.789777]  sock_do_ioctl+0xf8/0x220
[ 1459.793423]  sock_ioctl+0x3bc/0x490
[ 1459.796896]  do_vfs_ioctl+0xc4/0x868
[ 1459.800454]  ksys_ioctl+0x8c/0xa0
[ 1459.803752]  __arm64_sys_ioctl+0x28/0x38
[ 1459.807658]  el0_svc_common.constprop.0+0xe0/0x1e0
[ 1459.812426]  el0_svc_handler+0x34/0x90
[ 1459.816158]  el0_svc+0x10/0x14
[ 1459.819220] Disabling lock debugging due to kernel taint
[ 1459.825182] ------------[ cut here ]------------

Since ndo_stop will reclaim the RX's skb allocated by the driver,
so the backed up ring parameter should not keep this info.

Fixes: a723fb8efe ("net: hns3: refine for set ring parameters")
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-06 15:20:34 +02:00
Jian Shen
d9c0f2756a net: hns3: fix error VF index when setting VLAN offload
In original codes, the VF index used incorrectly in function
hclge_set_vlan_rx_offload_cfg() and hclge_set_vlan_rx_offload_cfg().
When VF id is greater than 8, for example 9, it will set the
same bit with VF id 1.

This patch fixes it by using  vport->vport_id % HCLGE_VF_NUM_PER_CMD /
HCLGE_VF_NUM_PER_BYTE as the array index, instead of vport->vport_id /
HCLGE_VF_NUM_PER_CMD.

Fixes: 052ece6dc1 ("net: hns3: add ethtool related offload command")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-06 15:20:34 +02:00
Andy Shevchenko
c3a502deaf stmmac: platform: adjust messages and move to dev level
This patch amends the error and warning messages across the platform driver.
It includes the following changes:
 - append \n to the end of messages
 - change pr_* macros to dev_*

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-06 15:18:35 +02:00
Zhu Yanjun
f4b633b911 forcedeth: use per cpu to collect xmit/recv statistics
When testing with a background iperf pushing 1Gbit/sec traffic and running
both ifconfig and netstat to collect statistics, some deadlocks occurred.

Ifconfig and netstat will call nv_get_stats64 to get software xmit/recv
statistics. In the commit f5d827aece ("forcedeth: implement
ndo_get_stats64() API"), the normal tx/rx variables is to collect tx/rx
statistics. The fix is to replace normal tx/rx variables with per
cpu 64-bit variable to collect xmit/recv statistics. The per cpu variable
will avoid deadlocks and provide fast efficient statistics updates.

In nv_probe, the per cpu variable is initialized. In nv_remove, this
per cpu variable is freed.

In xmit/recv process, this per cpu variable will be updated.

In nv_get_stats64, this per cpu variable on each cpu is added up. Then
the driver can get xmit/recv packets statistics.

A test runs for several days with this commit, the deadlocks disappear
and the performance is better.

Tested:
   - iperf SMP x86_64 ->
   Client connecting to 1.1.1.108, TCP port 5001
   TCP window size: 85.0 KByte (default)
   ------------------------------------------------------------
   [  3] local 1.1.1.105 port 38888 connected with 1.1.1.108 port 5001
   [ ID] Interval       Transfer     Bandwidth
   [  3]  0.0-10.0 sec  1.10 GBytes   943 Mbits/sec

   ifconfig results:

   enp0s9 Link encap:Ethernet  HWaddr 00:21:28:6f:de:0f
          inet addr:1.1.1.105  Bcast:0.0.0.0  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:5774764531 errors:0 dropped:0 overruns:0 frame:0
          TX packets:633534193 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:7646159340904 (7.6 TB) TX bytes:11425340407722 (11.4 TB)

   netstat results:

   Kernel Interface table
   Iface MTU Met RX-OK RX-ERR RX-DRP RX-OVR TX-OK TX-ERR TX-DRP TX-OVR Flg
   ...
   enp0s9 1500 0  5774764531 0    0 0      633534193      0      0  0 BMRU
   ...

Fixes: f5d827aece ("forcedeth: implement ndo_get_stats64() API")
CC: Joe Jin <joe.jin@oracle.com>
CC: JUNXIAO_BI <junxiao.bi@oracle.com>
Reported-and-tested-by: Nan san <nan.1986san@gmail.com>
Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-06 15:06:42 +02:00
Mao Wenan
6e1cdedcf0 net: sonic: return NETDEV_TX_OK if failed to map buffer
NETDEV_TX_BUSY really should only be used by drivers that call
netif_tx_stop_queue() at the wrong moment. If dma_map_single() is
failed to map tx DMA buffer, it might trigger an infinite loop.
This patch use NETDEV_TX_OK instead of NETDEV_TX_BUSY, and change
printk to pr_err_ratelimited.

Fixes: d9fb9f3842 ("*sonic/natsemi/ns83829: Move the National Semi-conductor drivers")
Signed-off-by: Mao Wenan <maowenan@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-06 15:04:57 +02:00
zhong jiang
47e2527769 nfp: Drop unnecessary continue in nfp_net_pf_alloc_vnics
Continue is not needed at the bottom of a loop.

Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-06 14:58:21 +02:00
Aya Levin
1297d97f48 net/mlx5e: Add port buffer's congestion counters
Add 3 counters per priority to ethtool using PPCNT:
1) rx_prio[p]_buf_discard - the number of packets discarded by device
   due to lack of per host receive buffers
2) rx_prio[p]_cong_discard - the number of packets discarded by device
   due to per host congestion
3) rx_prio[p]_marked - the number of packets ECN marked by device due
   to per host congestion

Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-09-05 14:44:43 -07:00
Saeed Mahameed
63d67f3059 net/mlx5: DR, Remove redundant dev_name print from err log
mlx5_core_err already prints the name of the device.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-09-05 14:44:43 -07:00
Wei Yongjun
83de91f826 net/mlx5: DR, Fix error return code in dr_domain_init_resources()
Fix to return negative error code -ENOMEM from the error handling
case instead of 0, as done elsewhere in this function.

Fixes: 4ec9e7b026 ("net/mlx5: DR, Expose steering domain functionality")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-09-05 14:44:42 -07:00
Wei Yongjun
f6a8cddfb5 net/mlx5: DR, Remove useless set memory to zero use memset()
The memory return by kzalloc() has already be set to zero, so
remove useless memset(0).

Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-09-05 14:44:42 -07:00
Maxim Mikityanskiy
7f7edefda1 net/mlx5e: Remove unnecessary clear_bit()s
Don't clear MLX5E_SQ_STATE_ENABLED on error in mlx5e_open_txqsq and
mlx5e_open_icosq, because it's not set there, and is 0 by default.

Fixes: acc6c5953a ("net/mlx5e: Split open/close channels to stages")
Fixes: 9d18b5144a ("net/mlx5e: Split open/close ICOSQ into stages")
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-09-05 14:44:42 -07:00
Tariq Toukan
fa9e01c895 net/mlx5e: kTLS, Remove unused function parameter
SKB parameter is no longer used in tx_post_resync_dump(),
remove it.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-09-05 14:44:42 -07:00
zhong jiang
a2b7189be6 net/mlx5: Use PTR_ERR_OR_ZERO rather than its implementation
PTR_ERR_OR_ZERO contains if(IS_ERR(...)) + PTR_ERR. It is better
to use it directly. hence just replace it.

Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Acked-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-09-05 14:44:41 -07:00
Colin Ian King
e53e665558 net/mlx5: fix missing assignment of variable err
The error return from a call to mlx5_flow_namespace_set_peer is not
being assigned to variable err and hence the error check following
the call is currently not working.  Fix this by assigning ret as
intended.

Addresses-Coverity: ("Logically dead code")
Fixes: 8463daf17e ("net/mlx5: Add support to use SMFS in switchdev mode")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-09-05 14:44:41 -07:00
Colin Ian King
4938c3d845 net/mlx5: fix spelling mistake "offlaods" -> "offloads"
There is a spelling mistake in a NL_SET_ERR_MSG_MOD error message.
Fix it.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-09-05 14:44:41 -07:00
Roi Dayan
a6d35fb47a net/mlx5e: Remove leftover declaration
This function was removed in the cited commit below.

Fixes: 13e509a4c1 ("net/mlx5e: Remove leftover code from the PF netdev being uplink rep")
Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-09-05 14:44:41 -07:00
Saeed Mahameed
5cc3a8c66d net/mlx5e: Use ipv6_stub to avoid dependency with ipv6 being a module
mlx5 is dependent on IPv6 tristate since we use ipv6's nd_tbl directly,
alternatively we can use ipv6_stub->nd_tbl and remove the dependency.

Reported-by: Walter Harms <wharms@bfs.de>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Reviewed-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-09-05 14:44:40 -07:00
Mao Wenan
4057a7652b net/mlx5: Kconfig: Fix MLX5_CORE dependency with PCI_HYPERV_INTERFACE
When MLX5_CORE=y and PCI_HYPERV_INTERFACE=m, below errors are found:
drivers/net/ethernet/mellanox/mlx5/core/en_main.o: In function `mlx5e_nic_enable':
en_main.c:(.text+0xb649): undefined reference to `mlx5e_hv_vhca_stats_create'
drivers/net/ethernet/mellanox/mlx5/core/en_main.o: In function `mlx5e_nic_disable':
en_main.c:(.text+0xb8c4): undefined reference to `mlx5e_hv_vhca_stats_destroy'

Fix this by making MLX5_CORE imply PCI_HYPERV_INTERFACE.

Fixes: cef35af34d ("net/mlx5e: Add mlx5e HV VHCA stats agent")
Signed-off-by: Mao Wenan <maowenan@huawei.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-09-05 14:44:40 -07:00
Eran Ben Elisha
394cf13c24 net/mlx5e: Fix static checker warning of potential pointer math issue
Cited patch have an issue in WARN_ON_ONCE check, with wrong address ranges
are compared. Fix that by changing pointer types from u64* to void*. This
will also make code simpler to read.

In addition mlx5e_hv_vhca_fill_ring_stats can get void pointer, so remove
the unnecessary casting when calling it.

Found by static checker:
drivers/net/ethernet/mellanox/mlx5/core/en/hv_vhca_stats.c:41 mlx5e_hv_vhca_fill_stats()
warn: potential pointer math issue ('buf' is a u64 pointer)

Fixes: cef35af34d ("net/mlx5e: Add mlx5e HV VHCA stats agent")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-09-05 14:44:40 -07:00
Anirudh Venkataramanan
5c875c1af8 ice: Rework around device/function capabilities
ice_parse_caps is printing capabilities in a different way when
compared to the variable names. This makes it difficult to search for
the right strings in the debug logs. So this patch updates the
print strings to be exactly the same as the fields' name in the
structure.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-05 08:13:41 -07:00
Jesse Brandeburg
dd47e1fd86 ice: change default number of receive descriptors
The driver should start out with a reasonable number of descriptors that
can prevent drops due to a CPU being in a power management state.
Change the default number of descriptors to 2048.
The user can always change the value at runtime.  Transmit descriptor
counts are not modified because they don't need to change due to the
speed of the interface, or for power managed CPUs, but the code is
simplified to a fixed value for the transmit default.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-05 08:13:41 -07:00
Anirudh Venkataramanan
8c243700ab ice: Minor refactor in queue management
Remove q_left_tx and q_left_rx from the PF struct as these can be
obtained by calling ice_get_avail_txq_count and ice_get_avail_rxq_count
respectively.

The function ice_determine_q_usage is only setting num_lan_tx and
num_lan_rx in the PF structure, and these are later assigned to
vsi->alloc_txq and vsi->alloc_rxq respectively. This is an unnecessary
indirection, so remove ice_determine_q_usage and just assign values
for vsi->alloc_txq and vsi->alloc_rxq in ice_vsi_set_num_qs and use
these to set num_lan_tx and num_lan_rx respectively.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-05 08:13:41 -07:00
Dave Ertman
ea300f41bb ice: Allow for delayed LLDP MIB change registration
Add an additional boolean parameter to the ice_init_dcb
function.  This boolean controls if the LLDP MIB change
events are registered for.  Also, add a new function
defined ice_cfg_lldp_mib_change.  The additional function
is necessary to be able to register for LLDP MIB change
events after calling ice_init_dcb.  The net effect of these
two changes is to allow a delayed registration for MIB change
events so that the driver is not accepting events before it
is ready for them.

Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-05 08:13:41 -07:00
Ashish Shah
201beeb715 ice: update Tx context struct
Add internal usage flag, bit 91 as described in spec.
Update width of internal queue state to 122 also as described in spec.

Signed-off-by: Ashish Shah <ashish.n.shah@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-05 08:13:41 -07:00
Akeem G Abodunrin
dfc6240012 ice: Report VF link status with opcode to get resources
This patch changes how and when the driver report link status, instead of
waiting till the call to enable queues for VF, we should report link
status earlier with opcode to get VF resources - So as to avoid reporting
erroneous information, especially when queues have not been configured.
In addition, we can also make a call to get and report link status change
after when queue is enabled, at least to report netdev or PHY link status.
This is in accordance to how link speed is being reported for PF...

Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-05 08:13:41 -07:00
Anirudh Venkataramanan
80739b57b1 ice: Check for DCB capability before initializing DCB
Check the ICE_FLAG_DCB_CAPABLE before calling ice_init_pf_dcb.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-05 08:13:41 -07:00
Lukasz Czapnik
c61d234234 ice: report link down for VF when PF's queues are not enabled
This is port of a fix from i40e commit 2ad1274fa3 ("i40e: don't
report link up for a VF who hasn't enabled queues")

Older VF drivers do not respond well to receiving a link
up notification before queues are enabled. This can cause their state
machine to think that it is safe to send traffic. This results in a Tx
hang on the VF.

Record whether the PF has actually enabled queues for the VF. When
reporting link status, always report link down if the queues aren't
enabled. In this way, the VF driver will never receive a link up
notification until after its queues are enabled.

Signed-off-by: Lukasz Czapnik <lukasz.czapnik@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-05 08:13:41 -07:00
Mitch Williams
29d42f1f3a ice: Reliably reset VFs
When a PFR (or bigger reset) occurs, the device clears the VF_MBX_ARQLEN
register for all VFs. But if a VFR is triggered by a VF, the device does
NOT clear this register, and the VF driver will never see the reset.

When this happens, the VF driver will eventually timeout and attempt
recovery, and usually it will be successful. But this makes resets take
a long time and there are occasional failures.

We cannot just blithely clear this register on every reset; this has
been shown to cause synchronization problems when a PFR is triggered
with a large number of VFs.

Fix this by clearing VF_MBX_ARQLEN when the reset source is not PFR.
GlobR will trigger PFR, so this test catches that occurrence as well.

Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-05 08:13:40 -07:00
Jesse Brandeburg
9d56b7fd6a ice: change work limit to a constant
The driver has supported a transmit work limit
that was configurable from ethtool for a long time, but
there are no good use cases for having it be a variable
that can be changed at run time.  In addition, this
variable was noted to be causing performance overhead
due to cache misses.

Just remove the variable and let the code use a constant
so that the functionality is maintained (a limit on the
number of transmits that will be cleaned in any one call
to the clean routines) without the cache miss.

Removes code, removes a variable, removes testing surface. Yay.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-05 08:13:40 -07:00
Jesse Brandeburg
d27525ec1f ice: small efficiency fixes
Add a small bit of efficiency to the code by adding a
prefetch of the port_info structure in order to help
avoid a cache miss a little later on in execution.

Also add an unlikely statement to a branch which
generally will never happen in normal operation.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-05 08:13:40 -07:00
Jesse Brandeburg
6503b65930 ice: move code closer together
This is a simple patch to move the assignment to a local variable
closer to the site where the local variable is used.  This
can help readability and also maybe performance, although the
performance enhancement is really dependent upon the compiler.

No functional change.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-05 08:13:40 -07:00
Jesse Brandeburg
2fb0821fd5 ice: clean up arguments
There are a couple of functions that don't need two arguments
passed in when the second argument already had access to
the pointer pointed to by the first.

Remove the unnecessary arguments.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-05 08:13:40 -07:00
Anirudh Venkataramanan
ade78c2ec1 ice: Check root pointer for validity
ice_sched_get_tc_node uses pi->root without checking for NULL. Add a
check to prevent NULL pointer dereference.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-05 08:13:40 -07:00
Anirudh Venkataramanan
208ff75135 ice: Add ice_get_main_vsi to get PF/main VSI
There are multiple places where we currently use ice_find_vsi_by_type
to get the PF (a.k.a. main) VSI. The PF VSI by definition is always
the first element in the pf->vsi array (i.e. pf->vsi[0]). So instead
add and use a new helper function ice_get_main_vsi, which just returns
pf->vsi[0].

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-05 08:13:40 -07:00
Brett Creeley
34cdcb165b ice: Update fields in ice_vsi_set_num_qs when reconfiguring
Currently when vsi->req_txqs or vsi->req_rxqs are set we don't
correctly set the number of vsi->num_q_vectors. Fix this by
setting the number of queue vectors based on the max
between the vsi->alloc_txqs and vsi->alloc_rxqs.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-05 08:13:40 -07:00
Kevin Laatz
7cbbf9f1fa ixgbe: fix xdp handle calculations
Currently, we don't add headroom to the handle in ixgbe_zca_free,
ixgbe_alloc_buffer_slow_zc and ixgbe_alloc_buffer_zc. The addition of the
headroom to the handle was removed in
commit d8c3061e5e ("ixgbe: modify driver for handling offsets"), which
will break things when headroom isvnon-zero. This patch fixes this and uses
xsk_umem_adjust_offset to add it appropritely based on the mode being run.

Fixes: d8c3061e5e ("ixgbe: modify driver for handling offsets")
Reported-by: Bjorn Topel <bjorn.topel@intel.com>
Signed-off-by: Kevin Laatz <kevin.laatz@intel.com>
Acked-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-09-05 13:53:43 +02:00
Kevin Laatz
4c5d9a7fa1 i40e: fix xdp handle calculations
Currently, we don't add headroom to the handle in i40e_zca_free,
i40e_alloc_buffer_slow_zc and i40e_alloc_buffer_zc. The addition of the
headroom to the handle was removed in
commit 2f86c806a8 ("i40e: modify driver for handling offsets"), which
will break things when headroom is non-zero. This patch fixes this and uses
xsk_umem_adjust_offset to add it appropritely based on the mode being run.

Fixes: 2f86c806a8 ("i40e: modify driver for handling offsets")
Reported-by: Bjorn Topel <bjorn.topel@intel.com>
Signed-off-by: Kevin Laatz <kevin.laatz@intel.com>
Acked-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-09-05 13:53:02 +02:00
Krzysztof Wilczynski
5e5d8bc4a0 net: hns: Move static keyword to the front of declaration
Move the static keyword to the front of declaration of g_dsaf_mode_match,
and resolve the following compiler warning that can be seen when building
with warnings enabled (W=1):

drivers/net/ethernet/hisilicon/hns/hns_dsaf_main.c:27:1: warning:
  ‘static’ is not at beginning of declaration [-Wold-style-declaration]

Signed-off-by: Krzysztof Wilczynski <kw@linux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-05 12:38:42 +02:00
Krzysztof Wilczynski
ee4c3deac7 net: qed: Move static keyword to the front of declaration
Move the static keyword to the front of declaration of iwarp_state_names,
and resolve the following compiler warning that can be seen when building
with warnings enabled (W=1):

drivers/net/ethernet/qlogic/qed/qed_iwarp.c:385:1: warning:
  ‘static’ is not at beginning of declaration [-Wold-style-declaration]

Also, resolve checkpatch.pl script warning:

WARNING: static const char * array should probably be
  static const char * const

Signed-off-by: Krzysztof Wilczynski <kw@linux.com>
Acked-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-05 12:38:22 +02:00
Arseny Solokha
8e578e73ef gianfar: use DT more consistently when selecting PHY connection type
Historically, gianfar only used phy-connection-type DT property when
connected to PHY in the rgmii-id mode. It ignored the property otherwise,
relying on the connection type auto-detection carried out by MAC and
providing that reconstructed mode to of_phy_connect(). It also did not
consider alternative phy-mode property at all.

Make the driver properly query DT node for PHY connection type first and
use an obtained value if it was specified there. Otherwise, if a particular
DT relies on connection type auto-detection, fall back to reconstructing
the value from MAC registers, as before.

Signed-off-by: Arseny Solokha <asolokha@kb.kras.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-05 12:28:15 +02:00
Arseny Solokha
887b8194fb gianfar: cleanup gianfar.h
Remove now unused macro and structure definitions from gianfar.h that have
accumulated there over time.

Signed-off-by: Arseny Solokha <asolokha@kb.kras.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-05 12:28:15 +02:00
Arseny Solokha
7ad387840a gianfar: make five functions static
Make functions that do not have callers outside the translation unit they
are defined in static.

Signed-off-by: Arseny Solokha <asolokha@kb.kras.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-05 12:28:14 +02:00
Arseny Solokha
7d993c5f86 gianfar: remove forward declarations
Remove forward declarations of various static functions located in two
driver implementation files and rearrange the corresponding definitions
accordingly.

This patch only introduces mechanical changes, namely it removes forward
declarations and moves function definitions around; it does not change any
functionality.

Signed-off-by: Arseny Solokha <asolokha@kb.kras.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-05 12:28:14 +02:00
Jose Abreu
427849e8c3 net: stmmac: selftests: Add Jumbo Frame tests
Add a test to validate the Jumbo Frame support in stmmac in single
channel and multichannel mode.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-05 12:19:55 +02:00
Jose Abreu
8a488c3f97 net: stmmac: xgmac: Enable RX Jumbo frame support
We are already doing it by default in the TX path so we can also enable
Jumbo Frame support in the RX path independently of MTU value.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-05 12:19:55 +02:00
Jose Abreu
56bcd59122 net: stmmac: Correctly assing MAX MTU in XGMAC cores case
Maximum MTU for XGMAC cores is 16k thus the check for presence of XGMAC
shall be done first in order to assign correct value.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-05 12:19:55 +02:00
Jose Abreu
c2b69474d6 net: stmmac: xgmac: Correct RAVSEL field interpretation
RAVSEL means that only RX side is available for AVB features. As we use
both TX and RX features we need to check if RAVSEL is selected and
disable AVB if only RX side is available.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-05 12:19:55 +02:00
Jose Abreu
8f9e5b5db4 net: stmmac: ethtool: Let user configure TX coalesce without RIWT
When RX Watchdog is disabled its currently not possible to configure TX
coalesce settings. Let user configure it anyway.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-05 12:19:54 +02:00
Jose Abreu
0b273ca41f net: stmmac: Only consider RX error when HW Timestamping is not enabled
Only consider that we have an error when HW Timestamping is not enabled
as this can give false positives due to the fact the RX Timestamping in
XGMAC and GMAC cores comes from context descriptors.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-05 12:19:54 +02:00
Jose Abreu
5e3fb0a6e2 net: stmmac: selftests: Implement the ARP Offload test
Implement a test for ARP Offload feature.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-05 12:19:54 +02:00
Jose Abreu
5904a980f9 net: stmmac: xgmac: Implement ARP Offload
Implement the ARP Offload feature in XGMAC cores.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-05 12:19:54 +02:00
Jose Abreu
4647e02119 net: stmmac: selftests: Add selftest for L3/L4 Filters
Adds the selftests for L3 and L4 filters with DA/SA/DP/SP support.

Changes from v1:
	- Reduce stack usage (kbuild test robot)

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-05 12:19:54 +02:00
Jose Abreu
425eabddaf net: stmmac: Implement L3/L4 Filters using TC Flower
Implement filters for Layer 3 and Layer 4 using TC Flower API. Add the
corresponding callbacks in XGMAC core.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-05 12:19:54 +02:00
Jose Abreu
c104891c4b net: stmmac: Do not return error code in TC Initialization
As we can still use the remaining TC callbacks, e.g. CBS. We should not
fail in the initialization only because RX Parser is not available.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-05 12:19:54 +02:00
Jose Abreu
6338488356 net: stmmac: xgmac: Add RBU handling in DMA interrupt
Add the handling of Receive Buffer Unavailable interrupt in the DMA
handler of XGMAC cores.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-05 12:19:54 +02:00
Jose Abreu
9513321069 net: stmmac: selftests: Return proper error code to userspace
We can do better than just return 1 to userspace. Lets return a proper
Linux error code.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-05 12:19:54 +02:00
Jiri Pirko
8330f73fe9 rocker: add missing init_net check in FIB notifier
Take only FIB events that are happening in init_net into account. No other
namespaces are supported.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-05 12:14:10 +02:00
zhong jiang
10ae8f4e81 ixgbe: Use kzfree() rather than its implementation.
Use kzfree() instead of memset() + kfree().

Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-05 12:06:04 +02:00
Shannon Nelson
8c15440bce ionic: Add coalesce and other features
Interrupt coalescing, tunable copybreak value, and
tx timeout.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-05 09:24:44 +02:00
Shannon Nelson
aa3198819b ionic: Add RSS support
Add code to manipulate through ethtool the RSS configuration
used by the NIC.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-05 09:24:44 +02:00
Shannon Nelson
e470355bd9 ionic: Add driver stats
Add in the detailed statistics for ethtool -S that the driver
keeps as it processes packets.  Display of the additional
debug statistics can be enabled through the ethtool priv-flags
feature.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-05 09:24:44 +02:00
Shannon Nelson
1a371ea1b7 ionic: Add netdev-event handling
When the netdev gets a new name from userland, pass that name
down to the NIC for internal tracking.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-05 09:24:44 +02:00
Shannon Nelson
0f3154e6bc ionic: Add Tx and Rx handling
Add both the Tx and Rx queue setup and handling.  The related
stats display comes later.  Instead of using the generic napi
routines used by the slow-path commands, the Tx and Rx paths
are simplified and inlined in one file in order to get better
compiler optimizations.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-05 09:24:44 +02:00
Shannon Nelson
4d03e00a21 ionic: Add initial ethtool support
Add in the basic ethtool callbacks for device information
and control.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-05 09:24:44 +02:00
Shannon Nelson
8d61aad4e8 ionic: Add async link status check and basic stats
Add code to handle the link status event, and wire up the
basic netdev hardware stats.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-05 09:24:44 +02:00
Shannon Nelson
2a654540be ionic: Add Rx filter and rx_mode ndo support
Add the Rx filtering and rx_mode NDO callbacks.  Also add
the deferred work thread handling needed to manage the filter
requests outside of the netif_addr_lock spinlock.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-05 09:24:44 +02:00
Shannon Nelson
c1e329ebec ionic: Add management of rx filters
Set up the infrastructure for managing Rx filters.  We can't ask the
hardware for what filters it has, so we keep a local list of filters
that we've pushed into the HW.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-05 09:24:43 +02:00
Shannon Nelson
beead698b1 ionic: Add the basic NDO callbacks for netdev support
Set up the initial NDO structure and callbacks for netdev
to use, and register the netdev.  This will allow us to do
a few basic operations on the device, but no traffic yet.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-05 09:24:43 +02:00
Shannon Nelson
77ceb68e29 ionic: Add notifyq support
The AdminQ is fine for sending messages and requests to the NIC,
but we also need to have events published from the NIC to the
driver.  The NotifyQ handles this for us, using the same interrupt
as AdminQ.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-05 09:24:43 +02:00
Shannon Nelson
938962d552 ionic: Add adminq action
Add AdminQ specific message requests and completion handling.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-05 09:24:43 +02:00
Shannon Nelson
1d062b7b6f ionic: Add basic adminq support
Most of the NIC configuration happens through the AdminQ message
queue.  NAPI is used for basic interrupt handling and message
queue management.  These routines are set up to be shared among
different types of queues when used in slow-path handling.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-05 09:24:43 +02:00
Shannon Nelson
6461b446f2 ionic: Add interrupts and doorbells
The ionic interrupt model is based on interrupt control blocks
accessed through the PCI BAR.  Doorbell registers are used by
the driver to signal to the NIC that requests are waiting on
the message queues.  Interrupts are used by the NIC to signal
to the driver that answers are waiting on the completion queues.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-05 09:24:43 +02:00
Shannon Nelson
1a58e19646 ionic: Add basic lif support
The LIF is the Logical Interface, which represents the external
connections.  The NIC can multiplex many LIFs to a single port,
but in most setups, LIF0 is the primary control for the port.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-05 09:24:43 +02:00
Shannon Nelson
04436595c4 ionic: Add port management commands
The port management commands apply to the physical port
associated with the PCI device, which might be shared among
several logical interfaces.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-05 09:24:43 +02:00
Shannon Nelson
fbfb803153 ionic: Add hardware init and device commands
The ionic device has a small set of PCI registers, including a
device control and data space, and a large set of message
commands.

Also adds new DEVLINK_INFO_VERSION_GENERIC tags for
ASIC_ID, ASIC_REV, and FW.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-05 09:24:43 +02:00
Shannon Nelson
df69ba4321 ionic: Add basic framework for IONIC Network device driver
This patch adds a basic driver framework for the Pensando IONIC
network device.  There is no functionality right now other than
the ability to load and unload.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-05 09:24:43 +02:00
Ioana Radulescu
52b6a4ffe2 dpaa2-eth: Poll Tx pending frames counter on if down
Starting with firmware version MC10.18.0, a new counter for in flight
Tx frames is offered. Use it when bringing down the interface to
determine when all pending Tx frames have been processed by hardware
instead of sleeping a fixed amount of time.

Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-05 00:24:06 +02:00
Ioana Radulescu
d84c3a4ded dpaa2-eth: Add new DPNI statistics counters
Recent firmware versions expose more  DPNI counters.
Export relevant ones via ethtool -S.

Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-05 00:24:06 +02:00
Ioana Radulescu
ae90a6f0d9 dpaa2-eth: Minor refactoring in ethtool stats
As we prepare to read more pages from the DPNI stat counters,
reorganize the code a bit to make it easier to extend.

Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-05 00:24:06 +02:00
Arnd Bergmann
00d2fbf73d net: remove w90p910-ether driver
The ARM w90x900 platform is getting removed, so this driver is obsolete.

Link: https://lore.kernel.org/r/20190809202749.742267-14-arnd@arndb.de
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2019-09-04 17:57:52 +02:00
Arnd Bergmann
13b0aefee1 net: remove ks8695 driver
The platform is getting removed, so there are no remaining
users of this driver.

Link: https://lore.kernel.org/r/20190809202749.742267-6-arnd@arndb.de
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2019-09-04 17:57:43 +02:00
David S. Miller
2c1f9e2634 Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:

====================
100GbE Intel Wired LAN Driver Updates 2019-09-03

This series contains updates to ice driver only.

Anirudh adds the ability for the driver to handle EMP resets correctly
by adding the logic to the existing ice_reset_subtask().

Jeb fixes up the logic to properly free up the resources for a switch
rule whether or not it was successful in the removal.

Brett fixes up the reporting of ITR values to let the user know odd ITR
values are not allowed.  Fixes the driver to only disable VLAN pruning
on VLAN deletion when the VLAN being deleted is the last VLAN on the VF
VSI.

Chinh updates the driver to determine the TSA value from the priority
value when in CEE mode.

Bruce aligns the driver with the hardware specification by ensuring that
a PF reset is done as part of the unload logic.  Also update the driver
unloading field, based on the latest hardware specification, which
allows us to remove an unnecessary endian conversion.  Moves #defines
based on their need in the code.

Jesse adds the current state of auto-negotiation in the link up message.
In addition, adds additional information to inform the user of an issue
with the topology/configuration of the link.

Usha updates the driver to allow the maximum TCs that the firmware
supports, rather than hard coding to a set value.

Dave updates the DCB initialization flow to handle the case of an actual
error during DCB init.  Updated the driver to report the current stats,
even when the netdev is down, which aligns with our other drivers.

Mitch fixes the VF reset code flows to ensure that it properly calls
ice_dis_vsi_txq() to notify the firmware that the VF is being reset.

Michal fixes the driver so the DCB is not enabled when the SW LLDP is
activated, which was causing a communication issue with other NICs.  The
problem lies in that DCB was being enabled without checking the number
of TCs.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-03 21:51:25 -07:00
David S. Miller
94810bd365 mlx5-updates-2019-09-01 (Software steering support)
Abstract:
 --------
 Mellanox ConnetX devices supports packet matching, packet modification and
 redirection. These functionalities are also referred to as flow-steering.
 To configure a steering rule, the rule is written to the device owned
 memory, this memory is accessed and cached by the device when processing
 a packet.
 Steering rules are constructed from multiple steering entries (STE).
 
 Rules are configured using the Firmware command interface. The Firmware
 processes the given driver command and translates them to STEs, then
 writes them to the device memory in the current steering tables.
 This process is slow due to the architecture of the command interface and
 the processing complexity of each rule.
 
 The highlight of this patchset is to cut the middle man (The firmware) and
 do steering rules programming into device directly from the driver, with
 no firmware intervention whatsoever.
 
 Motivation:
 -----------
 Software (driver managed) steering allows for high rule insertion rates
 compared to the FW steering described above, this is achieved by using
 internal RDMA writes to the device owned memory instead of the slow
 command interface to program steering rules.
 
 Software (driver managed) steering, doesn't depend on new FW
 for new steering functionality, new implementations can be done in the
 driver skipping the FW layer.
 
 Performance:
 ------------
 The insertion rate on a single core using the new approach allows
 programming ~300K rules per sec. (Done via direct raw test to the new mlx5
 sw steering layer, without any kernel layer involved).
 
 Test: TC L2 rules
 33K/s with Software steering (this patchset).
 5K/s  with FW and current driver.
 This will improve OVS based solution performance.
 
 Architecture and implementation details:
 ----------------------------------------
 Software steering will be dynamically selected via devlink device
 parameter. Example:
 $ devlink dev param show pci/0000:06:00.0 name flow_steering_mode
           pci/0000:06:00.0:
           name flow_steering_mode type driver-specific
           values:
              cmode runtime value smfs
 
 mlx5 software steering module a.k.a (DR - Direct Rule) is implemented
 and contained in mlx5/core/steering directory and controlled by
 MLX5_SW_STEERING kconfig flag.
 
 mlx5 core steering layer (fs_core) already provides a shim layer for
 implementing different steering mechanisms, software steering will
 leverage that as seen at the end of this series.
 
 When Software Steering for a specific steering domain
 (NIC/RDMA/Vport/ESwitch, etc ..) is supported, it will cause rules
 targeting this domain to be created using  SW steering instead of FW.
 
 The implementation includes:
 Domain - The steering domain is the object that all other object resides
     in. It holds the memory allocator, send engine, locks and other shared
     data needed by lower objects such as table, matcher, rule, action.
     Each domain can contain multiple tables. Domain is equivalent to
     namespaces e.g (NIC/RDMA/Vport/ESwitch, etc ..) as implemented
     currently in mlx5_core fs_core (flow steering core).
 
 Table - Table objects are used for holding multiple matchers, each table
     has a level used to prevent processing loops. Packets are being
     directed to this table once it is set as the root table, this is done
     by fs_core using a FW command. A packet is being processed inside the
     table matcher by matcher until a successful hit, otherwise the packet
     will perform the default action.
 
 Matcher - Matchers objects are used to specify the fields mask for
     matching when processing a packet. A matcher belongs to a table, each
     matcher can hold multiple rules, each rule with different matching
     values corresponding to the matcher mask. Each matcher has a priority
     used for rule processing order inside the table.
 
 Action - Action objects are created to specify different steering actions
     such as count, reformat (encapsulate, decapsulate, ...), modify
     header, forward to table and many other actions. When creating a rule
     a sequence of actions can be provided to be executed on a successful
     match.
 
 Rule - Rule objects are used to specify a specific match on packets as
     well as the actions that should be executed. A rule belongs to a
     matcher.
 
 STE - This layer is used to hold the specific STE format for the device
     and to convert the requested rule to STEs. Each rule is constructed of
     an STE chain, Multiple rules construct a steering graph. Each node in
     the graph is a hash table containing multiple STEs. The index of each
     STE in the hash table is being calculated using a CRC32 hash function.
 
 Memory pool - Used for managing and caching device owned memory for rule
     insertion. The memory is being allocated using DM (device memory) API.
 
 Communication with device - layer for standard RDMA operation using  RC QP
     to configure the device steering.
 
 Command utility - This module holds all of the FW commands that are
     required for SW steering to function.
 
 Patch planning and files:
 -------------------------
 1) First patch, adds the support to Add flow steering actions to fs_cmd
 shim layer.
 
 2) Next 12 patch will add a file per each Software steering
 functionality/module as described above. (See patches with title: DR, *)
 
 3) Add CONFIG_MLX5_SW_STEERING for software steering support and enable
 build with the new files
 
 4) Next two patches will add the support for software steering in mlx5
 steering shim layer
 net/mlx5: Add API to set the namespace steering mode
 net/mlx5: Add direct rule fs_cmd implementation
 
 5) Last two patches will add the new devlink parameter to select mlx5
 steering mode, will be valid only for switchdev mode for now.
 Two modes are supported:
     1. DMFS - Device managed flow steering
     2. SMFS - Software/Driver managed flow steering.
 
     In the DMFS mode, the HW steering entities are created through the
     FW. In the SMFS mode this entities are created though the driver
     directly.
 
     The driver will use the devlink steering mode only if the steering
     domain supports it, for now SMFS will manages only the switchdev
     eswitch steering domain.
 
     User command examples:
     - Set SMFS flow steering mode::
 
         $ devlink dev param set pci/0000:06:00.0 name flow_steering_mode value "smfs" cmode runtime
 
     - Read device flow steering mode::
 
         $ devlink dev param show pci/0000:06:00.0 name flow_steering_mode
           pci/0000:06:00.0:
           name flow_steering_mode type driver-specific
           values:
              cmode runtime value smfs
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAl1uxPAACgkQSD+KveBX
 +j5AkggAymoYqG2G+s8cLa4vQFySaD1Td3VzzWglp7PlpDBE3UcSoMAMg/gIDU1D
 8F04PeCsJ6snt1ICk56vPNyAEHWfWeBUd56+QK5lEJBuwozyFvBh6HP81Bnr6T/n
 n6uTx45ljAFQPTHJjEOLBPSzEXecLu07+mvpzSoW0F3ehfGbELhL1IkVobr/RELx
 z4xZW9uM2vm5ylheWvjf4V1S/SvokgJazW9+4fh//rl8tfXgun5IfPoS0hqKie1/
 h5sjcMSYkYR4gLVqrhKmBYHmHVl/h0TYROckW8iC/+XX7ailSo9uPG7lPa6cm+GE
 7Bajlbz4oD/K5RWoByo+q+dmyjeVhQ==
 =M9bS
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-updates-2019-09-01-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-updates-2019-09-01  (Software steering support)

Abstract:
--------
Mellanox ConnetX devices supports packet matching, packet modification and
redirection. These functionalities are also referred to as flow-steering.
To configure a steering rule, the rule is written to the device owned
memory, this memory is accessed and cached by the device when processing
a packet.
Steering rules are constructed from multiple steering entries (STE).

Rules are configured using the Firmware command interface. The Firmware
processes the given driver command and translates them to STEs, then
writes them to the device memory in the current steering tables.
This process is slow due to the architecture of the command interface and
the processing complexity of each rule.

The highlight of this patchset is to cut the middle man (The firmware) and
do steering rules programming into device directly from the driver, with
no firmware intervention whatsoever.

Motivation:
-----------
Software (driver managed) steering allows for high rule insertion rates
compared to the FW steering described above, this is achieved by using
internal RDMA writes to the device owned memory instead of the slow
command interface to program steering rules.

Software (driver managed) steering, doesn't depend on new FW
for new steering functionality, new implementations can be done in the
driver skipping the FW layer.

Performance:
------------
The insertion rate on a single core using the new approach allows
programming ~300K rules per sec. (Done via direct raw test to the new mlx5
sw steering layer, without any kernel layer involved).

Test: TC L2 rules
33K/s with Software steering (this patchset).
5K/s  with FW and current driver.
This will improve OVS based solution performance.

Architecture and implementation details:
----------------------------------------
Software steering will be dynamically selected via devlink device
parameter. Example:
$ devlink dev param show pci/0000:06:00.0 name flow_steering_mode
          pci/0000:06:00.0:
          name flow_steering_mode type driver-specific
          values:
             cmode runtime value smfs

mlx5 software steering module a.k.a (DR - Direct Rule) is implemented
and contained in mlx5/core/steering directory and controlled by
MLX5_SW_STEERING kconfig flag.

mlx5 core steering layer (fs_core) already provides a shim layer for
implementing different steering mechanisms, software steering will
leverage that as seen at the end of this series.

When Software Steering for a specific steering domain
(NIC/RDMA/Vport/ESwitch, etc ..) is supported, it will cause rules
targeting this domain to be created using  SW steering instead of FW.

The implementation includes:
Domain - The steering domain is the object that all other object resides
    in. It holds the memory allocator, send engine, locks and other shared
    data needed by lower objects such as table, matcher, rule, action.
    Each domain can contain multiple tables. Domain is equivalent to
    namespaces e.g (NIC/RDMA/Vport/ESwitch, etc ..) as implemented
    currently in mlx5_core fs_core (flow steering core).

Table - Table objects are used for holding multiple matchers, each table
    has a level used to prevent processing loops. Packets are being
    directed to this table once it is set as the root table, this is done
    by fs_core using a FW command. A packet is being processed inside the
    table matcher by matcher until a successful hit, otherwise the packet
    will perform the default action.

Matcher - Matchers objects are used to specify the fields mask for
    matching when processing a packet. A matcher belongs to a table, each
    matcher can hold multiple rules, each rule with different matching
    values corresponding to the matcher mask. Each matcher has a priority
    used for rule processing order inside the table.

Action - Action objects are created to specify different steering actions
    such as count, reformat (encapsulate, decapsulate, ...), modify
    header, forward to table and many other actions. When creating a rule
    a sequence of actions can be provided to be executed on a successful
    match.

Rule - Rule objects are used to specify a specific match on packets as
    well as the actions that should be executed. A rule belongs to a
    matcher.

STE - This layer is used to hold the specific STE format for the device
    and to convert the requested rule to STEs. Each rule is constructed of
    an STE chain, Multiple rules construct a steering graph. Each node in
    the graph is a hash table containing multiple STEs. The index of each
    STE in the hash table is being calculated using a CRC32 hash function.

Memory pool - Used for managing and caching device owned memory for rule
    insertion. The memory is being allocated using DM (device memory) API.

Communication with device - layer for standard RDMA operation using  RC QP
    to configure the device steering.

Command utility - This module holds all of the FW commands that are
    required for SW steering to function.

Patch planning and files:
-------------------------
1) First patch, adds the support to Add flow steering actions to fs_cmd
shim layer.

2) Next 12 patch will add a file per each Software steering
functionality/module as described above. (See patches with title: DR, *)

3) Add CONFIG_MLX5_SW_STEERING for software steering support and enable
build with the new files

4) Next two patches will add the support for software steering in mlx5
steering shim layer
net/mlx5: Add API to set the namespace steering mode
net/mlx5: Add direct rule fs_cmd implementation

5) Last two patches will add the new devlink parameter to select mlx5
steering mode, will be valid only for switchdev mode for now.
Two modes are supported:
    1. DMFS - Device managed flow steering
    2. SMFS - Software/Driver managed flow steering.

    In the DMFS mode, the HW steering entities are created through the
    FW. In the SMFS mode this entities are created though the driver
    directly.

    The driver will use the devlink steering mode only if the steering
    domain supports it, for now SMFS will manages only the switchdev
    eswitch steering domain.

    User command examples:
    - Set SMFS flow steering mode::

        $ devlink dev param set pci/0000:06:00.0 name flow_steering_mode value "smfs" cmode runtime

    - Read device flow steering mode::

        $ devlink dev param show pci/0000:06:00.0 name flow_steering_mode
          pci/0000:06:00.0:
          name flow_steering_mode type driver-specific
          values:
             cmode runtime value smfs
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-03 21:46:13 -07:00
Brett Creeley
cd186e5151 ice: Only disable VLAN pruning for the VF when all VLANs are removed
Currently if the VF adds a VLAN, VLAN pruning will be enabled for that VSI.
Also, when a VLAN gets deleted it will disable VLAN pruning even if other
VLAN(s) exists for the VF. Fix this by only disabling VLAN pruning on the
VF VSI when removing the last VF (i.e. vf->num_vlan == 0).

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-03 17:17:13 -07:00
Michal Swiatkowski
03bba02016 ice: Remove enable DCB when SW LLDP is activated
Remove code that enables DCB in initialization when SW LLDP is
activated. DCB flag is set or reset before in ice_init_pf_dcb
based on number of TCs. So there is not need to overwrite it.

Setting DCB without checking number of TCs can cause communication
problems with other cards. Host card sends packet with VLAN priority
tag, but client card doesn't strip this tag and ping doesn't work.

Signed-off-by: Michal Swiatkowski <michal.swiatkowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-03 17:14:37 -07:00
Dave Ertman
3d57fd10f2 ice: Report stats when VSI is down
There is currently a check in get_ndo_stats that
returns before updating stats if the VSI is down
or there are no Tx or Rx queues.  This causes the
netdev to report zero stats with the netdev is down.

Remove the check so that the behavior of reporting
stats is the same as it was in IXGBE.

Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-03 17:07:50 -07:00
Mitch Williams
06914ac20a ice: Always notify FW of VF reset
The call to ice_dis_vsi_txq() acts as the notification to the firmware
that the VF is being reset. Because of this, we need to make this call
every time we reset, regardless of whatever else we do to stop the Tx
queues.

Without this change, VF resets would fail to complete on interfaces that
were up and running.

Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-03 17:04:14 -07:00
Dave Ertman
473ca57488 ice: Correctly handle return values for init DCB
In the init path for DCB, the call to ice_init_dcb()
can return a non-zero value for either an actual
error, or due to the FW lldp engine being stopped.

We are currently treating all non-zero values only as
an indication that the FW LLDP engine is stopped.

Check for an actual error in the DCB init flow.

Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-03 17:02:23 -07:00
Usha Ketineni
a257f188b7 ice: Limit Max TCs on devices with more than 4 ports
This patch limits the max TCs set by the driver to the value provided by
the firmware as per the capabilities of the device. Otherwise, hard coding
to 8 TC max would fail the device configurations with more than 4 ports.

Signed-off-by: Usha Ketineni <usha.k.ketineni@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-03 16:35:58 -07:00
Tony Nguyen
6a025730e0 ice: Cleanup defines in ice_type.h
Conventionally, if the #defines/other are not needed by other header
files being included, #includes are done first followed by #defines
and other stuff. Move the #defines before the #includes to follow this
convention.

Suggested by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-03 16:32:30 -07:00
Jesse Brandeburg
2e0ab37c04 ice: print extra message if topology issue
The driver needs to inform the user if there is an issue
with the topology / configuration of the link.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-03 16:27:45 -07:00
Jesse Brandeburg
432609887a ice: add print of autoneg state to link message
Print the state of auto-negotiation when printing the Link
up message.  Adds new text to the "NIC Link is up" line like
Autoneg: <True | False>

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-03 16:25:34 -07:00
Bruce Allan
7404e84a23 ice: update driver unloading field for Queue Shutdown AQ command
According to recent specification versions, the field in the Queue Shutdown
AdminQ command consisting of the "driver unloading" indication is not a 4
byte field (it is byte.bit 16.0).  Change it to a byte and remove the
unnecessary endian conversion.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-03 16:23:35 -07:00
Bruce Allan
18057cb357 ice: add needed PFR during driver unload
According to the specification, a PF Reset must be done as part of the
driver unload flow.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-03 16:18:52 -07:00
Chinh T Cao
d24ef08a9d ice: Deduce TSA value from the priority value in the CEE mode
In CEE mode, the TSA information can be derived from the reported
priority value.

Signed-off-by: Chinh T Cao <chinh.t.cao@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-03 16:16:36 -07:00
Brett Creeley
567af267fa ice: Report what the user set for coalesce [tx|rx]-usecs
Currently if the user sets an odd value for [tx|rx]-usecs we align the
value because the hardware only understands ITR values in multiples of
2. This seems misleading because we are essentially telling the user
that the ITR value is odd, when in fact we have changed it internally.
Fix this by reporting that setting odd ITR values is not allowed.

Also, while making changes to ice_set_rc_coalesce() I noticed a bit of
code/error duplication. Make the necessary changes to remove the
duplication.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-03 16:11:10 -07:00
Jeb Cramer
8132e17dfb ice: Fix resource leak in ice_remove_rule_internal()
We don't free s_rule if ice_aq_sw_rules() returns a non-zero status.  If
it returned a zero status, s_rule would be freed right after, so this
implies it should be freed within the scope of the function regardless.

Signed-off-by: Jeb Cramer <jeb.j.cramer@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-03 16:08:54 -07:00
Anirudh Venkataramanan
03af840650 ice: Fix EMP reset handling
ice_reset_subtask needs to handle EMP resets as well, as EMP resets
can be triggered by the firmware. This patch adds the logic to do
this.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-03 13:47:12 -07:00
Maor Gottlieb
e890acd5ff net/mlx5: Add devlink flow_steering_mode parameter
Add new parameter (flow_steering_mode) to control the flow steering
mode of the driver.
Two modes are supported:
1. DMFS - Device managed flow steering
2. SMFS - Software/Driver managed flow steering.

In the DMFS mode, the HW steering entities are created through the
FW. In the SMFS mode this entities are created though the driver
directly.

The driver will use the devlink steering mode only if the steering
domain supports it, for now SMFS will manages only the switchdev eswitch
steering domain.

User command examples:
- Set SMFS flow steering mode::

    $ devlink dev param set pci/0000:06:00.0 name flow_steering_mode value "smfs" cmode runtime

- Read device flow steering mode::

    $ devlink dev param show pci/0000:06:00.0 name flow_steering_mode
      pci/0000:06:00.0:
      name flow_steering_mode type driver-specific
      values:
         cmode runtime value smfs

Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-09-03 12:54:24 -07:00
Maor Gottlieb
8463daf17e net/mlx5: Add support to use SMFS in switchdev mode
In case that flow steering mode of the driver is SMFS (Software Managed
Flow Steering), then use the DR (SW steering) API to create the steering
objects.

In addition, add a call to the set peer namespace when switchdev gets
devcom pair event. It is required to support VF LAG in SMFS.

Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-09-03 12:54:24 -07:00
Maor Gottlieb
38b9d1c62a net/mlx5: Add API to set the namespace steering mode
Add API to set the flow steering root namesapce mode.
Setting new mode should be called before any steering operation
is executed on the namespace.
This API is going to be used by steering users such switchdev.

Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-09-03 12:54:24 -07:00
Maor Gottlieb
6a48faeeca net/mlx5: Add direct rule fs_cmd implementation
Add support to create flow steering objects
via direct rule API (SW steering).
New layer is added - fs_dr, this layer translates the command that
fs_core sends to the FW into direct rule API. In case that direct
rule is not supported in some feature then -EOPNOTSUPP is
returned.

Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-09-03 12:54:23 -07:00
Alex Vesker
fb86f1210a net/mlx5: DR, Add CONFIG_MLX5_SW_STEERING for software steering support
Add new mlx5 Kconfig flag to allow selecting software steering
support and compile all the steering files only if the flag is
selected.

Signed-off-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@mellanox.com>
Reviewed-by: Erez Shitrit <erezsh@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-09-03 12:54:23 -07:00
Alex Vesker
70605ea545 net/mlx5: DR, Expose APIs for direct rule managing
Expose APIs for direct rule managing to increase insertion rate by
bypassing the firmware.

Signed-off-by: Alex Vesker <valex@mellanox.com>
Reviewed-by: Erez Shitrit <erezsh@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-09-03 12:54:23 -07:00
Alex Vesker
c47ff7baff net/mlx5: DR, Add required FW steering functionality
SW steering is capable of doing many steering functionalities
but there are still some functionalities which are not exposed
to upper layers and therefore performed by the FW.

This is the support for recalculating checksum using a hairpin QP.
The recalculation is required after a modify TTL action which skips
the needed CS calculation in HW.

Signed-off-by: Alex Vesker <valex@mellanox.com>
Reviewed-by: Erez Shitrit <erezsh@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-09-03 12:54:22 -07:00
Alex Vesker
41d0707415 net/mlx5: DR, Expose steering rule functionality
Rules are the actual objects that tie matchers, header values and
actions. Each rule belongs to a matcher, which can hold multiple rules
sharing the same mask. Each rule is a specific set of values and
actions.
When a packet reaches a matcher it is being matched against the
matcher`s rules. In case of a match over a rule its actions will be
executed. Each rule object contains a set of STEs, where each STE is a
definition of match values and actions defined by the rule.
This file handles the rule operations and processing.

Signed-off-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-09-03 12:54:22 -07:00
Alex Vesker
9db810ed2d net/mlx5: DR, Expose steering action functionality
On rule creation a set of actions can be provided, the actions describe
what to do with the packet in case of a match. It is possible to provide
a set of actions which will be done by order.

Signed-off-by: Alex Vesker <valex@mellanox.com>
Reviewed-by: Erez Shitrit <erezsh@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-09-03 12:54:22 -07:00
Alex Vesker
852f660bd7 net/mlx5: DR, Expose steering matcher functionality
Matcher defines which packets fields are matched when a packet arrives.
Matcher is a part of a table and can contain one or more rules. Where
rule defines specific values of the matcher's mask definition.

Signed-off-by: Alex Vesker <valex@mellanox.com>
Reviewed-by: Erez Shitrit <erezsh@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-09-03 12:54:21 -07:00
Alex Vesker
7838e17253 net/mlx5: DR, Expose steering table functionality
Tables are objects which are used for storing matchers, each table
belongs to a domain and defined by the domain type. When a packet
reaches the table it is being processed by each of its matchers until a
successful match. Tables can hold multiple matchers ordered by matcher
priority. Each table has a level.

Signed-off-by: Alex Vesker <valex@mellanox.com>
Reviewed-by: Erez Shitrit <erezsh@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-09-03 12:54:21 -07:00
Alex Vesker
4ec9e7b026 net/mlx5: DR, Expose steering domain functionality
Domain is the frame for all of the dr (direct rule) objects.
There are different domain types which also affect the object under that
domain. Each domain can hold multiple tables which can hold multiple
matchers and so on, this means that all of the dr (direct rule) objects
exist under a specific domain. The domain object also holds the
resources needed for other objects such as memory management and
communication with the device.

Signed-off-by: Alex Vesker <valex@mellanox.com>
Reviewed-by: Erez Shitrit <erezsh@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-09-03 12:54:21 -07:00
Alex Vesker
26d688e33f net/mlx5: DR, Add Steering entry (STE) utilities
Steering Entry (STE) object is the basic building block of the steering
map. There are several types of STEs. Each rule can be constructed of
multiple STEs. Each STE dictates which fields of the packet's header are
being matched as well as the information about the next step in map (hit
and miss pointers). The hardware gets a packet and tries to match it
against the STEs, going to either the hit pointer or the miss pointer.
This file handles the STE operations.

Signed-off-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-09-03 12:54:21 -07:00
Alex Vesker
297cccebdc net/mlx5: DR, Expose an internal API to issue RDMA operations
Inserting or deleting a rule is done by RDMA read/write operation to SW
ICM device memory. This file provides the support for executing these
operations. It includes allocating the needed resources and providing an
API for writing steering entries to the memory.

Signed-off-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Mark Bloch <markb@mellanox.com>
Reviewed-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-09-03 12:54:20 -07:00
Alex Vesker
29cf8febd1 net/mlx5: DR, ICM pool memory allocator
ICM device memory is used for writing steering rules (STEs) to the NIC.
An ICM memory pool allocator was implemented to manage the required
memory. The pool consists of buckets, a bucket per chunk size.
Once a bucket is empty we will cut a row of memory from the latest
allocated MR, if the MR size is not sufficient we will allocate a new MR.
HW design requires that chunks memory address should be aligned to the
chunk size, this is the reason for managing the MR with row size that
insures memory alignment.
Current design is greedy in memory but provides quick allocation times
in steady state.

Signed-off-by: Alex Vesker <valex@mellanox.com>
Reviewed-by: Erez Shitrit <erezsh@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-09-03 12:54:20 -07:00
Alex Vesker
1d9186476e net/mlx5: DR, Add direct rule command utilities
Add direct rule command utilities which consists of all the FW
commands that are executed to provide the SW steering functionality.

Signed-off-by: Alex Vesker <valex@mellanox.com>
Reviewed-by: Erez Shitrit <erezsh@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-09-03 12:54:20 -07:00
Alex Vesker
14c32fd17c net/mlx5: DR, Add the internal direct rule types definitions
Add the internal header file that contains various types
definition that will be used in coming patches as well as
the internal functions decelerations.

Signed-off-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@mellanox.com>
Reviewed-by: Erez Shitrit <erezsh@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-09-03 12:54:19 -07:00
Maor Gottlieb
2b688ea5ef net/mlx5: Add flow steering actions to fs_cmd shim layer
Add flow steering actions: modify header and packet reformat
to the fs_cmd shim layer. This allows each namespace to define
possibly different functionality for alloc/dealloc action commands.

Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-09-03 12:54:19 -07:00
Matteo Croce
7d04b0b13b mvpp2: percpu buffers
Every mvpp2 unit can use up to 8 buffers mapped by the BM (the HW buffer
manager). The HW will place the frames in the buffer pool depending on the
frame size: short (< 128 bytes), long (< 1664) or jumbo (up to 9856).

As any unit can have up to 4 ports, the driver allocates only 2 pools,
one for small and one long frames, and share them between ports.
When the first port MTU is set higher than 1664 bytes, a third pool is
allocated for jumbo frames.

This shared allocation makes impossible to use percpu allocators,
and creates contention between HW queues.

If possible, i.e. if the number of possible CPU are less than 8 and jumbo
frames are not used, switch to a new scheme: allocate 8 per-cpu pools for
short and long frames and bind every pool to an RXQ.

When the first port MTU is set higher than 1664 bytes, the allocation
scheme is reverted to the old behaviour (3 shared pools), and when all
ports MTU are lowered, the per-cpu buffers are allocated again.

Signed-off-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-02 12:07:46 -07:00
Matteo Croce
136163618e mvpp2: refactor BM pool functions
Refactor mvpp2_bm_pool_create(), mvpp2_bm_pool_destroy() and
mvpp2_bm_pools_init() so that they accept a struct device instead
of a struct platform_device, as they just need platform_device->dev.

Removing such dependency makes the BM code more reusable in context
where we don't have a pointer to the platform_device.

Signed-off-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-02 12:07:46 -07:00
Yizhuo
e33b4325e6 net: stmmac: dwmac-sun8i: Variable "val" in function sun8i_dwmac_set_syscon() could be uninitialized
In function sun8i_dwmac_set_syscon(), local variable "val" could
be uninitialized if function regmap_field_read() returns -EINVAL.
However, it will be used directly in the if statement, which
is potentially unsafe.

Signed-off-by: Yizhuo <yzhai003@ucr.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-02 11:48:15 -07:00
Jiri Pirko
a21cf11bc5 mlx5: Add missing init_net check in FIB notifier
Take only FIB events that are happening in init_net into account. No other
namespaces are supported.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-02 11:44:14 -07:00
David S. Miller
765b7590c9 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
r8152 conflicts are the NAPI fixes in 'net' overlapping with
some tasklet stuff in net-next

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-02 11:20:17 -07:00
Saeed Mahameed
a06ebb8d95 Merge branch 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
Merge mlx5-next patches needed for upcoming mlx5 software steering.

1) Alex adds HW bits and definitions required for SW steering
2) Ariel moves device memory management to mlx5_core (From mlx5_ib)
3) Maor, Cleanups and fixups for eswitch mode and RoCE
4) Mark, Set only stag for match untagged packets

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-09-02 00:16:05 -07:00
Mark Bloch
fc60329426 net/mlx5: Set only stag for match untagged packets
cvlan_tag enabled in match criteria and disabled in
match value means both S & C tags don't exist (untagged of both).

Signed-off-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-09-01 23:44:42 -07:00
Maor Gottlieb
3a6ef5158d net/mlx5: Avoid disabling RoCE when uninitialized
Move the check if RoCE steering is initialized to the
disable RoCE function, it will ensure that we disable
RoCE only if we succeeded in enabling it before.

Fixes: 80f09dfc23 ("net/mlx5: Eswitch, enable RoCE loopback traffic")
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-09-01 23:44:42 -07:00
Ariel Levkovich
c9b9dcb430 net/mlx5: Move device memory management to mlx5_core
Move the device memory allocation and deallocation commands
SW ICM memory to mlx5_core to expose this API for all
mlx5_core users.

This comes as preparation for supporting SW steering in kernel
where it will be required to allocate and register device
memory for direct rule insertion.

In addition, an API to register this device memory for future
remote access operations is introduced using the create_mkey
commands.

Signed-off-by: Ariel Levkovich <lariel@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-09-01 23:44:41 -07:00
YueHaibing
b943e03341 net: hns3: remove set but not used variable 'qos'
Fixes gcc '-Wunused-but-set-variable' warning:

drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c: In function 'hclge_restore_vlan_table':
drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c:8016:18: warning:
 variable 'qos' set but not used [-Wunused-but-set-variable]

Reported-by: Hulk Robot <hulkci@huawei.com>
Fixes: 70a214903d ("net: hns3: reduce the parameters of some functions")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-01 12:12:59 -07:00
Colin Ian King
bdad7529ee net: hns3: remove redundant assignment to pointer reg_info
Pointer reg_info is being initialized with a value that is never read and
is being re-assigned a little later on. The assignment is redundant
and hence can be removed.

Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-01 12:12:16 -07:00
Christophe JAILLET
e1e54ec7fb net: seeq: Fix the function used to release some memory in an error handling path
In commit 99cd149efe ("sgiseeq: replace use of dma_cache_wback_inv"),
a call to 'get_zeroed_page()' has been turned into a call to
'dma_alloc_coherent()'. Only the remove function has been updated to turn
the corresponding 'free_page()' into 'dma_free_attrs()'.
The error hndling path of the probe function has not been updated.

Fix it now.

Rename the corresponding label to something more in line.

Fixes: 99cd149efe ("sgiseeq: replace use of dma_cache_wback_inv")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Thomas Bogendoerfer <tbogendoerfer@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-01 12:10:11 -07:00
Heiner Kallweit
dc161162e4 r8169: don't set bit RxVlan on RTL8125
RTL8125 uses a different register for VLAN offloading config,
therefore don't set bit RxVlan.

Fixes: f1bce4ad2f ("r8169: add support for RTL8125")
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-01 11:37:36 -07:00
Christophe JAILLET
dd7078f05e enetc: Add missing call to 'pci_free_irq_vectors()' in probe and remove functions
Call to 'pci_free_irq_vectors()' are missing both in the error handling
path of the probe function, and in the remove function.
Add them.

Fixes: 19971f5ea0 ("enetc: add PTP clock driver")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-31 23:53:17 -07:00
Ryan M. Collins
dd1bf47a84 net: bcmgenet: use ethtool_op_get_ts_info()
This change enables the use of SW timestamping on the Raspberry Pi 4.

bcmgenet's transmit function bcmgenet_xmit() implements software
timestamping. However the SOF_TIMESTAMPING_TX_SOFTWARE capability was
missing and only SOF_TIMESTAMPING_RX_SOFTWARE was announced. By using
ethtool_ops bcmgenet_ethtool_ops() as get_ts_info(), the
SOF_TIMESTAMPING_TX_SOFTWARE capability is announced.

Similar to commit a8f5cb9e79 ("smsc95xx: use ethtool_op_get_ts_info()")

Signed-off-by: Ryan M. Collins <rmc032@bucknell.edu>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Doug Berger <opendmb@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-31 23:51:19 -07:00
Dmitry Bogdanov
be6cef69ba net: aquantia: fix out of memory condition on rx side
On embedded environments with hard memory limits it is a normal although
rare case when skb can't be allocated on rx part under high traffic.

In such OOM cases napi_complete_done() was not called.
So the napi object became in an invalid state like it is "scheduled".
Kernel do not re-schedules the poll of that napi object.

Consequently, kernel can not remove that object the system hangs on
`ifconfig down` waiting for a poll.

We are fixing this by gracefully closing napi poll routine with correct
invocation of napi_complete_done.

This was reproduced with artificially failing the allocation of skb to
simulate an "out of memory" error case and check that traffic does
not get stuck.

Fixes: 970a2e9864 ("net: ethernet: aquantia: Vector operations")
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: Dmitry Bogdanov <dmitry.bogdanov@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-31 19:07:11 -07:00
Igor Russkikh
5c47e3ba6f net: aquantia: linkstate irq should be oneshot
Declaring threaded irq handler should also indicate the irq is
oneshot. It is oneshot indeed, because HW implements irq automasking
on trigger.

Not declaring this causes some kernel configurations to fail
on interface up, because request_threaded_irq returned an err code.

The issue was originally hidden on normal x86_64 configuration with
latest kernel, because depending on interrupt controller, irq driver
added ONESHOT flag on its own.

Issue was observed on older kernels (4.14) where no such logic exists.

Fixes: 4c83f170b3 ("net: aquantia: link status irq handling")
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Reported-by: Michael Symolkin <Michael.Symolkin@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-31 19:07:11 -07:00
Dmitry Bogdanov
c2ef057ee7 net: aquantia: reapply vlan filters on up
In case of device reconfiguration the driver may reset the device invisible
for other modules, vlan module in particular. So vlans will not be
removed&created and vlan filters will not be configured in the device.
The patch reapplies the vlan filters at device start.

Fixes: 7975d2aff5 ("net: aquantia: add support of rx-vlan-filter offload")
Signed-off-by: Dmitry Bogdanov <dmitry.bogdanov@aquantia.com>
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-31 19:07:11 -07:00
Dmitry Bogdanov
392349f601 net: aquantia: fix limit of vlan filters
Fix a limit condition of vlans on the interface before setting vlan
promiscuous mode

Fixes: 48dd73d08d ("net: aquantia: fix vlans not working over bridged network")
Signed-off-by: Dmitry Bogdanov <dmitry.bogdanov@aquantia.com>
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-31 19:07:11 -07:00
Dmitry Bogdanov
6fdc060d74 net: aquantia: fix removal of vlan 0
Due to absence of checking against the rx flow rule when vlan 0 is being
removed, the other rule could be removed instead of the rule with vlan 0

Fixes: 7975d2aff5 ("net: aquantia: add support of rx-vlan-filter offload")
Signed-off-by: Dmitry Bogdanov <dmitry.bogdanov@aquantia.com>
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-31 19:07:10 -07:00
Sudarsana Reddy Kalluru
849dbf0923 qede: Add support for dumping the grc data.
This patch adds driver support for configuring grc dump config flags, and
dumping the grc data via ethtool get/set-dump interfaces.

Signed-off-by: Sudarsana Reddy Kalluru <skalluru@marvell.com>
Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-31 13:32:30 -07:00
Sudarsana Reddy Kalluru
3b86bd0762 qed: Add APIs for configuring grc dump config flags.
The patch adds driver support for configuring the grc dump config flags.

Signed-off-by: Sudarsana Reddy Kalluru <skalluru@marvell.com>
Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-31 13:32:30 -07:00
Sudarsana Reddy Kalluru
d44a3ced70 qede: Add support for reading the config id attributes.
Add driver support for dumping the config id attributes via ethtool dump
interfaces.

Signed-off-by: Sudarsana Reddy Kalluru <skalluru@marvell.com>
Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-31 13:32:30 -07:00
Sudarsana Reddy Kalluru
2d4c849530 qed: Add APIs for reading config id attributes.
The patch adds driver support for reading the config id attributes from NVM
flash partition.

Signed-off-by: Sudarsana Reddy Kalluru <skalluru@marvell.com>
Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-31 13:32:30 -07:00
Denis Efremov
7cf92ccb85 net/mlx5e: Remove unlikely() from WARN*() condition
"unlikely(WARN_ON_ONCE(x))" is excessive. WARN_ON_ONCE() already uses
unlikely() internally.

Signed-off-by: Denis Efremov <efremov@linux.com>
Cc: Boris Pismenny <borisp@mellanox.com>
Cc: Saeed Mahameed <saeedm@mellanox.com>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Joe Perches <joe@perches.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: netdev@vger.kernel.org
Acked-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-30 19:49:03 -07:00
David S. Miller
94880a5b2e Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:

====================
pull-request: bpf 2019-08-31

The following pull-request contains BPF updates for your *net* tree.

The main changes are:

1) Fix 32-bit zero-extension during constant blinding which
   has been causing a regression on ppc64, from Naveen.

2) Fix a latency bug in nfp driver when updating stack index
   register, from Jiong.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-30 17:39:37 -07:00
Michael Chan
e72cb7d624 bnxt_en: Fix compile error regression with CONFIG_BNXT_SRIOV not set.
Add a new function bnxt_get_registered_vfs() to handle the work
of getting the number of registered VFs under #ifdef CONFIG_BNXT_SRIOV.
The main code will call this function and will always work correctly
whether CONFIG_BNXT_SRIOV is set or not.

Fixes: 230d1f0de7 ("bnxt_en: Handle firmware reset.")
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-30 17:38:24 -07:00
Maxim Mikityanskiy
282c0c798f net/mlx5e: Allow XSK frames smaller than a page
Relax the requirements to the XSK frame size to allow it to be smaller
than a page and even not a power of two. The current implementation can
work in this mode, both with Striding RQ and without it.

The code that checks `mtu + headroom <= XSK frame size` is modified
accordingly. Any frame size between 2048 and PAGE_SIZE is accepted.

Functions that worked with pages only now work with XSK frames, even if
their size is different from PAGE_SIZE.

With XSK queues, regardless of the frame size, Striding RQ uses the
stride size of PAGE_SIZE, and UMR MTTs are posted using starting
addresses of frames, but PAGE_SIZE as page size. MTU guarantees that no
packet data will overlap with other frames. UMR MTT size is made equal
to the stride size of the RQ, because UMEM frames may come in random
order, and we need to handle them one by one. PAGE_SIZE is just a power
of two that is bigger than any allowed XSK frame size, and also it
doesn't require making additional changes to the code.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-31 01:08:26 +02:00
Kevin Laatz
beb3e4b295 mlx5e: modify driver for handling offsets
With the addition of the unaligned chunks option, we need to make sure we
handle the offsets accordingly based on the mode we are currently running
in. This patch modifies the driver to appropriately mask the address for
each case.

Signed-off-by: Kevin Laatz <kevin.laatz@intel.com>
Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-31 01:08:26 +02:00
Kevin Laatz
d8c3061e5e ixgbe: modify driver for handling offsets
With the addition of the unaligned chunks option, we need to make sure we
handle the offsets accordingly based on the mode we are currently running
in. This patch modifies the driver to appropriately mask the address for
each case.

Signed-off-by: Kevin Laatz <kevin.laatz@intel.com>
Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-31 01:08:26 +02:00
Kevin Laatz
2f86c806a8 i40e: modify driver for handling offsets
With the addition of the unaligned chunks option, we need to make sure we
handle the offsets accordingly based on the mode we are currently running
in. This patch modifies the driver to appropriately mask the address for
each case.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: Kevin Laatz <kevin.laatz@intel.com>
Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-31 01:08:26 +02:00
Kevin Laatz
b35a2d3e89 ixgbe: simplify Rx buffer recycle
Currently, the dma, addr and handle are modified when we reuse Rx buffers
in zero-copy mode. However, this is not required as the inputs to the
function are copies, not the original values themselves. As we use the
copies within the function, we can use the original 'obi' values
directly without having to mask and add the headroom.

Signed-off-by: Kevin Laatz <kevin.laatz@intel.com>
Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-31 01:08:26 +02:00
Kevin Laatz
10912fc9fa i40e: simplify Rx buffer recycle
Currently, the dma, addr and handle are modified when we reuse Rx buffers
in zero-copy mode. However, this is not required as the inputs to the
function are copies, not the original values themselves. As we use the
copies within the function, we can use the original 'old_bi' values
directly without having to mask and add the headroom.

Signed-off-by: Kevin Laatz <kevin.laatz@intel.com>
Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-31 01:08:26 +02:00
Jakub Kicinski
f24e29099f nfp: bpf: add simple map op cache
Each get_next and lookup call requires a round trip to the device.
However, the device is capable of giving us a few entries back,
instead of just one.

In this patch we ask for a small yet reasonable number of entries
(4) on every get_next call, and on subsequent get_next/lookup calls
check this little cache for a hit. The cache is only kept for 250us,
and is invalidated on every operation which may modify the map
(e.g. delete or update call). Note that operations may be performed
simultaneously, so we have to keep track of operations in flight.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-31 00:49:05 +02:00
Jakub Kicinski
bc2796db5a nfp: bpf: rework MTU checking
If control channel MTU is too low to support map operations a warning
will be printed. This is not enough, we want to make sure probe fails
in such scenario, as this would clearly be a faulty configuration.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-31 00:49:05 +02:00
Vlad Buslov
daa664a5cd net/mlx5e: Move local var definition into ifdef block
New local variable "struct flow_block_offload *f" was added to
mlx5e_setup_tc() in recent rtnl lock removal patches. The variable is used
in code that is only compiled when CONFIG_MLX5_ESWITCH is enabled. This
results compilation warning about unused variable when CONFIG_MLX5_ESWITCH
is not set. Move the variable definition into eswitch-specific code block
from the beginning of mlx5e_setup_tc() function.

Fixes: c9f14470d0 ("net: sched: add API for registering unlocked offload block callbacks")
Reported-by: tanhuazhong <tanhuazhong@huawei.com>
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-30 15:12:05 -07:00
Stephen Rothwell
27382472ad net: stmmac: depend on COMMON_CLK
Fixes: 190f73ab4c ("net: stmmac: setup higher frequency clk support for EHL & TGL")
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-30 14:35:02 -07:00
Chen-Yu Tsai
3b25528e1e net: stmmac: dwmac-rk: Don't fail if phy regulator is absent
The devicetree binding lists the phy phy as optional. As such, the
driver should not bail out if it can't find a regulator. Instead it
should just skip the remaining regulator related code and continue
on normally.

Skip the remainder of phy_power_on() if a regulator supply isn't
available. This also gets rid of the bogus return code.

Fixes: 2e12f53663 ("net: stmmac: dwmac-rk: Use standard devicetree property for phy regulator")
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-30 14:16:26 -07:00
YueHaibing
b6b4dc4c1f amd-xgbe: Fix error path in xgbe_mod_init()
In xgbe_mod_init(), we should do cleanup if some error occurs

Reported-by: Hulk Robot <hulkci@huawei.com>
Fixes: efbaa82833 ("amd-xgbe: Add support to handle device renaming")
Fixes: 47f164deab ("amd-xgbe: Add PCI device support")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-30 14:15:31 -07:00
Vasundhara Volam
acfb50e4e7 bnxt_en: Add FW fatal devlink_health_reporter.
Health show command example and output:

$ devlink health show pci/0000:af:00.0 reporter fw_fatal

pci/0000:af:00.0:
  name fw_fatal
    state healthy error 1 recover 1 grace_period 0 auto_recover true

Fatal events from firmware or missing periodic heartbeats will
be reported and recovery will be handled.

We also turn on the support flags when we register with the firmware to
enable this health and recovery feature in the firmware.

Cc: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-30 14:02:19 -07:00
Michael Chan
d1db9e166b bnxt_en: Add bnxt_fw_exception() to handle fatal firmware errors.
This call will handle fatal firmware errors by forcing a reset on the
firmware.  The master function driver will carry out the forced reset.
The sequence will go through the same bnxt_fw_reset_task() workqueue.
This fatal reset differs from the non-fatal reset at the beginning
stages.  From the BNXT_FW_RESET_STATE_ENABLE_DEV state onwards where
the firmware is coming out of reset, it is practically identical to the
non-fatal reset.

The next patch will add the periodic heartbeat check and the devlink
reporter to report the fatal event and to initiate the bnxt_fw_exception()
call.

Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-30 14:02:19 -07:00
Michael Chan
cbb51067a5 bnxt_en: Add RESET_FW state logic to bnxt_fw_reset_task().
This state handles driver initiated chip reset during error recovery.
Only the master function will perform this step during error recovery.
The next patch will add code to initiate this reset from the master
function.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-30 14:02:19 -07:00
Michael Chan
b4fff2079d bnxt_en: Do not send firmware messages if firmware is in error state.
Add a flag to mark that the firmware has encountered fatal condition.
The driver will not send any more firmware messages and will return
error to the caller.  Fix up some clean up functions to continue
and not abort when the firmware message function returns error.

This is preparation work to fully handle firmware error recovery
under fatal conditions.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-30 14:02:19 -07:00
Vasundhara Volam
2cd8696850 bnxt_en: Retain user settings on a VF after RESET_NOTIFY event.
Retain the VF MAC address, default VLAN, TX rate control, trust settings
of VFs after firmware reset.

Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-30 14:02:19 -07:00
Vasundhara Volam
657a33c8a0 bnxt_en: Add devlink health reset reporter.
Add devlink health reporter for the firmware reset event.  Once we get
the notification from firmware about the impending reset, the driver
will report this to devlink and the call to bnxt_fw_reset() will be
initiated to complete the reset sequence.

Cc: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-30 14:02:19 -07:00
Michael Chan
230d1f0de7 bnxt_en: Handle firmware reset.
Add the bnxt_fw_reset() main function to handle firmware reset.  This
is triggered by firmware to initiate an orderly reset, for example
when a non-fatal exception condition has been detected.  bnxt_fw_reset()
will first wait for all VFs to shutdown and then start the
bnxt_fw_reset_task() work queue to go through the sequence of reset,
re-probe, and re-initialization.

The next patch will add the devlink reporter to start the sequence and
call bnxt_fw_reset().

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-30 14:02:19 -07:00
Michael Chan
2151fe0830 bnxt_en: Handle RESET_NOTIFY async event from firmware.
This event from firmware signals a coordinated reset initiated by the
firmware.  It may be triggered by some error conditions encountered
in the firmware or other orderly reset conditions.

We store the parameters from this event.  Subsequent patches will
add logic to handle reset itself using devlink reporters.

Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-30 14:02:19 -07:00
Vasundhara Volam
6763c779c2 bnxt_en: Add new FW devlink_health_reporter
Create new FW devlink_health_reporter, to know the current health
status of FW.

Command example and output:
$ devlink health show pci/0000:af:00.0 reporter fw

pci/0000:af:00.0:
  name fw
    state healthy error 0 recover 0

 FW status: Healthy; Reset count: 1

Cc: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-30 14:02:19 -07:00
Michael Chan
3bc7d4a352 bnxt_en: Add BNXT_STATE_IN_FW_RESET state.
The new flag will be set in subsequent patches when firmware is
going through reset.  If bnxt_close() is called while the new flag
is set, the FW reset sequence will have to be aborted because the
NIC is prematurely closed before FW reset has completed.  We also
reject SRIOV configurations while FW reset is in progress.

v2: No longer drop rtnl_lock() in close and wait for FW reset to complete.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-30 14:02:19 -07:00
Michael Chan
7e914027f7 bnxt_en: Enable health monitoring.
Handle the async event from the firmware that enables firmware health
monitoring.  Store initial health metrics.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-30 14:02:19 -07:00
Michael Chan
9ffbd67734 bnxt_en: Pre-map the firmware health monitoring registers.
Pre-map the GRC registers for periodic firmware health monitoring.

Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-30 14:02:18 -07:00
Michael Chan
07f83d72d2 bnxt_en: Discover firmware error recovery capabilities.
Call the new firmware API HWRM_ERROR_RECOVERY_QCFG if it is supported
to discover the firmware health and recovery capabilities and settings.
This feature allows the driver to reset the chip if firmware crashes and
becomes unresponsive.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-30 14:02:18 -07:00
Michael Chan
ec5d31e3c1 bnxt_en: Handle firmware reset status during IF_UP.
During IF_UP, newer firmware has a new status flag that indicates that
firmware has reset.  Add new function bnxt_fw_init_one() to re-probe the
firmware and re-setup VF resources on the PF if necessary.  If the
re-probe fails, set a flag to prevent bnxt_open() from proceeding again.

Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-30 14:02:18 -07:00
Vasundhara Volam
91b9be4870 bnxt_en: Register buffers for VFs before reserving resources.
When VFs need to be reconfigured dynamically after firmwware reset, the
configuration sequence on the PF needs to be changed to register the VF
buffers first.  Otherwise, some VF firmware commands may not succeed as
there may not be PF buffers ready for the re-directed firmware commands.

This sequencing did not matter much before when we only supported
the normal bring-up of VFs.

Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-30 14:02:18 -07:00
Michael Chan
702d5011ab bnxt_en: Refactor bnxt_sriov_enable().
Refactor the hardware/firmware configuration portion in
bnxt_sriov_enable() into a new function bnxt_cfg_hw_sriov().  This
new function can be called after a firmware reset to reconfigure the
VFs previously enabled.

v2: straight refactor of the code.  Reordering done in the next patch.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-30 14:02:18 -07:00
Michael Chan
ba642ab773 bnxt_en: Prepare bnxt_init_one() to be called multiple times.
In preparation for the new firmware reset feature, some of the logic
in bnxt_init_one() and related functions will be called again after
firmware has reset.  Reset some of the flags and capabilities so that
everything that can change can be re-initialized.  Refactor some
functions to probe firmware versions and capabilities.  Check some
buffers before allocating as they may have been allocated previously.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-30 14:02:18 -07:00
Michael Chan
5bedb5296e bnxt_en: Suppress all error messages in hwrm_do_send_msg() in silent mode.
If the silent parameter is set, suppress all messages when there is
no response from firmware.  When polling for firmware to come out of
reset, no response may be normal and we want to suppress the error
messages.  Also, don't poll for the firmware DMA response if Bus Master
is disabled.  This is in preparation for error recovery when firmware
may be in error or reset state or Bus Master is disabled.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-30 14:02:18 -07:00
Michael Chan
a798302d56 bnxt_en: Simplify error checking in the SR-IOV message forwarding functions.
There are 4 functions handling message forwarding for SR-IOV.  They
check for non-zero firmware response code and then return -1.  There
is no need to do this anymore.  The main messaging function will
now return standard error code.  Since we don't need to examine the
response, we can use the hwrm_send_message() variant which will
take the mutex automatically.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-30 14:02:18 -07:00
Michael Chan
d4f1420d36 bnxt_en: Convert error code in firmware message response to standard code.
The main firmware messaging function returns the firmware defined error
code and many callers have to convert to standard error code for proper
propagation to userspace.  Convert bnxt_hwrm_do_send_msg() to return
standard error code so we can do away with all the special error code
handling by the many callers.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-30 14:02:18 -07:00
Michael Chan
a935cb7ec4 bnxt_en: Remove the -1 error return code from bnxt_hwrm_do_send_msg().
Replace the non-standard -1 code with -EBUSY when there is no firmware
response after waiting for the maximum timeout.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-30 14:02:18 -07:00
Michael Chan
b3b0ddd07e bnxt_en: Use a common function to print the same ethtool -f error message.
The same message is printed 3 times in the code, so use a common function
to do that.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-30 14:02:18 -07:00
Thomas Bogendoerfer
70359dbe24 net: sgi: ioc3-eth: no need to stop queue set_multicast_list
netif_stop_queue()/netif_wake_qeue() aren't needed for changing
multicast filters.

Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-30 13:54:36 -07:00
Thomas Bogendoerfer
d1c9454274 net: sgi: ioc3-eth: protect emcr in all cases
emcr in private struct wasn't always protected by spinlock.

Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-30 13:54:36 -07:00
Thomas Bogendoerfer
3498cb272e net: sgi: ioc3-eth: Fix IPG settings
The half/full duplex settings for inter packet gap counters/timer were
reversed.

Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-30 13:54:36 -07:00
Thomas Bogendoerfer
8dff19a6dc net: sgi: ioc3-eth: use csum_fold
replace open coded checksum folding by csum_fold.

Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-30 13:54:36 -07:00
Thomas Bogendoerfer
ed870f6a7a net: sgi: ioc3-eth: use dma-direct for dma allocations
Replace the homegrown DMA memory allocation, which only works on
SGI-IP27 machines, with the generic dma allocations.

Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-30 13:54:36 -07:00
Thomas Bogendoerfer
850d2fed5b net: sgi: ioc3-eth: refactor rx buffer allocation
Move common code for rx buffer setup into ioc3_alloc_skb and deal
with allocation failures. Also clean up allocation size calculation.

Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-30 13:54:36 -07:00
Thomas Bogendoerfer
19a957b6b4 net: sgi: ioc3-eth: split ring cleaning/freeing and allocation
Do tx ring cleaning and freeing of rx buffers, when chip is shutdown and
allocate buffers before bringing chip up.

Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-30 13:54:36 -07:00
Thomas Bogendoerfer
fcd0da5a6d net: sgi: ioc3-eth: introduce chip start function
ioc3_init did everything from reset to init rings to starting the chip.
This change move out chip start into a new function as preparation
for easier handling of receive buffer allocation failures.

Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-30 13:54:35 -07:00
Thomas Bogendoerfer
9c328b0544 net: sgi: ioc3-eth: separate tx and rx ring handling
After allocation of descriptor memory is now done once in probe
handling of tx ring is completely done by ioc3_clean_tx_ring. So
we remove the remaining tx ring actions out of ioc3_alloc_rings
and ioc3_free_rings and rename it to ioc3_[alloc|free]_rx_bufs
to better describe what they are doing.

Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-30 13:54:35 -07:00
Thomas Bogendoerfer
489467e524 net: sgi: ioc3-eth: get rid of ioc3_clean_rx_ring()
Move clearing of the descriptor valid bit into ioc3_alloc_rings. This
makes ioc3_clean_rx_ring obsolete.

Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-30 13:54:35 -07:00
Thomas Bogendoerfer
c7b5727475 net: sgi: ioc3-eth: allocate space for desc rings only once
Memory for descriptor rings are allocated/freed, when interface is
brought up/down. Since the size of the rings is not changeable by
hardware, we now allocate rings now during probe and free it, when
device is removed.

Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-30 13:54:35 -07:00
Thomas Bogendoerfer
141a7dbb88 net: sgi: ioc3-eth: use defines for constants dealing with desc rings
Descriptor ring sizes of the IOC3 are more or less fixed size. To
make clearer where there is a relation to ring sizes use defines.

Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-30 13:54:35 -07:00
Thomas Bogendoerfer
c1b6a3d85d net: sgi: ioc3-eth: remove checkpatch errors/warning
Before massaging the driver further fix oddities found by checkpatch like
- wrong indention
- comment formatting
- use of printk instead or netdev_xxx/pr_xxx

Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-30 13:54:35 -07:00
Thomas Bogendoerfer
cbe7d51745 MIPS: SGI-IP27: restructure ioc3 register access
Break up the big ioc3 register struct into functional pieces to
make use in sub-function drivers more straightforward. And while
doing that get rid of all volatile access by using readX/writeX.

Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-30 13:54:35 -07:00
Gustavo A. R. Silva
3f1071ec39 net: spider_net: Use struct_size() helper
One of the more common cases of allocation size calculations is finding
the size of a structure that has a zero-sized array at the end, along
with memory for some number of elements for that array. For example:

struct spider_net_card {
	...
        struct spider_net_descr darray[0];
};

Make use of the struct_size() helper instead of an open-coded version
in order to avoid any potential type mistakes.

So, replace the following form:

sizeof(struct spider_net_card) + (tx_descriptors + rx_descriptors) * sizeof(struct spider_net_descr)

with:

struct_size(card, darray, tx_descriptors + rx_descriptors)

Notice that, in this case, variable alloc_size is not necessary, hence it
is removed.

Building: allmodconfig powerpc.

This code was detected with the help of Coccinelle.

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-29 17:53:59 -07:00
Heiner Kallweit
b3a42e3a78 r8169: add support for EEE on RTL8125
This adds EEE support for RTL8125 based on the vendor driver.
Supported is EEE for 100Mbps and 1Gbps. Realtek recommended to not yet
enable EEE for 2.5Gbps due to potential compatibility issues. Also
ethtool doesn't support yet controlling EEE for 2.5Gbps and 5Gbps.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-29 17:47:27 -07:00
Heiner Kallweit
02bf642b18 r8169: add RTL8125 PHY initialization
This patch adds PHY initialization magic copied from the r8125 vendor
driver. In addition it supports loading the firmware for chip version
RTL_GIGA_MAC_VER_61.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-29 17:47:27 -07:00
Heiner Kallweit
f1bce4ad2f r8169: add support for RTL8125
This adds support for 2.5Gbps chip RTL8125, it's partially based on the
r8125 vendor driver. Tested with a Delock 89531 PCIe card against a
Netgear GS110MX Multi-Gig switch. Firmware isn't strictly needed,
but on some systems there may be compatibility issues w/o firmware.
Firmware has been submitted to linux-firmware.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-29 17:47:27 -07:00
Heiner Kallweit
ae84bc1873 r8169: don't use bit LastFrag in tx descriptor after send
On RTL8125 this bit is always cleared after send. Therefore check for
tx_skb->skb being set what is functionally equivalent.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-29 17:47:27 -07:00
Heiner Kallweit
7366016d2d r8169: read common register for PCI commit
RTL8125 uses a different register number for IntrMask.
To net have side effects by reading a random register let's
use a register that is the same on all supported chip families.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-29 17:47:27 -07:00
Heiner Kallweit
bcf2b868a5 r8169: move disabling interrupt coalescing to RTL8169/RTL8168 init
RTL8125 doesn't support the same coalescing registers, therefore move
this initialization to the 8168/6169-specific init.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-29 17:47:27 -07:00
Heiner Kallweit
ce37115e3a r8169: factor out reading MAC address from registers
For RTL8125 we will have to read the MAC address also from another
register range, therefore create a small helper.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-29 17:47:26 -07:00
Heiner Kallweit
c623305bf4 r8169: restrict rtl_is_8168evl_up to RTL8168 chip versions
Extend helper rtl_is_8168evl_up to properly work once we add
mac version numbers >51 for RTL8125.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-29 17:47:26 -07:00
Heiner Kallweit
c1d532d268 r8169: change interrupt mask type to u32
RTL8125 uses a 32 bit interrupt mask even though only bits in the
lower 16 bits are used. Change interrupt mask size to u32 to be
prepared and reintroduce helper rtl_get_events.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-29 17:47:26 -07:00
David S. Miller
1a4f1a024c mlx5-updates-2019-08-22
Misc updates for mlx5e net device driver
 
 1) Maxim and Tariq add the support for LAG TX port affinity distribution
 When VF LAG is enabled, VFs netdevs will round-robin the TX affinity
 of their tx queues among the different LAG ports.
 2) Aya adds the support for ip-in-ip RSS.
 3) Marina adds the support for ip-in-ip TX TSO and checksum offloads.
 4) Moshe adds a device internal drop counter to mlx5 ethtool stats.
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAl1mzKEACgkQSD+KveBX
 +j7n9QgAhabOmJtGTT9HP2u3ilbWW6oi2aHr244IDvmvJvuwNIcIll/HaNuj4no9
 XSr5aW0zjVENJ73r5V7slIcyCyjB4AoeEEt2QTBB/UINTkx1Yd56AWd7qgMC1LD0
 A+ZpwEqd6ArRnt8elZJ/w5JlyrjUCMVSqSU8HcuOT1pRnpF5628HmM9w5f33R7iJ
 KJaiNpbjb3zFDbQsRdItPAy4JtxLnhvz660Ti+fXff24DDpap8VSiaj7QsH0DamG
 DTrR0AIu7XQZzwyVthzBXMc/Pe/ord6nBoRzGzQGTaK07OwAP7N8Mc1+dk//FEbe
 xJh71SdoAoJQbNoDTUSJeYZw4mfxuA==
 =Ggn4
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-updates-2019-08-22' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-updates-2019-08-22

Misc updates for mlx5e net device driver

1) Maxim and Tariq add the support for LAG TX port affinity distribution
When VF LAG is enabled, VFs netdevs will round-robin the TX affinity
of their tx queues among the different LAG ports.
2) Aya adds the support for ip-in-ip RSS.
3) Marina adds the support for ip-in-ip TX TSO and checksum offloads.
4) Moshe adds a device internal drop counter to mlx5 ethtool stats.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-29 17:25:18 -07:00
Yufeng Mo
dd2956eab1 net: hns3: not allow SSU loopback while execute ethtool -t dev
The current loopback mode is to add 0x1F to the SMAC address
as the DMAC address and enable the promiscuous mode.
However, if the VF address is the same as the DMAC address,
the loopback test fails.

Loopback can be enabled in three places: SSU, MAC, and serdes.
By default, SSU loopback is enabled, so if the SMAC and the DMAC
are the same, the packets are looped back in the SSU. If SSU loopback
is disabled, packets can reach MAC even if SMAC is the same as DMAC.

Therefore, this patch disables the SSU loopback before the loopback
test. In this way, the SMAC and DMAC can be the same, and the
promiscuous mode does not need to be enabled. And this is not
valid in version 0x20.

This patch also uses a macro to replace 0x1F.

Fixes: c39c4d98dc ("net: hns3: Add mac loopback selftest support in hns3 driver")
Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Reviewed-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-29 16:57:44 -07:00
Huazhong Tan
2336f19d78 net: hns3: check reset interrupt status when reset fails
Currently, the reset interrupt will be cleared firstly, so when
reset fails, if interrupt status register has reset interrupt,
it means there is a new coming reset.

Fixes: 72e2fb0799 ("net: hns3: clear reset interrupt status in hclge_irq_handle()")
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Reviewed-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-29 16:57:44 -07:00
Yufeng Mo
c9765a89d1 net: hns3: add phy selftest function
Currently, the loopback test supports only mac selftest and serdes
selftest. This patch adds phy selftest.

Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-29 16:57:44 -07:00
Weihang Li
a83d29618b net: hns3: implement .process_hw_error for hns3 client
When hardware or IMP get specified error it may need the client
to take some special operations.

This patch implements the hns3 client's process_hw_errorx.

Signed-off-by: Weihang Li <liweihang@hisilicon.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Reviewed-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-29 16:57:43 -07:00
Zhongzhu Liu
e8df45c281 net: hns3: optimize waiting time for TQP reset
This patch optimizes the waiting time for TQP reset.

Signed-off-by: Zhongzhu Liu <liuzhongzhu@huawei.com>
Reviewed-by: Yunsheng Lin <linyunsheng@huawei.com>
Reviewed-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-29 16:57:43 -07:00
Guojia Liao
82f7d0576f net: hns3: fix incorrect type in assignment.
This patch fixes some incorrect type in assignment reported by sparse.
Those sparse warning as below:
- warning : restricted __le16 degrades to integer
- warning : cast from restricted __le32
- warning : expected restricted __le32
- warning : cast from restricted __be32
- warning : cast from restricted __be16
- warning : cast to restricted __le16

Signed-off-by: Guojia Liao <liaoguojia@huawei.com>
Reviewed-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-29 16:57:43 -07:00
Yonglong Liu
199d2dd416 net: hns3: make some reusable codes into a function
In hclge_dcb.c, these pair of codes:
	hclge_notify_client(hdev, HNAE3_DOWN_CLIENT);
	hclge_notify_client(hdev, HNAE3_UNINIT_CLIENT);
and
	hclge_notify_client(hdev, HNAE3_INIT_CLIENT);
	hclge_notify_client(hdev, HNAE3_UP_CLIENT);
are called many times, so make them into a function.

Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Reviewed-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-29 16:57:43 -07:00
Yufeng Mo
ed5b255ba6 net: hns3: optimize some log printings
To better identify abnormal conditions, this patch modifies or
adds some logs to show driver status more accurately.

Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Zhongzhu Liu <liuzhongzhu@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-29 16:57:43 -07:00
Guojia Liao
70a214903d net: hns3: reduce the parameters of some functions
This patch simplifies parameters of some functions by deleting
unused parameter.

Signed-off-by: Guojia Liao <liaoguojia@huawei.com>
Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-29 16:57:43 -07:00
Huazhong Tan
6125b52d26 net: hns3: modify base parameter of kstrtouint in hclge_dbg_dump_tm_map
This patch replaces kstrtouint()'s patameter base with 0 in the
hclge_dbg_dump_tm_mac(), which makes it more flexible. Also
uses a macro to replace string "dump tm map", since it has been
used multiple times.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-29 16:57:43 -07:00
Huazhong Tan
6f92bfd70a net: hns3: use macro instead of magic number
This patch uses macro to replace some magic number.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-29 16:57:43 -07:00
Zhongzhu Liu
a582b78dfc net: hns3: code optimization for debugfs related to "dump reg"
For making the code more readable, this patch uses a array to
keep the information about the dumping register, and then uses
it to parse the parameter cmd_buf which passing into
hclge_dbg_dump_reg_cmd().

Also replaces parameter "base" of kstrtouint with 0 in the
hclge_dbg_dump_reg_common(), which makes it more flexible.

Signed-off-by: Zhongzhu Liu <liuzhongzhu@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-29 16:57:43 -07:00
Ioana Radulescu
8eb3cef8d2 dpaa2-eth: Add pause frame support
Starting with firmware version MC10.18.0, we have support for
L2 flow control. Asymmetrical configuration (Rx or Tx only) is
supported, but not pause frame autonegotioation.

Pause frame configuration is done via ethtool. By default, we start
with flow control enabled on both Rx and Tx. Changes are propagated
to hardware through firmware commands, using two flags (PAUSE,
ASYM_PAUSE) to specify Rx and Tx pause configuration, as follows:

PAUSE | ASYM_PAUSE | Rx pause | Tx pause
----------------------------------------
  0   |     0      | disabled | disabled
  0   |     1      | disabled | enabled
  1   |     0      | enabled  | enabled
  1   |     1      | enabled  | disabled

The hardware can automatically send pause frames when the number
of buffers in the pool goes below a predefined threshold. Due to
this, flow control is incompatible with Rx frame queue taildrop
(both mechanisms target the case when processing of ingress
frames can't keep up with the Rx rate; for large frames, the number
of buffers in the pool may never get low enough to trigger pause
frames as long as taildrop is enabled). So we set pause frame
generation and Rx FQ taildrop as mutually exclusive.

Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-29 16:52:12 -07:00
Ioana Radulescu
cce62943c0 dpaa2-eth: Use stored link settings
Whenever a link state change occurs, we get notified and save
the new link settings in the device's private data. In ethtool
get_link_ksettings, use the stored state instead of interrogating
the firmware each time.

Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-29 16:52:12 -07:00