Commit Graph

24501 Commits

Author SHA1 Message Date
Ivan Khoronzhuk
fea49f60c9 net: ethernet: ti: cpsw: replace unnecessarily macroses on functions
Replace ugly macroses on functions.

Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-01 09:29:24 -07:00
Bart Van Assche
eb2463bab4 rdma/cxgb4: Fix SRQ endianness annotations
This patch avoids that sparse complains about casts to restricted __be32.

Fixes: a3cdaa69e4 ("cxgb4: Adds CPL support for Shared Receive Queues")
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-07-31 16:57:23 -06:00
Govindarajulu Varadarajan
cb5c656886 enic: do not call enic_change_mtu in enic_probe
In commit ab123fe071 ("enic: handle mtu change for vf properly")
ASSERT_RTNL() is added to _enic_change_mtu() to prevent it from being
called without rtnl held. enic_probe() calls enic_change_mtu()
without rtnl held. At this point netdev is not registered yet.
Remove call to enic_change_mtu and assign the mtu to netdev->mtu.

Fixes: ab123fe071 ("enic: handle mtu change for vf properly")
Signed-off-by: Govindarajulu Varadarajan <gvaradar@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-31 14:42:49 -07:00
Feras Daoud
8e1d162d8e net/mlx5e: IPoIB, Set the netdevice sw mtu in ipoib enhanced flow
After introduction of the cited commit, mlx5e_build_nic_params
receives the netdevice mtu in order to set the sw_mtu of mlx5e_params.
For enhanced IPoIB, the netdevice mtu is not set in this stage,
therefore, the initial sw_mtu equals zero. As a result, the hw_mtu
of the receive queue will be calculated incorrectly causing traffic
issues.

To fix this issue, query for port mtu before building the nic params.

Fixes: 472a1e44b3 ("net/mlx5e: Save MTU in channels params")
Signed-off-by: Feras Daoud <ferasda@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-31 12:58:45 -07:00
Adi Nissim
eacecf2760 net/mlx5e: Fix null pointer access when setting MTU of vport representor
MTU helper function is used by both conventional mlx5e
instances (PF/VF) and the eswitch representors. The representor
shouldn't change the nic vport context MTU, the VF is responsible for
that. Therefore set_mtu_cb has a null value when changing the
representor MTU.

Fixes: 250a42b6a7 ("net/mlx5e: Support configurable MTU for vport representors")
Signed-off-by: Adi Nissim <adin@mellanox.com>
Reviewed-by: Yevgeny Kliteynik <kliteyn@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-31 12:58:44 -07:00
Or Gerlitz
2e8e70d249 net/mlx5e: Set port trust mode to PCP as default
The hairpin offload code has dependency on the trust mode being PCP.

Hence we should set PCP as the default for handling cases where we are
disallowed to read the trust mode from the FW, or failed to initialize it.

Fixes: 106be53b6b ('net/mlx5e: Set per priority hairpin pairs')
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-31 12:58:43 -07:00
Eli Cohen
5f5991f36d net/mlx5e: E-Switch, Initialize eswitch only if eswitch manager
Execute mlx5_eswitch_init() only if we have MLX5_ESWITCH_MANAGER
capabilities.
Do the same for mlx5_eswitch_cleanup().

Fixes: a9f7705ffd ("net/mlx5: Unify vport manager capability check")
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-31 12:58:43 -07:00
Jesper Dangaard Brouer
39c64d8c87 mlx5: handle DMA mapping error case for XDP redirect
Commit 58b99ee3e3 ("net/mlx5e: Add support for XDP_REDIRECT in device-out side")
forgot to return/free the xdp_frame in case the DMA mapping failed, correct this.

Also DMA unmap the frame in case mlx5e_xmit_xdp_frame() fails.

Fixes: 58b99ee3e3 ("net/mlx5e: Add support for XDP_REDIRECT in device-out side")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-31 09:44:41 -07:00
Jakub Kicinski
240b74fde3 nfp: fix variable dereferenced before check in nfp_app_ctrl_rx_raw()
'app' is dereferenced before used for the devlink trace point.
In case FW is buggy and sends a control message to a VF queue
we should make sure app is not NULL.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-07-31 09:28:19 +02:00
YueHaibing
c87fffc57a liquidio: remove redundant function cn23xx_dump_iq_regs
There are no in-tree callers of cn23xx_dump_iq_regs.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-30 09:27:21 -07:00
YueHaibing
b23641fe73 qed: remove redundant functions qed_get_cm_pq_idx_rl
There are no in-tree callers of qed_get_cm_pq_idx_rl since it be there,
so it can be removed.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-29 13:18:35 -07:00
Sean Wang
2d14ba7228 net-next: mediatek: cleanup unnecessary get chip id and its user
Since driver is devicetree-based, all device type and charateristic can be
determined by the compatible string and its data. It's unnecessary to
create another dependent function to check chip ID and then decide whether
the specific funciton is being supported on a certain device. It can be
totally replaced by the existing flag, so a cleanup is made by removing
the function and the only user, HWLRO.

MT2701 also have a missing HWLRO support in old code, so add it the same
patch.

Signed-off-by: Sean Wang <sean.wang@mediatek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-29 13:15:57 -07:00
Sean Wang
6c21da204a net-next: mediatek: improve more with using dma_zalloc_coherent
Improve more in the existing code by reusing dma_zalloc_coherent instead
of dma_alloc_coherent with __GFP_ZERO or superfluous zeroing buffer.

Signed-off-by: Sean Wang <sean.wang@mediatek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-29 13:15:57 -07:00
David S. Miller
ebe023a424 mlx5e-updates-2018-07-27 (Vxlan updates)
This series from Gal and Saeed provides updates to mlx5 vxlan implementation.
 
 Gal, started with three cleanups to reflect the actual hardware vxlan state
 - reflect 4789 UDP port default addition to software database
 - check maximum number of vxlan  UDP ports
 - cleanup an unused member in vxlan work
 
 Then Gal provides performance optimization by replacing the
 vxlan radix tree with a hash table.
 
 Measuring mlx5e_vxlan_lookup_port execution time:
 
                       Radix Tree   Hash Table
      --------------- ------------ ------------
       Single Stream   161 ns       79  ns (51% improvement)
       Multi Stream    259 ns       136 ns (47% improvement)
 
     Measuring UDP stream packet rate, single fully utilized TX core:
     Radix Tree: 498,300 PPS
     Hash Table: 555,468 PPS (11% improvement)
 
 Next, from Saeed, vxlan refactoring to allow sharing the vxlan table
 between different mlx5 netdevice instances like PF and VF representors,
 this is done by making mlx5 vxlan interface more generic and decoupling
 it from PF netdevice structures and logic, then moving it into mlx5 core
 as a low level interface so it can be used by VF representors, which is
 illustrated in the last patch of the serious.
 
 -Saeed.
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJbW7KLAAoJEEg/ir3gV/o+YZ8H/2D4Ogav0zxQaNnw4ASYPpVA
 luW1tvGUyk3C+fgbZO7tp/DETGJKbSodSMV9ZasEFFuHPft37mQaEIZf2rq58DvT
 Q6vaaewyRCB6SzIGYjCZWtLI0aE5QtwMWDRbRBlRyQ0zUV6wr26W3WWWM2SpCJGK
 zSuAF3Np0dEPTzBB566CY0nhYpsBiBXm2QJcgySL2WIx1nwQ6X8MJxjErhgQV+2L
 1wVT6YTm1K2cdx9LsORt3FB/mFTQcOJCpl88AyhBB+pc7+i6pBtBp95tcqD8wmtF
 dnMBmR+JzT108VxvcBnpwCxuVI7lLCcAs9hYITTGzUbo6rqT8xXV2HQ1KwU17Ow=
 =tJj8
 -----END PGP SIGNATURE-----

Merge tag 'mlx5e-updates-2018-07-27' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5e-updates-2018-07-27 (Vxlan updates)

This series from Gal and Saeed provides updates to mlx5 vxlan implementation.

Gal, started with three cleanups to reflect the actual hardware vxlan state
- reflect 4789 UDP port default addition to software database
- check maximum number of vxlan  UDP ports
- cleanup an unused member in vxlan work

Then Gal provides performance optimization by replacing the
vxlan radix tree with a hash table.

Measuring mlx5e_vxlan_lookup_port execution time:

                      Radix Tree   Hash Table
     --------------- ------------ ------------
      Single Stream   161 ns       79  ns (51% improvement)
      Multi Stream    259 ns       136 ns (47% improvement)

    Measuring UDP stream packet rate, single fully utilized TX core:
    Radix Tree: 498,300 PPS
    Hash Table: 555,468 PPS (11% improvement)

Next, from Saeed, vxlan refactoring to allow sharing the vxlan table
between different mlx5 netdevice instances like PF and VF representors,
this is done by making mlx5 vxlan interface more generic and decoupling
it from PF netdevice structures and logic, then moving it into mlx5 core
as a low level interface so it can be used by VF representors, which is
illustrated in the last patch of the serious.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-29 13:08:42 -07:00
Ivan Khoronzhuk
193736c817 net: ethernet: ti: cpsw: add missed RX_CTAG feature for second slave
Seems it was missed while adding for first net dev in dual-emac mode.

Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-29 12:51:31 -07:00
Eugeniy Paltsev
9939a46d90 NET: stmmac: align DMA stuff to largest cache line length
As for today STMMAC_ALIGN macro (which is used to align DMA stuff)
relies on L1 line length (L1_CACHE_BYTES).
This isn't correct in case of system with several cache levels
which might have L1 cache line length smaller than L2 line. This
can lead to sharing one cache line between DMA buffer and other
data, so we can lose this data while invalidate DMA buffer before
DMA transaction.

Fix that by using SMP_CACHE_BYTES instead of L1_CACHE_BYTES for
aligning.

Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-29 12:33:30 -07:00
YueHaibing
4be90c79d6 qed: remove redundant functions qed_set_gft_event_id_cm_hdr
There are no in-tree callers of qed_set_gft_event_id_cm_hdr.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-29 08:33:30 -07:00
YueHaibing
5ae42de56b liquidio: remove redundant function cn23xx_dump_vf_iq_regs
There are no in-tree callers.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-29 08:31:07 -07:00
Geert Uytterhoeven
4042cd756e net: mac8390: Use standard memcpy_{from,to}io()
The mac8390 driver defines its own variants of memcpy_fromio() and
memcpy_toio(), using similar implementations, but different function
signatures.

Remove the custom definitions of memcpy_fromio() and memcpy_toio(), and
adjust all callers to the standard signatures.

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Greg Ungerer <gerg@linux-m68k.org>
2018-07-29 10:48:18 +02:00
Yelena Krivosheev
562e2f467e net: mvneta: Improve the buffer allocation method for SWBM
With system having a small memory (around 256MB), the state "cannot
allocate memory to refill with new buffer" is reach pretty quickly.

By this patch we changed buffer allocation method to a better handling of
this use case by avoiding memory allocation issues.

Signed-off-by: Yelena Krivosheev <yelena@marvell.com>
[gregory: extract from a larger patch]
Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-28 22:12:55 -07:00
Yelena Krivosheev
f945cec88c net: mvneta: Verify hardware checksum only when offload checksum feature is set
If the checksum offload feature is not set, then there is no point to
check the status of the hardware.

[gregory: extract from a larger patch]
Signed-off-by: Yelena Krivosheev <yelena@marvell.com>
Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-28 22:12:55 -07:00
Gregory CLEMENT
7e47fd84b5 net: mvneta: Allocate page for the descriptor
Instead of trying to allocate the exact amount of memory for each
descriptor use a page for each of them, it allows to simplify the
allocation management and increase the performance of the driver.

Based on the work of Yelena Krivosheev <yelena@marvell.com>
Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-28 22:12:55 -07:00
Gregory CLEMENT
17a96da627 net: mvneta: discriminate error cause for missed packet
In order to improve the diagnostic in case of error, make the distinction
between refill error and skb allocation error. Also make the information
available through the ethtool state.

Based on the work of Yelena Krivosheev <yelena@marvell.com>
Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-28 22:12:55 -07:00
Yelena Krivosheev
c307e2a895 net: mvneta: increase number of buffers in RX and TX queue
The initial values were too small leading to poor performance when using
the software buffer management.

Signed-off-by: Yelena Krivosheev <yelena@marvell.com>
[gregory: extract from a larger patch]
Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-28 22:12:55 -07:00
Gregory CLEMENT
965cbbec7f net: mvneta: remove data pointer usage from device_node structure
On year ago Rob Herring wanted to remove the data pointer from the
device_node structure[1]. The mvneta driver seemed to be the only one
which used (abused ?) it. However, the proposal of Rob to remove this
pointer from the driver introduced a regression, and I tested and fixed an
alternative way, but it was never submitted as a proper patch.

Now here it is: Instead of using the device_node structure ->data
pointer, we store the BM private data as the driver data of the BM
platform_device. The core mvneta code can retrieve it by doing a lookup
on which platform_device corresponds to the BM device tree node using
of_find_device_by_node(), and get its driver data

[1]https://www.spinics.net/lists/netdev/msg445197.html

Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-28 22:12:55 -07:00
Yelena Krivosheev
8466baf788 net: mvneta: fix mtu change on port without link
It is incorrect to enable TX/RX queues (call by mvneta_port_up()) for
port without link. Indeed MTU change for interface without link causes TX
queues to stuck.

Fixes: c5aff18204 ("net: mvneta: driver for Marvell Armada 370/XP
network unit")
Signed-off-by: Yelena Krivosheev <yelena@marvell.com>
[gregory.clement: adding Fixes tags and rewording commit log]
Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-28 22:12:55 -07:00
Andrew Lunn
7a86f05faf net: ethernet: mvneta: Fix napi structure mixup on armada 3700
The mvneta Ethernet driver is used on a few different Marvell SoCs.
Some SoCs have per cpu interrupts for Ethernet events. Some SoCs have
a single interrupt, independent of the CPU. The driver handles this by
having a per CPU napi structure when there are per CPU interrupts, and
a global napi structure when there is a single interrupt.

When the napi core calls mvneta_poll(), it passes the napi
instance. This was not being propagated through the call chain, and
instead the per-cpu napi instance was passed to napi_gro_receive()
call. This breaks when there is a single global napi instance.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Fixes: 2636ac3cc2 ("net: mvneta: Add network support for Armada 3700 SoC")
Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-28 22:12:55 -07:00
Govindarajulu Varadarajan
ab123fe071 enic: handle mtu change for vf properly
When driver gets notification for mtu change, driver does not handle it for
all RQs. It handles only RQ[0].

Fix is to use enic_change_mtu() interface to change mtu for vf.

Signed-off-by: Govindarajulu Varadarajan <gvaradar@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-28 19:04:38 -07:00
John Hurley
ee614c8710 nfp: flower: fix port metadata conversion bug
Function nfp_flower_repr_get_type_and_port expects an enum nfp_repr_type
return value but, if the repr type is unknown, returns a value of type
enum nfp_flower_cmsg_port_type.  This means that if FW encodes the port
ID in a way the driver does not understand instead of dropping the frame
driver may attribute it to a physical port (uplink) provided the port
number is less than physical port count.

Fix this and ensure a net_device of NULL is returned if the repr can not
be determined.

Fixes: 1025351a88 ("nfp: add flower app")
Signed-off-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-28 14:27:32 -07:00
Saeed Mahameed
a3e673660b net/mlx5e: Issue direct lookup on vxlan ports by vport representors
Remove uplink representor netdevice private structure lookup, and use
mlx5 core handle directly from representor private structure to lookup
vxlan ports.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
2018-07-27 15:46:13 -07:00
Saeed Mahameed
358aa5ce28 net/mlx5e: Vxlan, move vxlan logic to core driver
Move vxlan logic and objects to mlx5 core dirver.
Since it going to be used from different mlx5 interfaces.
e.g. mlx5e PF NIC netdev and mlx5e E-Switch representors.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
2018-07-27 15:46:13 -07:00
Saeed Mahameed
aec4eab9af net/mlx5e: Vxlan, add sync lock for add/del vxlan port
Vxlan API can and will be called from different mlx5 modules, we should
not count on mlx5e private state lock only, hence we introduce a vxlan
private mutex to sync between add/del vxlan port operations.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
2018-07-27 15:46:11 -07:00
Saeed Mahameed
1b318a92f3 net/mlx5e: Vxlan, return values for add/del port
For a better API mlx5_vxlan_{add/del}_port can fail, make them return
error values.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
2018-07-27 15:43:06 -07:00
Saeed Mahameed
a3c785d73c net/mlx5e: Vxlan, rename from mlx5e to mlx5
Rename vxlan functions from mlx5e_vxlan_* to mlx5_vxlan_*.
Rename mlx5e_vxlan_db to mlx5_vxlan and move it from en.h to vxlan.c
since it is not related to mlx5e anymore.

Allocate mlx5_vxlan structure dynamically in order to make it easier to
move later to core driver and to make it private in vxlan.c.

This is in preparation to move vxlan API to mlx5 core.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
2018-07-27 15:43:04 -07:00
Saeed Mahameed
5006eb221e net/mlx5e: Vxlan, rename struct mlx5e_vxlan to mlx5_vxlan_port
The name mlx5e_vxlan will be used in downstream patch to describe
mlx5 vxlan structure that will replace mlx5e_vxlan_db.

Hence we rename struct mlx5e_vxlan to mlx5_vxlan_port which describes a
mlx5 vxlan port.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
2018-07-27 15:36:51 -07:00
Saeed Mahameed
dccea6bf38 net/mlx5e: Vxlan, move netdev only logic to en_main.c
Create a direct vxlan API to add and delete vxlan ports from HW.
+void mlx5e_vxlan_add_port(struct mlx5e_priv *priv, u16 port);
+void mlx5e_vxlan_del_port(struct mlx5e_priv *priv, u16 port);

And move vxlan_add/del_work to en_main.c since they are netdev only
logic.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
2018-07-27 15:35:16 -07:00
Saeed Mahameed
0f647bfcd0 net/mlx5e: Vxlan, add direct delete function
Add direct vxlan delete function to be called from vxlan_delete_work.
Needed in downstream patch.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
2018-07-27 15:35:14 -07:00
Gal Pressman
278d7f3dc0 net/mlx5e: Vxlan, cleanup an unused member in vxlan work
Cleanup the sa_family member of the vxlan work, it is unused/needed
anywhere in the code.

Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-27 15:30:32 -07:00
Gal Pressman
d30d8cde19 net/mlx5e: Vxlan, replace ports radix-tree with hash table
The VXLAN database is accessed in the data path for each VXLAN TX skb in
order to check whether the UDP port is being offloaded or not.
The number of elements in the database is relatively small, we can
simplify the radix-tree to a hash table and speedup the lookup process.

Measuring mlx5e_vxlan_lookup_port execution time:

                  Radix Tree   Hash Table
 --------------- ------------ ------------
  Single Stream   161 ns       79  ns (51% improvement)
  Multi Stream    259 ns       136 ns (47% improvement)

Measuring UDP stream packet rate, single fully utilized TX core:
Radix Tree: 498,300 PPS
Hash Table: 555,468 PPS (11% improvement)

Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-27 13:56:44 -07:00
Gal Pressman
22a65aa8b1 net/mlx5e: Vxlan, check maximum number of UDP ports
The NIC has a limited number of offloaded VXLAN UDP ports (usually 4).
Instead of letting the firmware fail when trying to add more ports than
it can handle, let the driver check it on its own.

Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-27 13:56:44 -07:00
Gal Pressman
a082c4f4f0 net/mlx5e: Vxlan, reflect 4789 UDP port default addition to software database
The hardware offloads 4789 UDP port (default VXLAN port) automatically.
Add it to the software database as well in order to reflect the hardware
state appropriately.

Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-27 13:56:44 -07:00
Jia-Ju Bai
6ae5cbc418 net: nvidia: forcedeth: Replace GFP_ATOMIC with GFP_KERNEL in nv_probe()
nv_probe() is never called in atomic context.
It calls dma_alloc_coherent() with GFP_ATOMIC, which is not necessary.
GFP_ATOMIC can be replaced with GFP_KERNEL.

This is found by a static analysis tool named DCNS written by myself.

Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-27 13:45:14 -07:00
Jia-Ju Bai
d818c59a8f net: jme: Replace mdelay() with msleep() and usleep_range() in jme_wait_link()
jme_wait_link() is never called in atomic context.
It calls mdelay() to busily wait, which is not necessary.
mdelay() can be replaced with msleep() and usleep_range().

This is found by a static analysis tool named DCNS written by myself.

Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-27 13:45:14 -07:00
Jia-Ju Bai
89036f233a net: hisilicon: hns: Replace mdelay() with msleep()
hns_ppe_common_init_hw() and hns_xgmac_init() are never
called in atomic context.
They call mdelay() to busily wait, which is not necessary.
mdelay() can be replaced with msleep().

This is found by a static analysis tool named DCNS written by myself.

Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-27 13:45:14 -07:00
Jia-Ju Bai
2bcd619e6e net: amd: pcnet32: Replace GFP_ATOMIC with GFP_KERNEL in pcnet32_alloc_ring()
pcnet32_alloc_ring() is never called in atomic context.
It calls kcalloc() with GFP_ATOMIC, which is not necessary.
GFP_ATOMIC can be replaced with GFP_KERNEL.

This is found by a static analysis tool named DCNS written by myself.

Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-27 13:45:14 -07:00
Rahul Lakkireddy
27defe9d8f cxgb4: print ULD queue information managed by LLD
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-27 13:36:04 -07:00
Petr Machata
b2b1dab688 mlxsw: spectrum: Support ieee_setapp, ieee_delapp
The APP TLVs are used for communicating priority-to-protocol ID maps for
a given netdevice. Support the following APP TLVs:

- DSCP (selector 5) to configure priority-to-DSCP code point maps. Use
  these maps to configure packet priority on ingress, and DSCP code
  point rewrite on egress.

- Default priority (selector 1, PID 0) to configure priority for the
  DSCP code points that don't have one assigned by the DSCP selector. In
  future this could also be used for assigning default port priority
  when a packet arrives without DSCP tagging.

Besides setting up the maps themselves, also configure port trust level
and rewrite bits.

Port trust level determines whether, for a packet arriving through a
certain port, the priority should be determined based on PCP or DSCP
header fields. So far, mlxsw kept the device default of trust-PCP. Now,
as soon as the first DSCP APP TLV is configured, switch to trust-DSCP.
Only when all DSCP APP TLVs are removed, switch back to trust-PCP again.
Note that the default priority APP TLV doesn't impact the trust level
configuration.

Rewrite bits determine whether DSCP and PCP fields of egressing packets
should be updated according to switch priority. When port trust is
switched to DSCP, enable rewrite of DSCP field.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-27 13:17:50 -07:00
Petr Machata
55fb71f481 mlxsw: reg: Add QoS Priority to DSCP Mapping Register
This register controls mapping from Priority to DSCP for purposes of
rewrite. Note that rewrite happens as the packet is transmitted provided
that the DSCP rewrite bit is enabled for the packet.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-27 13:17:50 -07:00
Petr Machata
e67131d9b8 mlxsw: reg: Add QoS ReWrite Enable Register
This register configures the rewrite enable (whether PCP or DSCP value
in packet should be updated according to packet priority) per receive
port.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-27 13:17:50 -07:00
Petr Machata
746da42a1f mlxsw: reg: Add QoS Priority Trust State Register
The QPTS register controls the port policy to calculate the switch
priority and packet color based on incoming packet fields.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-27 13:17:50 -07:00
Petr Machata
02837d7267 mlxsw: reg: Add QoS Port DSCP to Priority Mapping Register
The QPDPM register controls the mapping from DSCP field to Switch
Priority for IP packets.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-27 13:17:50 -07:00
Krzysztof Kozlowski
5d258b48ef net: ethernet: Use existing define with polynomial
Do not define again the polynomial but use header with existing define.

Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-07-27 19:16:37 +08:00
Jakub Kicinski
17082566a9 nfp: bpf: improve map offload info messages
FW can put constraints on map element size to maximize resource
use and efficiency.  When user attempts offload of a map which
does not fit into those constraints an informational message is
printed to kernel logs to inform user about the reason offload
failed.  Map offload does not have access to any advanced error
reporting like verifier log or extack.  There is also currently
no way for us to nicely expose the FW capabilities to user
space.  Given all those constraints we should make sure log
messages are as informative as possible.  Improve them.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-07-27 07:14:35 +02:00
Jakub Kicinski
ab01f4ac5f nfp: bpf: remember maps by ID
Record perf maps by map ID, not raw kernel pointer.  This helps
with debug messages, because printing pointers to logs is frowned
upon, and makes debug easier for the users, as map ID is something
they should be more familiar with.  Note that perf maps are offload
neutral, therefore IDs won't be orphaned.

While at it use a rate limited print helper for the error message.

Reported-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-07-27 07:14:35 +02:00
Jakub Kicinski
0958762748 nfp: bpf: allow receiving perf events on data queues
Control queue is fairly low latency, and requires SKB allocations,
which means we can't even reach 0.5Msps with perf events.  Allow
perf events to be delivered to data queues.  This allows us to not
only use multiple queues, but also receive and deliver to user space
more than 5Msps per queue (Xeon E5-2630 v4 2.20GHz, no retpolines).

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-07-27 07:14:35 +02:00
Jakub Kicinski
20c5420421 nfp: bpf: pass raw data buffer to nfp_bpf_event_output()
In preparation for SKB-less perf event handling make
nfp_bpf_event_output() take buffer address and length,
not SKB as parameters.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-07-27 07:14:35 +02:00
Jakub Kicinski
79ca38e80c nfp: allow control message reception on data queues
Port id 0xffffffff is reserved for control messages.  Allow reception
of messages with this id on data queues.  Hand off a raw buffer to
the higher layer code, without allocating SKB for max efficiency.
The RX handle can't modify or keep the buffer, after it returns
buffer is handed back over to the NIC RX free buffer list.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-07-27 07:14:35 +02:00
Jakub Kicinski
78a8b760e4 nfp: move repr handling on RX path
Representor packets are received on PF queues with special metadata tag
for demux.  There is no reason to resolve the representor ID -> netdev
after the skb has been allocated.  Move the code, this will allow us to
handle special FW messages without SKB allocation overhead.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-07-27 07:14:35 +02:00
David S. Miller
ecbcd689d7 mlx5e-updates-2018-07-26 (XDP redirect)
This series from Tariq adds the support for device-out XDP redirect.
 
 Start with a simple RX and XDP cleanups:
 - Replace call to MPWQE free with dealloc in interface down flow
 - Do not recycle RX pages in interface down flow
 - Gather all XDP pre-requisite checks in a single function
 - Restrict the combination of large MTU and XDP
 
 Since now XDP logic is going to be called from TX side as well,
 generic XDP TX logic is not RX only anymore, for that Tariq creates
 a new xdp.c file and moves XDP related code into it, and generalizes
 the code to support XDP TX for XDP redirect, such as the xdp tx sq
 structures and xdp counters.
 
 XDP redirect support:
 Add implementation for the ndo_xdp_xmit callback.
 
 Dedicate a new set of XDP-SQ instances to satisfy the XDP_REDIRECT
 requests.  These instances are totally separated from the existing
 XDP-SQ objects that satisfy local XDP_TX actions.
 
 Performance tests:
 
 xdp_redirect_map from ConnectX-5 to ConnectX-5.
 CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz
 Packet-rate of 64B packets.
 
 Single queue: 7 Mpps.
 Multi queue: 55 Mpps.
 
 -Saeed.
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJbWk7YAAoJEEg/ir3gV/o+b+4H/RKfHrDgqRcz+qw2hYqMc/0Q
 N/s6SaZm/AZi63lOwUeeSGORJSGYxxbwcWFNuI6CZrHEupi0zQivDko+E3fTPeDQ
 3Y7jAaLjB0ZsWwEukcgZ1PG1QIYUg2/EBijG+aah9q/TuUfsFYJDRImlXRi5zvSe
 34yuLls6ctuoxoet5S5SVWFssdQZ2YpmmMNqVmQIRGDSmd03dhE1WUGUGmPhR5mK
 vxUBL5X3ucb1lcqxpFnZqsrIZphxkimX0q8qUK1yHk7VU4JXZgikPkRFM1mnlNt0
 YosvDlnDqJugBvYspBUx+9au8wxgPwpHEbnGKdS/npsIuNXa0ZFnEky1tp6CzV0=
 =3CNK
 -----END PGP SIGNATURE-----

Merge tag 'mlx5e-updates-2018-07-26' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5e-updates-2018-07-26 (XDP redirect)

This series from Tariq adds the support for device-out XDP redirect.

Start with a simple RX and XDP cleanups:
- Replace call to MPWQE free with dealloc in interface down flow
- Do not recycle RX pages in interface down flow
- Gather all XDP pre-requisite checks in a single function
- Restrict the combination of large MTU and XDP

Since now XDP logic is going to be called from TX side as well,
generic XDP TX logic is not RX only anymore, for that Tariq creates
a new xdp.c file and moves XDP related code into it, and generalizes
the code to support XDP TX for XDP redirect, such as the xdp tx sq
structures and xdp counters.

XDP redirect support:
Add implementation for the ndo_xdp_xmit callback.

Dedicate a new set of XDP-SQ instances to satisfy the XDP_REDIRECT
requests.  These instances are totally separated from the existing
XDP-SQ objects that satisfy local XDP_TX actions.

Performance tests:

xdp_redirect_map from ConnectX-5 to ConnectX-5.
CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz
Packet-rate of 64B packets.

Single queue: 7 Mpps.
Multi queue: 55 Mpps.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-26 21:33:24 -07:00
Gal Pressman
101f0cd4f2 net: ena: Fix use of uninitialized DMA address bits field
UBSAN triggers the following undefined behaviour warnings:
[...]
[   13.236124] UBSAN: Undefined behaviour in drivers/net/ethernet/amazon/ena/ena_eth_com.c:468:22
[   13.240043] shift exponent 64 is too large for 64-bit type 'long long unsigned int'
[...]
[   13.744769] UBSAN: Undefined behaviour in drivers/net/ethernet/amazon/ena/ena_eth_com.c:373:4
[   13.748694] shift exponent 64 is too large for 64-bit type 'long long unsigned int'
[...]

When splitting the address to high and low, GENMASK_ULL is used to generate
a bitmask with dma_addr_bits field from io_sq (in ena_com_prepare_tx and
ena_com_add_single_rx_desc).
The problem is that dma_addr_bits is not initialized with a proper value
(besides being cleared in ena_com_create_io_queue).
Assign dma_addr_bits the correct value that is stored in ena_dev when
initializing the SQ.

Fixes: 1738cd3ed3 ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Gal Pressman <pressmangal@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-26 21:30:13 -07:00
Jia-Ju Bai
d8ad2f31f8 net: adaptec: Replace mdelay() with msleep() in starfire_init_one()
starfire_init_one() is never called in atomic context.
It calls mdelay() to busily wait, which is not necessary.
mdelay() can be replaced with msleep().

This is found by a static analysis tool named DCNS written by myself.

Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-26 21:24:23 -07:00
YueHaibing
ff7b91262b net: hns: make hns_dsaf_roce_reset non static
hns_dsaf_roce_reset is exported and used in hns_roce_hw_v1.c
In commit 336a443bd9 ("net: hns: Make many functions static") I make
it static wrongly.

drivers/infiniband/hw/hns/hns_roce_hw_v1.o: In function `hns_roce_v1_reset':
hns_roce_hw_v1.c:(.text+0x37ac): undefined reference to `hns_dsaf_roce_reset'
hns_roce_hw_v1.c:(.text+0x37cc): undefined reference to `hns_dsaf_roce_reset'

Fixes: 336a443bd9 ("net: hns: Make many functions static")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-26 21:19:31 -07:00
Tariq Toukan
8ee4823356 net/mlx5e: TX, Use function to access sq_dma object in fifo
Use designated function mlx5e_dma_get() to get
the mlx5e_sq_dma object to be pushed into fifo.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-26 15:23:59 -07:00
Tariq Toukan
9a3956da1c net/mlx5e: TX, Move DB fields in TXQ-SQ struct
Pointers in DB are static, move them to read-only area so they
do not share a cacheline with fields modified in datapath.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-26 15:23:58 -07:00
Tariq Toukan
d3398a4f18 net/mlx5e: RX, Prefetch the xdp_frame data area
A loaded XDP program might write to the xdp_frame data area,
prefetchw() it to avoid a potential cache miss.

Performance tests:
ConnectX-5, XDP_TX packet rate, single ring.
CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz

Before: 13,172,976 pps
After:  13,456,248 pps
2% gain.

Fixes: 22f4539881 ("net/mlx5e: Support XDP over Striding RQ")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-26 15:23:57 -07:00
Tariq Toukan
58b99ee3e3 net/mlx5e: Add support for XDP_REDIRECT in device-out side
Add implementation for the ndo_xdp_xmit callback.

Dedicate a new set of XDP-SQ instances to satisfy the XDP_REDIRECT
requests.  These instances are totally separated from the existing
XDP-SQ objects that satisfy local XDP_TX actions.

Performance tests:

xdp_redirect_map from ConnectX-5 to ConnectX-5.
CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz
Packet-rate of 64B packets.

Single queue: 7 Mpps.
Multi queue: 55 Mpps.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-26 15:23:57 -07:00
Tariq Toukan
dac0d15fff net/mlx5e: Re-order fields of struct mlx5e_xdpsq
In the downstream patch that adds support to XDP_REDIRECT-out,
the XDP xmit frame function doesn't share the same run context as
the NAPI that polls the XDP-SQ completion queue.

Hence, need to re-order the XDP-SQ fields to avoid cacheline
false-sharing.

Take redirect_flush and doorbell out of DB, into separated
cachelines.

Add a cacheline breaker within the stats struct.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-26 15:23:56 -07:00
Tariq Toukan
890388ad6f net/mlx5e: Refactor XDP counters
Separate the XDP counters into two sets:
(1) One set reside in the RQ stats, and they monitor XDP stats
in the RQ side.
(2) Another set is per XDP-SQ, and they monitor XDP stats that
are related to XDP transmit flow.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-26 15:23:55 -07:00
Tariq Toukan
c94e4f117e net/mlx5e: Make XDP xmit functions more generic
Convert the XDP xmit functions to use the generic xdp_frame API
in XDP_TX flow.
Same functions will be used later in this series to transmit
the XDP redirect-out packets as well.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-26 15:23:55 -07:00
Tariq Toukan
86690b4b4a net/mlx5e: Add counter for XDP redirect in RX
Add per-ring and total stats for received packets that
goes into XDP redirection.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-26 15:23:54 -07:00
Tariq Toukan
159d213134 net/mlx5e: Move XDP related code into new XDP files
Take XDP code out of the general EN header and RX file into
new XDP files.

Currently, XDP-SQ resides only within an RQ and used from a
single flow (XDP_TX) triggered upon RX completions.
In a downstream patch, additional type of XDP-SQ instances will be
presented and used for the XDP_REDIRECT flow, totally unrelated to
the RX context.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-26 15:23:53 -07:00
Tariq Toukan
a26a5bdf3e net/mlx5e: Restrict the combination of large MTU and XDP
Add checks in control path upon an MTU change or an XDP program set,
to prevent reaching cases where large MTU and XDP are set simultaneously.

This is to make sure we allow XDP only with the linear RX memory scheme,
i.e. a received packet is not scattered to different pages.
Change mlx5e_rx_get_linear_frag_sz() accordingly, so that we make sure
the XDP configuration can really be set, instead of assuming that it is.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-26 15:23:53 -07:00
Tariq Toukan
0ec13877ce net/mlx5e: Gather all XDP pre-requisite checks in a single function
Dedicate a function to all checks done when setting an XDP program.
Take indications from priv instead of netdev features.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-26 15:23:52 -07:00
Tariq Toukan
cb5189d173 net/mlx5e: Do not recycle RX pages in interface down flow
Keep all page-pool recycle calls within NAPI context.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-26 15:23:51 -07:00
Tariq Toukan
afab995e06 net/mlx5e: Replace call to MPWQE free with dealloc in interface down flow
No need to expose the MPWQE free function to control path.
The dealloc function already exposed, use it.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-26 15:23:51 -07:00
David S. Miller
6a8fab1794 Merge branch '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:

====================
10GbE Intel Wired LAN Driver Updates 2018-07-26

This series contains updates to ixgbe and igb.

Tony fixes ixgbe to add checks to ensure jumbo frames or LRO get enabled
after an XDP program is loaded.

Shannon Nelson adds the missing security configuration registers to the
ixgbe register dump, which will help in debugging.

Christian Grönke fixes an issue in igb that occurs on SGMII based SPF
mdoules, by reverting changes from 2 previous patches.  The issue was
that initialization would fail on the fore mentioned modules because the
driver would try to reset the PHY before obtaining the PHY address of
the SGMII attached PHY.

Venkatesh Srinivas replaces wmb() with dma_wmb() for doorbell writes,
which avoids SFENCEs before the doorbell writes.

Alex cleans up and refactors ixgbe Tx/Rx shutdown to reduce time needed
to stop the device.  The code refactor allows us to take the completion
time into account when disabling queues, so that on some platforms with
higher completion times, would not result in receive queues disabled
messages.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-26 14:14:01 -07:00
YueHaibing
336a443bd9 net: hns: Make many functions static
Fixes the following sparse warning:

drivers/net/ethernet/hisilicon/hns/hns_ae_adapt.c:73:20: warning: symbol 'hns_ae_get_handle' was not declared. Should it be static?
drivers/net/ethernet/hisilicon/hns/hns_ae_adapt.c:332:6: warning: symbol 'hns_ae_stop' was not declared. Should it be static?
drivers/net/ethernet/hisilicon/hns/hns_ae_adapt.c:360:6: warning: symbol 'hns_ae_toggle_ring_irq' was not declared. Should it be static?
drivers/net/ethernet/hisilicon/hns/hns_ae_adapt.c:580:6: warning: symbol 'hns_ae_update_stats' was not declared. Should it be static?
drivers/net/ethernet/hisilicon/hns/hns_ae_adapt.c:663:6: warning: symbol 'hns_ae_get_stats' was not declared. Should it be static?
drivers/net/ethernet/hisilicon/hns/hns_ae_adapt.c:695:6: warning: symbol 'hns_ae_get_strings' was not declared. Should it be static?
drivers/net/ethernet/hisilicon/hns/hns_ae_adapt.c:728:5: warning: symbol 'hns_ae_get_sset_count' was not declared. Should it be static?
drivers/net/ethernet/hisilicon/hns/hns_ae_adapt.c:774:6: warning: symbol 'hns_ae_update_led_status' was not declared. Should it be static?
drivers/net/ethernet/hisilicon/hns/hns_ae_adapt.c:786:5: warning: symbol 'hns_ae_cpld_set_led_id' was not declared. Should it be static?
drivers/net/ethernet/hisilicon/hns/hns_ae_adapt.c:798:6: warning: symbol 'hns_ae_get_regs' was not declared. Should it be static?
drivers/net/ethernet/hisilicon/hns/hns_ae_adapt.c:823:5: warning: symbol 'hns_ae_get_regs_len' was not declared. Should it be static?
drivers/net/ethernet/hisilicon/hns/hns_dsaf_gmac.c:342:6: warning: symbol 'hns_gmac_update_stats' was not declared. Should it be static?
drivers/net/ethernet/hisilicon/hns/hns_dsaf_mac.c:934:12: warning: symbol 'hns_mac_get_vaddr' was not declared. Should it be static?
drivers/net/ethernet/hisilicon/hns/hns_dsaf_mac.c:953:5: warning: symbol 'hns_mac_get_cfg' was not declared. Should it be static?
drivers/net/ethernet/hisilicon/hns/hns_dsaf_misc.c:343:6: warning: symbol 'hns_dsaf_srst_chns' was not declared. Should it be static?
drivers/net/ethernet/hisilicon/hns/hns_dsaf_misc.c:366:1: warning: symbol 'hns_dsaf_srst_chns_acpi' was not declared. Should it be static?
drivers/net/ethernet/hisilicon/hns/hns_dsaf_misc.c:373:6: warning: symbol 'hns_dsaf_roce_srst' was not declared. Should it be static?
drivers/net/ethernet/hisilicon/hns/hns_dsaf_misc.c:387:6: warning: symbol 'hns_dsaf_roce_srst_acpi' was not declared. Should it be static?
drivers/net/ethernet/hisilicon/hns/hns_dsaf_misc.c:571:5: warning: symbol 'hns_mac_get_sfp_prsnt' was not declared. Should it be static?
drivers/net/ethernet/hisilicon/hns/hns_dsaf_misc.c:589:5: warning: symbol 'hns_mac_get_sfp_prsnt_acpi' was not declared. Should it be static?
drivers/net/ethernet/hisilicon/hns/hns_dsaf_main.c:31:12: warning: symbol 'g_dsaf_mode_match' was not declared. Should it be static?
drivers/net/ethernet/hisilicon/hns/hns_dsaf_main.c:45:5: warning: symbol 'hns_dsaf_get_cfg' was not declared. Should it be static?
drivers/net/ethernet/hisilicon/hns/hns_dsaf_main.c:962:6: warning: symbol 'hns_dsaf_tcam_addr_get' was not declared. Should it be static?
drivers/net/ethernet/hisilicon/hns/hns_dsaf_main.c:2087:6: warning: symbol 'hns_dsaf_port_work_rate_cfg' was not declared. Should it be static?
drivers/net/ethernet/hisilicon/hns/hns_dsaf_main.c:2837:5: warning: symbol 'hns_dsaf_roce_reset' was not declared. Should it be static?
drivers/net/ethernet/hisilicon/hns/hns_dsaf_ppe.c:76:5: warning: symbol 'hns_ppe_common_get_cfg' was not declared. Should it be static?
drivers/net/ethernet/hisilicon/hns/hns_dsaf_ppe.c:107:6: warning: symbol 'hns_ppe_common_free_cfg' was not declared. Should it be static?
drivers/net/ethernet/hisilicon/hns/hns_dsaf_ppe.c:340:6: warning: symbol 'hns_ppe_uninit_ex' was not declared. Should it be static?
drivers/net/ethernet/hisilicon/hns/hns_dsaf_rcb.c:708:5: warning: symbol 'hns_rcb_get_ring_num' was not declared. Should it be static?
drivers/net/ethernet/hisilicon/hns/hns_dsaf_rcb.c:744:14: warning: symbol 'hns_rcb_common_get_vaddr' was not declared. Should it be static?
drivers/net/ethernet/hisilicon/hns/hns_dsaf_xgmac.c:314:6: warning: symbol 'hns_xgmac_update_stats' was not declared. Should it be static?
drivers/net/ethernet/hisilicon/hns/hns_enet.c:1303:6: warning: symbol 'hns_nic_update_stats' was not declared. Should it be static?
drivers/net/ethernet/hisilicon/hns/hns_enet.c:1585:6: warning: symbol 'hns_nic_poll_controller' was not declared. Should it be static?
drivers/net/ethernet/hisilicon/hns/hns_enet.c:1938:6: warning: symbol 'hns_set_multicast_list' was not declared. Should it be static?
drivers/net/ethernet/hisilicon/hns/hns_enet.c:1960:6: warning: symbol 'hns_nic_set_rx_mode' was not declared. Should it be static?
drivers/net/ethernet/hisilicon/hns/hns_ethtool.c:661:6: warning: symbol 'hns_get_ringparam' was not declared. Should it be static?
drivers/net/ethernet/hisilicon/hns/hns_ethtool.c:811:6: warning: symbol 'hns_get_channels' was not declared. Should it be static?
drivers/net/ethernet/hisilicon/hns/hns_ethtool.c:828:6: warning: symbol 'hns_get_ethtool_stats' was not declared. Should it be static?
drivers/net/ethernet/hisilicon/hns/hns_ethtool.c:886:6: warning: symbol 'hns_get_strings' was not declared. Should it be static?
drivers/net/ethernet/hisilicon/hns/hns_ethtool.c:976:5: warning: symbol 'hns_get_sset_count' was not declared. Should it be static?
drivers/net/ethernet/hisilicon/hns/hns_ethtool.c:1010:5: warning: symbol 'hns_phy_led_set' was not declared. Should it be static?
drivers/net/ethernet/hisilicon/hns/hns_ethtool.c:1032:5: warning: symbol 'hns_set_phys_id' was not declared. Should it be static?
drivers/net/ethernet/hisilicon/hns/hns_ethtool.c:1106:6: warning: symbol 'hns_get_regs' was not declared. Should it be static?

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-26 09:41:48 -07:00
tangpengpeng
7f3fc7ddf7 net: fix amd-xgbe flow-control issue
If we enable or disable xgbe flow-control by ethtool ,
it does't work.Because the parameter is not properly
assigned,so we need to adjust the assignment order
of the parameters.

Fixes: c1ce2f7736 ("amd-xgbe: Fix flow control setting logic")
Signed-off-by: tangpengpeng <tangpengpeng@higon.com>
Acked-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-26 09:29:59 -07:00
Alexander Duyck
1918e937ca ixgbe: Refactor queue disable logic to take completion time into account
This change is meant to allow us to take completion time into account when
disabling queues. Previously we were just working with hard coded values
for how long we should wait. This worked fine for the standard case where
completion timeout was operating in the 50us to 50ms range, however on
platforms that have higher completion timeout times this was resulting in
Rx queues disable messages being displayed as we weren't waiting long
enough for outstanding Rx DMA completions.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Don Buchholz <donald.buchholz@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-07-26 09:04:06 -07:00
Alexander Duyck
3b5f14b50e ixgbe: Reorder Tx/Rx shutdown to reduce time needed to stop device
This change is meant to help reduce the time needed to shutdown the
transmit and receive paths for the device. Specifically what we now do
after this patch is disable the transmit path first at the netdev level,
and then work on disabling the Rx. This way while we are waiting on the Rx
queues to be disabled the Tx queues have an opportunity to drain out.

In addition I have dropped the 10ms timeout that was left in the ixgbe_down
function that seems to have been carried through from back in e1000 as far
as I can tell. We shouldn't need it since we don't actually disable the Tx
until much later and we have additional logic in place for verifying the Tx
queues have been disabled.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Don Buchholz <donald.buchholz@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-07-26 09:04:06 -07:00
Venkatesh Srinivas
73017f4e05 igb: Use dma_wmb() instead of wmb() before doorbell writes
igb writes to doorbells to post transmit and receive descriptors;
after writing descriptors to memory but before writing to doorbells,
use dma_wmb() rather than wmb(). wmb() is more heavyweight than
necessary before doorbell writes.

On x86, this avoids SFENCEs before doorbell writes in both the
tx and rx refill paths.

Signed-off-by: Venkatesh Srinivas <venkateshs@google.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-07-26 09:04:05 -07:00
Christian Grönke
2a83fba6ca igb: Remove superfluous reset to PHY and page 0 selection
This patch reverts two previous applied patches to fix an issue
that appeared when using SGMII based SFP modules. In the current
state the driver will try to reset the PHY before obtaining the
phy_addr of the SGMII attached PHY. That leads to an error in
e1000_write_phy_reg_sgmii_82575. Causing the initialization to
fail:

    igb: Intel(R) Gigabit Ethernet Network Driver - version 5.4.0-k
    igb: Copyright (c) 2007-2014 Intel Corporation.
    igb: probe of ????:??:??.? failed with error -3

The patches being reverted are:

    commit 1827853354
    Author: Aaron Sierra <asierra@xes-inc.com>
    Date:   Tue Nov 29 10:03:56 2016 -0600

        igb: reset the PHY before reading the PHY ID

    commit 440aeca4b9
    Author: Matwey V Kornilov <matwey@sai.msu.ru>
    Date:   Thu Nov 24 13:32:48 2016 +0300

         igb: Explicitly select page 0 at initialization

The first reverted patch directly causes the problem mentioned above.
In case of SGMII the phy_addr is not known at this point and will
only be obtained by 'igb_get_phy_id_82575' further down in the code.
The second removed patch selects forces selection of page 0 in the
PHY. Something that the reset tries to address as well.

As pointed out by Alexander Duzck, the patch below fixes the same
issue but in the proper location:

    commit 4e684f59d7
    Author: Chris J Arges <christopherarges@gmail.com>
    Date:   Wed Nov 2 09:13:42 2016 -0500

        igb: Workaround for igb i210 firmware issue

Reverts: 440aeca4b9.
Reverts: 1827853354.

Signed-off-by: Christian Grönke <c.groenke@infodas.de>
Reviewed-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-07-26 09:04:05 -07:00
Shannon Nelson
7f6cdbdafb ixgbe: add ipsec security registers into ethtool register dump
Add the ixgbe's security configuration registers into
the register dump.

Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-07-26 09:04:05 -07:00
Tony Nguyen
38b7e7f8ae ixgbe: Do not allow LRO or MTU change with XDP
XDP does not support jumbo frames or LRO.  These checks are being made
outside the driver when an XDP program is loaded, however, there is
nothing preventing these from changing after an XDP program is loaded.
Add the checks so that while an XDP program is loaded, do not allow MTU
to be changed or LRO to be enabled.

Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-07-26 09:04:05 -07:00
Arjun Vynipadath
942a656f1f cxgb4: Added missing break in ndo_udp_tunnel_{add/del}
Break statements were missing for Geneve case in
ndo_udp_tunnel_{add/del}, thereby raw mac matchall
entries were not getting added.

Fixes: c746fc0e8b2d("cxgb4: add geneve offload support for T6")
Signed-off-by: Arjun Vynipadath <arjun@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-25 22:38:49 -07:00
Jakub Kicinski
5ea14712d7 nfp: protect from theoretical size overflows on HW descriptor ring
Use array_size() and store the size as full size_t to protect from
theoretical size overflow when handling HW descriptor rings.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-25 22:17:44 -07:00
Jakub Kicinski
e76c1d3d2a nfp: restore correct ordering of fields in rx ring structure
Commit 7f1c684a89 ("nfp: setup xdp_rxq_info") mixed the cache
cold and cache hot data in the nfp_net_rx_ring structure (ignoring
the feedback), to try to fit the structure into 2 cache lines
after struct xdp_rxq_info was added.  Now that we are about to add
a new field the structure will grow back to 3 cache lines, so
order the members correctly.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-25 22:17:44 -07:00
Jakub Kicinski
4662717038 nfp: use kvcalloc() to allocate SW buffer descriptor arrays
Use kvcalloc() instead of tmp variable + kzalloc() when allocating
SW buffer information to allow falling back to vmalloc and to protect
from theoretical integer overflow.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-25 22:17:44 -07:00
Jakub Kicinski
5b0ced17ed nfp: don't fail probe on pci_sriov_set_totalvfs() errors
On machines with buggy ACPI tables or when SR-IOV is already enabled
we may not be able to set the SR-IOV VF limit in sysfs, it's not fatal
because the limit is imposed by the driver anyway.  Only the sysfs
'sriov_totalvfs' attribute will be too high.  Print an error to inform
user about the failure but allow probe to continue.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-25 22:17:44 -07:00
YueHaibing
b24dbfe9ce amd-xgbe: use dma_mapping_error to check map errors
The dma_mapping_error() returns true or false, but we want
to return -ENOMEM if there was an error.

Fixes: 174fd2597b ("amd-xgbe: Implement split header receive support")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-25 21:17:17 -07:00
Ido Schimmel
a0a777b940 mlxsw: spectrum_acl: Start using A-TCAM
Now that all the pieces are in place we can start using the A-TCAM
instead of only using the C-TCAM. This allows for much higher scale and
better performance (to be improved further by follow-up patch sets).

Perform the integration with the A-TCAM and the eRP core by reverting
the changes introduced by "mlxsw: spectrum_acl: Enable C-TCAM only mode
in eRP core" and add calls from the C-TCAM code into the eRP core.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-25 16:46:02 -07:00
Ido Schimmel
a8758b67bf mlxsw: spectrum_acl: Add A-TCAM rule insertion and deletion
Implement rule insertion and deletion into the A-TCAM before we flip the
driver to start using the A-TCAM.

Rule insertion into the A-TCAM is very similar to C-TCAM, but there are
subtle differences between regions of different sizes (i.e., different
number of key blocks).

Specifically, as explained in "mlxsw: spectrum_acl: Allow encoding a
partial key", in 12 key blocks regions a rule is split into two and the
two halves of the rule are linked using a "large entry key ID".

Such differences are abstracted away by using different region
operations per region type.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-25 16:46:02 -07:00
Ido Schimmel
a20ff8eb3f mlxsw: spectrum_acl: Pass C-TCAM region and entry to insert function
When A-TCAM will be used together with C-TCAM, the C-TCAM code will need
to call into the eRP core in order to get an eRP for an inserted entry.

The eRP core takes an A-TCAM region as one of its arguments, so pass the
C-TCAM region to the insertion function which will later allow us to
derive the A-TCAM region, given it contains the C-TCAM one.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-25 16:46:01 -07:00
Ido Schimmel
6d240650bc mlxsw: spectrum_acl: Add A-TCAM region initialization
Before we start using the A-TCAM we need to make sure the region is
properly initialized.

This includes the setting of its type (which affects the size of its eRP
table, for example) and its registration with the eRP core.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-25 16:46:01 -07:00
Ido Schimmel
f58df510f8 mlxsw: spectrum_acl: Make global TCAM resources available to regions
Each TCAM region currently uses its own resources and there is no
sharing between the different regions.

This is going to change with A-TCAM as each region will need to allocate
an eRP table from the global eRP tables array.

Make the global TCAM resources available to each region by passing the
TCAM private data to the region initialization routine.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-25 16:46:01 -07:00
Ido Schimmel
57e56d3699 mlxsw: spectrum_acl: Encapsulate C-TCAM region in A-TCAM region
In Spectrum-2 the C-TCAM is only used for rules that can't fit in the
A-TCAM due to a limited number of masks per A-TCAM region.

In addition, rules inserted into the C-TCAM may affect rules residing in
the A-TCAM, by clearing their C-TCAM prune bit.

The two regions are thus closely related and can be thought of as if the
C-TCAM region is encapsulated in the A-TCAM one.

Change the data structures to reflect that before introducing A-TCAM
support and make C-TCAM region initialization part of the A-TCAM region
initialization sequence.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-25 16:46:01 -07:00
Ido Schimmel
174c0adb69 mlxsw: spectrum_acl: Add A-TCAM initialization
Initialize the A-TCAM as part of the driver's initialization routine.

Specifically, initialize the eRP tables so that A-TCAM regions will be
able to perform allocations of eRP tables upon rule insertion in
subsequent patches.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-25 16:46:01 -07:00
Ido Schimmel
ca49544ed6 mlxsw: spectrum_acl: Allow encoding a partial key
When working with 12 key blocks in the A-TCAM, rules are split into two
records, which constitute two lookups. The two records are linked using
a "large entry key ID". The ID is assigned to key blocks 6 to 11 and
resolved during the first lookup. The second lookup is performed using
the ID and the remaining key blocks.

Allow encoding a partial key so that it can be later used to check if an
ID can be reused.

This is done by adding two arguments to the existing encode function
that specify the range of the block indexes we would like to encode. The
key and mask arguments become optional, as we will not need to encode
both of them all the time.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-25 16:46:01 -07:00
Ido Schimmel
befc7747df mlxsw: spectrum_acl: Extend Spectrum-2 region struct
In a similar fashion to Spectrum-1's region struct, Spectrum-2's struct
needs to store a pointer to the common region struct.

The pointer will be used in follow-up patches that implement rules
insertion and deletion.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-25 16:46:01 -07:00
Ido Schimmel
b17b113e0c mlxsw: spectrum_acl: Add support for C-TCAM eRPs
The number of eRPs that can be used by a single A-TCAM region is limited
to 16. When more eRPs are needed, an ordinary circuit TCAM (C-TCAM) can
be used to hold the extra eRPs.

Unlike the A-TCAM, only a single (last) lookup is performed in the
C-TCAM and not a lookup per-eRP. However, modeling the C-TCAM as extra
eRPs will allow us to easily introduce support for pruning in a
follow-up patch set and is also logically correct.

The following diagram depicts the relation between both TCAMs:
                                                                 C-TCAM
+-------------------+               +--------------------+    +-----------+
|                   |               |                    |    |           |
|  eRP #1 (A-TCAM)  +----> ... +----+  eRP #16 (A-TCAM)  +----+  eRP #17  |
|                   |               |                    |    |    ...    |
+-------------------+               +--------------------+    |  eRP #N   |
                                                              |           |
                                                              +-----------+
Lookup order is from left to right.

Extend the eRP core APIs with a C-TCAM parameter which indicates whether
the requested eRP is to be used with the C-TCAM or not.

Since the C-TCAM is only meant to absorb rules that can't fit in the
A-TCAM due to exceeded number of eRPs or key collision, an error is
returned when a C-TCAM eRP needs to be created when the eRP state
machine is in its initial state (i.e., 'no masks'). This should only
happen in the face of very unlikely errors when trying to push rules
into the A-TCAM.

In order not to perform unnecessary lookups, the eRP core will only
enable a C-TCAM lookup for a given region if it knows there are C-TCAM
eRPs present.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-25 16:46:01 -07:00
Ido Schimmel
c19df1d88d mlxsw: spectrum_acl: Enable C-TCAM only mode in eRP core
Currently, no calls are performed into the eRP core, but in order to
make review easier we would like to gradually add these calls.

Have the eRP core initialize a region's master mask to all ones and
allow it to use an empty eRP table. This directs the lookup to the
C-TCAM and allows the C-TCAM only mode to continue working.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-25 16:46:01 -07:00
Ido Schimmel
f465261aa1 mlxsw: spectrum_acl: Implement common eRP core
When rules are inserted into the A-TCAM they are associated with a mask,
which is part of the lookup key: { masked key, mask ID, region ID }.

These masks are called rule patterns (RP) and the aggregation of several
masks into one (to be introduced in follow-up patch sets) is called an
extended RP (eRP).

When a packet undergoes a lookup in an ACL region it is masked by the
current set of eRPs used by the region, looking for an exact match.
Eventually, the rule with the highest priority is picked.

These eRPs are stored in several global banks to allow for lookup to
occur using several eRPs simultaneously.

At first, an ACL region will only require a single mask - upon the
insertion of the first rule. In this case, the region can use the
"master RP" which is composed by OR-ing all the masks used by the
region. This mask is a property of the region and thus there is no need
to use the above mentioned banks.

At some point, a second mask will be needed. In this case, the region
will need to allocate an eRP table from the above mentioned banks and
insert its masks there.

>From now on, upon lookup, the eRP table used by the region will be
fetched from the eRP banks - using {eRP bank, Index within the bank} -
and the eRPs present in the table will be used to mask the packet. Note
that masks with consecutive indexes are inserted into consecutive banks.

When rules are deleted and a region only needs a single mask once again
it can free its eRP table and use the master RP.

The above logic is implemented in the eRP core and represented using the
following state machine:

    +------------+   create mask - as master RP   +---------------+
    |            +-------------------------------->               |
    |  no masks  |                                |  single mask  |
    |            <--------------------------------+               |
    +------------+          delete mask           +-----+--^------+
                                                        |  |
                                                        |  |
                                  create mask -         |  |  delete mask -
    create mask                   transition to use eRP |  |  transition to
     +--------+                   table                 |  |  use master RP
     |        |                                         |  |
     |        |                                         |  |
+----v--------+----+         create mask           +----v--+-----+
|                  <-------------------------------+             |
|  multiple masks  |                               |  two masks  |
|                  +------------------------------->             |
+------------------+      delete mask - if two     +-------------+
                          remaining

The code that actually configures rules in the A-TCAM will interface
with the eRP core by getting or putting an eRP based on the required
mask used by the rule.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-25 16:46:01 -07:00
Ido Schimmel
489142eca9 mlxsw: resources: Add Spectrum-2 eRP resources
Add the following resources to be used by A-TCAM code:
* Maximum number of eRP banks
* Maximum size of eRP bank
* Number of eRP entries required for a 2/4/8/12 key blocks mask

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-25 16:46:01 -07:00
Ido Schimmel
541e249cdc mlxsw: resources: Add Spectrum-2 maximum large key ID resource
Add a resource to make sure we do not exceed the maximum number of
supported large key IDs in a region.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-25 16:46:01 -07:00
Ido Schimmel
8c0d1cdd05 mlxsw: reg: Add Policy-Engine eRP Table Register
The register is used to add and delete eRPs from the eRP table.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-25 16:46:01 -07:00
Ido Schimmel
aecefac903 mlxsw: reg: Add Policy-Engine TCAM Entry Register Version 3
The register is used to configure rules in the A-TCAM.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-25 16:46:01 -07:00
Ido Schimmel
91329e27f3 mlxsw: reg: Prepare PERERP register for A-TCAM usage
Before introducing A-TCAM support we need to make sure all the necessary
fields are configurable and not hard coded to values that worked for the
C-TCAM only use case.

This includes - for example - the ability to configure the eRP table
used by the TCAM region.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-25 16:46:01 -07:00
Wei Yongjun
41147bb18a lan743x: Make symbol lan743x_pm_ops static
Fixes the following sparse warning:

drivers/net/ethernet/microchip/lan743x_main.c:2944:25: warning:
 symbol 'lan743x_pm_ops' was not declared. Should it be static?

Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Acked-by: Bryan Whitehead <Bryan.Whitehead@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-25 16:37:55 -07:00
Tariq Toukan
7cc77bf4c2 net/mlx4_core: Allow MTTs starting at any index
Allow obtaining MTTs starting at any index,
thus give a better cache utilization.

For this, allow setting log_mtts_per_seg to 0, and use
this in default.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Anaty Rahamim Bar Kat <anaty@mellanox.com>
Reviewed-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-25 16:30:38 -07:00
Or Gerlitz
bcef735c59 net/mlx5e: Offload TC matching on tos/ttl for ip tunnels
Enable offloading of TC matching on tos/ttl for ipv4/6 tunnels.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-25 16:28:57 -07:00
Or Gerlitz
f35f800d35 net/mlx5e: Support setup of tos and ttl for tunnel key TC action offload
Use the values provided by user-space for the encapsulation headers.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-25 16:28:57 -07:00
Or Gerlitz
6360cd625e net/mlx5e: Use ttl from route lookup on tc encap offload only if needed
Currnetly, the ttl for the encapsulation headers is taken from the
route lookup result. As a pre-step to allow for an offload case when
the user specifies the ttl, take it from the route lookup only if
not zero. While here, also move to use u8 instead int for the ttl.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-25 16:28:57 -07:00
dann frazier
7856e86162 hinic: Link the logical network device to the pci device in sysfs
Otherwise interfaces get exposed under /sys/devices/virtual, which
doesn't give udev the context it needs for PCI-based predictable
interface names.

Signed-off-by: dann frazier <dann.frazier@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-25 13:54:52 -07:00
Bjorn Helgaas
bc510d59cf vxge: Remove unnecessary include of <linux/pci_hotplug.h>
The vxge driver doesn't need anything provided by pci_hotplug.h, so remove
the unnecessary include of it.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Jon Mason <jdmason@kudzu.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-25 13:52:38 -07:00
Krzysztof Kozlowski
d805f6a868 net: ethernet: fs-enet: Use generic CRC32 implementation
Use generic kernel CRC32 implementation because it:
1. Should be faster (uses lookup tables),
2. Removes duplicated CRC generation code,
3. Uses well-proven algorithm instead of coding it one more time.

Suggested-by: Eric Biggers <ebiggers3@gmail.com>
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-25 13:40:39 -07:00
Krzysztof Kozlowski
16f6e9835b net: ethernet: freescale: Use generic CRC32 implementation
Use generic kernel CRC32 implementation because it:
1. Should be faster (uses lookup tables),
2. Removes duplicated CRC generation code,
3. Uses well-proven algorithm instead of coding it one more time.

Suggested-by: Eric Biggers <ebiggers3@gmail.com>
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-25 13:40:25 -07:00
David S. Miller
19725496da Merge ra.kernel.org:/pub/scm/linux/kernel/git/davem/net 2018-07-24 19:21:58 -07:00
Shubhrajyoti Datta
03bc7cab7d net: axienet: Fix double deregister of mdio
If the registration fails then mdio_unregister is called.
However at unbind the unregister ia attempted again resulting
in the below crash

[   73.544038] kernel BUG at drivers/net/phy/mdio_bus.c:415!
[   73.549362] Internal error: Oops - BUG: 0 [#1] SMP
[   73.554127] Modules linked in:
[   73.557168] CPU: 0 PID: 2249 Comm: sh Not tainted 4.14.0 #183
[   73.562895] Hardware name: xlnx,zynqmp (DT)
[   73.567062] task: ffffffc879e41180 task.stack: ffffff800cbe0000
[   73.572973] PC is at mdiobus_unregister+0x84/0x88
[   73.577656] LR is at axienet_mdio_teardown+0x18/0x30
[   73.582601] pc : [<ffffff80085fa4cc>] lr : [<ffffff8008616858>]
pstate: 20000145
[   73.589981] sp : ffffff800cbe3c30
[   73.593277] x29: ffffff800cbe3c30 x28: ffffffc879e41180
[   73.598573] x27: ffffff8008a21000 x26: 0000000000000040
[   73.603868] x25: 0000000000000124 x24: ffffffc879efe920
[   73.609164] x23: 0000000000000060 x22: ffffffc879e02000
[   73.614459] x21: ffffffc879e02800 x20: ffffffc87b0b8870
[   73.619754] x19: ffffffc879e02800 x18: 000000000000025d
[   73.625050] x17: 0000007f9a719ad0 x16: ffffff8008195bd8
[   73.630345] x15: 0000007f9a6b3d00 x14: 0000000000000010
[   73.635640] x13: 74656e7265687465 x12: 0000000000000030
[   73.640935] x11: 0000000000000030 x10: 0101010101010101
[   73.646231] x9 : 241f394f42533300 x8 : ffffffc8799f6e98
[   73.651526] x7 : ffffffc8799f6f18 x6 : ffffffc87b0ba318
[   73.656822] x5 : ffffffc87b0ba498 x4 : 0000000000000000
[   73.662117] x3 : 0000000000000000 x2 : 0000000000000008
[   73.667412] x1 : 0000000000000004 x0 : ffffffc8799f4000
[   73.672708] Process sh (pid: 2249, stack limit = 0xffffff800cbe0000)

Fix the same by making the bus NULL on unregister.

Signed-off-by: Shubhrajyoti Datta <shubhrajyoti.datta@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-24 16:38:01 -07:00
Sudarsana Reddy Kalluru
ae2dcb28c2 bnx2x: Fix invalid memory access in rss hash config path.
Rx hash/filter table configuration uses rss_conf_obj to configure filters
in the hardware. This object is initialized only when the interface is
brought up.
This patch adds driver changes to configure rss params only when the device
is in opened state. In port disabled case, the config will be cached in the
driver structure which will be applied in the successive load path.

Please consider applying it to 'net' branch.

Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-24 16:33:02 -07:00
Jack Morgenstein
958c696f5a net/mlx4_core: Save the qpn from the input modifier in RST2INIT wrapper
Function mlx4_RST2INIT_QP_wrapper saved the qp number passed in the qp
context, rather than the one passed in the input modifier.

However, the qp number in the qp context is not defined as a
required parameter by the FW. Therefore, drivers may choose to not
specify the qp number in the qp context for the reset-to-init transition.

Thus, we must save the qp number passed in the command input modifier --
which is always present. (This saved qp number is used as the input
modifier for command 2RST_QP when a slave's qp's are destroyed).

Fixes: c82e9aa0a8 ("mlx4_core: resource tracking for HCA resources used by guests")
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-24 16:30:48 -07:00
Heiner Kallweit
18041b5236 r8169: restore previous behavior to accept BIOS WoL settings
Commit 7edf6d314c tried to resolve an inconsistency (BIOS WoL
settings are accepted, but device isn't wakeup-enabled) resulting
from a previous broken-BIOS workaround by making disabled WoL the
default.
This however had some side effects, most likely due to a broken BIOS
some systems don't properly resume from suspend when the MagicPacket
WoL bit isn't set in the chip, see
https://bugzilla.kernel.org/show_bug.cgi?id=200195
Therefore restore the WoL behavior from 4.16.

Reported-by: Albert Astals Cid <aacid@kde.org>
Fixes: 7edf6d314c ("r8169: disable WOL per default")
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-24 14:06:39 -07:00
Jason Gunthorpe
eda98779f7 Merge branch 'mellanox/mlx5-next' into rdma.git for-next
From git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux.git

This is required to resolve dependencies of the next series of RDMA
patches.

* branch 'mellanox/mlx5-next':
  net/mlx5: Add support for flow table destination number
  net/mlx5: Add forward compatible support for the FTE match data
  net/mlx5: Fix tristate and description for MLX5 module
  net/mlx5: Better return types for CQE API
  net/mlx5: Use ERR_CAST() instead of coding it
  net/mlx5: Add missing SET_DRIVER_VERSION command translation
  net/mlx5: Add XRQ commands definitions
  net/mlx5: Add core support for double vlan push/pop steering action
  net/mlx5: Expose MPEGC (Management PCIe General Configuration) structures
  net/mlx5: FW tracer, add hardware structures
  net/mlx5: fix uaccess beyond "count" in debugfs read/write handlers

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-07-24 13:10:23 -06:00
Rahul Lakkireddy
ae2a922fae cxgb4: move Tx/Rx free pages collection to common code
This information needs to be collected in vmcore device dump as well.
So, move to common code.

Fixes: fa145d5dfd ("cxgb4: display number of rx and tx pages free")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-24 10:12:22 -07:00
Rahul Lakkireddy
9d0f180cd5 cxgb4: collect number of free PSTRUCT page pointers
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-24 10:12:21 -07:00
Nir Dotan
27c203cd14 mlxsw: spectrum_flower: Add extack messages
Return extack messages in order to explain failures
of unsupported actions, keys and invalid user input.

Signed-off-by: Nir Dotan <nird@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-24 10:10:33 -07:00
Nir Dotan
af1fe78643 mlxsw: spectrum_acl: Add extack messages
Return extack messages for failures in action set creation.
Messages provide reasons for not being able to implement
the action in HW.

Signed-off-by: Nir Dotan <nird@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-24 10:10:33 -07:00
Nir Dotan
9c10812afe mlxsw: core_acl_flex_actions: Add extack messages
Return extack messages for failures in action set creation.
Errors may occur when action is not currently supported or due
to lack of resources.

Signed-off-by: Nir Dotan <nird@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-24 10:10:33 -07:00
Nir Dotan
ad7769ca2d mlxsw: spectrum_acl: Propagate extack pointer
Propagate extack pointer in order to add extack messages for ACL.
In the follow-up patches, appropriate messages will be added
in various points.

Signed-off-by: Nir Dotan <nird@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-24 10:10:33 -07:00
Yishai Hadas
664000b6bb net/mlx5: Add support for flow table destination number
Add support to set a destination from a flow table number.
This functionality will be used in downstream patches from this
series by the DEVX stuff.

Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Acked-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2018-07-24 08:51:20 +03:00
Yishai Hadas
2aada6c0c9 net/mlx5: Add forward compatible support for the FTE match data
Use the PRM size including the reserved when working with the FTE
match data.

This comes to support forward compatibility for cases that current
reserved data will be exposed by the firmware by an application that
uses the DEVX API without changing the kernel.

Also drop some driver checks around the match criteria leaving the work
for firmware to enable forward compatibility for future bits there.

Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Acked-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2018-07-24 08:50:30 +03:00
Jiri Pirko
e2f2a1fd5b mlxsw: spectrum: Implement chain template hinting
Since cld_flower provides information about the filter template for
specific chain, use this information in order to prepare a region.
Use the template to find out what elements are going to be used
and pass that down to mlxsw_sp_acl_tcam_group_add(). Later on, when the
first filter is inserted, the mlxsw_sp_acl_tcam_group_use_patterns()
function would use this element usage information instead of looking
up a pattern.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-23 20:44:13 -07:00
Ivan Khoronzhuk
4b4255ed06 net: ethernet: ti: cpsw: restore shaper configuration while down/up
Need to restore shapers configuration after interface was down/up.
This is needed as appropriate configuration is still replicated in
kernel settings. This only shapers context restore, so vlan
configuration should be restored by user if needed, especially for
devices with one port where vlan frames are sent via ALE.

Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-23 20:34:36 -07:00
Ivan Khoronzhuk
57d9014825 net: ethernet: ti: cpsw: add CBS Qdisc offload
The cpsw has up to 4 FIFOs per port and upper 3 FIFOs can feed rate
limited queue with shaping. In order to set and enable shaping for
those 3 FIFOs queues the network device with CBS qdisc attached is
needed. The CBS configuration is added for dual-emac/single port mode
only, but potentially can be used in switch mode also, based on
switchdev for instance.

Despite the FIFO shapers can work w/o cpdma level shapers the base
usage must be in combine with cpdma level shapers as described in TRM,
that are set as maximum rates for interface queues with sysfs.

One of the possible configuration with txq shapers and CBS shapers:

                      Configured with echo RATE >
                  /sys/class/net/eth0/queues/tx-0/tx_maxrate
             /---------------------------------------------------
            /
           /            cpdma level shapers
        +----+ +----+ +----+ +----+ +----+ +----+ +----+ +----+
        | c7 | | c6 | | c5 | | c4 | | c3 | | c2 | | c1 | | c0 |
        \    / \    / \    / \    / \    / \    / \    / \    /
         \  /   \  /   \  /   \  /   \  /   \  /   \  /   \  /
          \/     \/     \/     \/     \/     \/     \/     \/
+---------|------|------|------|-------------------------------------+
|    +----+      |      |  +---+                                     |
|    |      +----+      |  |                                         |
|    v      v           v  v                                         |
| +----+ +----+ +----+ +----+ p        p+----+ +----+ +----+ +----+  |
| |    | |    | |    | |    | o        o|    | |    | |    | |    |  |
| | f3 | | f2 | | f1 | | f0 | r  CPSW  r| f3 | | f2 | | f1 | | f0 |  |
| |    | |    | |    | |    | t        t|    | |    | |    | |    |  |
| \    / \    / \    / \    / 0        1\    / \    / \    / \    /  |
|  \  X   \  /   \  /   \  /             \  /   \  /   \  /   \  /   |
|   \/ \   \/     \/     \/               \/     \/     \/     \/    |
+-------\------------------------------------------------------------+
         \
          \ FIFO shaper, set with CBS offload added in this patch,
           \ FIFO0 cannot be rate limited
            ------------------------------------------------------

CBS shaper configuration is supposed to be used with root MQPRIO Qdisc
offload allowing to add sk_prio->tc->txq maps that direct traffic to
appropriate tx queue and maps L2 priority to FIFO shaper.

The CBS shaper is intended to be used for AVB where L2 priority
(pcp field) is used to differentiate class of traffic. So additionally
vlan needs to be created with appropriate egress sk_prio->l2 prio map.

If CBS has several tx queues assigned to it, the sum of their
bandwidth has not overlap bandwidth set for CBS. It's recomended the
CBS bandwidth to be a little bit more.

The CBS shaper is configured with CBS qdisc offload interface using tc
tool from iproute2 packet.

For instance:

$ tc qdisc replace dev eth0 handle 100: parent root mqprio num_tc 3 \
map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 queues 1@0 1@1 2@2 hw 1

$ tc -g class show dev eth0
+---(100:ffe2) mqprio
|    +---(100:3) mqprio
|    +---(100:4) mqprio
|    
+---(100:ffe1) mqprio
|    +---(100:2) mqprio
|    
+---(100:ffe0) mqprio
     +---(100:1) mqprio

$ tc qdisc add dev eth0 parent 100:1 cbs locredit -1440 \
hicredit 60 sendslope -960000 idleslope 40000 offload 1

$ tc qdisc add dev eth0 parent 100:2 cbs locredit -1470 \
hicredit 62 sendslope -980000 idleslope 20000 offload 1

The above code set CBS shapers for tc0 and tc1, for that txq0 and
txq1 is used. Pay attention, the real set bandwidth can differ a bit
due to discreteness of configuration parameters.

Here parameters like locredit, hicredit and sendslope are ignored
internally and are supposed to be set with assumption that maximum
frame size for frame - 1500.

It's supposed that interface speed is not changed while reconnection,
not always is true, so inform user in case speed of interface was
changed, as it can impact on dependent shapers configuration.

For more examples see Documentation.

Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-23 20:34:36 -07:00
Ivan Khoronzhuk
7929a66871 net: ethernet: ti: cpsw: add MQPRIO Qdisc offload
That's possible to offload vlan to tc priority mapping with
assumption sk_prio == L2 prio.

Example:
$ ethtool -L eth0 rx 1 tx 4

$ qdisc replace dev eth0 handle 100: parent root mqprio num_tc 3 \
map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 queues 1@0 1@1 2@2 hw 1

$ tc -g class show dev eth0
+---(100:ffe2) mqprio
|    +---(100:3) mqprio
|    +---(100:4) mqprio
|    
+---(100:ffe1) mqprio
|    +---(100:2) mqprio
|    
+---(100:ffe0) mqprio
     +---(100:1) mqprio

Here, 100:1 is txq0, 100:2 is txq1, 100:3 is txq2, 100:4 is txq3
txq0 belongs to tc0, txq1 to tc1, txq2 and txq3 to tc2
The offload part only maps L2 prio to classes of traffic, but not
to transmit queues, so to direct traffic to traffic class vlan has
to be created with appropriate egress map.

Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-23 20:34:36 -07:00
Ivan Khoronzhuk
4bb6c356a0 net: ethernet: ti: cpdma: fit rated channels in backward order
According to TRM tx rated channels should be in 7..0 order,
so correct it.

Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-23 20:34:36 -07:00
Ivan Khoronzhuk
79b3325d0d net: ethernet: ti: cpsw: use cpdma channels in backward order for txq
The cpdma channel highest priority is from hi to lo number.
The driver has limited number of descriptors that are shared between
number of cpdma channels. Number of queues can be tuned with ethtool,
that allows to not spend descriptors on not needed cpdma channels.
In AVB usually only 2 tx queues can be enough with rate limitation.
The rate limitation can be used only for hi priority queues. Thus, to
use only 2 queues the 8 has to be created. It's wasteful.

So, in order to allow using only needed number of rate limited
tx queues, save resources, and be able to set rate limitation for
them, let assign tx cpdma channels in backward order to queues.

Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-23 20:34:36 -07:00
Boris Pismenny
3f44899ef2 net/mlx5e: Use PARTIAL_GSO for UDP segmentation
This patch removes the splitting of UDP_GSO_L4 packets in the driver,
and exposes UDP_GSO_L4 as a PARTIAL_GSO feature. Thus, the network stack
is not responsible for splitting the packet into two.

Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-23 15:01:11 -07:00
Jianbo Liu
cc495188a8 net/mlx5e: Support offloading double vlan push/pop tc actions
As we can configure two push/pop actions in one flow table entry,
add support to offload those double vlan actions in a rule to HW.

Signed-off-by: Jianbo Liu <jianbol@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-23 15:01:11 -07:00
Jianbo Liu
1482bd3d50 net/mlx5e: Refactor tc vlan push/pop actions offloading
Extract actions offloading code to a new function, and also extend data
structures for double vlan actions.

Signed-off-by: Jianbo Liu <jianbol@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-23 15:01:11 -07:00
Jianbo Liu
699e96ddf4 net/mlx5e: Support offloading tc double vlan headers match
We can match on both outer and inner vlan tags, add support for
offloading that.

Signed-off-by: Jianbo Liu <jianbol@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-23 15:01:11 -07:00
Roi Dayan
c7f7ba8df8 net/mlx5e: Remove redundant WARN when we cannot find neigh entry
It is possible for neigh entry not to exist if it was cleaned already.
When we bring down an interface the neigh gets deleted but it could be
that our listener for neigh event to clear the encap valid bit didn't
start yet and the neigh update last used work is started first.
In this scenario the encap entry has valid bit set but the neigh entry
doesn't exist.

Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-23 15:01:11 -07:00
Saeed Mahameed
3101d1fc6b net/mlx5: FW tracer, Add debug prints
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-23 15:01:11 -07:00
Feras Daoud
244069532f net/mlx5: FW tracer, Enable tracing
Add the tracer file to the makefile and add the init
function to the load one flow.

Signed-off-by: Feras Daoud <ferasda@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-23 15:01:11 -07:00
Feras Daoud
70dd6fdb89 net/mlx5: FW tracer, parse traces and kernel tracing support
For each message the driver should do the following:
1- Find the message string in the strings database
2- Count the param number of each message
3- Wait for the param events and accumulate them
4- Calculate the event timestamp using the local event timestamp
and the first timestamp event following it.
5- Print message to trace log

Enable the tracing by:
echo 1 > /sys/kernel/debug/tracing/events/mlx5/mlx5_fw/enable

Read traces by:
cat /sys/kernel/debug/tracing/trace

Signed-off-by: Feras Daoud <ferasda@mellanox.com>
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-23 15:01:11 -07:00
Feras Daoud
c71ad41ccb net/mlx5: FW tracer, events handling
The tracer has one event, event 0x26, with two subtypes:
- Subtype 0: Ownership change
- Subtype 1: Traces available

An ownership change occurs in the following cases:
1- Owner releases his ownership, in this case, an event will be
sent to inform others to reattempt acquire ownership.
2- Ownership was taken by a higher priority tool, in this case
the owner should understand that it lost ownership, and go through
tear down flow.

The second subtype indicates that there are traces in the trace buffer,
in this case, the driver polls the tracer buffer for new traces, parse
them and prepares the messages for printing.

The HW starts tracing from the first address in the tracer buffer.
Driver receives an event notifying that new trace block exists.
HW posts a timestamp event at the last 8B of every 256B block.
Comparing the timestamp to the last handled timestamp would indicate
that this is a new trace block. Once the new timestamp is detected,
the entire block is considered valid.

Block validation and parsing, should be done after copying the current
block to a different location, in order to avoid block overwritten
during processing.

Signed-off-by: Feras Daoud <ferasda@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-23 15:01:11 -07:00
Saeed Mahameed
e9cad2cea7 net/mlx5: FW tracer, register log buffer memory key
Create a memory key and protection domain for the tracer log buffer.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-23 15:01:11 -07:00
Feras Daoud
48967ffdeb net/mlx5: FW tracer, create trace buffer and copy strings database
For each PF do the following:
1- Allocate memory for the tracer strings database and read the
strings from the FW to the SW. These strings will be used later for
parsing traces.
2- Allocate and dma map tracer buffers.

Traces that will be written into the buffer will be parsed as a group
of one or more traces, referred to as trace message. The trace message
represents a C-like printf string.
First trace of a message holds the pointer to the correct string in
strings database. The following traces holds the variables of the
message.

Signed-off-by: Feras Daoud <ferasda@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-23 15:01:11 -07:00
Feras Daoud
f53aaa31cc net/mlx5: FW tracer, implement tracer logic
Implement FW tracer logic and registers access, initialization and
cleanup flows.

Initializing the tracer will be part of load one flow, as multiple
PFs will try to acquire ownership but only one will succeed and will
be the tracer owner.

Signed-off-by: Feras Daoud <ferasda@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-23 15:01:11 -07:00
Saeed Mahameed
7854ac44fe Merge branch 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
mlx5 core infrastructure updates and fixes.

From Eran:
 - Add MPEGC (Management PCIe General Configuration) registers and btis
 - Fix tristate and description for MLX5 module

rom Feras:
 - Add hardware structures for the firmware tracer

From Jainbo:
 - Core support for double vlan push/pop steering action

From Max:
 - Add XRQ commands definitions

From Noa:
 - Add missing SET_DRIVER_VERSION command translation

From Roi:
 - Use ERR_CAST() instead of coding it

From Tariq:
 - Better return types for CQE API

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-23 14:58:46 -07:00
Bryan Whitehead
43e8fe9b84 lan743x: Add RSS support
Implement RSS support

Signed-off-by: Bryan Whitehead <Bryan.Whitehead@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-23 14:09:19 -07:00
Bryan Whitehead
c9cf96bb5f lan743x: Add EEE support
Implement EEE support

Signed-off-by: Bryan Whitehead <Bryan.Whitehead@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-23 14:09:19 -07:00
Bryan Whitehead
4d94282afd lan743x: Add power management support
Implement power management
Supports suspend, resume, and Wake on LAN

Signed-off-by: Bryan Whitehead <Bryan.Whitehead@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-23 14:09:18 -07:00
Bryan Whitehead
695846047a lan743x: Add support for ethtool eeprom access
Implement ethtool eeprom access
Also provides access to OTP (One Time Programming)

Signed-off-by: Bryan Whitehead <Bryan.Whitehead@microchip.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-23 14:09:18 -07:00
Bryan Whitehead
2958337d68 lan743x: Add support for ethtool message level
Implement ethtool message level

Signed-off-by: Bryan Whitehead <Bryan.Whitehead@microchip.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-23 14:09:18 -07:00
Bryan Whitehead
8114e8a2f1 lan743x: Add support for ethtool statistics
Implement ethtool statistics

Signed-off-by: Bryan Whitehead <Bryan.Whitehead@microchip.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-23 14:09:18 -07:00
Bryan Whitehead
63b92a91a4 lan743x: Add support for ethtool link settings
Use default link setting functions

Signed-off-by: Bryan Whitehead <Bryan.Whitehead@microchip.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-23 14:09:18 -07:00
Bryan Whitehead
0cf632265d lan743x: Add support for ethtool get_drvinfo
Implement ethtool get_drvinfo

Signed-off-by: Bryan Whitehead <Bryan.Whitehead@microchip.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-23 14:09:18 -07:00
Sergei Shtylyov
51459d4c76 sh_eth: make sh_eth_tsu_{read|write}_entry() prototypes symmetric
sh_eth_tsu_read_entry() is still asymmetric with sh_eth_tsu_write_entry()
WRT their prototypes -- make them symmetric by passing to the former a TSU
register offset instead of its address and also adding the (now necessary)
'ndev' parameter...

Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-23 12:34:50 -07:00
Sergei Shtylyov
7a54c867ba sh_eth: make sh_eth_tsu_write_entry() take 'offset' parameter
We can add the TSU register base address to a TSU register offset right
in sh_eth_tsu_write_entry(),  no need to do it in its callers...

Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-23 12:34:50 -07:00
Sergei Shtylyov
ecbecb0a90 sh_eth: call sh_eth_tsu_get_offset() from TSU register accessors
With sh_eth_tsu_get_offset() now actually returning TSU register's offset,
we  can at last use it in sh_eth_tsu_{read|write}(). Somehow this saves 248
bytes of object code with AArch64 gcc 4.8.5... :-)

Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-23 12:34:50 -07:00
Sergei Shtylyov
41414f0a85 sh_eth: make sh_eth_tsu_get_offset() match its name
sh_eth_tsu_get_offset(), despite its name, returns a TSU register's address,
not its offset.  Make this  function match its name and return a register's
offset  from the TSU  registers base address instead.

Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-23 12:34:50 -07:00
Sergei Shtylyov
388c4bb4dc sh_eth: uninline sh_eth_tsu_get_offset()
sh_eth_tsu_get_offset() is called several  times  by the driver, remove
*inline* and move  that function  from the header to the driver  itself
to let gcc decide  whether to expand it inline or not...

Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-23 12:34:50 -07:00
Sakari Ailus
977d5ad39f ACPI: Convert ACPI reference args to generic fwnode reference args
Convert all users of struct acpi_reference_args to more generic
fwnode_reference_args. This will

 1) avoid an ACPI specific references to device nodes with integer
    arguments as well as

 2) allow making references to nodes other than device nodes in ACPI.

As a by-product, convert the fwnode interger arguments to u64. The
arguments were 64-bit integers on ACPI but the fwnode arguments were
just 32-bit.

Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-07-23 12:44:52 +02:00
YueHaibing
0a78c3803d net: mediatek: use dma_zalloc_coherent instead of allocator/memset
Use dma_zalloc_coherent instead of dma_alloc_coherent
followed by memset 0.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-22 20:51:40 -07:00
Randy Dunlap
c9ce1fa1c2 net: prevent ISA drivers from building on PPC32
Prevent drivers from building on PPC32 if they use isa_bus_to_virt(),
isa_virt_to_bus(), or isa_page_to_bus(), which are not available and
thus cause build errors.

../drivers/net/ethernet/3com/3c515.c: In function 'corkscrew_open':
../drivers/net/ethernet/3com/3c515.c:824:9: error: implicit declaration of function 'isa_virt_to_bus'; did you mean 'virt_to_bus'? [-Werror=implicit-function-declaration]

../drivers/net/ethernet/amd/lance.c: In function 'lance_rx':
../drivers/net/ethernet/amd/lance.c:1203:23: error: implicit declaration of function 'isa_bus_to_virt'; did you mean 'bus_to_virt'? [-Werror=implicit-function-declaration]

../drivers/net/ethernet/amd/ni65.c: In function 'ni65_init_lance':
../drivers/net/ethernet/amd/ni65.c:585:20: error: implicit declaration of function 'isa_virt_to_bus'; did you mean 'virt_to_bus'? [-Werror=implicit-function-declaration]

../drivers/net/ethernet/cirrus/cs89x0.c: In function 'net_open':
../drivers/net/ethernet/cirrus/cs89x0.c:897:20: error: implicit declaration of function 'isa_virt_to_bus'; did you mean 'virt_to_bus'? [-Werror=implicit-function-declaration]

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Suggested-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-22 11:12:29 -07:00
Jakub Kicinski
07300f774f nfp: avoid buffer leak when FW communication fails
After device is stopped we reset the rings by moving all free buffers
to positions [0, cnt - 2], and clear the position cnt - 1 in the ring.
We then proceed to clear the read/write pointers.  This means that if
we try to reset the ring again the code will assume that the next to
fill buffer is at position 0 and swap it with cnt - 1.  Since we
previously cleared position cnt - 1 it will lead to leaking the first
buffer and leaving ring in a bad state.

This scenario can only happen if FW communication fails, in which case
the ring will never be used again, so the fact it's in a bad state will
not be noticed.  Buffer leak is the only problem.  Don't try to move
buffers in the ring if the read/write pointers indicate the ring was
never used or have already been reset.

nfp_net_clear_config_and_disable() is now fully idempotent.

Found by code inspection, FW communication failures are very rare,
and reconfiguring a live device is not common either, so it's unlikely
anyone has ever noticed the leak.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-22 10:58:52 -07:00
Jakub Kicinski
042f882556 nfp: bring back support for offloading shared blocks
Now that we have offload replay infrastructure added by
commit 326367427c ("net: sched: call reoffload op on block callback reg")
and flows are guaranteed to be removed correctly, we can revert
commit 951a8ee6de ("nfp: reject binding to shared blocks").

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-22 10:58:52 -07:00
John Hurley
b809ec869b nfp: flower: ensure dead neighbour entries are not offloaded
Previously only the neighbour state was checked to decide if an offloaded
entry should be removed. However, there can be situations when the entry
is dead but still marked as valid. This can lead to dead entries not
being removed from fw tables or even incorrect data being added.

Check the entry dead bit before deciding if it should be added to or
removed from fw neighbour tables.

Fixes: 8e6a9046b6 ("nfp: flower vxlan neighbour offload")
Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-22 10:55:01 -07:00
Florian Westphal
6e56830776 atl1c: reserve min skb headroom
Got crash report with following backtrace:
BUG: unable to handle kernel paging request at ffff8801869daffe
RIP: 0010:[<ffffffff816429c4>]  [<ffffffff816429c4>] ip6_finish_output2+0x394/0x4c0
RSP: 0018:ffff880186c83a98  EFLAGS: 00010283
RAX: ffff8801869db00e ...
  [<ffffffff81644cdc>] ip6_finish_output+0x8c/0xf0
  [<ffffffff81644d97>] ip6_output+0x57/0x100
  [<ffffffff81643dc9>] ip6_forward+0x4b9/0x840
  [<ffffffff81645566>] ip6_rcv_finish+0x66/0xc0
  [<ffffffff81645db9>] ipv6_rcv+0x319/0x530
  [<ffffffff815892ac>] netif_receive_skb+0x1c/0x70
  [<ffffffffc0060bec>] atl1c_clean+0x1ec/0x310 [atl1c]
  ...

The bad access is in neigh_hh_output(), at skb->data - 16 (HH_DATA_MOD).
atl1c driver provided skb with no headroom, so 14 bytes (ethernet
header) got pulled, but then 16 are copied.

Reserve NET_SKB_PAD bytes headroom, like netdev_alloc_skb().

Compile tested only; I lack hardware.

Fixes: 7b70176421 ("atl1c: Fix misuse of netdev_alloc_skb in refilling rx ring")
Signed-off-by: Florian Westphal <fw@strlen.de>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-22 10:28:26 -07:00
YueHaibing
4c30337349 libcxgb: replace vmalloc and memset with vzalloc
Use vzalloc instead of the vmalloc, memset combo

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-21 16:32:59 -07:00
YueHaibing
c1907e53ab net: hix5hd2_gmac: use dma_zalloc_coherent instead of allocator/memset
Use dma_zalloc_coherent instead of dma_alloc_coherent
followed by memset 0.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-21 16:31:23 -07:00
Sudarsana Reddy Kalluru
25c020a909 qed: Correct Multicast API to reflect existence of 256 approximate buckets.
FW hsi contains 256 approximation buckets which are split in ramrod into
eight u32 values, but driver is using eight 'unsigned long' variables.

This patch fixes the mcast logic by making the API utilize u32.

Fixes: 83aeb933 ("qed*: Trivial modifications")
Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-21 16:19:04 -07:00
Sudarsana Reddy Kalluru
58874c7b24 qed: Fix possible race for the link state value.
There's a possible race where driver can read link status in mid-transition
and see that virtual-link is up yet speed is 0. Since in this
mid-transition we're guaranteed to see a mailbox from MFW soon, we can
afford to treat this as link down.

Fixes: cc875c2e ("qed: Add link support")
Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-21 16:19:04 -07:00
Sudarsana Reddy Kalluru
4ad95a93a7 qed: Fix link flap issue due to mismatching EEE capabilities.
Apparently, MFW publishes EEE capabilities even for Fiber-boards that don't
support them, and later since qed internally sets adv_caps it would cause
link-flap avoidance (LFA) to fail when driver would initiate the link.
This in turn delays the link, causing traffic to fail.

Driver has been modified to not to ask MFW for any EEE config if EEE isn't
to be enabled.

Fixes: 645874e5 ("qed: Add support for Energy efficient ethernet.")
Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-21 16:19:04 -07:00
David S. Miller
a6fc8594a5 mlx5-fixes-2018-07-18
-----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJbT+cLAAoJEEg/ir3gV/o+I5QH/3LQemGzH33iNsg4khpPeNA+
 Q4mGd2jqbwfL17FTSGpTsPje6rpwzR+j8W1fGTx1vzYmE79ZyDu4EwHS7YZJcGyz
 q8P0HgrUe4NrJV8mlOpbIRbTuSwfqultw2qRpmCfLf5kK1nqSIPpUHIfBUMqwy0o
 O7GJrytUI4Av+r5Px/6bjb5kBaVe5YBe0tg8nSrN2vtzHVQWm+5/uaNRW2SrCN+4
 5SI2AsWyMwfGCC+IE8i9OlIFCy6Iu2vwcUabK+6EeGKP4Wb6rukyG01TkQPSd7gy
 ozcAjvj+ppHmVFath1uzLCFU3RbKt6GbVRGaFQg5jO5vvK3uzFJnm59Vqw/WzNs=
 =UXsy
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-fixes-2018-07-18' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
Mellanox, mlx5 fixes 2018-07-18

The following series provides fixes to mlx5 core and net device driver.

Please pull and let me know if there's any problem.

For -stable v4.7
    net/mlx5e: Don't allow aRFS for encapsulated packets
    net/mlx5e: Fix quota counting in aRFS expire flow

For -stable v4.15
    net/mlx5e: Only allow offloading decap egress (egdev) flows
    net/mlx5e: Refine ets validation function
    net/mlx5: Adjust clock overflow work period

For -stable v4.17
    net/mlx5: E-Switch, UBSAN fix undefined behavior in mlx5_eswitch_mode
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-21 10:18:28 -07:00
Jian Shen
d71d8381c5 net: hns3: Add SPDX tags to HNS3 PF driver
Add the SPDX identifiers to HNS3 PF driver.

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-21 08:44:23 -07:00
Jian Shen
584b464f83 net: hns3: Remove unused struct member and definition
The struct hclge_desc_cb and hclge_desc_cb are never used in
anywhere. This patch removes them.

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-21 08:44:23 -07:00
Jian Shen
ef0c500961 net: hns3: Fix misleading parameter name
The input parameter "dev" of hns3_irq_handle() is indeed
used as a tqp vector, it is misleadin.

The struct member "flag" is used to indicate ring type,
so rename it.

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-21 08:44:23 -07:00
Jian Shen
c79301d8d9 net: hns3: Modify inconsistent bit mask macros
Use BIT() and GENMASK() to convert the bit mask, modify
the inconsistent ones, and remove useless ones.

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-21 08:44:23 -07:00
Jian Shen
f8a91784a1 net: hns3: Use decimal for bit offset macros
Using hex for bit offsets is inconsistent with the rest
of the file. Change them to decimal.

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-21 08:44:23 -07:00
Jian Shen
fdace1bc4a net: hns3: Correct unreasonable code comments
This patch fixes some comment spelling errors, removes
redundant comments, rewrites misleading comments, and
adds some necessary comments.

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-21 08:44:23 -07:00
Jian Shen
a10829c4ae net: hns3: Remove extra space and brackets
Remove extra space and brackets.

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-21 08:44:23 -07:00
Jian Shen
3f639907e0 net: hns3: Standardize the handle of return value
Apply the standard minor cleanup by returning ret outside
the brackets.

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-21 08:44:23 -07:00
Jian Shen
646cb51228 net: hns3: Remove some redundant assignments
Remove some redundant assignments, because they have
been set to zero when allocate hdev.

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-21 08:44:23 -07:00
David S. Miller
eae249b27f Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
pull-request: bpf-next 2018-07-20

The following pull-request contains BPF updates for your *net-next* tree.

The main changes are:

1) Add sharing of BPF objects within one ASIC: this allows for reuse of
   the same program on multiple ports of a device, and therefore gains
   better code store utilization. On top of that, this now also enables
   sharing of maps between programs attached to different ports of a
   device, from Jakub.

2) Cleanup in libbpf and bpftool's Makefile to reduce unneeded feature
   detections and unused variable exports, also from Jakub.

3) First batch of RCU annotation fixes in prog array handling, i.e.
   there are several __rcu markers which are not correct as well as
   some of the RCU handling, from Roman.

4) Two fixes in BPF sample files related to checking of the prog_cnt
   upper limit from sample loader, from Dan.

5) Minor cleanup in sockmap to remove a set but not used variable,
   from Colin.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-20 23:58:30 -07:00
David S. Miller
c4c5551df1 Merge ra.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux
All conflicts were trivial overlapping changes, so reasonably
easy to resolve.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-20 21:17:12 -07:00
Sudarsana Reddy Kalluru
97df0d6562 qede: Add driver callbacks for eeprom module query.
This patch implements the ethtool callbacks for querying sfp/eeprom module.

Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-19 23:35:37 -07:00
Sudarsana Reddy Kalluru
b51dab46c6 qed: Add qed APIs for PHY module query.
This patch adds qed APIs for reading the PHY module.

Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-19 23:35:37 -07:00
Zhao Chen
f7482683f1 net-next/hinic: fix a problem in hinic_xmit_frame()
The calculation of "wqe_size" is not correct when the tx queue is busy in
hinic_xmit_frame().

When there are no free WQEs, the tx flow will unmap the skb buffer, then
ring the doobell for the pending packets. But the "wqe_size" which used
to calculate the doorbell address is not correct. The wqe size should be
cleared to 0, otherwise, it will cause a doorbell error.

This patch fixes the problem.

Reported-by: Zhou Wang <wangzhou1@hisilicon.com>
Signed-off-by: Zhao Chen <zhaochen6@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-19 23:27:32 -07:00
Linus Torvalds
024ddc0ce1 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:
 "Lots of fixes, here goes:

   1) NULL deref in qtnfmac, from Gustavo A. R. Silva.

   2) Kernel oops when fw download fails in rtlwifi, from Ping-Ke Shih.

   3) Lost completion messages in AF_XDP, from Magnus Karlsson.

   4) Correct bogus self-assignment in rhashtable, from Rishabh
      Bhatnagar.

   5) Fix regression in ipv6 route append handling, from David Ahern.

   6) Fix masking in __set_phy_supported(), from Heiner Kallweit.

   7) Missing module owner set in x_tables icmp, from Florian Westphal.

   8) liquidio's timeouts are HZ dependent, fix from Nicholas Mc Guire.

   9) Link setting fixes for sh_eth and ravb, from Vladimir Zapolskiy.

  10) Fix NULL deref when using chains in act_csum, from Davide Caratti.

  11) XDP_REDIRECT needs to check if the interface is up and whether the
      MTU is sufficient. From Toshiaki Makita.

  12) Net diag can do a double free when killing TCP_NEW_SYN_RECV
      connections, from Lorenzo Colitti.

  13) nf_defrag in ipv6 can unnecessarily hold onto dst entries for a
      full minute, delaying device unregister. From Eric Dumazet.

  14) Update MAC entries in the correct order in ixgbe, from Alexander
      Duyck.

  15) Don't leave partial mangles bpf program in jit_subprogs, from
      Daniel Borkmann.

  16) Fix pfmemalloc SKB state propagation, from Stefano Brivio.

  17) Fix ACK handling in DCTCP congestion control, from Yuchung Cheng.

  18) Use after free in tun XDP_TX, from Toshiaki Makita.

  19) Stale ipv6 header pointer in ipv6 gre code, from Prashant Bhole.

  20) Don't reuse remainder of RX page when XDP is set in mlx4, from
      Saeed Mahameed.

  21) Fix window probe handling of TCP rapair sockets, from Stefan
      Baranoff.

  22) Missing socket locking in smc_ioctl(), from Ursula Braun.

  23) IPV6_ILA needs DST_CACHE, from Arnd Bergmann.

  24) Spectre v1 fix in cxgb3, from Gustavo A. R. Silva.

  25) Two spots in ipv6 do a rol32() on a hash value but ignore the
      result. Fixes from Colin Ian King"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (176 commits)
  tcp: identify cryptic messages as TCP seq # bugs
  ptp: fix missing break in switch
  hv_netvsc: Fix napi reschedule while receive completion is busy
  MAINTAINERS: Drop inactive Vitaly Bordug's email
  net: cavium: Add fine-granular dependencies on PCI
  net: qca_spi: Fix log level if probe fails
  net: qca_spi: Make sure the QCA7000 reset is triggered
  net: qca_spi: Avoid packet drop during initial sync
  ipv6: fix useless rol32 call on hash
  ipv6: sr: fix useless rol32 call on hash
  net: sched: Using NULL instead of plain integer
  net: usb: asix: replace mii_nway_restart in resume path
  net: cxgb3_main: fix potential Spectre v1
  lib/rhashtable: consider param->min_size when setting initial table size
  net/smc: reset recv timeout after clc handshake
  net/smc: add error handling for get_user()
  net/smc: optimize consumer cursor updates
  net/nfc: Avoid stalls when nfc_alloc_send_skb() returned NULL.
  ipv6: ila: select CONFIG_DST_CACHE
  net: usb: rtl8150: demote allmulti message to dev_dbg()
  ...
2018-07-18 19:32:54 -07:00
Roi Dayan
7e29392eee net/mlx5e: Only allow offloading decap egress (egdev) flows
We get egress rules through the egdev mechanism when the ingress device
is not supporting offload, with the expected use-case of tunnel decap
ingress rule set on shared tunnel device.

Make sure to offload egress/egdev rules only if decap action (tunnel key
unset) exists there and err otherwise.

Fixes: 717503b9cf ("net: sched: convert cls_flower->egress_dev users to tc_setup_cb_egdev infra")
Signed-off-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-18 18:16:59 -07:00
Tariq Toukan
d7037ad73d net/mlx5: Fix QP fragmented buffer allocation
Fix bad alignment of SQ buffer in fragmented QP allocation.
It should start directly after RQ buffer ends.

Take special care of the end case where the RQ buffer does not occupy
a whole page. RQ size is a power of two, so would be the case only for
small RQ sizes (RQ size < PAGE_SIZE).

Fix wrong assignments for sqb->size (mistakenly assigned RQ size),
and for npages value of RQ and SQ.

Fixes: 3a2f703312 ("net/mlx5: Use order-0 allocations for all WQ types")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-18 18:16:58 -07:00
Raed Salem
8c49f54a3f net/mlx5: Fix 'DON'T_TRAP' functionality
The flow counters binding support commit introduced a code change where
none NULL 'rule_dest' is always passed to mlx5_add_flow_rules, this breaks
'DON'T_TRAP' rules insertion.

The fix uses the equivalent 'dest_num' value instead of dest pointer
at the failed check.

fixes: 3b3233fbf0 ('IB/mlx5: Add flow counters binding support')
Signed-off-by: Raed Salem <raeds@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-18 18:16:57 -07:00
Saeed Mahameed
443a858158 net/mlx5: E-Switch, UBSAN fix undefined behavior in mlx5_eswitch_mode
With debug kernel UBSAN detects the following issue, which might happen
when eswitch instance is not created, fix this by testing the eswitch
pointer before returning the eswitch mode, if not set return mode =
SRIOV_NONE.

[   32.528951] UBSAN: Undefined behaviour in drivers/net/ethernet/mellanox/mlx5/core/eswitch.c:2219:12
[   32.528951] member access within null pointer of type 'struct mlx5_eswitch'
[   32.528951] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.18.0-rc3-dirty #181
[   32.528951] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014
[   32.528951] Call Trace:
[   32.528951]  dump_stack+0xc7/0x13b
[   32.528951]  ? show_regs_print_info+0x5/0x5
[   32.528951]  ? __pm_runtime_use_autosuspend+0x140/0x140
[   32.528951]  ubsan_epilogue+0x9/0x49
[   32.528951]  ubsan_type_mismatch_common+0x1f9/0x2c0
[   32.528951]  ? ucs2_as_utf8+0x310/0x310
[   32.528951]  ? device_initialize+0x229/0x2e0
[   32.528951]  __ubsan_handle_type_mismatch+0x9f/0xc9
[   32.528951]  ? __ubsan_handle_divrem_overflow+0x19b/0x19b
[   32.578008]  ? ib_device_get_by_index+0xf0/0xf0
[   32.578008]  mlx5_eswitch_mode+0x30/0x40
[   32.578008]  mlx5_ib_add+0x1e0/0x4a0

Fixes: 57cbd893c4 ("net/mlx5: E-Switch, Move representors definition to a global scope")
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
2018-07-18 18:16:57 -07:00
Eran Ben Elisha
d2e1c57bcf net/mlx5e: Don't allow aRFS for encapsulated packets
Driver is yet to support aRFS for encapsulated packets, return early
error in such case.

Fixes: 18c908e477 ("net/mlx5e: Add accelerated RFS support")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-18 18:16:56 -07:00
Eran Ben Elisha
2630bae801 net/mlx5e: Fix quota counting in aRFS expire flow
Quota should follow the amount of rules which do expire, and not the
number of rules that were examined, fixed that.

Fixes: 18c908e477 ("net/mlx5e: Add accelerated RFS support")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-18 18:16:55 -07:00
Ariel Levkovich
33180bee86 net/mlx5: Adjust clock overflow work period
When driver converts HW timestamp to wall clock time it subtracts
the last saved cycle counter from the HW timestamp and converts the
difference to nanoseconds.
The conversion is done by multiplying the cycles difference with the
clock multiplier value as a first step and therefore the cycles
difference should be small enough so that the multiplication product
doesn't exceed 64bit.

The overflow handling routine is in charge of updating the last saved
cycle counter in driver and it is called periodically using kernel
delayed workqueue.

The delay period for this work is calculated using the max HW cycle
counter value (a 41 bit mask) as a base which doesn't take the 64bit
limit into account so the delay period may be incorrect and too
long to prevent a large difference between the HW counter and the last
saved counter in SW.

This change adjusts the work period for the HW clock overflow work by
taking the minimum between the previous value and the quotient of max
u64 value and the clock multiplier value.

Fixes: ef9814deaf ("net/mlx5e: Add HW timestamping (TS) support")
Signed-off-by: Ariel Levkovich <lariel@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-18 18:16:55 -07:00
Shay Agroskin
e279d634f3 net/mlx5e: Refine ets validation function
Removed an error message received when configuring ETS total
bandwidth to be zero.
Our hardware doesn't support such configuration, so we shall
reject it in the driver. Nevertheless, we removed the error message
in order to eliminate error messages caused by old userspace tools
who try to pass such configuration.

Fixes: ff0891915c ("net/mlx5e: Fix ETS BW check")
Signed-off-by: Shay Agroskin <shayag@mellanox.com>
Reviewed-by: Huy Nguyen <huyn@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-18 18:16:54 -07:00
Alexander Sverdlin
e40562abdf net: cavium: Add fine-granular dependencies on PCI
Add dependencies on PCI where necessary.

Fixes: 7e2bc7fb65 ("net: cavium: Drop dependency of NET_VENDOR_CAVIUM on PCI")
Signed-off-by: Alexander Sverdlin <alexander.sverdlin@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-18 15:21:33 -07:00
Stefan Wahren
5097399326 net: qca_spi: Fix log level if probe fails
In cases the probing fails the log level of the messages should
be an error.

Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-18 15:19:08 -07:00
Stefan Wahren
711c62dfa6 net: qca_spi: Make sure the QCA7000 reset is triggered
In case the SPI thread is not running, a simple reset of sync
state won't fix the transmit timeout. We also need to wake up the kernel
thread.

Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Fixes: ed7d42e24e ("net: qca_spi: fix transmit queue timeout handling")
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-18 15:19:08 -07:00
Stefan Wahren
b2bab426dc net: qca_spi: Avoid packet drop during initial sync
As long as the synchronization with the QCA7000 isn't finished, we
cannot accept packets from the upper layers. So let the SPI thread
enable the TX queue after sync and avoid unwanted packet drop.

Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Fixes: 291ab06ecf ("net: qualcomm: new Ethernet over SPI driver for QCA7000")
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-18 15:19:08 -07:00
Eran Ben Elisha
048f31437a net/mlx5: Fix tristate and description for MLX5 module
Current description did not include new devices. Fix that by proving the
correct generic description.

Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-18 14:33:25 -07:00
Roi Dayan
d34c6efc59 net/mlx5: Use ERR_CAST() instead of coding it
This makes it more readable that rule is being used to return an err.

Signed-off-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-18 14:33:25 -07:00
Noa Osherovich
0f4039104e net/mlx5: Add missing SET_DRIVER_VERSION command translation
When translating command opcodes to a string, SET_DRIVER_VERSION
command was missing.

Fixes: 42ca502e17 ('net/mlx5_core: Use a macro in mlx5_command_str()')
Signed-off-by: Noa Osherovich <noaos@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-18 14:33:25 -07:00
Max Gurtovoy
a451733909 net/mlx5: Add XRQ commands definitions
Update mlx5 command list and error return function to handle XRQ
commands.

Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-18 14:33:25 -07:00
Jianbo Liu
8da6fe2a18 net/mlx5: Add core support for double vlan push/pop steering action
As newer firmware supports double push/pop in a single FTE, we add
core bits and extend vlan action logic for it.

Signed-off-by: Jianbo Liu <jianbol@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-18 14:33:25 -07:00
Florian Fainelli
7f7b757455 net: ethernet: broadcom: Drop dependency on OF
Both BCMGENET and SYSTEMPORT build just fine with CONFIG_OF=n, we do have a
dependency on HAS_IOMEM that was not being reflected for SYSTEMPORT so add
that.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-18 13:53:48 -07:00
Arnd Bergmann
74525cc5f5 net: cavium: add missing PCI dependencies
While some of the cavium drivers don't require PCI support, most
others do, as shown by these build failures:

WARNING: unmet direct dependencies detected for MDIO_THUNDER
  Depends on [n]: NETDEVICES [=y] && MDIO_BUS [=y] && 64BIT [=y] && PCI [=n]
  Selected by [y]:
  - THUNDER_NIC_BGX [=y] && NETDEVICES [=y] && ETHERNET [=y] && NET_VENDOR_CAVIUM [=y] && 64BIT [=y]
  - THUNDER_NIC_RGX [=y] && NETDEVICES [=y] && ETHERNET [=y] && NET_VENDOR_CAVIUM [=y] && 64BIT [=y]
drivers/net/ethernet/cavium/thunder/nicvf_main.c: In function 'nicvf_set_irq_affinity':
drivers/net/ethernet/cavium/thunder/nicvf_main.c:1095:25: error: implicit declaration of function 'pci_irq_vector'; did you mean 'rcu_irq_enter'? [-Werror=implicit-function-declaration]
drivers/net/ethernet/cavium/thunder/nic_main.c: In function 'nic_mbx_intr_handler':
drivers/net/ethernet/cavium/thunder/nic_main.c:1135:13: error: implicit declaration of function 'pci_irq_vector'; did you mean 'rcu_irq_enter'? [-Werror=implicit-function-declaration]
In file included from drivers/net/ethernet/cavium/liquidio/cn23xx_pf_device.c:27:
drivers/net/ethernet/cavium/liquidio/octeon_main.h: In function 'octeon_unmap_pci_barx':
drivers/net/ethernet/cavium/liquidio/octeon_main.h:97:3: error: implicit declaration of function 'pci_release_region'; did you mean 'pci_release_regions'? [-Werror=implicit-function-declaration]
drivers/net/ethernet/cavium/liquidio/octeon_mailbox.c: In function 'octeon_mbox_process_cmd':
drivers/net/ethernet/cavium/liquidio/octeon_mailbox.c:263:3: error: implicit declaration of function 'pcie_capability_set_word'; did you mean 'has_capability_noaudit'? [-Werror=implicit-function-declaration]
drivers/net/ethernet/cavium/liquidio/cn23xx_pf_device.c: In function 'setup_cn23xx_octeon_pf_device':
drivers/net/ethernet/cavium/liquidio/cn23xx_pf_device.c:1315:22: error: 'data32' is used uninitialized in this function [-Werror=uninitialized]
drivers/net/ethernet/cavium/liquidio/cn23xx_vf_device.c: In function 'cn23xx_dump_vf_iq_regs':
include/linux/dynamic_debug.h:135:3: error: 'regval' may be used uninitialized in this function [-Werror=maybe-uninitialized]
drivers/net/ethernet/cavium/liquidio/lio_core.c: In function 'octeon_setup_interrupt':
drivers/net/ethernet/cavium/liquidio/lio_core.c:1067:17: error: invalid application of 'sizeof' to incomplete type 'struct msix_entry'
drivers/net/ethernet/cavium/liquidio/octeon_main.h: In function 'octeon_unmap_pci_barx':
drivers/net/ethernet/cavium/liquidio/octeon_main.h:97:3: error: implicit declaration of function 'pci_release_region'; did you mean 'pci_release_regions'? [-Werror=implicit-function-declaration]

This adds back the minimum set of dependencies to get everything to
build cleanly again, but leaving the ones that build cleanly.

Fixes: 7e2bc7fb65 ("net: cavium: Drop dependency of NET_VENDOR_CAVIUM on PCI")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-18 13:47:40 -07:00
YueHaibing
bf20a5c1d5 liquidio: Using NULL instead of plain integer
Fixes the following sparse warnings:

drivers/net/ethernet/cavium/liquidio/lio_main.c:3068:23: warning:
 Using plain integer as NULL pointer
drivers/net/ethernet/cavium/liquidio/lio_main.c:2909:23: warning:
 Using plain integer as NULL pointer
drivers/net/ethernet/cavium/liquidio/cn23xx_vf_device.c:385:27: warning:
 Using plain integer as NULL pointer

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-18 13:40:58 -07:00
Arnd Bergmann
6060d9d24e net/mlx5: fix an unused-function warning
These dummy helpers are all intended to be inline functions,
but one of them by accident came without the 'inline' keyword,
causing a harmless warning:

In file included from drivers/net/ethernet/mellanox/mlx5/core/main.c:63:
drivers/net/ethernet/mellanox/mlx5/core/accel/tls.h:79:1: error: 'mlx5_accel_tls_add_flow' defined but not used [-Werror=unused-function]
 mlx5_accel_tls_add_flow(struct mlx5_core_dev *mdev, void *flow,

Fixes: ab412e1dd7 ("net/mlx5: Accel, add TLS rx offload routines")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-18 13:38:54 -07:00
Gustavo A. R. Silva
676bcfece1 net: cxgb3_main: fix potential Spectre v1
t.qset_idx can be indirectly controlled by user-space, hence leading to
a potential exploitation of the Spectre variant 1 vulnerability.

This issue was detected with the help of Smatch:

drivers/net/ethernet/chelsio/cxgb3/cxgb3_main.c:2286 cxgb_extension_ioctl()
warn: potential spectre issue 'adapter->msix_info'

Fix this by sanitizing t.qset_idx before using it to index
adapter->msix_info

Notice that given that speculation windows are large, the policy is
to kill the speculation on the first load and not worry if it can be
completed with a dependent load/store [1].

[1] https://marc.info/?l=linux-kernel&m=152449131114778&w=2

Cc: stable@vger.kernel.org
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-18 13:31:15 -07:00
Gustavo A. R. Silva
e146471f58 net: mvpp2: debugfs: fix incorrect bitwise operator
The use of the | operator always leads to true, which looks rather
suspect in this case.

Fix this by using & instead.

Addresses-Coverity-ID: 1471903 ("Wrong operator used")
Fixes: dba1d918da ("net: mvpp2: debugfs: add entries for classifier flows")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Acked-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-18 10:55:42 -07:00
Ganesh Goudar
fa145d5dfd cxgb4: display number of rx and tx pages free
display free rx and tx page count in the meminfo of
an adapter.

Signed-off-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-18 10:52:46 -07:00
Jiri Pirko
c3ab435466 mlxsw: spectrum: Extend to support Spectrum-2 ASIC
Extend existing driver for Spectrum ASIC to support Spectrum-2 ASIC.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-19 02:13:14 +09:00
Jiri Pirko
9912e6b8c2 mlxsw: spectrum_acl: Add initial Spectrum-2 ACL implementation
Utilize only C-TCAM for now. Do very minimal A-TCAM initialization in
order to make C-TCAM work.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-19 02:13:14 +09:00
Ido Schimmel
a6b9c87daf mlxsw: spectrum_acl: Add region association callback
In Spectrum-2, ACL regions that use 8 or 12 key blocks require several
consecutive hardware regions.

In order to allow defragmentation, the device stores a mapping from a
logical region ID to an hardware region ID, which is similar to the page
table that is used to translate virtual addresses to physical addresses.

Add the region association callback to the region create sequence and
implement it as a NOP in Spectrum which does not require it.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-19 02:13:14 +09:00
Ido Schimmel
7a921a1e58 mlxsw: spectrum_acl: Add support for Spectrum-2 block encoding
Encode each flexible key block in the general block scheme according its
block index.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-19 02:13:14 +09:00
Ido Schimmel
a6d70a878e mlxsw: spectrum_acl: Prepare for Spectrum-2 block encoding
In Spectrum the key (and mask) block layout is very straight forward and
every block is 16 bytes aligned.

However, in Spectrum-2 the blocks are not even byte aligned, which makes
it difficult to encode them using current method.

Instead, first encode each block and then encode the block in the
general blocks layout.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-19 02:13:14 +09:00
Ido Schimmel
7050f439ef mlxsw: reg: Add Policy-Engine General Configuration Register
The PGCR register configures general Policy-Engine settings.

Specifically, we are going to use it in order to set the default action
base pointer, which determines where the default action (when there is
no hit) is located for each region.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-19 02:13:14 +09:00
Ido Schimmel
f1c7d9cce2 mlxsw: reg: Add Policy-Engine Region eRP Register
The PERERP register configures the region eRPs. It can be used, for
example, to enable lookup in the C-TCAM in addition to the A-TCAM.

To be able to perform a lookup in the C-TCAM we need to "use" the eRP
table. This is done by marking the pointer as valid, but zeroing the eRP
table vector.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-19 02:13:14 +09:00
Ido Schimmel
481662a8a3 mlxsw: reg: Add Policy-Engine Region Configuration Register
The PERCR register configures the region parameters such as whether to
consult the bloom filter before performing a lookup using a specific
eRP.

For C-TCAM only usage we don't need to accurately set the master mask.
Instead, we can set all of its bits to make sure all the extracted keys
are actually used.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-19 02:13:13 +09:00
Jiri Pirko
3390787b61 mlxsw: reg: Add Policy-Engine Region Association Register
The PERAR register is used to associate a hw region for region_id's.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-19 02:13:13 +09:00
Jiri Pirko
0f27e80aea mlxsw: acl: Introduce activity get operation for action block/set
In Spectrum-2, activity cannot be find out by TCAM rule (PTCEv2 register),
but rather by associated action set. For that purpose, extend action ops
to allow query activity from PEFA register. Block activity is decided
according to activity of the first set.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-19 02:13:13 +09:00
Jiri Pirko
2d186ed4dd mlxsw: reg: Add support for activity information from PEFA register
In Spectrum-2, the PEFA register is extend to report if the action set
was hit during processing of packets. Introduce this extension and
adjust the code around this accordingly.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-19 02:13:13 +09:00
Jiri Pirko
dcdf01028e mlxsw: spectrum: Introduce flex key blocks for Spectrum-2
Introduce key blocks for Spectrum-2 that contains the same elements used
already for Spectrum1. Along with that, introduce encoder stub.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-19 02:13:13 +09:00
Jiri Pirko
d55ece4b6e mlxsw: spectrum: Add Spectrum-2 variant of flex actions ops
In Spectrum-2, no action set is stored directly in TCAM, all are located
in KVD linear. So ask core to treat the first set as dummy empty one,
to be just used for PTCEV2 purposes.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-19 02:13:13 +09:00
Jiri Pirko
18ce0e4e66 mlxsw: spectrum_mr_tcam: Add Spectrum-2 stubs
Add dummy ops for now. The ops are going to be implemented later on.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-19 02:13:13 +09:00
Jiri Pirko
742f75a600 mlxsw: spectrum: Add KVDL manager implementation for Spectrum-2
In Spectrum-2, KVD linear indexes are hashed into KVD hash. Therefore it
is possible for multiple resource types to use same indexes. There are
multiple index spaces. Also, the index space is bigger than the actual
KVD hash area, which allows to have holes in the index space without any
penalization. The HW has to be told in case the index for particular
resource type is no longer used so it can be freed from KVD hash. IEDR
register is used for that.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-19 02:13:13 +09:00
Jiri Pirko
c33d0cb192 mlxsw: reg: Add Infrastructure Entry Delete Register
The IEDR register is used for deleting entries from the entry tables.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-19 02:13:13 +09:00
Olof Johansson
380d685923 This is the pxa changes for 4.19 cycle :
- the pxa architecture is ported to dma slavemap
  - some minor AC97 fixes
  - some minor board fixes
 -----BEGIN PGP SIGNATURE-----
 
 iQJLBAABCAA1FiEExgkueSa0u8n4Ls+HA/Z63R24yxIFAltM+OUXHHJvYmVydC5q
 YXJ6bWlrQGZyZWUuZnIACgkQA/Z63R24yxJZmw/8Cg/SXNLgShgoHAhDBth/pOc7
 rq4iZwRFi4d2SQowQAj+booRrEuUePnvnpjbPby4eY/nNsfEC+KsZLDJcdVjeqsN
 HG+V/0SQ6tTFzCfDLR9DS75DEyCQY5kjJmAhcWRQV2QKLbXHP2tnRbqvwNS++ltT
 xZaKwULvC6uVw8DveVyBmWz5aXCSwF9LjFdY2FY1fnfSd76phtjuaLiVF9akH+6C
 x8D+piuloMhxM8w3rX3jZu3RO/gaG+57gYBaVNM0soHXz73zRnf/qvn4bbNmcLaK
 K2m5slB+S5/QFLWBZG4uZFRF0eZ2aUtNLGjRsT3HFfP6iitc35FAIhdj76vvTk4U
 XDoD89Sjzi/jprvlnHkJESt16PZUP1F2jbrS8aTE7Ma9OMv0BogWMebEcJ0TpzU2
 d5SCvGdD+ZOhwUTrmPdLSYjnAjjJqxQMT9TAtA7oYCWJs9aEiIp02mxXQx6ddRCH
 6C5KOY8qC03x6kWqy1lEb7ySbwldcHprf1ZGvVI1VizDdWkjblhdvstJW10elIfd
 7lnUmyE0Dy7AsJclR91NEfE1p+w6HZ9z8SP9EH5/54g2/pLr8KEsEvXkPBks0Twj
 znDFx3ystqK26XHme8LBxtnQbY+ghl5XdERbeVHFBourCdjKol94asa6Yk9yET1h
 s31jjWQ1+ib9XokiWJY=
 =dvhR
 -----END PGP SIGNATURE-----

Merge tag 'pxa-for-4.19-v2' of https://github.com/rjarzmik/linux into next/soc

This is the pxa changes for 4.19 cycle :
 - the pxa architecture is ported to dma slavemap
 - some minor AC97 fixes
 - some minor board fixes

* tag 'pxa-for-4.19-v2' of https://github.com/rjarzmik/linux:
  net: smc91x: remove the dmaengine compat need
  net: smc911x: remove the dmaengine compat need
  ARM: pxa: zylonite: use the new ac97 bus support
  ARM: pxa: add the missing AC97 clocks
  ARM: pxa: mioa701 convert to the new AC97 bus
  ARM: pxa: hx4700: fix the usb client
  ARM: pxa: change SSP DMA channels allocation
  ARM: pxa: remove the DMA IO resources
  dmaengine: pxa: document pxad_param
  ata: pata_pxa: remove the dmaengine compat need
  mtd: rawnand: marvell: remove the dmaengine compat need
  media: pxa_camera: remove the dmaengine compat need
  mmc: pxamci: remove the dmaengine compat need
  dmaengine: pxa: add a default requestor policy
  ARM: pxa: add dma slave map
  dmaengine: pxa: use a dma slave map

Signed-off-by: Olof Johansson <olof@lixom.net>
2018-07-18 07:50:50 -07:00
Jakub Kicinski
b5faa20d6f nfp: bpf: allow program sharing within ASIC
Allow program sharing between netdevs of the same NFP ASIC.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-07-18 15:10:34 +02:00
Jakub Kicinski
602144c224 bpf: offload: keep the offload state per-ASIC
Create a higher-level entity to represent a device/ASIC to allow
programs and maps to be shared between device ports.  The extra
work is required to make sure we don't destroy BPF objects as
soon as the netdev for which they were loaded gets destroyed,
as other ports may still be using them.  When netdev goes away
all of its BPF objects will be moved to other netdevs of the
device, and only destroyed when last netdev is unregistered.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-07-18 15:10:34 +02:00
Jakub Kicinski
9fd7c55591 bpf: offload: aggregate offloads per-device
Currently we have two lists of offloaded objects - programs and maps.
Netdevice unregister notifier scans those lists to orphan objects
associated with device being unregistered.  This puts unnecessary
(even if negligible) burden on all netdev unregister calls in BPF-
-enabled kernel.  The lists of objects may potentially get long
making the linear scan even more problematic.  There haven't been
complaints about this mechanisms so far, but it is suboptimal.

Instead of relying on notifiers, make the few BPF-capable drivers
register explicitly for BPF offloads.  The programs and maps will
now be collected per-device not on a global list, and only scanned
for removal when driver unregisters from BPF offloads.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-07-18 15:10:34 +02:00
Jakub Kicinski
4612bebfa3 nfp: add .ndo_init() and .ndo_uninit() callbacks
BPF code should unregister the offload capabilities from .ndo_uninit(),
to make sure the operation is atomic with unlist_netdevice().  Plumb
the init/uninit NDOs for vNICs and representors.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-07-18 15:10:34 +02:00
Niklas Söderlund
e49b42faae ravb: fix byte order for TX descriptor tag field lower bits
The wrong helper is used to swap the bytes when adding the lower bits of
the TX descriptors tag field in the shared ds_tagl variable. The
variable contains the DS[11:0] field and then the TAG[3:0] bits.

The mistake was highlighted by the sparse warning:

ravb_main.c:1622:31:    left side has type restricted __le16
ravb_main.c:1622:31:    right side has type unsigned short
ravb_main.c:1622:31: warning: invalid assignment: |=
ravb_main.c:1622:34: warning: cast to restricted __le16

Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-18 14:21:49 +09:00
Niklas Söderlund
49f3303a6e ravb: fix warning about memcpy length
This fixes sparse warning:

ravb_main.c:1257 ravb_get_strings() error: memcpy() '*ravb_gstrings_stats' too small (32 vs 960)

Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-18 14:21:49 +09:00
Niklas Söderlund
c94f2fc4fc ravb: fix shadowing of symbol 'stats' in ravb_get_ethtool_stats()
Inside a loop in ravb_get_ethtool_stats() a variable 'stats' is declared
resulting in the argument also named 'stats' to be shadowed. Fix this
warning by renaming the unused argument 'stats' to 'estats'.

This fixes the sparse warning:

ravb_main.c:1225:36: originally declared here
ravb_main.c:1233:41: warning: symbol 'stats' shadows an earlier one

Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-18 14:21:49 +09:00
Alexander Sverdlin
4aac0b4347 octeon_mgmt: Fix MIX registers configuration on MTU setup
octeon_mgmt driver doesn't drop RX frames that are 1-4 bytes bigger than
MTU set for the corresponding interface. The problem is in the
AGL_GMX_RX0/1_FRM_MAX register setting, which should not account for VLAN
tagging.

According to Octeon HW manual:
"For tagged frames, MAX increases by four bytes for each VLAN found up to a
maximum of two VLANs, or MAX + 8 bytes."

OCTEON_FRAME_HEADER_LEN "define" is fine for ring buffer management, but
should not be used for AGL_GMX_RX0/1_FRM_MAX.

The problem could be easily reproduced using "ping" command. If affected
system has default MTU 1500, other host (having MTU >= 1504) can
successfully "ping" the affected system with payload size 1473-1476,
resulting in IP packets of size 1501-1504 accepted by the mgmt driver.
Fixed system still accepts IP packets of 1500 bytes even with VLAN tagging,
because the limits are lifted in HW as expected, for every VLAN tag.

Signed-off-by: Alexander Sverdlin <alexander.sverdlin@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-18 13:13:07 +09:00
Heiner Kallweit
07df5bd874 r8169: power down chip in probe
The removed code would be called in two situations:
1. interface is brought up never or >10s after driver load
2. after close()

Case 1 we can handle cleaner by ensuring chip is powered down when
leaving probe(). open() callback will power up the chip.

In case 2 we call rtl_pll_power_down() twice currently, from the
close() callback and 10s later when entering runtime-suspend.
This is avoided by this patch.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-18 10:03:35 +09:00
Heiner Kallweit
29a12b4953 r8169: don't read chip phy status register
Instead of accessing the PHYstatus register we can use the information
phylib stores in the phy_device structure.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-18 09:46:32 +09:00
Heiner Kallweit
f7ffa9ae2b r8169: remove mii_if_info member from struct rtl8169_private
The only remaining usage of the struct mii_if_info member is to store the
information whether the chip is GMII-capable. So we can replace it with
a simple flag.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-18 09:46:32 +09:00
Heiner Kallweit
a2965f12fd r8169: remove rtl8169_set_speed_xmii
We can remove rtl8169_set_speed_xmii() now that phylib handles all this.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-18 09:46:32 +09:00
Heiner Kallweit
5b7ad4b75d r8169: use phy_speed_down / phy_speed_up
Use new phylib functions phy_speed_down() and phy_speed_up().

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-18 09:46:32 +09:00
Heiner Kallweit
69b3c59fe2 r8169: use phy_mii_ioctl
Switch to using phy_mii_ioctl().

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-18 09:46:32 +09:00
Heiner Kallweit
dd84957eee r8169: use phy_ethtool_nway_reset
Switch to using phy_ethtool_nway_reset().

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-18 09:46:32 +09:00
Heiner Kallweit
4577243392 r8169: use phy_ethtool_(g|s)et_link_ksettings
Use phy_ethtool_(g|s)et_link_ksettings() for the respective ethtool_ops
callbacks.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-18 09:46:32 +09:00
Heiner Kallweit
f75222bce9 r8169: replace open-coded PHY soft reset with genphy_soft_reset
Use genphy_soft_reset() instead of open-coding a PHY soft reset. We have
to do an explicit PHY soft reset because some chips use the genphy driver
which uses a no-op as soft_reset callback.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-18 09:46:32 +09:00
Heiner Kallweit
242cd9b586 r8169: use phy_resume/phy_suspend
Use phy_resume() / phy_suspend() instead of open coding this functionality.
The chip version specific differences are handled by the respective PHY
drivers.

The call to r8168_phy_power_down() in r8168_pll_power_down() can be
removed because phylib takes care now. The relevant scenarios are:
- rtl8169_close(): phy_disconnect() powers down PHY
- suspend: mdio_bus_phy_suspend() takes care
- runtime-suspend: WoL is active, don't suspend PHY
- rtl_shutdown(): no need to power down PHY

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-18 09:46:32 +09:00
Heiner Kallweit
f1e911d5d0 r8169: add basic phylib support
Add basic phylib support to r8169. All now unneeded old PHY handling code
will be removed in subsequent patches.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-18 09:46:32 +09:00
Rick Farrington
fcaccc8293 liquidio: correct error msg text when removing VLAN ID
Signed-off-by: Rick Farrington <ricardo.farrington@cavium.com>
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-16 22:55:29 -07:00
Surendra Mobiya
1eb94d441f cxgb4: collect ASIC LA dumps from ULP TX
Signed-off-by: Surendra Mobiya <surendra@chelsio.com>
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-16 14:46:16 -07:00
Sanjeev Bansal
3a498606bb tg3: Add higher cpu clock for 5762.
This patch has fix for TX timeout while running bi-directional
traffic with 100 Mbps using 5762.

Signed-off-by: Sanjeev Bansal <sanjeevb.bansal@broadcom.com>
Signed-off-by: Siva Reddy Kallam <siva.kallam@broadcom.com>
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-16 14:42:11 -07:00
Siva Reddy Kallam
0f2605fbaf tg3: Update copyright
Signed-off-by: Siva Reddy Kallam <siva.kallam@broadcom.com>
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-16 14:42:10 -07:00
John Allen
3578a7ecb6 ibmvnic: Fix error recovery on login failure
Testing has uncovered a failure case that is not handled properly. In the
event that a login fails and we are not able to recover on the spot, we
return 0 from do_reset, preventing any error recovery code from being
triggered.  Additionally, the state is set to "probed" meaning that when we
are able to trigger the error recovery, the driver always comes up in the
probed state. To handle the case properly, we need to return a failure code
here and set the adapter state to the state that we entered the reset in
indicating the state that we would like to come out of the recovery reset
in.

Signed-off-by: John Allen <jallen@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-16 14:39:47 -07:00
Saeed Mahameed
432e629e56 net/mlx4_en: Don't reuse RX page when XDP is set
When a new rx packet arrives, the rx path will decide whether to reuse
the remainder of the page or not according to one of the below conditions:
1. frag_info->frag_stride == PAGE_SIZE / 2
2. frags->page_offset + frag_info->frag_size > PAGE_SIZE;

The first condition is no met for when XDP is set.
For XDP, page_offset is always set to priv->rx_headroom which is
XDP_PACKET_HEADROOM and frag_info->frag_size is around mtu size + some
padding, still the 2nd release condition will hold since
XDP_PACKET_HEADROOM + 1536 < PAGE_SIZE, as a result the page will not
be released and will be _wrongly_ reused for next free rx descriptor.

In XDP there is an assumption to have a page per packet and reuse can
break such assumption and might cause packet data corruptions.

Fix this by adding an extra condition (!priv->rx_headroom) to the 2nd
case to avoid page reuse when XDP is set, since rx_headroom is set to 0
for non XDP setup and set to XDP_PACKET_HEADROOM for XDP setup.

No additional cache line is required for the new condition.

Fixes: 34db548bfb ("mlx4: add page recycling in receive path")
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Suggested-by: Martin KaFai Lau <kafai@fb.com>
CC: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-16 14:05:25 -07:00
Jiri Pirko
1222d15a01 mlxsw: spectrum: Expose counters for various packet sizes
Expose counters ASIC has in the group of RFC 2819 counters that count
number of packets within specific size range.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-16 14:04:42 -07:00
Rick Farrington
ac13d6d8ea liquidio: fix hang when re-binding VF host drv after running DPDK VF driver
When configuring SLI_PKTn_OUTPUT_CONTROL, VF driver was assuming that IPTR
mode was disabled by reset, which was not true.  Since DPDK driver had
set IPTR mode previously, the VF driver (which uses buf-ptr-only mode) was
not properly handling DROQ packets (i.e. it saw zero-length packets).

This represented an invalid hardware configuration which the driver could
not handle.

Signed-off-by: Rick Farrington <ricardo.farrington@cavium.com>
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-16 14:02:00 -07:00
Thomas Falcon
2d14d37952 ibmvnic: Revise RX/TX queue error messages
During a device failover, there may be latency between the loss
of the current backing device and a notification from firmware that
a failover has occurred. This latency can result in a large amount of
error printouts as firmware returns outgoing traffic with a generic
error code. These are not necessarily errors in this case as the
firmware is busy swapping in a new backing adapter and is not ready
to send packets yet. This patch reclassifies those error codes as
warnings with an explanation that a failover may be pending. All
other return codes will be considered errors.

Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-16 14:00:39 -07:00
Alexander Sverdlin
7e2bc7fb65 net: cavium: Drop dependency of NET_VENDOR_CAVIUM on PCI
Octeon Ethernet drivers work perfectly without PCI.

Signed-off-by: Alexander Sverdlin <alexander.sverdlin@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-16 13:44:36 -07:00
Antoine Tenart
495083807f net: mscc: simplify retrieving the tag type from the frame header
The tag type in the frame extraction header is only a bit wide. There's
no need to use GENMASK when retrieving the information. This patch
simplify the code by dropping GENMASK and using BIT instead.

Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Reviewed-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-16 13:43:31 -07:00
Ganesh Goudar
bc1b50309c cxgb4: do not return DUPLEX_UNKNOWN when link is down
We were returning DUPLEX_UNKNOWN in get_link_ksettings() when
the link was down.  Unfortunately, this causes a problem when
"ethtool -s autoneg on" is issued for a link which is down because
the ethtool code first reads the settings and then reapplies them
with only the changes provided on the command line. Which results
in us diving into set_link_ksettings() with DUPLEX_UNKNOWN which is
not DUPLEX_FULL, so set_link_ksettings() throws an -EINVAL error.
do not return DUPLEX_UNKNOWN to fix the issue.

Signed-off-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-16 13:43:10 -07:00
Corentin Labbe
014dd7684e net: ethernet: stmmac: fix documentation warning
This patch remove the following documentation warning
drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c:103: warning: Excess function parameter 'priv' description in 'stmmac_axi_setup'
It was introduced in commit afea03656a ("stmmac: rework DMA bus setting and introduce new platform AXI structure")

Signed-off-by: Corentin Labbe <clabbe@baylibre.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-16 13:42:44 -07:00
Corentin Labbe
56c266dcfa net: stmmac: dwmac-sun8i: fix typo descrive => describe
This patch fix a typo in the word Describe
Signed-off-by: Corentin Labbe <clabbe@baylibre.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-16 13:42:21 -07:00
YueHaibing
48559af345 bnxt_en: remove redundant debug register dma mem allocation
hwrm_dbg_resp_addr and hwrm_dbg_resp_dma_addr are never used
and can be removed.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-16 13:37:38 -07:00
Helge Deller
6e85d7a8bc liquidio: Use %pad printk format for dma_addr_t values
Use the existing %pad printk format to print dma_addr_t values.
This avoids the following warnings when compiling on the parisc platform:

warning: format '%llx' expects argument of type 'long long unsigned int', but argument 2 has type 'dma_addr_t {aka unsigned int}' [-Wformat=]

Signed-off-by: Helge Deller <deller@gmx.de>
Acked-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-16 13:36:49 -07:00
Fuyun Liang
5550aa4d47 net: hns3: Fix comments for hclge_get_ring_chain_from_mbx
Actually, hclge_get_ring_chain_from_mbx is used to get ring type, tqp id,
and int_gl index from mailbox message. So the comments is incorrect. This
patch fixes it.

Fixes: dde1a86e93 ("net: hns3: Add mailbox support to PF driver")
Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-16 11:16:45 -07:00
Fuyun Liang
cf4103c699 net: hns3: Fix for using wrong mask and shift in hclge_get_ring_chain_from_mbx
HCLGE_INT_GL_IDX_M and HCLGE_INT_GL_IDX_S are used to set fireware
cmd. When getting int_gl value from mailbox message, we should use
HNAE3_RING_GL_IDX_M and HNAE3_RING_GL_IDX_S.

Fixes: 79eee41085 ("net: hns3: add int_gl_idx setup for VF")
Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-16 11:16:45 -07:00
Yunsheng Lin
82b5321460 net: hns3: Fix for reset_level default assignment probelm
handle->reset_level is assigned to HNAE3_NONE_RESET when client is
initialized, if a tx timeout happens right after initialization,
then handle->reset_level is not resetted to HNAE3_FUNC_RESET in
hclge_reset_event, which will cause reset event not properly
handled problem.

This patch fixes it by setting handle->reset_level properly when
client is initialized.

Fixes: 6d4c3981a8 ("net: hns3: Changes to make enet watchdog timeout func common for PF/VF")
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-16 11:16:44 -07:00
Huazhong Tan
d62eccaed4 net: hns3: remove unnecessary ring configuration operation while resetting
The configuration of the ring will be used to reinitialize the
ring after the hardware reset is completed. So we should not
release and reacquire this configuration during reset.

Fixes: bb6b94a896 ("net: hns3: Add reset interface implementation in client")
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-16 11:16:44 -07:00
Huazhong Tan
6b1385cc25 net: hns3: Fix return value error in hns3_reset_notify_down_enet
When doing reset, netdev has not been brought up is not an error,
it means that we do not need do the stop operation, so just return
zero.

Fixes: 76ad4f0ee7 ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC")
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-16 11:16:44 -07:00
Huazhong Tan
9ca8d1a73c net: hns3: Correct reset event status register
According to hardware's description, driver should get reset event
from VECTOR0_PF_OTHER_INT_ST(0x20800) instead of
VECTOR0_PF_OTHER_INT_SRC(0x20700).

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-16 11:16:44 -07:00
Huazhong Tan
9de0b86f64 net: hns3: Prevent to request reset frequently
Netdevice reset should not be requested frequently, a new one
must wait a moment since there may be some work not completed.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-16 11:16:44 -07:00
Huazhong Tan
6d4fab3953 net: hns3: Reset net device with rtnl_lock
Since current locking was not covering certain code where
netdev was being accessed or manipulated, this patch fixes
it.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-16 11:16:44 -07:00
Huazhong Tan
1b3725781a net: hns3: Modify the order of initializing command queue register
According to hardware's description, the head pointer register should
be written before the tail pointer register while doing command queue
initialization.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-16 11:16:44 -07:00
Boris Pismenny
b3ccf97813 net/mlx5e: IPsec, fix byte count in CQE
This patch fixes the byte count indication in CQE for processed IPsec
packets that contain a metadata header.

Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-16 00:13:28 -07:00
Boris Pismenny
10e71acca2 net/mlx5: Accel, add common metadata functions
This patch adds common functions to handle mellanox metadata headers.
These functions are used by IPsec and TLS to process FPGA metadata.

Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-16 00:13:11 -07:00
Boris Pismenny
790af90c00 net/mlx5e: TLS, build TLS netdev from capabilities
This patch enables TLS Rx based on available HW capabilities.

Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-16 00:13:11 -07:00
Boris Pismenny
afd3baaa93 net/mlx5e: TLS, add software statistics
This patch adds software statistics for TLS to count important
events.

Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-16 00:13:11 -07:00
Boris Pismenny
00aebab27c net/mlx5e: TLS, add Innova TLS rx data path
Implement the TLS rx offload data path according to the
requirements of the TLS generic NIC offload infrastructure.

Special metadata ethertype is used to pass information to
the hardware.

When hardware loses synchronization a special resync request
metadata message is used to request resync.

Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-16 00:13:11 -07:00
Boris Pismenny
ca942c78f3 net/mlx5e: TLS, add innova rx support
Add the mlx5 implementation of the TLS Rx routines to add/del TLS
contexts, also add the tls_dev_resync_rx routine
to work with the TLS inline Rx crypto offload infrastructure.

Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-16 00:13:11 -07:00
Boris Pismenny
ab412e1dd7 net/mlx5: Accel, add TLS rx offload routines
In Innova TLS, TLS contexts are added or deleted
via a command message over the SBU connection.
The HW then sends a response message over the same connection.

Complete the implementation for Innova TLS (FPGA-based) hardware by
adding support for rx inline crypto offload.

Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-16 00:13:11 -07:00
Boris Pismenny
0aadb2fc09 net/mlx5e: TLS, refactor variable names
For symmetry, we rename mlx5e_tls_offload_context to
mlx5e_tls_offload_context_tx before we add mlx5e_tls_offload_context_rx.

Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Reviewed-by: Aviad Yehezkel <aviadye@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-16 00:13:11 -07:00
Boris Pismenny
d80a1b9d18 tls: Refactor tls_offload variable names
For symmetry, we rename tls_offload_context to
tls_offload_context_tx before we add tls_offload_context_rx.

Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-16 00:12:09 -07:00
Maxime Chevallier
f9d30d5bd5 net: mvpp2: debugfs: add classifier hit counters
The classification operations that are used for RSS make use of several
lookup tables. Having hit counters for these tables is really helpful
to determine what flows were matched by ingress traffic, and see the
path of packets among all the classifier tables.

This commit adds hit counters for the 3 tables used at the moment :

 - The decoding table (also called lookup_id table), that links flows
   identified by the Header Parser to the flow table.

   There's one entry per flow, located at :
   .../mvpp2/<controller>/flows/XX/dec_hits

   Note that there are 21 flows in the decoding table, whereas there are
   52 flows in the Header Parser. That's because there are several kind
   of traffic that will match a given flow. Reading the hit counter from
   one sub-flow will clear all hit counter that have the same flow_id.

   This also applies to the flow_hits.

 - The flow table, that contains all the different lookups to be
   performed by the classifier for each packet of a given flow. The match
   is done on the first entry of the flow sequence.

 - The C2 engine entries, that are used to assign the default rx queue,
   and enable or disable RSS for a given port.

   There's one entry per flow, located at:
   .../mvpp2/<controller>/flows/XX/flow_hits

   There is one C2 entry per port, so the c2 hit counter is located at :
   .../mvpp2/<controller>/ethX/c2_hits

All hit counter values are 16-bits clear-on-read values.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-16 00:10:01 -07:00
Maxime Chevallier
dba1d918da net: mvpp2: debugfs: add entries for classifier flows
The classifier configuration for RSS is quite complex, with several
lookup tables being used. This commit adds useful info in debugfs to
see how the different tables are configured :

Added 2 new entries in the per-port directory :

  - .../eth0/default_rxq : The default rx queue on that port
  - .../eth0/rss_enable : Indicates if RSS is enabled in the C2 entry

Added the 'flows' directory :

  It contains one entry per sub-flow. a 'sub-flow' is a unique path from
  Header Parser to the flow table. Multiple sub-flows can point to the
  same 'flow' (each flow has an id from 8 to 29, which is its index in the
  Lookup Id table) :

  - .../flows/00/...
             /01/...
             ...
             /51/id : The flow id. There are 21 unique flows. There's one
                       flow per combination of the following parameters :
                       - L4 protocol (TCP, UDP, none)
                       - L3 protocol (IPv4, IPv6)
                       - L3 parameters (Fragmented or not)
                       - L2 parameters (Vlan tag presence or not)
              .../type : The flow type. This is an even higher level flow,
                         that we manipulate with ethtool. It can be :
                         "udp4" "tcp4" "udp6" "tcp6" "ipv4" "ipv6" "other".
              .../eth0/...
              .../eth1/engine : The hash generation engine used for this
	                        flow on the given port
                  .../hash_opts : The hash generation options indicating on
                                  what data we base the hash (vlan tag, src
                                  IP, src port, etc.)

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-16 00:10:01 -07:00
Maxime Chevallier
1203341cc9 net: mvpp2: debugfs: add hit counter stats for Header Parser entries
One helpful feature to help debug the Header Parser TCAM filter in PPv2
is to be able to see if the entries did match something when a packet
comes in. This can be done by using the built-in hit counter for TCAM
entries.

This commit implements reading the counter, and exposing its value on
debugfs for each filter entry.

The counter is a 16-bits clear-on-read value, located at:
 .../mvpp2/<controller>/parser/XXX/hits

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-16 00:10:01 -07:00
Maxime Chevallier
21da57a231 net: mvpp2: add a debugfs interface for the Header Parser
Marvell PPv2 Packer Header Parser has a TCAM based filter, that is not
trivial to configure and debug. Being able to dump TCAM entries from
userspace can be really helpful to help development of new features
and debug existing ones.

This commit adds a basic debugfs interface for the PPv2 driver, focusing
on TCAM related features.

<mnt>/mvpp2/ --- f2000000.ethernet
              \- f4000000.ethernet --- parser --- 000 ...
                                    |          \- 001
                                    |          \- ...
                                    |          \- 255 --- ai
                                    |                  \- header_data
                                    |                  \- lookup_id
                                    |                  \- sram
                                    |                  \- valid
                                    \- eth1 ...
                                    \- eth2 --- mac_filter
                                             \- parser_entries
                                             \- vid_filter

There's one directory per PPv2 instance, named after pdev->name to make
sure names are uniques. In each of these directories, there's :

 - one directory per interface on the controller, each containing :

   - "mac_filter", which lists all filtered addresses for this port
     (based on TCAM, not on the kernel's uc / mc lists)

   - "parser_entries", which lists the indices of all valid TCAM
      entries that have this port in their port map

   - "vid_filter", which lists the vids allowed on this port, based on
     TCAM

 - one "parser" directory (the parser is common to all ports), containing :

   - one directory per TCAM entry (256 of them, from 0 to 255), each
     containing :

     - "ai" : Contains the 1 byte Additional Info field from TCAM, and

     - "header_data" : Contains the 8 bytes Header Data extracted from
       the packet

     - "lookup_id" : Contains the 4 bits LU_ID

     - "sram" : contains the raw SRAM data, which is the result of the TCAM
		lookup. This readonly at the moment.

     - "valid" : Indicates if the entry is valid of not.

All entries are read-only, and everything is output in hex form.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-16 00:10:00 -07:00
Antoine Tenart
f1e37e3101 net: mvpp2: switch to SPDX identifiers
Use the appropriate SPDX license identifiers and drop the license text.
This patch is only cosmetic.

Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-16 00:10:00 -07:00
Greg Kroah-Hartman
83cf9cd6d5 Merge 4.18-rc5 into char-misc-next
We want the char-misc fixes in here as well.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-16 09:04:54 +02:00
David S. Miller
2aa4a3378a Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
pull-request: bpf-next 2018-07-15

The following pull-request contains BPF updates for your *net-next* tree.

The main changes are:

1) Various different arm32 JIT improvements in order to optimize code emission
   and make the JIT code itself more robust, from Russell.

2) Support simultaneous driver and offloaded XDP in order to allow for advanced
   use-cases where some work is offloaded to the NIC and some to the host. Also
   add ability for bpftool to load programs and maps beyond just the cgroup case,
   from Jakub.

3) Add BPF JIT support in nfp for multiplication as well as division. For the
   latter in particular, it uses the reciprocal algorithm to emulate it, from Jiong.

4) Add BTF pretty print functionality to bpftool in plain and JSON output
   format, from Okash.

5) Add build and installation to the BPF helper man page into bpftool, from Quentin.

6) Add a TCP BPF callback for listening sockets which is triggered right after
   the socket transitions to TCP_LISTEN state, from Andrey.

7) Add a new cgroup tree command to bpftool which iterates over the whole cgroup
   tree and prints all attached programs, from Roman.

8) Improve xdp_redirect_cpu sample to support parsing of double VLAN tagged
   packets, from Jesper.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-14 18:47:44 -07:00
Ido Schimmel
c3a495409a mlxsw: spectrum_router: Optimize processing of VRRP MACs
Hosts using a VRRP router send their packets with a destination MAC of
the VRRP router which is of the following form [1]:

IPv4 - 00-00-5E-00-01-{VRID}
IPv6 - 00-00-5E-00-02-{VRID}

Where VRID is the ID of the virtual router. Such packets are directed to
the router block in the ASIC by an FDB entry that was added in the
previous patch.

However, in certain cases it is possible to skip this FDB lookup and
send such packets directly to the router. This is accomplished by adding
these special MAC addresses to the RIF cache. If the cache is hit, the
packet will skip the L2 lookup and ingress the router with the RIF
specified in the cache entry.

1. https://tools.ietf.org/html/rfc5798#section-7.3

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-14 11:23:26 -07:00
Ido Schimmel
11566d34f8 mlxsw: spectrum: Add VRRP traps
Virtual Router Redundancy Protocol packets are used to communicate the
state of the Master router associated with the virtual router ID (VRID).

These are link-local multicast packets sent with IP protocol 112 that
are trapped in the router block in the ASIC.

Add a trap for these packets and mark the trapped packets to prevent
them from potentially being re-flooded by the bridge driver.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-14 11:23:26 -07:00
Ido Schimmel
2db9937804 mlxsw: spectrum_router: Direct macvlans' MACs to router
An IP packet received on a netdev with a macvlan upper whose MAC matches
the packet's destination MAC will be re-injected to the Rx path as if it
was received by the macvlan, and perform an L3 lookup.

Reflect this functionality to the ASIC by programming FDB entries that
will direct MACs of macvlan uppers to the router.

In a similar fashion to router interfaces (RIFs) that are programmed
upon the addition of the first IP address on an interface and destroyed
upon the removal of the last IP address, the FDB entries for the macvlan
are added and destroyed based on the addition of the first and removal
of the last IP address on the macvlan.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-14 11:23:26 -07:00
Ido Schimmel
c55161852f mlxsw: spectrum: Enable macvlan upper devices
In order to allow more unicast MAC addresses (e.g., VRRP virtual MAC) to
be directed to the router we need to enable macvlan uppers on top of
mlxsw netdevs.

Allow macvlan upper devices on top of mlxsw netdevs and sanitize
configurations that can't work. For example, a macvlan can't be enslaved
to a bridge as without ACLs the device doesn't take the destination MAC
into account when classifying a packet to a bridge instance (i.e., a
FID).

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-14 11:23:25 -07:00
kbuild test robot
9cee8c4375 net: mvpp2: mvpp2_cls_flow_get() can be static
Fixes: f9358e12a0 ("net: mvpp2: split ingress traffic into multiple flows")
Signed-off-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-13 20:21:56 -07:00
Dan Carpenter
5fc853cc01 qlogic: check kstrtoul() for errors
We accidentally left out the error handling for kstrtoul().

Fixes: a520030e32 ("qlcnic: Implement flash sysfs callback for 83xx adapter")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-13 18:28:58 -07:00
Jakub Kicinski
5f4284015e nfp: add support for simultaneous driver and hw XDP
Split handling of offloaded and driver programs completely.  Since
offloaded programs always come with XDP_FLAGS_HW_MODE set in reality
there could be no sharing, anyway, programs would only be installed
in driver or in hardware.  Splitting the handling allows us to install
programs in HW and in driver at the same time.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-07-13 20:26:35 +02:00
Jakub Kicinski
a25717d2b6 xdp: support simultaneous driver and hw XDP attachment
Split the query of HW-attached program from the software one.
Introduce new .ndo_bpf command to query HW-attached program.
This will allow drivers to install different programs in HW
and SW at the same time.  Netlink can now also carry multiple
programs on dump (in which case mode will be set to
XDP_ATTACHED_MULTI and user has to check per-attachment point
attributes, IFLA_XDP_PROG_ID will not be present).  We reuse
IFLA_XDP_PROG_ID skb space for second mode, so rtnl_xdp_size()
doesn't need to be updated.

Note that the installation side is still not there, since all
drivers currently reject installing more than one program at
the time.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-07-13 20:26:35 +02:00
Jakub Kicinski
05296620f6 xdp: factor out common program/flags handling from drivers
Basic operations drivers perform during xdp setup and query can
be moved to helpers in the core.  Encapsulate program and flags
into a structure and add helpers.  Note that the structure is
intended as the "main" program information source in the driver.
Most drivers will additionally place the program pointer in their
fast path or ring structures.

The helpers don't have a huge impact now, but they will
decrease the code duplication when programs can be installed
in HW and driver at the same time.  Encapsulating the basic
operations in helpers will hopefully also reduce the number
of changes to drivers which adopt them.

Helpers could really be static inline, but they depend on
definition of struct netdev_bpf which means they'd have
to be placed in netdevice.h, an already 4500 line header.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-07-13 20:26:35 +02:00
Jakub Kicinski
6b86758973 xdp: don't make drivers report attachment mode
prog_attached of struct netdev_bpf should have been superseded
by simply setting prog_id long time ago, but we kept it around
to allow offloading drivers to communicate attachment mode (drv
vs hw).  Subsequently drivers were also allowed to report back
attachment flags (prog_flags), and since nowadays only programs
attached will XDP_FLAGS_HW_MODE can get offloaded, we can tell
the attachment mode from the flags driver reports.  Remove
prog_attached member.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-07-13 20:26:35 +02:00
Linus Walleij
430ac34de9 net: gemini: Indicate that we can handle jumboframes
The hardware supposedly handles frames up to 10236 bytes and
implements .ndo_change_mtu() so accept 10236 minus the ethernet
header for a VLAN tagged frame on the netdevices. Use
ETH_MIN_MTU as minimum MTU.

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-12 17:39:15 -07:00
Linus Walleij
06d5151312 net: gemini: Move main init to port
The initialization sequence for the ethernet, setting up
interrupt routing and such things, need to be done after
both the ports are clocked and reset. Before this the
config will not "take". Move the initialization to the
port probe function and keep track of init status in
the state.

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-12 17:39:15 -07:00
Linus Walleij
60cc7767b9 net: gemini: Allow multiple ports to instantiate
The code was not tested with two ports actually in use at
the same time. (I blame this on lack of actual hardware using
that feature.) Now after locating a system using both ports,
add necessary fix to make both ports come up.

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-12 17:39:15 -07:00
Linus Walleij
9ab5c929e6 net: gemini: Improve connection prints
Switch over to using a module parameter and debug prints
that can be controlled by this or ethtool like everyone
else. Depromote all other prints to debug messages.

The phy_print_status() was already in place, albeit never
really used because the debuglevel hiding it had to be
set up using ethtool.

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-12 17:39:15 -07:00
Linus Walleij
cedca41801 net: gemini: Look up L3 maxlen from table
The code to calculate the hardware register enumerator
for the maximum L3 length isn't entirely simple to read.
Use the existing defines and rewrite the function into a
table look-up.

Acked-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-12 17:39:15 -07:00
Alex Vesker
3c641ba4a8 net/mlx4_core: Use devlink region_snapshot parameter
This parameter enables capturing region snapshot of the crspace
during critical errors. The default value of this parameter is
disabled, it can be enabled using devlink param commands.
It is possible to configure during runtime and also driver init.

Signed-off-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-12 17:37:13 -07:00
Alex Vesker
bedc989b0c net/mlx4_core: Add Crdump FW snapshot support
Crdump allows the driver to create a snapshot of the FW PCI
crspace and health buffer during a critical FW issue.
In case of a FW command timeout, FW getting stuck or a non zero
value on the catastrophic buffer, a snapshot will be taken.

The snapshot is exposed using devlink, cr-space, fw-health
address regions are registered on init and snapshots are attached
once a new snapshot is collected by the driver.

Signed-off-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-12 17:37:13 -07:00
Alex Vesker
523f9eb1ef net/mlx4_core: Add health buffer address capability
Health buffer address is a 32 bit PCI address offset provided by
the FW. This offset is used for reading FW health debug data
located on the shared CR space. Cr space is accessible in both
driver and FW and allows for different queries and configurations.
Health buffer size is always 64B of readable data followed by a
lock which is used to block volatile CR space access.

Signed-off-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-12 17:37:13 -07:00
Maxime Chevallier
436d4fdb20 net: mvpp2: allow setting RSS flow hash parameters with ethtool
This commit allows setting the RSS hash generation parameters from
ethtool. When setting parameters for a given flow type from ethtool
(e.g. tcp4), all the corresponding flows in the flow table are updated,
according to the supported hash parameters.

For example, when configuring TCP over IPv4 hash parameters to be
src/dst IP  + src/dst port ("ethtool -N eth0 rx-flow-hash tcp4 sdfn"),
we only set the "src/dst port" hash parameters on the non-fragmented TCP
over IPv4 flows.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-12 17:30:49 -07:00
Maxime Chevallier
d33ec45250 net: mvpp2: add an RSS classification step for each flow
One of the classification action that can be performed is to compute a
hash of the packet header based on some header fields, and lookup a RSS
table based on this hash to determine the final RxQ.

This is done by adding one lookup entry per flow per port, so that we
can configure the hash generation parameters for each flow and each
port.

There are 2 possible engines that can be used for RSS hash generation :

 - C3HA, that generates a hash based on up to 4 header-extracted fields
 - C3HB, that does the same as c3HA, but also includes L4 info in the hash

There are a lot of fields that can be extracted from the header. For now,
we only use the ones that we can configure using ethtool :
 - DST MAC address
 - L3 info
 - Source IP
 - Destination IP
 - Source port
 - Destination port

The C3HB engine is selected when we use L4 fields (src/dst port).

               Header parser          Dec table
 Ingress pkt  +-------------+ flow id +----------------------------+
------------->| TCAM + SRAM |-------->|TCP IPv4 w/ VLAN, not frag  |
              +-------------+         |TCP IPv4 w/o VLAN, not frag |
                                      |TCP IPv4 w/ VLAN, frag      |--+
                                      |etc.                        |  |
                                      +----------------------------+  |
                                                                      |
                                            Flow table                |
  +---------+   +------------+         +--------------------------+   |
  | RSS tbl |<--| Classifier |<--------| flow 0: C2 lookup        |   |
  +---------+   +------------+         |         C3 lookup port 0 |   |
                 |         |           |         C3 lookup port 1 |   |
         +-----------+ +-------------+ |         ...              |   |
         | C2 engine | | C3H engines | | flow 1: C2 lookup        |<--+
         +-----------+ +-------------+ |         C3 lookup port 0 |
                                       |         ...              |
                                       | ...                      |
                                       | flow 51 : C2 lookup      |
                                       |           ...            |
                                       +--------------------------+

The C2 engine also gains the role of enabling and disabling the RSS
table lookup for this packet.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-12 17:30:49 -07:00
Maxime Chevallier
f9358e12a0 net: mvpp2: split ingress traffic into multiple flows
The PPv2 classifier allows to perform classification operations on each
ingress packet, based on the flow the packet is assigned to.

The current code uses only 1 flow per port, and the only classification
action consists of assigning the rx queue to the packet, depending on the
port.

In preparation for adding RSS support, we have to split all incoming
traffic into different flows. Since RSS assigns a rx queue depending on
the hash of some header fields, we have to make sure that the hash is
generated in a consistent way for all packets in the same flow.

What we call a "flow" is actually a set of attributes attached to a
packet that depends on various L2/L3/L4 info.

This patch introduces 52 flows, wich are a combination of various L2, L3
and L4 attributes :
 - Whether or not the packet has a VLAN tag
 - Whether the packet is IPv4, IPv6 or something else
 - Whether the packet is TCP, UDP or something else
 - Whether or not the packet is fragmented at L3 level.

The flow is associated to a packet by the Header Parser. Each flow
corresponds to an entry in the decoding table. This entry then points to
the sequence of classification lookups to be performed by the
classifier, represented in the flow table.

For now, the only lookup we perform is a C2 lookup to set the default
rx queue.

               Header parser          Dec table
 Ingress pkt  +-------------+ flow id +----------------------------+
------------->| TCAM + SRAM |-------->|TCP IPv4 w/ VLAN, not frag  |
              +-------------+         |TCP IPv4 w/o VLAN, not frag |
                                      |TCP IPv4 w/ VLAN, frag      |--+
                                      |etc.                        |  |
                                      +----------------------------+  |
                                                                      |
                                           Flow table                 |
                +------------+        +---------------------+         |
     To RxQ <---| Classifier |<-------| flow 0: C2 lookup   |<--------+
                +------------+        | flow 1: C2 lookup   |
                       |              | ...                 |
                +------------+        | flow 51 : C2 lookup |
		| C2 engine  |        +---------------------+
                +------------+

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-12 17:30:49 -07:00
Maxime Chevallier
b1a962c62c net: mvpp2: use classifier to assign default rx queue
The PPv2 Controller has a classifier, that can perform multiple lookup
operations for each packet, using different engines.

One of these engines is the C2 engine, which performs TCAM based lookups
on data extracted from the packet header. When a packet matches an
entry, the engine sets various attributes, used to perform
classification operations.

One of these attributes is the rx queue in which the packet should be sent.
The current code uses the lookup_id table (also called decoding table)
to assign the rx queue. However, this only works if we use one entry per
port in the decoding table, which won't be the case once we add RSS
lookups.

This patch uses the C2 engine to assign the rx queue to each packet.

The C2 engine is used through the flow table, which dictates what
classification operations are done for a given flow.

Right now, we have one flow per port, which contains every ingress
packet for this port.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-12 17:30:49 -07:00
Maxime Chevallier
e6e21c0242 net: mvpp2: rename per-port RSS init function
mvpp22_init_rss function configures the RSS parameters for each port, so
rename it accordingly. Since this function relies on classifier
configuration, move its call right after the classifier config.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-12 17:30:48 -07:00
Maxime Chevallier
2a2f467daf net: mvpp2: make sure we don't spread load on disabled CPUs
When filling the RSS table, we have to make sure that the rx queue is
attached to an online CPU.

This patch is not a full support for cpu_hotplug, but rather a way to
make sure that we don't break network on system booted with the maxcpus
parameter.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-12 17:30:48 -07:00
Antoine Tenart
662ae3fe65 net: mvpp2: improve the distribution of packets on CPUs when using RSS
This patch adds an extra indirection when setting the indirection table
into the RSS hardware table to improve the packets distribution across
CPUs. For example, if 2 queues are used on a multi-core system this new
indirection will choose two queues on two different CPUs instead of the
two first queues which are on the same first CPU.

Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-12 17:30:48 -07:00
Antoine Tenart
8179642b52 net: mvpp2: RSS indirection table support
This patch adds the RSS indirection table support, allowing to use the
ethtool -x and -X options to dump and set this table.

Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
[Maxime: Small warning fixes, use one table per port]
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-12 17:30:48 -07:00
Maxime Chevallier
a27a254c26 net: mvpp2: use one RSS table per port
PPv2 Controller has 8 RSS Tables, of 32 entries each. A lookup in the
RXQ2RSS_TABLE is performed for each incoming packet, and the RSS Table
to be used is chosen according to the default rx queue that would be
used for the packet.

This default rx queue is set in the Lookup_id Table (also called
Decoding Table), and is equal to the port->first_rxq.

Since the Classifier itself isn't active at any time for the moment,
this doesn't have a direct effect, the default rx queue at the moment is
the one where all packets end-up into.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-12 17:30:48 -07:00
Maxime Chevallier
4b86097be7 net: mvpp2: fix RSS register definitions
There is no RSS_TABLE register in PPv2 Controller. The register 0x1510
which was specified is actually named "RSS_HASH_SEL", but isn't used by
this driver at all.

Based on how this register was used, it should have been the
RXQ2RSS_TABLE register, which allows to select the RSS table that will
be used for the incoming packet.

The RSS_TABLE_POINTER is actually a field of this RXQ2RSS_TABLE
register.

Since RSS tables are actually not used by the driver for now, this
commit does not fix a runtime bug.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-12 17:30:48 -07:00
Antoine Tenart
132baa0378 net: mvpp2: fix a typo in the RSS code
Cosmetic patch fixing a typo in one of the RSS comments.

Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-12 17:30:48 -07:00
Maxime Chevallier
f8c6ba8424 net: mvpp2: use only one rx queue per port per CPU
The number of receive queue per port is :
 - MVPP2_DEFAULT_RXQ if in single queue mode
 - MVPP2_DEFAULT_RXQ * num_possible_cpus if in multi queue mode

with MVPP2_DEFAULT_RXQ = 4.

However, we don't use the extra rx queues at the moment, we really only
need one per port per CPU, until some more advanced classification rules
are implemented.

Suggested-by: Stefan Chulski <stefanc@marvell.com>
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-12 17:30:48 -07:00
Maxime Chevallier
790d32c6d3 net: mvpp2: fix hardcoded number of rx queues
There's a dedicated #define that indicates the number of rx queues per
port per cpu, this commit removes a harcoded use of that value

This doesn't fix any runtime bugs since the harcoded value matches the
expected value.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-12 17:30:48 -07:00
Yan Markman
4c4a5686c4 net: mvpp2: use RSS only when using multi-queue mode
Since RSS only applies when we have per-cpu rx queues, it should only
be enabled when the driver is configured to make use of multi-queue
mode.

Signed-off-by: Yan Markman <ymarkman@marvell.com>
[Maxime: Commit message]
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-12 17:30:48 -07:00
Maxime Chevallier
3f6aaf7289 net: mvpp2: make multi queue mode the default mode
The multi queue mode is needed to have RSS available, and offers some
nice advantages, being able to have one rx queue vector per CPU.

This mode has been usable through the use of a module parameter, this
commit makes it the default value.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-12 17:30:48 -07:00
Maxime Chevallier
1e27a628e3 net: mvpp2: make sure we use single queue mode on PPv2.1
The PPv2 driver defines 2 "queue_modes" :
 - QDIST_SINGLE_MODE, where each port share one rx queue vector
   between all CPUs
 - QDIST_MULTI_MODE, where each port has one rx queue vector per CPU.

Multi queue mode isn't available on PPv2.1, make sure we fallback to
single mode when running on this revision.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-12 17:30:48 -07:00
Maxime Chevallier
0ad2f53906 net: mvpp2: define the number of RSS entries per table in mvpp2.h
The size of the the RSS indirection tables should be defined in mvpp2.h,
so that we can use it in all files of the PPv2 driver.

This commit moves the define in mvpp2.h, and adds the missing #include
in mvpp2_cls.h.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-12 17:30:47 -07:00
Maxime Chevallier
53a40025c0 net: mvpp2: fix include guards in mvpp2_prs.h
Include guards should be put before #includes. This doesn't fix any bug,
but prevent future compilation issues when adding new files in the mvpp2
driver

The Header Parser init function needs the platform_device definition,
and with the fixed include guards we need to add the missing include.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-12 17:30:47 -07:00
Arnd Bergmann
51bef926be nfp: avoid using getnstimeofday64()
getnstimeofday64 is deprecated in favor of the ktime_get() family of
functions. The direct replacement would be ktime_get_real_ts64(),
but I'm picking the basic ktime_get() instead:

- using a ktime_t simplifies the code compared to timespec64
- using monotonic time instead of real time avoids issues caused
  by a concurrent settimeofday() or during a leap second adjustment.

Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-12 14:55:39 -07:00
Arnd Bergmann
44c58899b0 liquidio: use ktime_get_real_ts64() instead of getnstimeofday64()
The two do the same thing, but we want to have a consistent
naming in the kernel.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-12 14:55:39 -07:00
Bert Kenward
193f20033c sfc: hold filter_sem consistently during reset
We should take and release the filter_sem consistently during the
reset process, in the same manner as the mac_lock and reset_lock.

For lockdep consistency we also take the filter_sem for write around
other calls to efx->type->init().

Fixes: c2bebe37c6 ("sfc: give ef10 its own rwsem in the filter table instead of filter_lock")
Signed-off-by: Bert Kenward <bkenward@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-12 14:52:04 -07:00
Bert Kenward
1c56c0994a sfc: avoid hang from nested use of the filter_sem
In some situations we may end up calling down_read while already
holding the semaphore for write, thus hanging. This has been seen
when setting the MAC address for the interface. The hung task log
in this situation includes this stack:
  down_read
  efx_ef10_filter_insert
  efx_ef10_filter_insert_addr_list
  efx_ef10_filter_vlan_sync_rx_mode
  efx_ef10_filter_add_vlan
  efx_ef10_filter_table_probe
  efx_ef10_set_mac_address
  efx_set_mac_address
  dev_set_mac_address

In addition, lockdep rightly points out that nested calling of
down_read is incorrect.

Fixes: c2bebe37c6 ("sfc: give ef10 its own rwsem in the filter table instead of filter_lock")
Tested-by: Jarod Wilson <jarod@redhat.com>
Signed-off-by: Bert Kenward <bkenward@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-12 14:52:04 -07:00
Florian Fainelli
9e3bff9239 net: systemport: Fix CRC forwarding check for SYSTEMPORT Lite
SYSTEMPORT Lite reversed the logic compared to SYSTEMPORT, the
GIB_FCS_STRIP bit is set when the Ethernet FCS is stripped, and that bit
is not set by default. Fix the logic such that we properly check whether
that bit is set or not and we don't forward an extra 4 bytes to the
network stack.

Fixes: 44a4524c54 ("net: systemport: Add support for SYSTEMPORT Lite")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-12 14:47:47 -07:00
Dan Carpenter
c411104115 ixgbe: Off by one in ixgbe_ipsec_tx()
The ipsec->tx_tbl[] has IXGBE_IPSEC_MAX_SA_COUNT elements so the > needs
to be changed to >= so we don't read one element beyond the end of the
array.

Fixes: 5925947047 ("ixgbe: process the Tx ipsec offload")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Shannon Nelson <shannon.nelson@oracle.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-07-12 08:03:09 -07:00
Alexander Duyck
d14c780c11 ixgbe: Be more careful when modifying MAC filters
This change makes it so that we are much more explicit about the ordering
of updates to the receive address register (RAR) table. Prior to this patch
I believe we may have been updating the table while entries were still
active, or possibly allowing for reordering of things since we weren't
explicitly flushing writes to either the lower or upper portion of the
register prior to accessing the other half.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-07-12 07:17:13 -07:00
Ivan Vecera
28ace84b10 be2net: move rss_flags field in rss_info to ensure proper alignment
The current position of .rss_flags field in struct rss_info causes
that fields .rsstable and .rssqueue (both 128 bytes long) crosses
cache-line boundaries. Moving it at the end properly align all fields.

Before patch:
struct rss_info {
        u64                        rss_flags;            /*     0     8 */
        u8                         rsstable[128];        /*     8   128 */
        /* --- cacheline 2 boundary (128 bytes) was 8 bytes ago --- */
        u8                         rss_queue[128];       /*   136   128 */
        /* --- cacheline 4 boundary (256 bytes) was 8 bytes ago --- */
        u8                         rss_hkey[40];         /*   264    40 */
};

After patch:
struct rss_info {
        u8                         rsstable[128];        /*     0   128 */
        /* --- cacheline 2 boundary (128 bytes) --- */
        u8                         rss_queue[128];       /*   128   128 */
        /* --- cacheline 4 boundary (256 bytes) --- */
        u8                         rss_hkey[40];         /*   256    40 */
        u64                        rss_flags;            /*   296     8 */
};

Signed-off-by: Ivan Vecera <cera@cera.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-12 00:03:31 -07:00
Ivan Vecera
03d231a963 be2net: re-order fields in be_error_recovert to avoid hole
- Unionize two u8 fields where only one of them is used depending on NIC
chipset.
- Move recovery_supported field after that union

These changes eliminate 7-bytes hole in the struct and makes it smaller
by 8 bytes.

Signed-off-by: Ivan Vecera <cera@cera.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-12 00:03:31 -07:00
Ivan Vecera
f9520b86dc be2net: remove unused tx_jiffies field from be_tx_stats
Signed-off-by: Ivan Vecera <cera@cera.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-12 00:03:30 -07:00
Ivan Vecera
646d2c10aa be2net: move txcp field in be_tx_obj to eliminate holes in the struct
Before patch:
struct be_tx_obj {
        u32                        db_offset;            /*     0     4 */

        /* XXX 4 bytes hole, try to pack */

        struct be_queue_info       q;                    /*     8    56 */
        /* --- cacheline 1 boundary (64 bytes) --- */
        struct be_queue_info       cq;                   /*    64    56 */
        struct be_tx_compl_info    txcp;                 /*   120     4 */

        /* XXX 4 bytes hole, try to pack */

        /* --- cacheline 2 boundary (128 bytes) --- */
        struct sk_buff *           sent_skb_list[2048];  /*   128 16384 */
        ...
}:

After patch:
struct be_tx_obj {
        u32                        db_offset;            /*     0     4 */
        struct be_tx_compl_info    txcp;                 /*     4     4 */
        struct be_queue_info       q;                    /*     8    56 */
        /* --- cacheline 1 boundary (64 bytes) --- */
        struct be_queue_info       cq;                   /*    64    56 */
        struct sk_buff *           sent_skb_list[2048];  /*   120 16384 */
        ...
};

Signed-off-by: Ivan Vecera <cera@cera.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-12 00:03:30 -07:00
Ivan Vecera
e9c74cd85c be2net: reorder fields in be_eq_obj structure
Re-order fields in struct be_eq_obj to ensure that .napi field begins
at start of cache-line. Also the .adapter field is moved to the first
cache-line next to .q field and 3 fields (idx,msi_idx,spurious_intr)
and the 4-bytes hole to 3rd cache-line.

Signed-off-by: Ivan Vecera <cera@cera.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-12 00:03:30 -07:00
Ivan Vecera
d6d9704af8 be2net: remove desc field from be_eq_obj
The event queue description (be_eq_obj.desc) field is used only to format
string for IRQ name and it is not really needed to hold this value.
Remove it and use local variable to format string for IRQ name.

Signed-off-by: Ivan Vecera <cera@cera.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-12 00:03:30 -07:00
Ivan Vecera
c1328a27bb be2net: remove unused old custom busy-poll fields
The commit fb6113e688 ("be2net: get rid of custom busy poll code")
replaced custom busy-poll code by the generic one but left several
macros and fields in struct be_eq_obj that are currently unused.
Remove this stuff.

Fixes: fb6113e688 ("be2net: get rid of custom busy poll code")
Signed-off-by: Ivan Vecera <cera@cera.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-12 00:03:30 -07:00
Ivan Vecera
a5d7fcb689 be2net: remove unused old AIC info
The commit 2632bafd74 ("be2net: fix adaptive interrupt coalescing")
introduced a separate struct be_aic_obj to hold AIC information but
unfortunately left the old stuff in be_eq_obj. So remove it.

Fixes: 2632bafd74 ("be2net: fix adaptive interrupt coalescing")
Signed-off-by: Ivan Vecera <cera@cera.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-12 00:03:30 -07:00
Ewan D. Milne
20c4515a1a qed: fix spelling mistake "successffuly" -> "successfully"
Trivial fix to spelling mistake in qed_probe message.

Signed-off-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-12 00:02:05 -07:00
Ivan Khoronzhuk
d0c694fc7b net: ethernet: ti: cpts: break cycle once late ts is matched
The late ts queue can contain a bunch of skbs while hi rate testing,
no need to check all of them if timestamp is already matched.

Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-12 00:00:07 -07:00
Petr Machata
b5de82f3df mlxsw: spectrum_span: Change LAG lower selection
When offloading mirror-to-gretap, mlxsw needs to preroute the path that
the encapsulated packet will take. That path may include a LAG device
above a front panel port. So far, mlxsw resolved the path to the first
up front panel slave of the LAG interface, but that only reflects
administrative state of the port. It neglects to consider whether the
port actually has a carrier, and what the LACP state is.

So instead of checking upness of the device, check carrier state and
txability.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-11 23:10:19 -07:00
David S. Miller
e32f55f373 Merge branch '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:

====================
L2 Fwd Offload & 10GbE Intel Driver Updates 2018-07-09

This patch series is meant to allow support for the L2 forward offload, aka
MACVLAN offload without the need for using ndo_select_queue.

The existing solution currently requires that we use ndo_select_queue in
the transmit path if we want to associate specific Tx queues with a given
MACVLAN interface. In order to get away from this we need to repurpose the
tc_to_txq array and XPS pointer for the MACVLAN interface and use those as
a means of accessing the queues on the lower device. As a result we cannot
offload a device that is configured as multiqueue, however it doesn't
really make sense to configure a macvlan interfaced as being multiqueue
anyway since it doesn't really have a qdisc of its own in the first place.

The big changes in this set are:
  Allow lower device to update tc_to_txq and XPS map of offloaded MACVLAN
  Disable XPS for single queue devices
  Replace accel_priv with sb_dev in ndo_select_queue
  Add sb_dev parameter to fallback function for ndo_select_queue
  Consolidated ndo_select_queue functions that appeared to be duplicates
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-11 23:03:32 -07:00
Rahul Lakkireddy
31e5f5c3e9 cxgb4: expose stats fetched from firmware via debugfs
Expose stats obtained from firmware via debugfs. These stats can't
be part of ethtool -S because the slow firmware mailbox can cause
packet drops under heavy load.

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-11 22:59:38 -07:00
Rahul Lakkireddy
b351b16d8a cxgb4: remove stats fetched from firmware
When running ethtool -S, some stats are requested from firmware.
Since getting these stats via firmware mailbox is slow, some packets
get dropped under heavy load while running ethtool -S.

So, remove these stats from ethtool -S.

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-11 22:59:38 -07:00
Antoine Tenart
b32b088181 net: mvpp2: explicitly include linux/interrupt.h
The Marvell PPv2 driver uses interrupts and tasklet but does not
explicitly include linux/interrupt.h, relying on implicit includes. This
one particularly is included by chance after a long unlogical chain of
inclusions. Fix this so we do not get future build breaks.

Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-11 22:56:52 -07:00
Jan Dakinevich
fb8ed3af74 cnic: use kvzalloc to allocate memory for csk_tbl
Size of csk_tbl is about 58K, which means 3rd order page allocation.
kvzalloc provides a fallback if no high order memory is available.

Signed-off-by: Jan Dakinevich <jan.dakinevich@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-11 22:55:52 -07:00
Arjun Vynipadath
8dce04f1fd cxgb4: specify IQTYPE in fw_iq_cmd
congestion argument passed to t4_sge_alloc_rxq() is used
to differentiate between nic/ofld queues.

Signed-off-by: Arjun Vynipadath <arjun@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-11 22:52:10 -07:00
Jann Horn
31e33a5b41 net/mlx5: fix uaccess beyond "count" in debugfs read/write handlers
In general, accessing userspace memory beyond the length of the supplied
buffer in VFS read/write handlers can lead to both kernel memory corruption
(via kernel_read()/kernel_write(), which can e.g. be triggered via
sys_splice()) and privilege escalation inside userspace.

In this case, the affected files are in debugfs (and should therefore only
be accessible to root) and check that *pos is zero (which prevents the
sys_splice() trick). Therefore, this is not a security fix, but rather a
small cleanup.

For the read handlers, fix it by using simple_read_from_buffer() instead of
custom logic.
For the write handler, add a check.

changed in v2:
 - also fix dbg_write()

Fixes: e126ba97db ("mlx5: Add driver for Mellanox Connect-IB adapters")
Signed-off-by: Jann Horn <jannh@google.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-11 12:08:57 -07:00
Vikas Gupta
c58387ab16 bnxt_en: Fix for system hang if request_irq fails
Fix bug in the error code path when bnxt_request_irq() returns failure.
bnxt_disable_napi() should not be called in this error path because
NAPI has not been enabled yet.

Fixes: c0c050c58d ("bnxt_en: New Broadcom ethernet driver.")
Signed-off-by: Vikas Gupta <vikas.gupta@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-09 16:27:14 -07:00
Michael Chan
30f529473e bnxt_en: Do not modify max IRQ count after RDMA driver requests/frees IRQs.
Calling bnxt_set_max_func_irqs() to modify the max IRQ count requested or
freed by the RDMA driver is flawed.  The max IRQ count is checked when
re-initializing the IRQ vectors and this can happen multiple times
during ifup or ethtool -L.  If the max IRQ is reduced and the RDMA
driver is operational, we may not initailize IRQs correctly.  This
problem shows up on VFs with very small number of MSIX.

There is no other logic that relies on the IRQ count excluding the ones
used by RDMA.  So we fix it by just removing the call to subtract or
add the IRQs used by RDMA.

Fixes: a588e4580a ("bnxt_en: Add interface to support RDMA driver.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-09 16:27:14 -07:00
Michael Chan
30e338487a bnxt_en: Support clearing of the IFF_BROADCAST flag.
Currently, the driver assumes IFF_BROADCAST is always set and always sets
the broadcast filter.  Modify the code to set or clear the broadcast
filter according to the IFF_BROADCAST flag.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-09 16:27:14 -07:00
Michael Chan
78f058a4aa bnxt_en: Always set output parameters in bnxt_get_max_rings().
The current code returns -ENOMEM and does not bother to set the output
parameters to 0 when no rings are available.  Some callers, such as
bnxt_get_channels() will display garbage ring numbers when that happens.
Fix it by always setting the output parameters.

Fixes: 6e6c5a57fb ("bnxt_en: Modify bnxt_get_max_rings() to support shared or non shared rings.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-09 16:27:14 -07:00
Michael Chan
07f4fde53d bnxt_en: Fix inconsistent BNXT_FLAG_AGG_RINGS logic.
If there aren't enough RX rings available, the driver will attempt to
use a single RX ring without the aggregation ring.  If that also
fails, the BNXT_FLAG_AGG_RINGS flag is cleared but the other ring
parameters are not set consistently to reflect that.  If more RX
rings become available at the next open, the RX rings will be in
an inconsistent state and may crash when freeing the RX rings.

Fix it by restoring the BNXT_FLAG_AGG_RINGS if not enough RX rings are
available to run without aggregation rings.

Fixes: bdbd1eb59c ("bnxt_en: Handle no aggregation ring gracefully.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-09 16:27:14 -07:00
Venkat Duvvuru
e32d4e60b3 bnxt_en: Fix the vlan_tci exact match check.
It is possible that OVS may set don’t care for DEI/CFI bit in
vlan_tci mask. Hence, checking for vlan_tci exact match will endup
in a vlan flow rejection.

This patch fixes the problem by checking for vlan_pcp and vid
separately, instead of checking for the entire vlan_tci.

Fixes: e85a9be93c (bnxt_en: do not allow wildcard matches for L2 flows)
Signed-off-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-09 16:27:14 -07:00
Jiri Pirko
a8b9f232ec mlxsw: resources: Add couple of Spectrum-2 KVD resources
These resources are needed for Spectrum-2 KVD linear management
implementation.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-09 16:24:18 -07:00
Jiri Pirko
abfd61825b mlxsw: spectrum: Prepare for multiple FW versions for Spectrum and Spectrum-2
Prepare for Spectrum-2 FW version checking and
make mlxsw_sp_fw_rev_validate() per-ASIC as well as required FW revision
and FW filename.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-09 16:24:17 -07:00
Jiri Pirko
ea8b2e28aa mlxsw: spectrum_acl: Implement priority setting for rules inserted to TCAM
For Spectrum-2, we need to insert priority to C-TCAM because HW
needs that info in order to correctly process scenarios where rules
are in both C-TCAM and A-TCAM.

So extend the mlxsw_sp_acl_ctcam_entry_add() args to accept indication
if priority needs to be filled up and implement the priority
computation and fill-up.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-09 16:24:17 -07:00
Jiri Pirko
42df8358c3 mlxsw: reg: Add priority field for PTCEV2 register
This is going to be needed for Spectrum-2 C-TCAM implementation.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-09 16:24:17 -07:00
Jiri Pirko
a5995cc801 mlxsw: spectrum_acl: Move block items encoding into Spectrum op
Since Spectrum-2 encodes blocks into different HW layout, push this
code into Spectrum-specific op.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-09 16:24:17 -07:00
Jiri Pirko
c17d20838e mlxsw: spectrum_acl: Convert mlxsw_afk_create args to ops
Since the flex keys for Spectrum-2 differ not only in blocks definitions
but also in encoding layout, prepare for the implementation and pass
Spectrum/Spectrum-2 specific ops down to mlxsw_afk_create.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-09 16:24:17 -07:00
Jiri Pirko
bab5c1cfb7 mlxsw: spectrum_acl: Add tcam init/fini ops
Add ops to be called on driver instance init and fini.
This is needed in order to be possible to do Spectrum-2 specific init
and fini work.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-09 16:24:17 -07:00
Jiri Pirko
64eccd0066 mlxsw: spectrum_acl: Split TCAM handling 3 ways
To allow easy and clean Spectrum-2 implementation for things that differ
from Spectrum, split the existing ACL TCAM code 3 ways:
1) common code that calls Spectrum/Spectrum-2 specific ops
2) Spectrum ops implementations
3) common C-TCAM code that is going to be shared between Spectrum and
   Spectrum-2 implementations

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-09 16:24:17 -07:00
Jiri Pirko
8fae4392d4 mlxsw: spectrum_mr_tcam: Push Spectrum-specific operations into a separate file
Since Spectrum-2 has different handling of TCAM, push Spectrum MR TCAM
bits to a separate file accessible by ops which allows to implement
Spectrum-2 specific ops.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-09 16:24:17 -07:00
Jiri Pirko
0304c00546 mlxsw: spectrum_kvdl: Pass entry_count to free function
For the Spectrum-2 KVD linear manager implementation, entry_count will be
needed even for the free function. So pass it down.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-09 16:24:16 -07:00
Jiri Pirko
4b6b18692a mlxsw: spectrum_kvdl: Pass entry type to alloc/free
Future Spectrum-2 KVD linear manager implementation needs to know type
of the entry to alloc and free. So define the types in an enum and
pass it down to alloc and free functions. Once the entry type
is passed down, KVDL common part knows sizes of each entry types,
so replace size function arg with entry count.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-09 16:24:16 -07:00
Jiri Pirko
ebcff74386 mlxsw: spectrum_kvdl: Push out KVD linear management into ops
In Spectrum-2 there is a different implementation of KVD linear
management. Unlike in Spectrum where there is a single index space,
in Spectrum-2 the indexes are per-resource. Also there is need to
explicitly tell HW that an entry is no longer used.
So push out the existing implementation into spectrum1_kvdl.c and
prepare ops infrastructure to allow new implementation in a follow-up.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-09 16:24:16 -07:00
Kees Cook
eec4edc9ee net/mlx5: Use 2-factor allocator calls
This restores the use of 2-factor allocation helpers that were already
fixed treewide. Please do not use open-coded multiplication; prefer,
instead, using 2-factor allocation helpers.

Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-09 16:00:07 -07:00
Alexander Duyck
8ec56fc3c5 net: allow fallback function to pass netdev
For most of these calls we can just pass NULL through to the fallback
function as the sb_dev. The only cases where we cannot are the cases where
we might be dealing with either an upper device or a driver that would
have configured things to support an sb_dev itself.

The only driver that has any significant change in this patch set should be
ixgbe as we can drop the redundant functionality that existed in both the
ndo_select_queue function and the fallback function that was passed through
to us.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-07-09 13:57:25 -07:00
Alexander Duyck
4f49dec907 net: allow ndo_select_queue to pass netdev
This patch makes it so that instead of passing a void pointer as the
accel_priv we instead pass a net_device pointer as sb_dev. Making this
change allows us to pass the subordinate device through to the fallback
function eventually so that we can keep the actual code in the
ndo_select_queue call as focused on possible on the exception cases.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-07-09 13:41:34 -07:00
Alexander Duyck
a4ea8a3dac net: Add generic ndo_select_queue functions
This patch adds a generic version of the ndo_select_queue functions for
either returning 0 or selecting a queue based on the processor ID. This is
generally meant to just reduce the number of functions we have to change
in the future when we have to deal with ndo_select_queue changes.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-07-09 13:15:34 -07:00
Alexander Duyck
eadec877ce net: Add support for subordinate traffic classes to netdev_pick_tx
This change makes it so that we can support the concept of subordinate
device traffic classes to the core networking code. In doing this we can
start pulling out the driver specific bits needed to support selecting a
queue based on an upper device.

The solution at is currently stands is only partially implemented. I have
the start of some XPS bits in here, but I would still need to allow for
configuration of the XPS maps on the queues reserved for the subordinate
devices. For now I am using the reference to the sb_dev XPS map as just a
way to skip the lookup of the lower device XPS map for now as that would
result in the wrong queue being picked.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-07-09 12:53:58 -07:00
Alexander Duyck
58b0b3ed4c ixgbe: Add code to populate and use macvlan TC to Tx queue map
This patch makes it so that we use the tc_to_txq mapping in the macvlan
device in order to select the Tx queue for outgoing packets.

The idea here is to try and move away from using ixgbe_select_queue and to
come up with a generic way to make this work for devices going forward. By
encoding this information in the netdev this can become something that can
be used generically as a solution for similar setups going forward.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-07-09 12:27:52 -07:00
Linus Torvalds
8979319f2d pci-v4.18-fixes-2
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAltBPacUHGJoZWxnYWFz
 QGdvb2dsZS5jb20ACgkQWYigwDrT+vwx+hAAvzTP6o3VOtgNK7lm3nOBuzfykgCv
 TFhXP2yeDItWDBLDpWX7wjs2657W3Sjrpw6FyVIYvsoMKKRuOYeZ6ChDieG5ZgTj
 oxav5U6TlDHoF0fNq0LWfv78lP6+++7/6yaer6j9xDksVqE4/zxlFcExxuszhZlC
 8ptJ54ORn92RdfRHCDptA4PFReNlQWNw3bpKpGxu8xj0TN/sYPN3ggHfnmiEyGMZ
 8/KLNOzGrhoqztaBPyRoG3GhU1hpicT+fWzg11Li8hP1solQDht+S3mnCC1IxqdI
 JSU7/jhti4nUV56qE7QzYzdsVbTFcItGn0WImklIFCt7htp4Rms0wN+QCmwhtjKM
 0WH02gRDip4CcYcPZGeGaeexWJFEScamFK1yxxDV7059KsoQKUN1Sm+R7y9xORYw
 nQAHPTO/nv02Xo+ADAYzV0aBPD7fEvFaWtdXAuLocVWVj3eiEqp8ftoYmuWym6T3
 gHWt9Dod8olj3t9bkR1MWZ121Ar5MsobgJSF30smEx8VMKv+4Xx8eXvq0FCuUQwT
 s/WLTPqK31Sqjii0xe1wO8g0yexCVaBpXJB/e9PpYf/+ICItG1DHLz0Ygw01QWVK
 Tv7HTAofdO42kChHjErB6GbzSuxDxSVCPN26bwkZu0eV91uKiLKA7wLUv8+qSXlZ
 lK98Eypy0Ti7pvA=
 =IOF/
 -----END PGP SIGNATURE-----

Merge tag 'pci-v4.18-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI fixes from Bjorn Helgaas:

 - Fix a use-after-free in the endpoint code (Dan Carpenter)

 - Stop defaulting CONFIG_PCIE_DW_PLAT_HOST to yes (Geert Uytterhoeven)

 - Fix an nfp regression caused by a change in how we limit the number
   of VFs we can enable (Jakub Kicinski)

 - Fix failure path cleanup issues in the new R-Car gen3 PHY support
   (Marek Vasut)

 - Fix leaks of OF nodes in faraday, xilinx-nwl, xilinx (Nicholas Mc
   Guire)

* tag 'pci-v4.18-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
  nfp: stop limiting VFs to 0
  PCI/IOV: Reset total_VFs limit after detaching PF driver
  PCI: faraday: Add missing of_node_put()
  PCI: xilinx-nwl: Add missing of_node_put()
  PCI: xilinx: Add missing of_node_put()
  PCI: endpoint: Use after free in pci_epf_unregister_driver()
  PCI: controller: dwc: Do not let PCIE_DW_PLAT_HOST default to yes
  PCI: rcar: Clean up PHY init on failure
  PCI: rcar: Shut the PHY down in failpath
2018-07-08 10:55:21 -07:00
Jiri Pirko
0317a6f4eb mlxsw: core_acl_flex_actions: Fix helper to get the first KVD linear index
The helper should return always KVD linear index of the second set.
It is unused now, but going to be used soon.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-08 17:05:19 +09:00
Jiri Pirko
5b9488fd5f mlxsw: core_acl_flex_actions: Allow the first set to be dummy
In Spectrum-2, the real action sets are always in KVD linear. The first
set is always empty and contains only pointer to the first real set in
KVD linear. So provide possibility to specify the first set is the dummy
one.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-08 17:05:19 +09:00
Jiri Pirko
9dbab6f588 mlxsw: spectrum: Put pointer to flex action ops to mlxsw_sp
Spectrum-2 need a slightly different handling of flexible actions. So
put an ops pointer in mlxsw_sp struct and rename it.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-08 17:05:19 +09:00
Jiri Pirko
82b63bcf8c mlxsw: core_acl_flex_keys: Change SRC_SYS_PORT flex key element size
The SRC_SYS_PORT is passed as 8 bit value down to hw anyway, so cap it
in the driver as well. Also, in Spectrum-2 the FW iface for SRC_SYS_PORT
is only 8 bits, so prepare for it.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-08 17:05:19 +09:00
Jiri Pirko
c43ea06dbd mlxsw: core_acl_flex_keys: Split MAC and IP address flex key elements
Since in Spectrum-2, MACs are split and IP addresses are split as well,
in order to use the same elements for Spectrum and Spectrum-2 split them
now.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-08 17:05:19 +09:00
Jiri Pirko
2139469b04 mlxsw: spectrum_acl: Ignore always-zeroed bits in tp->prio
The lowest 16 bits of tp->prio are always zero, so ignore them with a
shift.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-08 17:05:19 +09:00
Jiri Pirko
45e0620d5e mlxsw: reg: Introduce Flex2 key type for PTAR register
Introduce Flex2 key type for PTAR register which is used in Spectrum-2.
Also, extend mlxsw_reg_ptar_pack() to set the value according to the
caller.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-08 17:05:19 +09:00
Jiri Pirko
d4b0d20fec mlxsw: spectrum: Change name of mlxsw_sp_afk_blocks to mlxsw_sp1_afk_blocks
This is specific for Spectrum as Spectrum-2 has completely different key
blocks.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-08 17:05:19 +09:00
Randy Dunlap
ac3167257b headers: separate linux/mod_devicetable.h from linux/platform_device.h
At over 4000 #includes, <linux/platform_device.h> is the 9th most
#included header file in the Linux kernel.  It does not need
<linux/mod_devicetable.h>, so drop that header and explicitly add
<linux/mod_devicetable.h> to source files that need it.

   4146 #include <linux/platform_device.h>

After this patch, there are 225 files that use <linux/mod_devicetable.h>,
for a reduction of around 3900 times that <linux/mod_devicetable.h>
does not have to be read & parsed.

    225 #include <linux/mod_devicetable.h>

This patch was build-tested on 20 different arch-es.

It also makes these drivers SubmitChecklist#1 compliant.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: kbuild test robot <lkp@intel.com> # drivers/media/platform/vimc/
Reported-by: kbuild test robot <lkp@intel.com> # drivers/pinctrl/pinctrl-u300.c
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-07 17:52:26 +02:00
Ivan Khoronzhuk
1c0e8123e3 net: ethernet: ti: cpsw: allow PTP 224.0.0.107 to be timestamped
Tested on AM572x with cpsw v1.15 for PTP sync and delay_req messages.
It doesn't work on cpsw v1.12, so added only for cpsw v > 1.15.

Command for testing:
ptp4l -P -4 -H -i eth0 -l 6 -m -q -p /dev/ptp0 -f ptp.cfg
where ptp.cfg:

[global]
tx_timestamp_timeout     20

Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-07 21:22:57 +09:00
Ivan Khoronzhuk
1239a96a8f net: ethernet: ti: cpsw: use BIT macro
It's needed to avoid checkpatch warnings for farther changes.

Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-07 21:22:57 +09:00
Arnd Bergmann
8f704ef666 stmmac: fix signed 64-bit division
I link error on 32-bit ARM points to yet another arithmetic bug:

drivers/net/ethernet/stmicro/stmmac/stmmac_tc.o: In function `tc_setup_cbs':
stmmac_tc.c:(.text+0x148): undefined reference to `__aeabi_uldivmod'
stmmac_tc.c:(.text+0x1fc): undefined reference to `__aeabi_uldivmod'
stmmac_tc.c:(.text+0x308): undefined reference to `__aeabi_uldivmod'
stmmac_tc.c:(.text+0x320): undefined reference to `__aeabi_uldivmod'
stmmac_tc.c:(.text+0x33c): undefined reference to `__aeabi_uldivmod'
drivers/net/ethernet/stmicro/stmmac/stmmac_tc.o:stmmac_tc.c:(.text+0x3a4): more undefined references to `__aeabi_uldivmod' follow

I observe that the last change to add the 'ul' prefix was incorrect,
as it did not turn the result of the multiplication into a 64-bit
expression on 32-bit architectures. Further, it seems that the
do_div() macro gets confused by the fact that we pass a signed
variable rather than unsigned into it.

This changes the code to instead use the div_s64() helper that is
meant for signed division, along with changing the constant suffix
to 'll' to actually make it a 64-bit argument everywhere, fixing
both of the issues I pointed out.

I'm not completely convinced that this makes the code correct, but
I'm fairly sure that we have two problems less than before.

Fixes: 1f705bc61a ("net: stmmac: Add support for CBS QDISC")
Fixes: c18a9c0966 ("net: stmmac_tc: use 64-bit arithmetic instead of 32-bit")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-07 21:16:53 +09:00
Harini Katakam
404cd086f2 net: macb: Allocate valid memory for TX and RX BD prefetch
GEM version in ZynqMP and most versions greater than r1p07 supports
TX and RX BD prefetch. The number of BDs that can be prefetched is a
HW configurable parameter. For ZynqMP, this parameter is 4.

When GEM DMA is accessing the last BD in the ring, even before the
BD is processed and the WRAP bit is noticed, it will have prefetched
BDs outside the BD ring. These will not be processed but it is
necessary to have accessible memory after the last BD. Especially
in cases where SMMU is used, memory locations immediately after the
last BD may not have translation tables triggering HRESP errors. Hence
always allocate extra BDs to accommodate for prefetch.
The value of tx/rx bd prefetch for any given SoC version is:
2 ^ (corresponding field in design config 10 register).
(value of this field >= 1)

Added a capability flag so that older IP versions that do not have
DCFG10 or this prefetch capability are not affected.

Signed-off-by: Harini Katakam <harini.katakam@xilinx.com>
Reviewed-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-07 20:54:25 +09:00
Harini Katakam
e50b770ea5 net: macb: Free RX ring for all queues
rx ring is allocated for all queues in macb_alloc_consistent.
Free the same for all queues instead of just Q0.

Signed-off-by: Harini Katakam <harini.katakam@xilinx.com>
Reviewed-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-07 20:54:25 +09:00
Arnd Bergmann
be9c64b19b mlxsw: spectrum_router: avoid uninitialized variable access
When CONFIG_BRIDGE_VLAN_FILTERING is disabled, gcc correctly points out
that the 'vid' variable is uninitialized whenever br_vlan_get_pvid
returns an error:

drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c: In function 'mlxsw_sp_rif_vlan_fid_get':
drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c:6881:6: error: 'vid' may be used uninitialized in this function [-Werror=maybe-uninitialized]

This changes the condition check to always return -EINVAL here,
which I guess is what the author intended here.

Fixes: e6f1960ae6 ("mlxsw: spectrum_router: Allocate FID according to PVID")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-07 20:06:08 +09:00
Casey Leedom
843789f6dd cxgb4: assume flash part size to be 4MB, if it can't be determined
t4_get_flash_params() fails in a fatal fashion if the FLASH part isn't
one of the recognized parts. But this leads to desperate efforts to update
drivers when various FLASH parts which we are using suddenly become
unavailable and we need to substitute new FLASH parts.  This has lead to
more than one Customer Field Emergency when a Customer has an old driver
and suddenly can't use newly shipped adapters.

This commit fixes this by simply assuming that the FLASH part is 4MB in
size if it can't be identified. Note that all Chelsio adapters will have
flash parts which are at least 4MB in size.

Signed-off-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-07 19:57:33 +09:00
Huazhong Tan
8d40854fc1 net: hns3: Prevent sending command during global or core reset
According to hardware's description, driver should not send command to
IMP while hardware doing global or core reset.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-07 11:13:07 +09:00
Peng Li
a754e5c4ed net: hns3: Remove the warning when clear reset cause
Only the core/global/IMP reset need clear cause, other type does not
need do it. The warning may be treated as error as it is normal. This
patch removes the warning.

Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-07 11:13:07 +09:00
Yunsheng Lin
03718db97b net: hns3: Fix get_vector ops in hclgevf_main module
The hclgevf_free_vector function expects the caller to pass
the vector_id to it, and hclgevf_put_vector pass vector to
it now, which will cause vector allocation problem.

This patch fixes it by converting vector into vector_id before
calling hclgevf_free_vector.

Fixes: e2cb1dec97 ("net: hns3: Add HNS3 VF HCL(Hardware Compatibility Layer) Support")
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-07 11:13:07 +09:00
Yunsheng Lin
d7099d1547 net: hns3: Fix warning bug when doing lp selftest
The napi_alloc_skb is excepted to be called under the
non-preemptible code path when it is called by hns3_clean_rx_ring
during loopback selftest, otherwise the below warning will be
logged:

[   92.420780] BUG: using smp_processor_id() in preemptible
[00000000] code: ethtool/1873
<SNIP>
[   92.463202]  check_preemption_disabled+0xf8/0x100
[   92.467893]  debug_smp_processor_id+0x1c/0x28
[   92.472239]  __napi_alloc_skb+0x30/0x130
[   92.476158]  hns3_clean_rx_ring+0x118/0x5f0 [hns3]
[   92.480941]  hns3_self_test+0x32c/0x4d0 [hns3]
[   92.485375]  ethtool_self_test+0xdc/0x1e8
[   92.489372]  dev_ethtool+0x1020/0x1da8
[   92.493109]  dev_ioctl+0x188/0x3a0
[   92.496499]  sock_do_ioctl+0xf4/0x208
[   92.500148]  sock_ioctl+0x228/0x3e8
[   92.503626]  do_vfs_ioctl+0xc4/0x880
[   92.507189]  SyS_ioctl+0x94/0xa8
[   92.510404]  el0_svc_naked+0x30/0x34

This patch fix it by disabling preemption when calling
hns3_clean_rx_ring during loopback selftest.

Fixes: c39c4d98dc ("net: hns3: Add mac loopback selftest support in hns3 driver")
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-07 11:13:07 +09:00
Jian Shen
8fc7346c84 net: hns3: Add configure for mac minimal frame size
When change the mtu, the minimal frame size of mac will be set
to zero, it is incorrect. This patch fixes it by set it to the
default value.

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-07 11:13:06 +09:00
Fuyun Liang
ead5bd4d35 net: hns3: Fix for mailbox message truncated problem
The payload of mailbox message is 16 byte and the value of
HCLGE_MBX_MAX_ARQ_MSG_SIZE is 8. A message truncated problem will
happen when mailbox message is converted to ARQ message. This patch
replaces HCLGE_MBX_MAX_ARQ_MSG_SIZE with the size of ARQ message in
hclgevf_mbx_handler to fix this problem.

Fixes: b11a0bb231 ("net: hns3: Add mailbox support to VF driver")
Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-07 11:13:06 +09:00
Yunsheng Lin
5c8971979a net: hns3: Fix for l4 checksum offload bug
Hardware only support tcp/udp/sctp l4 checksum offload, but
the driver currently tell hardware to do l4 checksum offlad when
l3 is IPv4 or IPv6, which may cause checksumm error.

This patch fixes it by only enabling the l4 offload when l4 is
tcp/udp/sctp.

Fixes: 76ad4f0ee7 ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC")
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-07 11:13:06 +09:00