Commit Graph

24994 Commits

Author SHA1 Message Date
Florian Fainelli
b5061778f8 net: systemport: Turn on offloads by default
We can turn on the RX/TX checksum offloads by default and make sure that
those are properly reflected back to e.g: stacked devices such as VLAN
or DSA.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-01 23:11:52 -07:00
Florian Fainelli
297357d1a1 net: systemport: Utilize bcm_sysport_set_features() during resume/open
During driver resume and open, the HW may have lost its context/state,
utilize bcm_sysport_set_features() to make sure we do restore the
correct set of features that were previously configured.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-01 23:11:52 -07:00
Florian Fainelli
10b476c57b net: systemport: Refactor bcm_sysport_set_features()
In preparation for unconditionally enabling TX and RX checksum offloads,
refactor bcm_sysport_set_features() a bit such that
__netdev_update_features() during register_netdev() can make sure that
features are correctly programmed during network device registration.

Since we can now be called during register_netdev() with clocks gated,
we need to temporarily turn them on/off in order to have a successful
register programming.

We also move the CRC forward setting read into
bcm_sysport_set_features() since priv->crc_fwd matters while turning on
RX checksum offload, that way we are guaranteed they are in sync in case
we ever add support for NETIF_F_RXFCS at some point in the future.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-01 23:11:52 -07:00
Jian Shen
c17852a893 net: hns3: Add support for enable/disable flow director
This patch adds switch for flow director with ethtool command

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-01 22:57:45 -07:00
Jian Shen
dc5e606477 net: hns3: Remove all flow director rules when unload hns3 driver
This patch removes all flow director rules when unload hns3 driver.

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-01 22:57:45 -07:00
Jian Shen
6871af29b3 net: hns3: Add reset handle for flow director
When doing reset, remove all entries in TCAM block, and keep flow
director rules list. After finishing reset, restore all entries.

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-01 22:57:45 -07:00
Jian Shen
05c2314fe6 net: hns3: Add support for rule query of flow director
This patch adds support for querying rule number and rule details
by ethtool commands.

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-01 22:57:45 -07:00
Jian Shen
dd74f815dd net: hns3: Add support for rule add/delete for flow director
This patch adds support for add and delete rule by ethtool commands.
HNS3 driver supports several flow types, include ETHER_FLOW,
IP_USER_FLOW, TCP_V4_FLOW, UDP_V4_FLOW, SCTP_V4_FLOW, IPV6_USER_FLOW,
TCP_V6_FLOW, UDP_V6_FLOW and SCTP_V6_FLOW.

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-01 22:57:45 -07:00
Jian Shen
1173286802 net: hns3: Add input key and action config support for flow director
Each flow director rule consists of input key and action. The input key
is the condition for matching, includes tuples of L2/L3/L4 header.
Action is the behaviour when a packet matches with the input key, such
as drop the packet, or forward to a specified queue.

The input key is stored in the tcam blocks, Each bit of input key can
be masked.

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-01 22:57:45 -07:00
Jian Shen
d695964d72 net: hns3: Add flow director initialization
Flow director is a new feature supported by hardware with revision 0x21.
This patch adds flow direcor initialization for each PF. It queries flow
director mode and tcam resource from firmware, selects tuples used for
input key.

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-01 22:57:45 -07:00
Andrew Lunn
719655a149 net: phy: Replace phy driver features u32 with link_mode bitmap
This is one step in allowing phylib to make use of link_mode bitmaps,
instead of u32 for supported and advertised features. Convert the phy
drivers to use bitmaps to indicates the features they support.

Build bitmap equivalents of the u32 values at runtime, and have the
drivers point to the appropriate bitmap. These bitmaps are shared, and
we don't want a driver to modify them. So mark them __ro_after_init.

Within phylib, the features bitmap is currently turned back into a
u32. This will be removed once the whole of phylib, and the drivers
are converted to use bitmaps.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-01 22:55:36 -07:00
Andrew Lunn
d0939c26c5 net: ethernet: xgbe: expand PHY_GBIT_FEAUTRES
The macro PHY_GBIT_FEAUTRES needs to change into a bitmap in order to
support link_modes. Remove its use from xgde by replacing it with its
definition.

Probably, the current behavior is wrong. It probably should be
ANDing not assigning.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-01 22:55:36 -07:00
Andrew Lunn
5f991f7bdd net: phy: Add helper for advertise to lcl value
Add a helper to convert the local advertising to an LCL capabilities,
which is then used to resolve pause flow control settings.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-01 22:55:36 -07:00
Jakub Kicinski
97ea8ac360 nfp: warn on experimental TLV types
Reserve two TLV types for feature development, and warn in the driver
if they ever leak into production.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-01 22:49:30 -07:00
Moritz Fischer
ea43a5907f net: nixge: Address compiler warnings when building for i386
Address compiler warning reported by kbuild autobuilders
when building for i386 as a result of dma_addr_t size on
different architectures.

warning: cast to pointer from integer of different size
	[-Wint-to-pointer-cast]

Fixes: 7e8d5755be ("net: nixge: Add support for 64-bit platforms")
Signed-off-by: Moritz Fischer <mdf@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-01 22:48:08 -07:00
David S. Miller
3bd09b05b0 mlx5e-updates-2018-10-01
This series includes updates to mlx5e ethernet netdevice driver:
 
 From Or Gerlitz:
 1) Support masks for l3/l4 filters in ethtool flow steering
 2) Report checksum unnecessary also when the L3 checksum flag on the
    cqe is set and there's no L4 header
 3) Allow reporting of checksum unnecessary, using an ethtool private flag.
 
 From Gavi Teitz and Or, VF representors netdevs performance improvements
 4) Allow striding RQ in VF representor and bigger RQ size, ~3X performance improvement
 5) Enable stateless offloads for VF representor, csum and TSO, 1.5X performance improvement
 6) RSS Support for VF representors
    6.1) Allow flow table destination fir VF representor steering rule.
    6.2) Create RSS flow table per representor netdev
    6.3) Expose mlx5e RSS ethtool to be used by representor netdevs
    6.4) Enable multi-queue and RSS for VF representors, using mlx5e existing infrastructure
             for managing a multi-queue RX RSS tables.
 
 From Alaa Hleihel:
 7) Cache the system image guid, The system image guid is a read-only field
    Read this once and save it on the core device.
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJbsmk5AAoJEEg/ir3gV/o+S+IIAMpGR1b79uPjCIdZSTxLlCvg
 xGdc3MGQ4Lv+DuUEP1DQpZDIAJQeh10KqICGfRZxo7d4jUtajwujwaUylkZrb9da
 QNIeK2FLMXkxQo51owZxXgdAy8BNpPmg3D7xI/O/DET4UWCKRmZ/4eJfWLG7r8qp
 +BEqQmJjhY67mj1tuzJ1JsysrOvUhfXEwqJGgBCLlME8KVWh0No4fTsrAlKBmeec
 kYxaW9hS57LKOu32y8LdYb830yHDjUqtWAd+5cGWNTOaygm0dd5ZPVnR9oJ/Ga7I
 Ru64aMMQoLuRbMfryUh8gYZ/oIwP48oIIr3Wler5SXkHxEKR++F5fJrApgxvibA=
 =ivVt
 -----END PGP SIGNATURE-----

Merge tag 'mlx5e-updates-2018-10-01' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5e-updates-2018-10-01

This series includes updates to mlx5e ethernet netdevice driver:

From Or Gerlitz:
1) Support masks for l3/l4 filters in ethtool flow steering
2) Report checksum unnecessary also when the L3 checksum flag on the
   cqe is set and there's no L4 header
3) Allow reporting of checksum unnecessary, using an ethtool private flag.

From Gavi Teitz and Or, VF representors netdevs performance improvements
4) Allow striding RQ in VF representor and bigger RQ size, ~3X performance improvement
5) Enable stateless offloads for VF representor, csum and TSO, 1.5X performance improvement
6) RSS Support for VF representors
   6.1) Allow flow table destination fir VF representor steering rule.
   6.2) Create RSS flow table per representor netdev
   6.3) Expose mlx5e RSS ethtool to be used by representor netdevs
   6.4) Enable multi-queue and RSS for VF representors, using mlx5e existing infrastructure
            for managing a multi-queue RX RSS tables.

From Alaa Hleihel:
7) Cache the system image guid, The system image guid is a read-only field
   Read this once and save it on the core device.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-01 15:49:17 -07:00
David S. Miller
d96112b2ca Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:

====================
100GbE Intel Wired LAN Driver Updates 2018-10-01

This series contains updates to ice driver only.

Anirudh provides several changes to "prep" the driver for upcoming
features.  Specifically, the functions that are used for PF VSI/netdev
setup will also be used in SR-IOV support and to allow the reuse of
these functions, code needs to move.

Dave provides the only other change in the series, updates the driver to
protect the reset patch in its entirety.  This is done by adding the
various bit checks to determine if a reset is scheduled/initiated and
whether it came from the software or firmware.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-01 15:43:58 -07:00
Dave Ertman
5df7e45d54 ice: Change pf state behavior to protect reset path
Currently, there is no bit, or set of bits, that protect the entirety
of the reset path.

If the reset is originated by the driver, then the relevant
one of the following bits will be set when the reset is scheduled:
__ICE_PFR_REQ
__ICE_CORER_REQ
__ICE_GLOBR_REQ
This bit will not be cleared until after the rebuild has completed.

If the reset is originated by the FW, then the first the driver knows of
it will be the reception of the OICR interrupt.  The __ICE_RESET_OICR_RECV
bit will be set in the interrupt handler.  This will also be the indicator
in a SW originated reset that we have completed the pre-OICR tasks and
have informed the FW that a reset was requested.

To utilize these bits, change the function:
ice_is_reset_recovery_pending()
	to be:
ice_is_reset_in_progress()

The new function will check all of the above bits in the pf->state and
will return a true if one or more of these bits are set.

Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-10-01 12:50:50 -07:00
Anirudh Venkataramanan
37bb839012 ice: Move common functions out of ice_main.c part 7/7
This patch completes the code move out of ice_main.c

The following top level functions and related dependency functions) were
moved to ice_lib.c:
ice_vsi_setup
ice_vsi_cfg_tc

The following functions were made static again:
ice_vsi_setup_vector_base
ice_vsi_alloc_q_vectors
ice_vsi_get_qs
void ice_vsi_map_rings_to_vectors
ice_vsi_alloc_rings
ice_vsi_set_rss_params
ice_vsi_set_num_qs
ice_get_free_slot
ice_vsi_init
ice_vsi_alloc_arrays

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-10-01 12:50:30 -07:00
Anirudh Venkataramanan
df0f847915 ice: Move common functions out of ice_main.c part 6/7
This patch continues the code move out of ice_main.c

The following top level functions (and related dependency functions) were
moved to ice_lib.c:
ice_vsi_setup_vector_base
ice_vsi_alloc_q_vectors
ice_vsi_get_qs

The following functions were made static again:
ice_vsi_free_arrays
ice_vsi_clear_rings

Also, in this patch, the netdev and NAPI registration logic was de-coupled
from the VSI creation logic (ice_vsi_setup) as for SR-IOV, while we want to
create VF VSIs using ice_vsi_setup, we don't want to create netdevs.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-10-01 12:50:06 -07:00
Anirudh Venkataramanan
07309a0e59 ice: Move common functions out of ice_main.c part 5/7
This patch continues the code move out of ice_main.c

The following top level functions (and related dependency functions) were
moved to ice_lib.c:
ice_vsi_clear
ice_vsi_close
ice_vsi_free_arrays
ice_vsi_map_rings_to_vectors

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-10-01 12:49:45 -07:00
Anirudh Venkataramanan
28c2a64573 ice: Move common functions out of ice_main.c part 4/7
This patch continues the code move out of ice_main.c

The following top level functions (and related dependency functions) were
moved to ice_lib.c:
ice_vsi_alloc_rings
ice_vsi_set_rss_params
ice_vsi_set_num_qs
ice_get_free_slot
ice_vsi_init
ice_vsi_clear_rings
ice_vsi_alloc_arrays

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-10-01 12:49:21 -07:00
Anirudh Venkataramanan
5153a18e57 ice: Move common functions out of ice_main.c part 3/7
This patch continues the code move out of ice_main.c

The following top level functions (and related dependency functions) were
moved to ice_lib.c:
ice_vsi_delete
ice_free_res
ice_get_res
ice_is_reset_recovery_pending
ice_vsi_put_qs
ice_vsi_dis_irq
ice_vsi_free_irq
ice_vsi_free_rx_rings
ice_vsi_free_tx_rings
ice_msix_clean_rings

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-10-01 12:48:58 -07:00
Anirudh Venkataramanan
72adf2421d ice: Move common functions out of ice_main.c part 2/7
This patch continues the code move out of ice_main.c

The following top level functions (and related dependency functions) were
moved to ice_lib.c:
ice_vsi_start_rx_rings
ice_vsi_stop_rx_rings
ice_vsi_stop_tx_rings
ice_vsi_cfg_rxqs
ice_vsi_cfg_txqs
ice_vsi_cfg_msix

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-10-01 12:45:56 -07:00
Anirudh Venkataramanan
45d3d428ea ice: Move common functions out of ice_main.c part 1/7
The functions that are used for PF VSI/netdev setup will also be used
for SR-IOV support. To allow reuse of these functions, move these
functions out of ice_main.c to ice_common.c/ice_lib.c

This move is done across multiple patches. Each patch moves a few
functions and may have minor adjustments. For example, a function that was
previously static in ice_main.c will be made non-static temporarily in
its new location to allow the driver to build cleanly. These adjustments
will be removed in subsequent patches where more code is moved out of
ice_main.c

In this particular patch, the following functions were moved out of
ice_main.c:
int ice_add_mac_to_list
ice_free_fltr_list
ice_stat_update40
ice_stat_update32
ice_update_eth_stats
ice_vsi_add_vlan
ice_vsi_kill_vlan
ice_vsi_manage_vlan_insertion
ice_vsi_manage_vlan_stripping

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-10-01 12:39:40 -07:00
Alaa Hleihel
59c9d35ea9 net/mlx5: Cache the system image guid
The system image guid is a read-only field which is used by the TC
offloads code to determine if two mlx5 devices belong to the same
ASIC while adding flows.

Read this once and save it on the core device rather than querying each
time an offloaded flow is added.

Signed-off-by: Alaa Hleihel <alaa@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-01 11:32:47 -07:00
Or Gerlitz
b856df28f9 net/mlx5e: Allow reporting of checksum unnecessary
Currently we practically never report checksum unnecessary, because
for all IP packets we take the checksum complete path.

Enable non-default runs with reprorting checksum unnecessary, using
an ethtool private flag. This can be useful for performance evals
and other explorations.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-01 11:32:47 -07:00
Or Gerlitz
b820e6fb09 net/mlx5e: Enable reporting checksum unnecessary also for L3 packets
We can report checksum unnecessary also when the L3 checksum
flag on the cqe is set and there's no L4 header.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-01 11:32:46 -07:00
Gavi Teitz
f128f138cc net/mlx5e: Add ethtool control of ring params to VF representors
Added ethtool control to the representors for setting and querying
the ring params.

Signed-off-by: Gavi Teitz <gavi@mellanox.com>
2018-10-01 11:32:46 -07:00
Gavi Teitz
84a0973386 net/mlx5e: Enable multi-queue and RSS for VF representors
Increased the amount of channels the representors can open to be the
amount of CPUs. The default amount opened remains one.

Used the standard NIC netdev functions to:
* Set RSS params when building the representors' params.
* Setup an indirect TIR and RQT for the representors upon
  initialization.
* Create a TTC flow table for the representors' indirect TIR (when
  creating the TTC table, mlx5e_set_ttc_basic_params() is not called,
  in order to avoid setting the inner_ttc param, which is not needed).

Added ethtool control to the representors for setting and querying
the amount of open channels. Additionally, included logic in the
representors' ethtool set channels handler which controls a
representor's vport rx rule, so that if there is one open channel
the rx rule steers traffic to the representor's direct TIR, whereas
if there is more than one channel, the rx rule steers traffic to the
new TTC flow table.

Signed-off-by: Gavi Teitz <gavi@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-01 11:32:45 -07:00
Or Gerlitz
a5355de878 net/mlx5e: Expose ethtool rss key size / indirection table functions
Towards enabling RSS for the vport representors, expose the functions for
querying the rss hash key size and indirection table size via ethtool.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-01 11:32:45 -07:00
Gavi Teitz
3edc0159c0 net/mlx5e: Expose function for building RSS params
Towards enabling RSS for the vport representors, extract the
procedure for building a device's RSS params, and expose the
function.

Signed-off-by: Gavi Teitz <gavi@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-01 11:32:44 -07:00
Or Gerlitz
46dc933cee net/mlx5e: Provide explicit directive if to create inner indirect tirs
Change the driver functions that deal with creating indirect tirs
to get a flag telling if inner ttc is desired.

A pre-step for enabling rss on the vport representors, where
inner ttc is not needed.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-01 11:32:44 -07:00
Gavi Teitz
c966f7d55d net/mlx5: E-Switch, Provide flow dest when creating vport rx rule
Currently the destination for the representor e-switch rx rule is
a TIR number. Towards changing that to potentially be a flow table,
as part of enabling RSS for representors, modify the signature of
the related e-switch API to get a flow destination.

Signed-off-by: Gavi Teitz <gavi@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-01 11:32:43 -07:00
Gavi Teitz
092297e09a net/mlx5e: Extract creation of rep's default flow rule
Cleaning up the flow of the representors' rx initialization, towards
enabling RSS for the representors.

Signed-off-by: Gavi Teitz <gavi@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-01 11:32:43 -07:00
Gavi Teitz
dabeb3b0d5 net/mlx5e: Enable stateless offloads for VF representor netdevs
Enabled checksum and TSO offloads for the representors, in
order to increase their performance, which is required to
increase the performance of flows that cannot be offloaded.

Checksum offloads contribute to a general acceleration of all
traffic (to around 150%), whereas the TSO offload contributes
to a prominent acceleration of the representor's TX for traffic
flows with larger than MTU sized packets (to around 200%). This
is the usual case for TCP streams, as the PF, which serves as
the uplink representor, and the VF representors employ GRO before
forwarding the packets to the representor.

GRO was enabled implicitly for the representors beforehand, and
is explicitly enabled here to ensure that the representors preserve
the performance boost it provides (of around 200%) when working in
tandem with the TSO offload by the forwardee, which is the standard
case as both the PF and the VF representors employ HW TSO.

The impact of these changes can be seen in the following
measurements taken on a setup of a VM over a VF, connected
to OVS via the VF representor, to an external host:

Before current changes:
                     TCP Throughput [Gb/s]
External host to VM         ~ 10.5
VM to external host         ~ 23.5

With just checksum offloads enabled:
                     TCP Throughput [Gb/s]
External host to VM         ~ 14.9
VM to external host         ~ 28.5

With the TSO offload also enabled:
                     TCP Throughput [Gb/s]
External host to VM         ~ 30.5

Signed-off-by: Gavi Teitz <gavi@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-01 11:32:42 -07:00
Gavi Teitz
749359f4aa net/mlx5e: Change VF representors' RQ type
The representors' RQ size was not large enough for them to achieve
high enough performance, and therefore needed to be enlarged, while
suffering a minimum hit to its memory usage. To achieve this the
representors RQ size was increased, and its type was changed to be a
striding RQ if it is supported.

Towards that goal the following changes were made:

* Extracted the sequence for setting the standard netdev's RQ parmas
  into a function

* Replaced the sequence for setting the representor's RQ params with
  the standard sequence

The impact of this change can be seen in the following measurements
taken on a setup of a VM over a VF, connected to OVS via the VF
representor, to an external host:

Before current change:
                     TCP Throughput [Gb/s]
VM to external host         ~  7.2

With the current change (measured with a striding RQ):
                     TCP Throughput [Gb/s]
VM to external host         ~ 23.5

Each representor now consumes 2 [MB] of memory for its packet
buffers.

Signed-off-by: Gavi Teitz <gavi@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-01 11:32:42 -07:00
Or Gerlitz
3a95e0ccaf net/mlx5e: Ethtool steering, Support masks for l3/l4 filters
Allow using partial masks for L3 addresses and L4 ports across
the place.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-01 11:32:41 -07:00
Jianbo Liu
cee2648762 net/mlx5e: Set vlan masks for all offloaded TC rules
In flow steering, if asked to, the hardware matches on the first ethertype
which is not vlan. It's possible to set a rule as follows, which is meant
to match on untagged packet, but will match on a vlan packet:
    tc filter add dev eth0 parent ffff: protocol ip flower ...

To avoid this for packets with single tag, we set vlan masks to tell
hardware to check the tags for every matched packet.

Fixes: 095b6cfd69 ('net/mlx5e: Add TC vlan match parsing')
Signed-off-by: Jianbo Liu <jianbol@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-01 10:58:00 -07:00
Eran Ben Elisha
11aa5800ed net/mlx5: E-Switch, Fix out of bound access when setting vport rate
The code that deals with eswitch vport bw guarantee was going beyond the
eswitch vport array limit, fix that.  This was pointed out by the kernel
address sanitizer (KASAN).

The error from KASAN log:
[2018-09-15 15:04:45] BUG: KASAN: slab-out-of-bounds in
mlx5_eswitch_set_vport_rate+0x8c1/0xae0 [mlx5_core]

Fixes: c9497c9890 ("net/mlx5: Add support for setting VF min rate")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-01 10:58:00 -07:00
Alaa Hleihel
4d8fcf216c net/mlx5e: Avoid unbounded peer devices when unpairing TC hairpin rules
If the peer device was already unbound, then do not attempt to modify
it's resources, otherwise we will crash on dereferencing non-existing
device.

Fixes: 5c65c564c9 ("net/mlx5e: Support offloading TC NIC hairpin flows")
Signed-off-by: Alaa Hleihel <alaa@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-01 10:58:00 -07:00
Hans de Goede
ac8bd9e13b r8169: Disable clk during suspend / resume
Disable the clk during suspend to save power. Note that tp->clk may be
NULL, the clk core functions handle this without problems.

Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Tested-by: Carlo Caione <carlo@endlessm.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-29 11:47:57 -07:00
Shahed Shaikh
c333fa0c4f qlcnic: fix Tx descriptor corruption on 82xx devices
In regular NIC transmission flow, driver always configures MAC using
Tx queue zero descriptor as a part of MAC learning flow.
But with multi Tx queue supported NIC, regular transmission can occur on
any non-zero Tx queue and from that context it uses
Tx queue zero descriptor to configure MAC, at the same time TX queue
zero could be used by another CPU for regular transmission
which could lead to Tx queue zero descriptor corruption and cause FW
abort.

This patch fixes this in such a way that driver always configures
learned MAC address from the same Tx queue which is used for
regular transmission.

Fixes: 7e2cf4feba ("qlcnic: change driver hardware interface mechanism")
Signed-off-by: Shahed Shaikh <shahed.shaikh@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-29 11:46:07 -07:00
David S. Miller
3ff6cde846 hns3: Another build fix.
drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c: In function ‘hclge_get_sset_count’:
drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c:496:31: error: ‘HNAE3_REVISION_ID_21’ undeclared (first use in this function); did you mean ‘FADT2_REVISION_ID’?
   if (hdev->pdev->revision >= HNAE3_REVISION_ID_21 ||
                               ^~~~~~~~~~~~~~~~~~~~
                               FADT2_REVISION_ID

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-29 11:21:06 -07:00
David S. Miller
d401766585 hns3: Fix the build.
drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c: In function ‘hns3_self_test’:
drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c:278:15: error: ‘HNS3_SELF_TEST_TYPE_NUM’ undeclared (first use in this function); did you mean ‘HNS3_SELF_TEST_TPYE_NUM’?
  int st_param[HNS3_SELF_TEST_TYPE_NUM][2];
               ^~~~~~~~~~~~~~~~~~~~~~~
               HNS3_SELF_TEST_TPYE_NUM

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-29 10:49:58 -07:00
Bruce Allan
3b6bf296c4 ice: fix changing of ring descriptor size (ethtool -G)
rx_mini_pending was set to an incorrect value. This was causing EINVAL to
always be returned to 'ethtool -G'. The driver does not support mini or
jumbo rings so the respective settings should be zero.

Also, change the valid range of the number of descriptors in the rings to
make the code simpler and easier for users to understand (this removes the
valid settings of 8 and 16). Add a system log message indicating when the
number is rounded-up from what the user specifies with the 'ethtool -G'
command (i.e. when it is not a multiple of 32), and update the log message
when a user-provided value is out of range to also indicate the stride.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-28 11:16:36 -07:00
Anirudh Venkataramanan
7d86cf3840 ice: Update to capabilities admin queue command
This patch makes a couple of changes in the way the driver uses the
"get capabilities" command.

1. Get device capabilities in addition to function capabilities

2. Align to latest spec by using cap_count to determine size of the
   buffer in case of length error.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-28 11:16:36 -07:00
Anirudh Venkataramanan
1886588fb6 ice: Query the Tx scheduler node before adding it
Query the Tx scheduler tree node information from FW before adding it to
the driver's software database. This will keep the node information current
in driver.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-28 11:16:36 -07:00
Brett Creeley
8bc8d188cd ice: Update comment for ice_fltr_mgmt_list_entry
Previously the comment stated that VSI lists should be used when a
second VSI becomes a subscriber to the "VLAN address". VSI lists
are always used for VLAN membership, so replace "VLAN address" with
"MAC address". Also note that VLAN(s) always use VSI list rules.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-28 11:16:36 -07:00
Jacob Keller
b2ccf317ed ice: update fw version check logic
We have MAX_FW_API_VER_BRANCH, MAX_FW_API_VER_MAJOR, and
MAX_FW_API_VER_MINOR that we use in ice_controlq.h to test when a
firmware version is newer than expected. This is currently tested by
comparing each field separately. Thus, we compare the branch field
against the MAX_FW_API_VER_BRANCH, and so forth.

This means that currently, if we suppose that the max firmware version
is defined as 0.2.1, i.e.

Then firmware 0.1.3 will fail to load. This is because the minor version
3 is greater than the max minor version 1.

This is not intuitive, because of the notion that increasing the major
firmware version to 2 should mean any firmware version with a major
version is less than 2 should be considered older than 2...

In order to allow both 0.2.1 and 0.1.3 to load, you would have to define
the "max" firmware version as 0.2.3.. It is possible that such
a firmware version doesn't even exist yet!

Fix this by replacing the current logic with an updated check that
behaves as follows:

First, we check the major version. If it is greater than the expected
version, then we prevent driver load. Additionally, a warning message is
logged to indicate to the system administrator that they need to update
their driver. This is now the only case where the driver will refuse to
load.

Second, if the major version is less than the expected version, we log
an information message indicating the NVM should be updated.

Third, if the major version is exact, we'll then check the minor
version. If the minor version is more than two versions less than
expected, we log an information message indicating the NVM should be
updated. If it is more than two versions greater than the expected
version, we log an information message that the driver should be
updated.

To support this, the ice_aq_ver_check function needs its signature
updated to pass the HW structure. Since we now pass this structure,
there is no need to pass the firmware API versions separately.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-28 11:16:36 -07:00
Bruce Allan
e4a0e1ee94 ice: update branding strings and supported device ids
Update branding strings and remove device ids 0x1594 and 0x1595.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-28 11:16:36 -07:00
Bruce Allan
95a525bee0 ice: replace unnecessary memcpy with direct assignment
Direct assignment is preferred over a memcpy()

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-28 11:16:36 -07:00
Jacob Keller
c913b73cd0 ice: use [sr]q.count when checking if queue is initialized
When shutting down the controlqs, we check if they are initialized
before we shut them down and destroy the lock. This is important, as it
prevents attempts to access the lock of an already shutdown queue.

Unfortunately, we checked rq.head and sq.head as the value to determine
if the queue was initialized. This doesn't work, because head is not
reset when the queue is shutdown. In some flows, the adminq will have
already been shut down prior to calling ice_shutdown_all_ctrlqs. This
can result in a crash due to attempting to access the already destroyed
mutex.

Fix this by using rq.count and sq.count instead. Indeed, ice_shutdown_sq
and ice_shutdown_rq already indicate that this is the value we should be
using to determine of the queue was initialized.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-28 11:16:36 -07:00
Colin Ian King
6a42b5128d qed: fix spelling mistake "b_cb_registred" -> "b_cb_registered"
Trivial fix to spelling mistake struct field name, rename it.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28 11:15:11 -07:00
Eric Dumazet
0c3b9d1b37 ibmvnic: remove ndo_poll_controller
As diagnosed by Song Liu, ndo_poll_controller() can
be very dangerous on loaded hosts, since the cpu
calling ndo_poll_controller() might steal all NAPI
contexts (for all RX/TX queues of the NIC). This capture
can last for unlimited amount of time, since one
cpu is generally not able to drain all the queues under load.

ibmvnic uses NAPI for TX completions, so we better let core
networking stack call the napi->poll() to avoid the capture.

ibmvnic_netpoll_controller() was completely wrong anyway,
as it was scheduling NAPI to service RX queues (instead of TX),
so I doubt netpoll ever worked on this driver.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Cc: John Allen <jallen@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28 11:12:29 -07:00
Eric Dumazet
a4f570be65 sfc-falcon: remove ndo_poll_controller
As diagnosed by Song Liu, ndo_poll_controller() can
be very dangerous on loaded hosts, since the cpu
calling ndo_poll_controller() might steal all NAPI
contexts (for all RX/TX queues of the NIC). This capture
can last for unlimited amount of time, since one
cpu is generally not able to drain all the queues under load.

sfc-falcon uses NAPI for TX completions, so we better let core
networking stack call the napi->poll() to avoid the capture.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Solarflare linux maintainers <linux-net-drivers@solarflare.com>
Cc: Edward Cree <ecree@solarflare.com>
Cc: Bert Kenward <bkenward@solarflare.com>
Acked-By: Bert Kenward <bkenward@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28 11:12:29 -07:00
Eric Dumazet
9447a10ff6 sfc: remove ndo_poll_controller
As diagnosed by Song Liu, ndo_poll_controller() can
be very dangerous on loaded hosts, since the cpu
calling ndo_poll_controller() might steal all NAPI
contexts (for all RX/TX queues of the NIC). This capture
can last for unlimited amount of time, since one
cpu is generally not able to drain all the queues under load.

sfc uses NAPI for TX completions, so we better let core
networking stack call the napi->poll() to avoid the capture.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Edward Cree <ecree@solarflare.com>
Cc: Bert Kenward <bkenward@solarflare.com>
Cc: Solarflare linux maintainers <linux-net-drivers@solarflare.com>
Acked-By: Bert Kenward <bkenward@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28 11:12:29 -07:00
Eric Dumazet
21627982e4 net: ena: remove ndo_poll_controller
As diagnosed by Song Liu, ndo_poll_controller() can
be very dangerous on loaded hosts, since the cpu
calling ndo_poll_controller() might steal all NAPI
contexts (for all RX/TX queues of the NIC). This capture
can last for unlimited amount of time, since one
cpu is generally not able to drain all the queues under load.

ena uses NAPI for TX completions, so we better let core
networking stack call the napi->poll() to avoid the capture.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Netanel Belgazal <netanel@amazon.com>
Cc: Saeed Bishara <saeedb@amazon.com>
Cc: Zorik Machulsky <zorik@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28 11:12:29 -07:00
Eric Dumazet
3548fcf7d8 qlogic: netxen: remove ndo_poll_controller
As diagnosed by Song Liu, ndo_poll_controller() can
be very dangerous on loaded hosts, since the cpu
calling ndo_poll_controller() might steal all NAPI
contexts (for all RX/TX queues of the NIC). This capture
can last for unlimited amount of time, since one
cpu is generally not able to drain all the queues under load.

netxen uses NAPI for TX completions, so we better let core
networking stack call the napi->poll() to avoid the capture.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Manish Chopra <manish.chopra@cavium.com>
Cc: Rahul Verma <rahul.verma@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28 11:12:29 -07:00
Eric Dumazet
81b059b218 qlcnic: remove ndo_poll_controller
As diagnosed by Song Liu, ndo_poll_controller() can
be very dangerous on loaded hosts, since the cpu
calling ndo_poll_controller() might steal all NAPI
contexts (for all RX/TX queues of the NIC). This capture
can last for unlimited amount of time, since one
cpu is generally not able to drain all the queues under load.

qlcnic uses NAPI for TX completions, so we better let core
networking stack call the napi->poll() to avoid the capture.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Harish Patil <harish.patil@cavium.com>
Cc: Manish Chopra <manish.chopra@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28 11:12:29 -07:00
Eric Dumazet
4bd2c03be7 net: hns: remove ndo_poll_controller
As diagnosed by Song Liu, ndo_poll_controller() can
be very dangerous on loaded hosts, since the cpu
calling ndo_poll_controller() might steal all NAPI
contexts (for all RX/TX queues of the NIC). This capture
can last for unlimited amount of time, since one
cpu is generally not able to drain all the queues under load.

hns uses NAPI for TX completions, so we better let core
networking stack call the napi->poll() to avoid the capture.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Yisen Zhuang <yisen.zhuang@huawei.com>
Cc: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28 11:12:28 -07:00
Eric Dumazet
226a2dd62c ehea: remove ndo_poll_controller
As diagnosed by Song Liu, ndo_poll_controller() can
be very dangerous on loaded hosts, since the cpu
calling ndo_poll_controller() might steal all NAPI
contexts (for all RX/TX queues of the NIC). This capture
can last for unlimited amount of time, since one
cpu is generally not able to drain all the queues under load.

ehea uses NAPI for TX completions, so we better let core
networking stack call the napi->poll() to avoid the capture.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Douglas Miller <dougmill@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28 11:12:28 -07:00
Eric Dumazet
e71fb423e0 hinic: remove ndo_poll_controller
As diagnosed by Song Liu, ndo_poll_controller() can
be very dangerous on loaded hosts, since the cpu
calling ndo_poll_controller() might steal all NAPI
contexts (for all RX/TX queues of the NIC). This capture
can last for unlimited amount of time, since one
cpu is generally not able to drain all the queues under load.

hinic uses NAPI for TX completions, so we better let core
networking stack call the napi->poll() to avoid the capture.

Note that hinic_netpoll() was incorrectly scheduling NAPI
on both RX and TX queues.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Aviad Krawczyk <aviad.krawczyk@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28 11:12:28 -07:00
David S. Miller
ec72001d38 Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:

====================
100GbE Intel Wired LAN Driver Updates 2018-09-27

This series contains fixes to the ice driver only.

Jake fixes a potential crash due to attempting to access the mutex which
is already destroyed.  Fix this by using rq.count and sq.count to
determine if the queue was initialized.  Fixed the current logic for
checking the firmware version to properly handle situations when
firmware major/minor versions differ and when the branch version
differs.

Bruce replaces a memcpy() with a direct assignment, which is preferred.
Also updated the branding strings and device ids supported by the
driver.  Fixed the "ethtool -G" command in the driver, which was always
returning EINVAL when changing the descriptor ring size.

Brett update and clarified code comments.

Anirudh updates the driver to ensure we query the firmware for the
transmit scheduler node information before adding it to the driver
database, to ensure we have the current information.  Also update the
"get capabilities" command to get device and function capabilities.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28 11:09:02 -07:00
Sudarsana Reddy Kalluru
5f672090e4 qed: Fix shmem structure inconsistency between driver and the mfw.
The structure shared between driver and the management FW (mfw) differ in
sizes. This would lead to issues when driver try to access the structure
members which are not-aligned with the mfw copy e.g., data_ptr usage in the
case of mfw_tlv request.
Align the driver structure with mfw copy, add reserved field(s) to driver
structure for the members not used by the driver.

Fixes: dd006921d6 ("qed: Add MFW interfaces for TLV request support.)
Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
2018-09-28 10:39:44 -07:00
Huazhong Tan
e4fd75022c net: hns3: Fix loss of coal configuration while doing reset
The user's coal configuration will be lost after reset, so the tx_coal
and rx_coal fields are added to the struct hns_nic_priv to save the coal
configuration and used to restore the user's configuration after the reset
is complete.

Fixes: bb6b94a896 ("net: hns3: Add reset interface implementation in client")
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28 10:37:42 -07:00
Huazhong Tan
0d43bf45f4 net: hns3: Modify hns3_get_max_available_channels
The current hns3_get_max_available_channels returns the total number
of queues for the device, which makes ethtool -L set the number of queues
per channel queues incorrectly, so hns3_get_max_available_channels should
return the maximum available number of queues per channel, depending on
the total number of queues allocated and the hardware configurations.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28 10:37:42 -07:00
Huazhong Tan
fe5eb04318 net: hns3: Change return type of hclge_tm_schd_info_update()
hclge_tm_schd_info_update should return an error when num_tc is greater
than alloc_tqps.

This patch changes the return type of hnae3_register_ae_algo from void
to int.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28 10:37:42 -07:00
Yunsheng Lin
93d8daf460 net: hns3: Fix for netdev not up problem when setting mtu
Currently hns3_nic_change_mtu will try to down the netdev before
setting mtu, and it does not up the netdev when the setting fails,
which causes netdev not up problem.

This patch fixes it by not returning when the setting fails.

Fixes: a8e8b7ff35 ("net: hns3: Add support to change MTU in HNS3 hardware")
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28 10:37:42 -07:00
Yunsheng Lin
996ff91840 net: hns3: Fix for packet buffer setting bug
The hardware expects a unit of 128 bytes when setting
packet buffer. When calculating the packet buffer size,
hclge_rx_buffer_calc does not round up the size as a unit
of 128 byte, which may casue packet lost problem when stress
testing.

This patch fixes it by rounding up packet size when calculating.

Fixes: 46a3df9f97 ("net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support")
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28 10:37:42 -07:00
Fuyun Liang
4dc13b9668 net: hns3: Add serdes parallel inner loopback support
This patch adds serdes parallel inner loopback support for self test.

Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28 10:37:41 -07:00
Fuyun Liang
eb66d50352 net: hns3: Rename mac loopback to app loopback
In fact, our implementation of mac loopback is the implementation of app
loopback now. Current name is wrong. This patch renames mac loopback to
app loopback.

Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28 10:37:41 -07:00
Fuyun Liang
a7b687b354 net: hns3: Rename loop mode
Our loop mode includes mac loop, serdes loop and phy loop. Not all of them
are related with mac. This patch corrects their names.

Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28 10:37:41 -07:00
Fuyun Liang
cd2086bf49 net: hns3: Set extra mac address of pause param for HW
The extra mac address of pause param is used to do double check
for pause frame. This patch set it to HW. If we do not do that,
pfc pause frame will be transferred protocol stack when normal
flow control mode is enabled.

Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28 10:37:41 -07:00
Peng Li
5b71ac3cc4 net: hns3: Add support for sctp checksum offload
This patch adds support for sctp checksum offload.

Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28 10:37:41 -07:00
Arnd Bergmann
5e8cc3947d net: ethernet: dpaa: remove unused variables
The patch that removed the only users of the oldadv/newadv variables
accidentally left the now-unused declarations behind:

drivers/net/ethernet/freescale/dpaa/dpaa_ethtool.c: In function 'dpaa_set_pauseparam':
drivers/net/ethernet/freescale/dpaa/dpaa_ethtool.c:185:14: error: unused variable 'oldadv' [-Werror=unused-variable]
drivers/net/ethernet/freescale/dpaa/dpaa_ethtool.c:185:6: error: unused variable 'newadv' [-Werror=unused-variable]

Fixes: 70814e819c ("net: ethernet: Add helper for set_pauseparam for Asym Pause")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28 10:25:11 -07:00
Wei Yongjun
3d5537f9d4 net: aquantia: Make function aq_fw1x_set_power() static
Fixes the following sparse warning:

drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_utils.c:873:5: warning:
 symbol 'aq_fw1x_set_power' was not declared. Should it be static?

Fixes: a0da96c08c ("net: aquantia: implement WOL support")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28 10:25:11 -07:00
YueHaibing
470b9254d4 qed: Remove set but not used variable 'p_archipelago'
Fixes gcc '-Wunused-but-set-variable' warning:

drivers/net/ethernet/qlogic/qed/qed_ooo.c: In function 'qed_ooo_delete_isles':
drivers/net/ethernet/qlogic/qed/qed_ooo.c:354:30: warning:
 variable 'p_archipelago' set but not used [-Wunused-but-set-variable]

drivers/net/ethernet/qlogic/qed/qed_ooo.c: In function 'qed_ooo_join_isles':
drivers/net/ethernet/qlogic/qed/qed_ooo.c:463:30: warning:
 variable 'p_archipelago' set but not used [-Wunused-but-set-variable]

Since commit 1eec2437d1 ("qed: Make OOO archipelagos into an array"),
'p_archipelago' is no longer in use.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28 10:25:11 -07:00
Greg Kroah-Hartman
ad0371482b Second rc pull request
- Fix a long standing race bug when destroying comp_event file descriptors
 
 - srp, hfi1, bnxt_re: Various driver crashes from missing validation and
   other cases
 
 - Fixes for regressions in patches merged this window in the gid cache,
   devx, ucma and uapi.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAlutG7QACgkQOG33FX4g
 mxps3Q//fPL1o4Na/nx/aSmtfcEuPxjXwczUbB0MzlN7rKu0HjrES/DOXbGJZgS7
 5XGu7TQyOBWzpZHZM4Xs+EjTdVY9pOK5+Q+t76Ns1vZsblCATJPOCIYMVkr+zM5T
 zQxEfarlz8kEPWVz/TBAM7djoSkGXNLQQ1ttSVJBSAindhb25NzoiV1fsFGkXKmT
 xbZHW03K1o4dZMf0JJO77eu1m9tkRfX0iFXBg4efgNOZtl9MP4l5lRwPqyRdGhMO
 MvBNCJte/4dKlg5TeoVtp8B0sJvrO6QOj878VqYH4nL8uW2lZWFuzB7ZWxWGxMb/
 VgXwpmXyDuTUjX9gn3kQToG8L9FJqsMzPZGBePMjilJqpgpqoVZpQ8i674JBZznF
 as/bRSLdkA4PL5upFepBD/j9ep9LGIb6ojDcUSmwB+aJ99ZtlFUOhdf4ehQ2x9MU
 WETIcdSqlmFjhY2IoHFKEr5PP3NGIDtPlvZbAmJrrne3uTCwXChXeZgzX1S6AHkY
 djvDEaV8DrMmdJc0XTIOIQvxzMnHuoCo3oHdVaJMLudGkw1NKMN84SkIPb6U2Qjk
 QiPf3pRRUazO6tqJfEZr8Es5lQjYkZUK4wKagOJKdcLBAubEnMhrsvECXQuoFWT8
 sYhzK3AwyEaXp4Y3VbA+aqjZA/AhtPk/l3QKesU6u7hoLtiawoc=
 =cjxG
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Jason writes:
  "Second RDMA rc pull request

   - Fix a long standing race bug when destroying comp_event file descriptors

   - srp, hfi1, bnxt_re: Various driver crashes from missing validation
     and other cases

   - Fixes for regressions in patches merged this window in the gid
     cache, devx, ucma and uapi."

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
  RDMA/core: Set right entry state before releasing reference
  IB/mlx5: Destroy the DEVX object upon error flow
  IB/uverbs: Free uapi on destroy
  RDMA/bnxt_re: Fix system crash during RDMA resource initialization
  IB/hfi1: Fix destroy_qp hang after a link down
  IB/hfi1: Fix context recovery when PBC has an UnsupportedVL
  IB/hfi1: Invalid user input can result in crash
  IB/hfi1: Fix SL array bounds check
  RDMA/uverbs: Fix validity check for modify QP
  IB/srp: Avoid that sg_reset -d ${srp_device} triggers an infinite loop
  ucma: fix a use-after-free in ucma_resolve_ip()
  RDMA/uverbs: Atomically flush and mark closed the comp event queue
  cxgb4: fix abort_req_rss6 struct
2018-09-27 21:53:55 +02:00
Bruce Allan
f934bb9b8b ice: fix changing of ring descriptor size (ethtool -G)
rx_mini_pending was set to an incorrect value. This was causing EINVAL to
always be returned to 'ethtool -G'. The driver does not support mini or
jumbo rings so the respective settings should be zero.

Also, change the valid range of the number of descriptors in the rings to
make the code simpler and easier for users to understand (this removes the
valid settings of 8 and 16). Add a system log message indicating when the
number is rounded-up from what the user specifies with the 'ethtool -G'
command (i.e. when it is not a multiple of 32), and update the log message
when a user-provided value is out of range to also indicate the stride.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-27 08:44:17 -07:00
Anirudh Venkataramanan
32f13d0e61 ice: Update to capabilities admin queue command
This patch makes a couple of changes in the way the driver uses the
"get capabilities" command.

1. Get device capabilities in addition to function capabilities

2. Align to latest spec by using cap_count to determine size of the
   buffer in case of length error.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-27 08:42:18 -07:00
Anirudh Venkataramanan
56daee6c5a ice: Query the Tx scheduler node before adding it
Query the Tx scheduler tree node information from FW before adding it to
the driver's software database. This will keep the node information current
in driver.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-27 08:20:15 -07:00
Brett Creeley
f31028bfd7 ice: Update comment for ice_fltr_mgmt_list_entry
Previously the comment stated that VSI lists should be used when a
second VSI becomes a subscriber to the "VLAN address". VSI lists
are always used for VLAN membership, so replace "VLAN address" with
"MAC address". Also note that VLAN(s) always use VSI list rules.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-27 08:18:34 -07:00
Jacob Keller
396fbf9cab ice: update fw version check logic
We have MAX_FW_API_VER_BRANCH, MAX_FW_API_VER_MAJOR, and
MAX_FW_API_VER_MINOR that we use in ice_controlq.h to test when a
firmware version is newer than expected. This is currently tested by
comparing each field separately. Thus, we compare the branch field
against the MAX_FW_API_VER_BRANCH, and so forth.

This means that currently, if we suppose that the max firmware version
is defined as 0.2.1, i.e.

Then firmware 0.1.3 will fail to load. This is because the minor version
3 is greater than the max minor version 1.

This is not intuitive, because of the notion that increasing the major
firmware version to 2 should mean any firmware version with a major
version is less than 2 should be considered older than 2...

In order to allow both 0.2.1 and 0.1.3 to load, you would have to define
the "max" firmware version as 0.2.3.. It is possible that such
a firmware version doesn't even exist yet!

Fix this by replacing the current logic with an updated check that
behaves as follows:

First, we check the major version. If it is greater than the expected
version, then we prevent driver load. Additionally, a warning message is
logged to indicate to the system administrator that they need to update
their driver. This is now the only case where the driver will refuse to
load.

Second, if the major version is less than the expected version, we log
an information message indicating the NVM should be updated.

Third, if the major version is exact, we'll then check the minor
version. If the minor version is more than two versions less than
expected, we log an information message indicating the NVM should be
updated. If it is more than two versions greater than the expected
version, we log an information message that the driver should be
updated.

To support this, the ice_aq_ver_check function needs its signature
updated to pass the HW structure. Since we now pass this structure,
there is no need to pass the firmware API versions separately.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-27 08:16:49 -07:00
Bruce Allan
c185e39afb ice: update branding strings and supported device ids
Update branding strings and remove device ids 0x1594 and 0x1595.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-27 08:11:37 -07:00
Bruce Allan
daca32a2aa ice: replace unnecessary memcpy with direct assignment
Direct assignment is preferred over a memcpy()

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-27 08:06:57 -07:00
Jacob Keller
dec64ff10e ice: use [sr]q.count when checking if queue is initialized
When shutting down the controlqs, we check if they are initialized
before we shut them down and destroy the lock. This is important, as it
prevents attempts to access the lock of an already shutdown queue.

Unfortunately, we checked rq.head and sq.head as the value to determine
if the queue was initialized. This doesn't work, because head is not
reset when the queue is shutdown. In some flows, the adminq will have
already been shut down prior to calling ice_shutdown_all_ctrlqs. This
can result in a crash due to attempting to access the already destroyed
mutex.

Fix this by using rq.count and sq.count instead. Indeed, ice_shutdown_sq
and ice_shutdown_rq already indicate that this is the value we should be
using to determine of the queue was initialized.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-27 08:00:21 -07:00
Michael Chan
73f21c653f bnxt_en: Fix TX timeout during netpoll.
The current netpoll implementation in the bnxt_en driver has problems
that may miss TX completion events.  bnxt_poll_work() in effect is
only handling at most 1 TX packet before exiting.  In addition,
there may be in flight TX completions that ->poll() may miss even
after we fix bnxt_poll_work() to handle all visible TX completions.
netpoll may not call ->poll() again and HW may not generate IRQ
because the driver does not ARM the IRQ when the budget (0 for netpoll)
is reached.

We fix it by handling all TX completions and to always ARM the IRQ
when we exit ->poll() with 0 budget.

Also, the logic to ACK the completion ring in case it is almost filled
with TX completions need to be adjusted to take care of the 0 budget
case, as discussed with Eric Dumazet <edumazet@google.com>

Reported-by: Song Liu <songliubraving@fb.com>
Reviewed-by: Song Liu <songliubraving@fb.com>
Tested-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-26 20:32:13 -07:00
Maxime Chevallier
da58a931f2 net: mvneta: Add support for 2500Mbps SGMII
The mvneta controller can handle speeds up to 2500Mbps on the SGMII
interface. This relies on serdes configuration, the lane must be
configured at 3.125Gbps and we can't use in-band autoneg at that speed.

The main issue when supporting that speed on this particular controller
is that the link partner can send ethernet frames with a shortened
preamble, which if not explicitly enabled in the controller will cause
unexpected behaviours.

This was tested on Armada 385, with the comphy configuration done in
bootloader.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-26 20:27:09 -07:00
Nathan Chancellor
77f2d75381 qed: Avoid implicit enum conversion in qed_iwarp_parse_rx_pkt
Clang warns when one enumerated type is implicitly converted to another.

drivers/net/ethernet/qlogic/qed/qed_iwarp.c:1713:25: warning: implicit
conversion from enumeration type 'enum tcp_ip_version' to different
enumeration type 'enum qed_tcp_ip_version' [-Wenum-conversion]
                cm_info->ip_version = TCP_IPV4;
                                    ~ ^~~~~~~~
drivers/net/ethernet/qlogic/qed/qed_iwarp.c:1733:25: warning: implicit
conversion from enumeration type 'enum tcp_ip_version' to different
enumeration type 'enum qed_tcp_ip_version' [-Wenum-conversion]
                cm_info->ip_version = TCP_IPV6;
                                    ~ ^~~~~~~~
2 warnings generated.

Use the appropriate values from the expected type, qed_tcp_ip_version:

TCP_IPV4 = QED_TCP_IPV4 = 0
TCP_IPV6 = QED_TCP_IPV6 = 1

Link: https://github.com/ClangBuiltLinux/linux/issues/125
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-26 20:23:16 -07:00
Nathan Chancellor
1c492a9d55 qed: Avoid constant logical operation warning in qed_vf_pf_acquire
Clang warns when a constant is used in a boolean context as it thinks a
bitwise operation may have been intended.

drivers/net/ethernet/qlogic/qed/qed_vf.c:415:27: warning: use of logical
'&&' with constant operand [-Wconstant-logical-operand]
        if (!p_iov->b_pre_fp_hsi &&
                                 ^
drivers/net/ethernet/qlogic/qed/qed_vf.c:415:27: note: use '&' for a
bitwise operation
        if (!p_iov->b_pre_fp_hsi &&
                                 ^~
                                 &
drivers/net/ethernet/qlogic/qed/qed_vf.c:415:27: note: remove constant
to silence this warning
        if (!p_iov->b_pre_fp_hsi &&
                                ~^~
1 warning generated.

This has been here since commit 1fe614d10f ("qed: Relax VF firmware
requirements") and I am not entirely sure why since 0 isn't a special
case. Just remove the statement causing Clang to warn since it isn't
required.

Link: https://github.com/ClangBuiltLinux/linux/issues/126
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-26 20:23:04 -07:00
Nathan Chancellor
d3a315795b qed: Avoid implicit enum conversion in qed_roce_mode_to_flavor
Clang warns when one enumerated type is implicitly converted to another.

drivers/net/ethernet/qlogic/qed/qed_roce.c:153:12: warning: implicit
conversion from enumeration type 'enum roce_mode' to different
enumeration type 'enum roce_flavor' [-Wenum-conversion]
                flavor = ROCE_V2_IPV6;
                       ~ ^~~~~~~~~~~~
drivers/net/ethernet/qlogic/qed/qed_roce.c:156:12: warning: implicit
conversion from enumeration type 'enum roce_mode' to different
enumeration type 'enum roce_flavor' [-Wenum-conversion]
                flavor = MAX_ROCE_MODE;
                       ~ ^~~~~~~~~~~~~
2 warnings generated.

Use the appropriate values from the expected type, roce_flavor:

ROCE_V2_IPV6 = RROCE_IPV6 = 2
MAX_ROCE_MODE = MAX_ROCE_FLAVOR = 3

While we're add it, ditch the local variable flavor, we can just return
the value directly from the switch statement.

Link: https://github.com/ClangBuiltLinux/linux/issues/125
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-26 20:18:44 -07:00
Nathan Chancellor
db803f36e5 qed: Fix mask parameter in qed_vf_prep_tunn_req_tlv
Clang complains when one enumerated type is implicitly converted to
another.

drivers/net/ethernet/qlogic/qed/qed_vf.c:686:6: warning: implicit
conversion from enumeration type 'enum qed_tunn_mode' to different
enumeration type 'enum qed_tunn_clss' [-Wenum-conversion]
                                 QED_MODE_L2GENEVE_TUNN,
                                 ^~~~~~~~~~~~~~~~~~~~~~

Update mask's parameter to expect qed_tunn_mode, which is what was
intended.

Link: https://github.com/ClangBuiltLinux/linux/issues/125
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-26 20:17:24 -07:00
Nathan Chancellor
a898fba322 qed: Avoid implicit enum conversion in qed_set_tunn_cls_info
Clang warns when one enumerated type is implicitly converted to another.

drivers/net/ethernet/qlogic/qed/qed_sp_commands.c:163:25: warning:
implicit conversion from enumeration type 'enum tunnel_clss' to
different enumeration type 'enum qed_tunn_clss' [-Wenum-conversion]
        p_tun->vxlan.tun_cls = type;
                             ~ ^~~~
drivers/net/ethernet/qlogic/qed/qed_sp_commands.c:165:26: warning:
implicit conversion from enumeration type 'enum tunnel_clss' to
different enumeration type 'enum qed_tunn_clss' [-Wenum-conversion]
        p_tun->l2_gre.tun_cls = type;
                              ~ ^~~~
drivers/net/ethernet/qlogic/qed/qed_sp_commands.c:167:26: warning:
implicit conversion from enumeration type 'enum tunnel_clss' to
different enumeration type 'enum qed_tunn_clss' [-Wenum-conversion]
        p_tun->ip_gre.tun_cls = type;
                              ~ ^~~~
drivers/net/ethernet/qlogic/qed/qed_sp_commands.c:169:29: warning:
implicit conversion from enumeration type 'enum tunnel_clss' to
different enumeration type 'enum qed_tunn_clss' [-Wenum-conversion]
        p_tun->l2_geneve.tun_cls = type;
                                 ~ ^~~~
drivers/net/ethernet/qlogic/qed/qed_sp_commands.c:171:29: warning:
implicit conversion from enumeration type 'enum tunnel_clss' to
different enumeration type 'enum qed_tunn_clss' [-Wenum-conversion]
        p_tun->ip_geneve.tun_cls = type;
                                 ~ ^~~~
5 warnings generated.

Avoid this by changing type to an int.

Link: https://github.com/ClangBuiltLinux/linux/issues/125
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-26 20:16:22 -07:00
Colin Ian King
5a94df70d3 qed: fix spelling mistake "toogle" -> "toggle"
Trivial fix to spelling mistake in DP_VERBOSE message

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-26 20:09:08 -07:00
YueHaibing
0a71515665 net: faraday: fix return type of ndo_start_xmit function
The method ndo_start_xmit() is defined as returning an 'netdev_tx_t',
which is a typedef for an enum type, so make sure the implementation in
this driver has returns 'netdev_tx_t' value, and change the function
return type to netdev_tx_t.

Found by coccinelle.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-26 10:18:08 -07:00
YueHaibing
6323d57f33 net: smsc: fix return type of ndo_start_xmit function
The method ndo_start_xmit() is defined as returning an 'netdev_tx_t',
which is a typedef for an enum type, so make sure the implementation in
this driver has returns 'netdev_tx_t' value, and change the function
return type to netdev_tx_t.

Found by coccinelle.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-26 10:15:17 -07:00
zhong jiang
880e1b2111 net: liquidio: list usage cleanup
Trival cleanup, list_move_tail will implement the same function that
list_del() + list_add_tail() will do. hence just replace them.

Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-26 10:12:10 -07:00
zhong jiang
631e871edc net: qed: list usage cleanup
Trival cleanup, list_move_tail will implement the same function that
list_del() + list_add_tail() will do. hence just replace them.

Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-26 10:11:36 -07:00
David S. Miller
71f9b61c5b Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:

====================
40GbE Intel Wired LAN Driver Updates 2018-09-25

This series contains updates to i40e and xsk.

Mariusz fixes an issue where the VF link state was not being updated
properly when the PF is down or up.  Also cleaned up the promiscuous
configuration during a VF reset.

Patryk simplifies the code a bit to use the variables for PF and HW that
are declared, rather than using the VSI pointers.  Cleaned up the
message length parameter to several virtchnl functions, since it was not
being used (or needed).

Harshitha fixes two potential race conditions when trying to change VF
settings by creating a helper function to validate that the VF is
enabled and that the VSI is set up.

Sergey corrects a double "link down" message by putting in a check for
whether or not the link is up or going down.

Björn addresses an AF_XDP zero-copy issue that buffers passed
from userspace to the kernel was leaked when the hardware descriptor
ring was torn down.  A zero-copy capable driver picks buffers off the
fill ring and places them on the hardware receive ring to be completed at
a later point when DMA is complete. Similar on the transmit side; The
driver picks buffers off the transmit ring and places them on the
transmit hardware ring.

In the typical flow, the receive buffer will be placed onto an receive
ring (completed to the user), and the transmit buffer will be placed on
the completion ring to notify the user that the transfer is done.

However, if the driver needs to tear down the hardware rings for some
reason (interface goes down, reconfiguration and such), the userspace
buffers cannot be leaked. They have to be reused or completed back to
userspace.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-25 20:25:00 -07:00
Björn Töpel
3ab52af58f i40e: disallow changing the number of descriptors when AF_XDP is on
When an AF_XDP UMEM is attached to any of the Rx rings, we disallow a
user to change the number of descriptors via e.g. "ethtool -G IFNAME".

Otherwise, the size of the stash/reuse queue can grow unbounded, which
would result in OOM or leaking userspace buffers.

Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-25 13:16:19 -07:00
Björn Töpel
411dc16ff1 i40e: clean zero-copy XDP Rx ring on shutdown/reset
Outstanding Rx descriptors are temporarily stored on a stash/reuse
queue. When/if the HW rings comes up again, entries from the stash are
used to re-populate the ring.

The latter required some restructuring of the allocation scheme for
the AF_XDP zero-copy implementation. There is now a fast, and a slow
allocation. The "fast allocation" is used from the fast-path and
obtains free buffers from the fill ring and the internal recycle
mechanism. The "slow allocation" is only used in ring setup, and
obtains buffers from the fill ring and the stash (if any).

Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-25 13:15:16 -07:00
Björn Töpel
9dbb137045 i40e: clean zero-copy XDP Tx ring on shutdown/reset
When the zero-copy enabled XDP Tx ring is torn down, due to
configuration changes, outstanding frames on the hardware descriptor
ring are queued on the completion ring.

The completion ring has a back-pressure mechanism that will guarantee
that there is sufficient space on the ring.

Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-25 13:10:24 -07:00
Patryk Małek
679b05c053 i40e: Remove unused msglen parameter from virtchnl functions
msglen parameter seems to be unused in several virtchnl function.
This patch removes it from signatures of those functions.

Signed-off-by: Patryk Małek <patryk.malek@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-25 13:08:42 -07:00
Sergey Nemov
fd835129ab i40e: fix double 'NIC Link is Down' messages
When isup is false meaning that interface is going to shut down
set new speed to 0 to avoid double 'NIC Link is Down' messages.

Signed-off-by: Sergey Nemov <sergey.nemov@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-25 13:06:50 -07:00
Harshitha Ramamurthy
ed277c50c0 i40e: add a helper function to validate a VF based on the vf id
When we are trying to change VF settings, it is possible for 2 race
conditions to happen. One, when the VF is created but not yet enabled.
Second, the VF is enabled but the VSI is still not created or not yet
re-created in the VF reset flow.

This patch introduces a helper function to validate that the VF is
enabled and that the VSI is set up. This patch also calls this
function from other functions which could get into these race conditions.
While we are poking around here, remove unnecessary parenthesis that
checkpatch was complaining about.

Signed-off-by: Harshitha Ramamurthy <harshitha.ramamurthy@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-25 13:05:37 -07:00
Patryk Małek
e7bac7afa6 i40e: use declared variables for pf and hw
In order to slightly simplify the code use the variables for pf and hw
that are declared in i40e_set_mac function.

Signed-off-by: Patryk Małek <patryk.malek@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-25 13:03:36 -07:00
Mariusz Stachura
0ce5233e6c i40e: Unset promiscuous settings on VF reset
This patch cleans up promiscuous configuration when a VF reset occurs.
Previously the promiscuous mode settings were still there after the VF
driver removal.

Signed-off-by: Mariusz Stachura <mariusz.stachura@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-25 12:49:05 -07:00
Mariusz Stachura
f3fc7915a5 i40e: Fix VF's link state notification
This resolves an issue where the VF link state was not being updated
when the PF is down or up, and the VF link state would always show
that it is running.

Signed-off-by: Mariusz Stachura <mariusz.stachura@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-25 12:46:49 -07:00
Yunsheng Lin
2e9361efa7 net: hns: fix for unmapping problem when SMMU is on
If SMMU is on, there is more likely that skb_shinfo(skb)->frags[i]
can not send by a single BD. when this happen, the
hns_nic_net_xmit_hw function map the whole data in a frags using
skb_frag_dma_map, but unmap each BD' data individually when tx is
done, which causes problem when SMMU is on.

This patch fixes this problem by ummapping the whole data in a
frags when tx is done.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Reviewed-by: Yisen Zhuang <yisen.zhuang@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-25 10:42:20 -07:00
David S. Miller
a06ee256e5 Merge ra.kernel.org:/pub/scm/linux/kernel/git/davem/net
Version bump conflict in batman-adv, take what's in net-next.

iavf conflict, adjustment of netdev_ops in net-next conflicting
with poll controller method removal in net.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-25 10:35:29 -07:00
Michal Simek
e1e5d8a9fe net: macb: Clean 64b dma addresses if they are not detected
Clear ADDR64 dma bit in DMACFG register in case that HW_DMA_CAP_64B is
not detected on 64bit system.
The issue was observed when bootloader(u-boot) does not check macb
feature at DCFG6 register (DAW64_OFFSET) and enabling 64bit dma support
by default. Then macb driver is reading DMACFG register back and only
adding 64bit dma configuration but not cleaning it out.

Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-25 10:33:52 -07:00
Michal Simek
bd6207202d net: macb: Clean 64b dma addresses if they are not detected
Clear ADDR64 dma bit in DMACFG register in case that HW_DMA_CAP_64B is
not detected on 64bit system.
The issue was observed when bootloader(u-boot) does not check macb
feature at DCFG6 register (DAW64_OFFSET) and enabling 64bit dma support
by default. Then macb driver is reading DMACFG register back and only
adding 64bit dma configuration but not cleaning it out.

Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-25 10:33:06 -07:00
Heiner Kallweit
a0456790fe r8169: improve a check in rtl_init_one
The check for pci_is_pcie() is redundant here because all
chip versions >=18 are PCIe only anyway. In addition use
dma_set_mask_and_coherent() instead of separate calls to
pci_set_dma_mask() and pci_set_consistent_dma_mask().

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-25 10:28:03 -07:00
Heiner Kallweit
de20e12f3f r8169: improve rtl8169_irq_mask_and_ack
Code can be slightly simplified by acking even events we're not
interested in. In addition add a comment making clear that the
read has no functional purpose and is just a PCI commit.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-25 10:28:03 -07:00
Heiner Kallweit
4bee64b417 r8169: use default watchdog timeout
The networking core has a default watchdog timeout of 5s. I see no
need to define an own timeout of 6s which is basically the same.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-25 10:28:03 -07:00
Ioana Ciocoi Radulescu
edad8d260e dpaa2-eth: Make Rx flow hash key configurable
Until now, the Rx flow hash key was a 5-tuple (IP src, IP dst,
IP nextproto, L4 src port, L4 dst port) fixed value that we
configured at probe.

Add support for configuring this hash key at runtime.
We support all standard header fields configurable through ethtool,
but cannot differentiate between flow types, so the same hash key
is applied regardless of protocol.

We also don't support the discard option.

Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-24 12:28:40 -07:00
Antoine Tenart
f4a518797b net: mvneta: fix the remaining Rx descriptor unmapping issues
With CONFIG_DMA_API_DEBUG enabled we get DMA unmapping warning in
various places of the mvneta driver, for example when putting down an
interface while traffic is passing through.

The issue is when using s/w buffer management, the Rx buffers are mapped
using dma_map_page but unmapped with dma_unmap_single. This patch fixes
this by using the right unmapping function.

Fixes: 562e2f467e ("net: mvneta: Improve the buffer allocation method for SWBM")
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Reviewed-by: Gregory CLEMENT <gregory.clement@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-24 12:27:28 -07:00
Stefan Wahren
48c1699ec2 net: qca_spi: Introduce write register verification
The SPI protocol for the QCA7000 doesn't have any fault detection.
In order to increase the drivers reliability in noisy environments,
we could implement a write verification inspired by the enc28j60.
This should avoid situations were the driver wrongly assumes the
receive interrupt is enabled and miss all incoming packets.

This function is disabled per default and can be controlled via module
parameter wr_verify.

Signed-off-by: Michael Heimpold <michael.heimpold@i2se.com>
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-24 12:26:06 -07:00
Maxime Chevallier
4251ea5b8e net: mvpp2: use round-robin scheduling for TX queues on the same CPU
This commit allows each TXQ to be picked in a round-robin fashion by
the PPv2 transmit scheduling mechanism. This is opposed to the default
behaviour that prioritizes the highest numbered queues.

Suggested-by: Yan Markman <ymarkman@marvell.com>
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-24 10:01:10 -07:00
Maxime Chevallier
0d283ab5b4 net: mvpp2: support XPS by mapping TX queues to CPUs
Since the PPv2 controller has multiple TX queues, we can spread traffic
by assining TX queues to CPUs, allowing to use XPS to balance egress
traffic between CPUs.

Suggested-by : Yan Markman <ymarkman@marvell.com>
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-24 10:01:10 -07:00
Friedemann Gerold
d26ed6b0e5 net: aquantia: memory corruption on jumbo frames
This patch fixes skb_shared area, which will be corrupted
upon reception of 4K jumbo packets.

Originally build_skb usage purpose was to reuse page for skb to eliminate
needs of extra fragments. But that logic does not take into account that
skb_shared_info should be reserved at the end of skb data area.

In case packet data consumes all the page (4K), skb_shinfo location
overflows the page. As a consequence, __build_skb zeroed shinfo data above
the allocated page, corrupting next page.

The issue is rarely seen in real life because jumbo are normally larger
than 4K and that causes another code path to trigger.
But it 100% reproducible with simple scapy packet, like:

    sendp(IP(dst="192.168.100.3") / TCP(dport=443) \
          / Raw(RandString(size=(4096-40))), iface="enp1s0")

Fixes: 018423e90b ("net: ethernet: aquantia: Add ring support code")

Reported-by: Friedemann Gerold <f.gerold@b-c-s.de>
Reported-by: Michael Rauch <michael@rauch.be>
Signed-off-by: Friedemann Gerold <f.gerold@b-c-s.de>
Tested-by: Nikita Danilov <nikita.danilov@aquantia.com>
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-23 22:25:25 -07:00
Eric Dumazet
0825ce7031 nfp: remove ndo_poll_controller
As diagnosed by Song Liu, ndo_poll_controller() can
be very dangerous on loaded hosts, since the cpu
calling ndo_poll_controller() might steal all NAPI
contexts (for all RX/TX queues of the NIC). This capture
can last for unlimited amount of time, since one
cpu is generally not able to drain all the queues under load.

nfp uses NAPI for TX completions, so we better let core
networking stack call the napi->poll() to avoid the capture.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Tested-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-23 21:55:25 -07:00
Eric Dumazet
58e0e22bff bnxt: remove ndo_poll_controller
As diagnosed by Song Liu, ndo_poll_controller() can
be very dangerous on loaded hosts, since the cpu
calling ndo_poll_controller() might steal all NAPI
contexts (for all RX/TX queues of the NIC). This capture
can last for unlimited amount of time, since one
cpu is generally not able to drain all the queues under load.

bnxt uses NAPI for TX completions, so we better let core
networking stack call the napi->poll() to avoid the capture.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-23 21:55:25 -07:00
Eric Dumazet
d8ea6a91ad bnx2x: remove ndo_poll_controller
As diagnosed by Song Liu, ndo_poll_controller() can
be very dangerous on loaded hosts, since the cpu
calling ndo_poll_controller() might steal all NAPI
contexts (for all RX/TX queues of the NIC). This capture
can last for unlimited amount of time, since one
cpu is generally not able to drain all the queues under load.

bnx2x uses NAPI for TX completions, so we better let core
networking stack call the napi->poll() to avoid the capture.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-23 21:55:25 -07:00
Eric Dumazet
9c29bcd189 mlx5: remove ndo_poll_controller
As diagnosed by Song Liu, ndo_poll_controller() can
be very dangerous on loaded hosts, since the cpu
calling ndo_poll_controller() might steal all NAPI
contexts (for all RX/TX queues of the NIC). This capture
can last for unlimited amount of time, since one
cpu is generally not able to drain all the queues under load.

mlx5 uses NAPI for TX completions, so we better let core
networking stack call the napi->poll() to avoid the capture.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-23 21:55:25 -07:00
Eric Dumazet
a24b66c249 mlx4: remove ndo_poll_controller
As diagnosed by Song Liu, ndo_poll_controller() can
be very dangerous on loaded hosts, since the cpu
calling ndo_poll_controller() might steal all NAPI
contexts (for all RX/TX queues of the NIC). This capture
can last for unlimited amount of time, since one
cpu is generally not able to drain all the queues under load.

mlx4 uses NAPI for TX completions, so we better let core
networking stack call the napi->poll() to avoid the capture.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-23 21:55:25 -07:00
Eric Dumazet
1aa28fb983 i40evf: remove ndo_poll_controller
As diagnosed by Song Liu, ndo_poll_controller() can
be very dangerous on loaded hosts, since the cpu
calling ndo_poll_controller() might steal all NAPI
contexts (for all RX/TX queues of the NIC). This capture
can last for unlimited amount of time, since one
cpu is generally not able to drain all the queues under load.

i40evf uses NAPI for TX completions, so we better let core
networking stack call the napi->poll() to avoid the capture.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-23 21:55:25 -07:00
Eric Dumazet
158a08a694 ice: remove ndo_poll_controller
As diagnosed by Song Liu, ndo_poll_controller() can
be very dangerous on loaded hosts, since the cpu
calling ndo_poll_controller() might steal all NAPI
contexts (for all RX/TX queues of the NIC). This capture
can last for unlimited amount of time, since one
cpu is generally not able to drain all the queues under load.

ice uses NAPI for TX completions, so we better let core
networking stack call the napi->poll() to avoid the capture.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-23 21:55:25 -07:00
Eric Dumazet
0542997ede igb: remove ndo_poll_controller
As diagnosed by Song Liu, ndo_poll_controller() can
be very dangerous on loaded hosts, since the cpu
calling ndo_poll_controller() might steal all NAPI
contexts (for all RX/TX queues of the NIC). This capture
can last for unlimited amount of time, since one
cpu is generally not able to drain all the queues under load.

igb uses NAPI for TX completions, so we better let core
networking stack call the napi->poll() to avoid the capture.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-23 21:55:24 -07:00
Eric Dumazet
2753166e4b ixgb: remove ndo_poll_controller
As diagnosed by Song Liu, ndo_poll_controller() can
be very dangerous on loaded hosts, since the cpu
calling ndo_poll_controller() might steal all NAPI
contexts (for all RX/TX queues of the NIC). This capture
can last for unlimited amount of time, since one
cpu is generally not able to drain all the queues under load.

ixgb uses NAPI for TX completions, so we better let core
networking stack call the napi->poll() to avoid the capture.

This also removes a problematic use of disable_irq() in
a context it is forbidden, as explained in commit
af3e0fcf78 ("8139too: Use disable_irq_nosync() in
rtl8139_poll_controller()")

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-23 21:55:24 -07:00
Eric Dumazet
dda9d57e2d fm10k: remove ndo_poll_controller
As diagnosed by Song Liu, ndo_poll_controller() can
be very dangerous on loaded hosts, since the cpu
calling ndo_poll_controller() might steal all NAPI
contexts (for all RX/TX queues of the NIC). This capture
lasts for unlimited amount of time, since one
cpu is generally not able to drain all the queues under load.

fm10k uses NAPI for TX completions, so we better let core
networking stack call the napi->poll() to avoid the capture.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-23 21:55:24 -07:00
Eric Dumazet
6f5d941eba ixgbevf: remove ndo_poll_controller
As diagnosed by Song Liu, ndo_poll_controller() can
be very dangerous on loaded hosts, since the cpu
calling ndo_poll_controller() might steal all NAPI
contexts (for all RX/TX queues of the NIC). This capture
can last for unlimited amount of time, since one
cpu is generally not able to drain all the queues under load.

ixgbevf uses NAPI for TX completions, so we better let core
networking stack call the napi->poll() to avoid the capture.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-23 21:55:24 -07:00
Eric Dumazet
b80e71a986 ixgbe: remove ndo_poll_controller
As diagnosed by Song Liu, ndo_poll_controller() can
be very dangerous on loaded hosts, since the cpu
calling ndo_poll_controller() might steal all NAPI
contexts (for all RX/TX queues of the NIC). This capture
can last for unlimited amount of time, since one
cpu is generally not able to drain all the queues under load.

ixgbe uses NAPI for TX completions, so we better let core
networking stack call the napi->poll() to avoid the capture.

Reported-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Tested-by: Song Liu <songliubraving@fb.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-23 21:55:24 -07:00
Petr Machata
12ba7e1045 mlxsw: Make MLXSW_SP1_FWREV_MINOR a hard requirement
Up until now, mlxsw tolerated firmware versions that weren't exactly
matching the required version, if the branch number matched. That
allowed the users to test various firmware versions as long as they were
on the right branch.

On the other hand, it made it impossible for mlxsw to put a hard lower
bound on a version that fixes all problems known to date. If a user had
a somewhat older FW version installed, mlxsw would start up just fine,
possibly performing non-optimally as it would use features that trigger
problematic behavior.

Therefore tweak the check to accept any FW version that is:

- on the same branch as the preferred version, and
- the same as or newer than the preferred version.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-23 12:30:01 -07:00
Peng Li
ebfefb8aa7 net: hns3: Remove redundant hclge_get_port_type()
This patch removes hclge_get_port_type which is redundant.

Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-21 19:29:32 -07:00
Fuyun Liang
5f373b1585 net: hns3: Fix speed/duplex information loss problem when executing ethtool ethx cmd of VF
Our VF has not implemented the ops for get_port_type. So when we executing
ethtool ethx cmd of VF, hns3_get_link_ksettings will return directly. And
we can not query anything.

To support get_link_ksettings for VF, this patch replaces get_port_type
with get_media_type. If the media type is HNAE3_MEDIA_TYPE_NONE,
hns3_get_link_ksettings will return link information of VF.

Fixes: 12f46bc1d4 ("net: hns3: Refine hns3_get_link_ksettings()")
Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-21 19:29:32 -07:00
Peng Li
c136b88425 net: hns3: Add get_media_type ops support for VF
This patch adds the ops of get_media_type support for VF.

Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-21 19:29:32 -07:00
Jian Shen
75e9853518 net: hns3: Remove print messages for error packet
There are already multiple types packets statistics for error packets,
it's unnecessary to print them, which may affect the rx performance if
print too many.

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-21 19:29:32 -07:00
Jian Shen
2211f4e195 net: hns3: Add unlikely for dma_mapping_error check
For dma_mapping_error is unlikely happened, this patch adds unlikely for
dma_mapping_error check.

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-21 19:29:32 -07:00
Jian Shen
7a8101109d net: hns3: Add nic state check before calling netif_tx_wake_queue
When nic down, it firstly calls netif_tx_stop_all_queues(), then calls
napi_disable(). But napi_disable() will wait current napi_poll finish,
it may call netif_tx_wake_queue(). This patch fixes it by add nic state
checking.

Fixes: 424eb834a9 ("net: hns3: Unified HNS3 {VF|PF} Ethernet Driver for hip08 SoC")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-21 19:29:32 -07:00
Jian Shen
fa7a4bd564 net: hns3: Add handle for default case
There are a few "switch-case" codes missed handle for default case. For
some abnormal case, it should return error code instead of return 0.

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-21 19:29:32 -07:00
Jian Shen
6cee6fc384 net: hns3: Unify the prefix of vf functions
The prefix of most functions for vf are hclgevf. This patch renames the
function with inconsistent prefix.

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-21 19:29:32 -07:00
Jian Shen
b4f1d30327 net: hns3: Fix tqp array traversal condition for vf
There are two tqp_num variables "hdev->tqp_num" and "kinfo->tqp_num"
used in VF. "hdev->tqp_num" is the total tqp number allocated to the
VF, and "kinfo->tqp_num" indicates the tqp number being used by the
VF. Usually the two variables are equal. But for the case hdev->tqp_num
larger than rss_size_max, and num_tc is 1, "kinfo->tqp_num" will be
less than "hdev->tqp_num".

In original codes, "hdev->tqp_num" is always used to traverse the
tqp array of kinfo. It may cause null pointer error when "hdev->tqp_num"
is larger than "kinfo->tqp_num"

Fixes: e2cb1dec97 ("net: hns3: Add HNS3 VF HCL(Hardware Compatibility Layer) Support")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-21 19:29:32 -07:00
Jian Shen
0c21812302 net: hns3: Adjust prefix of tx/rx statistic names
Some prefix of tx/rx statistic names are redundant, this patch modifies
these names.

The new prefix looks like below:
rxq#1_ -> rxq1_
txq#1_ -> txq1_
tx_dropped -> dropped
tx_wake -> wake
tx_busy -> busy
rx_dropped -> dropped

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-21 19:29:32 -07:00
Jian Shen
d0d72bac02 net: hns3: Unify the type convert for desc.data
For desc.data is already point to the address of struct member "data[6]",
it's unnecessary to use '&' to get its address. This patch unifies all
the type convert for dest.data, using "req = (struct name *)dest.data".

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-21 19:29:32 -07:00
Jian Shen
adefc0a2ff net: hns3: Fix ets validate issue
There is a defect in hclge_ets_validate(). If each member of tc_tsa is
not IEEE_8021QAZ_TSA_ETS, the variable total_ets_bw won't be updated.
In this case, the check for value of total_ets_bw will fail. This patch
fixes it by checking total_ets_bw only after it has been updated.

Fixes: cacde272dd ("net: hns3: Add hclge_dcb module for the support of DCB feature")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-21 19:29:32 -07:00
Andrew Lunn
65c5877f64 ravb: Disable Pause Advertisement
The previous commit to ravb had the side effect of making the PHY
advertise Pause and Asym Pause, which previously did not happen.  By
default, phydev->supported has both forms of pause enabled, but
phydev->advertising does not. The new phy_remove_link_mode() copies
phydev->supported to phydev->advertising after removing the requested
link mode. These Pause configuration bits appears it stops the PHY
from completing Auto-Neg and the link remains down.  Be explicit and
remove the Pause and Asym Pause modes, so restoring the old behavior.

Fixes: 41124fa64d ("net: ethernet: Add helper to remove a supported link mode")
Reported-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-21 19:26:52 -07:00
Nathan Chancellor
8ac1ee6f4d net/mlx4: Use cpumask_available for eq->affinity_mask
Clang warns that the address of a pointer will always evaluated as true
in a boolean context:

drivers/net/ethernet/mellanox/mlx4/eq.c:243:11: warning: address of
array 'eq->affinity_mask' will always evaluate to 'true'
[-Wpointer-bool-conversion]
        if (!eq->affinity_mask || cpumask_empty(eq->affinity_mask))
            ~~~~~^~~~~~~~~~~~~
1 warning generated.

Use cpumask_available, introduced in commit f7e30f01a9 ("cpumask: Add
helper cpumask_available()"), which does the proper checking and avoids
this warning.

Link: https://github.com/ClangBuiltLinux/linux/issues/86
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-21 19:20:22 -07:00
YueHaibing
e6ce3822a9 net: apple: fix return type of ndo_start_xmit function
The method ndo_start_xmit() is defined as returning an 'netdev_tx_t',
which is a typedef for an enum type, so make sure the implementation in
this driver has returns 'netdev_tx_t' value, and change the function
return type to netdev_tx_t.

Found by coccinelle.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-21 19:15:15 -07:00