In order to specify that the mlxsw spectrum driver needs additional
headroom for packets, there have been use of the hard_header_len field of
the netdevice struct.
This commit changes that to use needed_headroom instead, as this is the
correct way to do that.
Fixes: 56ade8fe3f ("mlxsw: spectrum: Add initial support for Spectrum ASIC")
Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
Acked-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This stops NCSI device when closing the network device so that the
NCSI device can be reenabled later.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jeff Kirsher says:
====================
100GbE Intel Wired LAN Driver Updates 2016-10-02
This series contains updates to fm10k only.
Jake fixes an issue where PTP applications requesting software timestamps
may complain that the requested mode is not supported, so add a generic
callback for those drivers that have software transmit timestamp support
enabled. Then provides a trivial cleanup where a code was not wrapped
properly. Got make sure that code looks good in a 80 character limit.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Although rare, it's possible to hit PCI error early on device
probe, meaning possibly some structs are not entirely initialized,
and some might even be completely uninitialized, leading to NULL
pointer dereference.
The i40e driver currently presents a "bad" behavior if device hits
such early PCI error: firstly, the struct i40e_pf might not be
attached to pci_dev yet, leading to a NULL pointer dereference on
access to pf->state.
Even checking if the struct is NULL and avoiding the access in that
case isn't enough, since the driver cannot recover from PCI error
that early; in our experiments we saw multiple failures on kernel
log, like:
[549.664] i40e 0007:01:00.1: Initial pf_reset failed: -15
[549.664] i40e: probe of 0007:01:00.1 failed with error -15
[...]
[871.644] i40e 0007:01:00.1: The driver for the device stopped because the
device firmware failed to init. Try updating your NVM image.
[871.644] i40e: probe of 0007:01:00.1 failed with error -32
[...]
[872.516] i40e 0007:01:00.0: ARQ: Unknown event 0x0000 ignored
Between the first probe failure (error -15) and the second (error -32)
another PCI error happened due to the first bad probe. Also, driver
started to flood console with those ARQ event messages.
This patch will prevent these issues by allowing error recovery
mechanism to remove the failed device from the system instead of
trying to recover from early PCI errors during device probe.
CC: <stable@vger.kernel.org>
Signed-off-by: Guilherme G Piccoli <gpiccoli@linux.vnet.ibm.com>
Acked-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jeff Kirsher says:
====================
40GbE Intel Wired LAN Driver Updates 2016-10-03
This series contains fixes to i40e only.
Stefan Assmann provides the changes in this series to resolve an issue
where when we run out of MSIx vectors, iWARP gets disabled automatically.
First adds a check for "no vectors left" during MSIx vector allocation
for VMDq, which will prevent more vectors being allocated than available.
Then fixed the MSIx vector redistribution when we reach the hardware limit
for vectors so that additional features like VMDq, iWARP, etc do not get
starved for vectors because the PF is hogging all the resources. Lastly,
fix the issue for flow director by moving the check for the reaching the
vector limit earlier in the code so that a decision can be made on
disabling flow director.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Add the RoCE-specific LL2 logic [as well as GSI support] over
the 'generic' LL2 interface.
Signed-off-by: Ram Amrani <Ram.Amrani@caviumnetworks.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@caviumnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add slowpath configuration support for user, dma and memory
regions registration.
Signed-off-by: Ram Amrani <Ram.Amrani@caviumnetworks.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@caviumnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add support for the slowpath configurations of Queue Pair verbs
which adds, deletes, modifies and queries Queue Pairs.
Signed-off-by: Ram Amrani <Ram.Amrani@caviumnetworks.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@caviumnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add support for the configurations of the protection domain and
completion queues.
Signed-off-by: Ram Amrani <Ram.Amrani@caviumnetworks.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@caviumnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This adds the backbone required for the various HW initalizations
which are necessary for the qedr driver - FW notification, resource
initializations, etc.
Signed-off-by: Ram Amrani <Ram.Amrani@caviumnetworks.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@caviumnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Adds a skeletal implementation of the qede RoCE driver -
The qedr has some dependencies of the state of the underlying base
interface. This adds some logic required with mutual registrations
and the ability to pass updates on 'intresting' events.
Signed-off-by: Ram Amrani <Ram.Amrani@caviumnetworks.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@caviumnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Other protocols beside the networking driver need the ability
of passing some L2 traffic, usually [although not limited] for the
purpose of some management traffic.
Signed-off-by: Yuval Mintz <Yuval.Mintz@caviumnetworks.com>
Signed-off-by: Ram Amrani <Ram.Amrani@caviumnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull CPU hotplug updates from Thomas Gleixner:
"Yet another batch of cpu hotplug core updates and conversions:
- Provide core infrastructure for multi instance drivers so the
drivers do not have to keep custom lists.
- Convert custom lists to the new infrastructure. The block-mq custom
list conversion comes through the block tree and makes the diffstat
tip over to more lines removed than added.
- Handle unbalanced hotplug enable/disable calls more gracefully.
- Remove the obsolete CPU_STARTING/DYING notifier support.
- Convert another batch of notifier users.
The relayfs changes which conflicted with the conversion have been
shipped to me by Andrew.
The remaining lot is targeted for 4.10 so that we finally can remove
the rest of the notifiers"
* 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (46 commits)
cpufreq: Fix up conversion to hotplug state machine
blk/mq: Reserve hotplug states for block multiqueue
x86/apic/uv: Convert to hotplug state machine
s390/mm/pfault: Convert to hotplug state machine
mips/loongson/smp: Convert to hotplug state machine
mips/octeon/smp: Convert to hotplug state machine
fault-injection/cpu: Convert to hotplug state machine
padata: Convert to hotplug state machine
cpufreq: Convert to hotplug state machine
ACPI/processor: Convert to hotplug state machine
virtio scsi: Convert to hotplug state machine
oprofile/timer: Convert to hotplug state machine
block/softirq: Convert to hotplug state machine
lib/irq_poll: Convert to hotplug state machine
x86/microcode: Convert to hotplug state machine
sh/SH-X3 SMP: Convert to hotplug state machine
ia64/mca: Convert to hotplug state machine
ARM/OMAP/wakeupgen: Convert to hotplug state machine
ARM/shmobile: Convert to hotplug state machine
arm64/FP/SIMD: Convert to hotplug state machine
...
Currently if the MSI-X vector limit is reached the sideband flow
director gets disabled. A bit too early to make that decision, as
vectors may get re-distributed. So move the check further back.
Signed-off-by: Stefan Assmann <sassmann@kpanic.de>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The driver allocates 1 vector per CPU thread and the current hardware
limit for vectors is 129 per PF. On systems with 128 or more threads
this currently means all vectors are used by the PF leaving no room for
additional features like VMDq, iWARP, etc...
The code that should redistribute the vectors in this case is broken and
never triggers. Fixed the code so that it actually triggers if the
hardware limit is reached and adjust the number of queue pairs
accordingly.
Also the number of initially requested iWARP vectors was not properly
saved when the vector limit was reached, and therefore always zero.
Comparison with debug statement.
Before:
i40e 0000:2d:00.0: VMDq disabled, not enough MSI-X vectors
i40e 0000:2d:00.0: IWARP disabled, not enough MSI-X vectors
i40e 00.0 MSI-X vector distribution: PF 128, VMDq 0, FDSB 0, iWARP 0
After:
i40e 0000:2d:00.0: MSI-X vector limit reached, attempting to redistribute vectors
i40e 00.0 MSI-X vector distribution: PF 78, VMDq 8, FDSB 0, iWARP 42
Signed-off-by: Stefan Assmann <sassmann@kpanic.de>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
During MSI-X vector allocation for VMDq, a check for "no vectors left"
was missing, add it. This prevents more vectors to be allocated than
available.
Signed-off-by: Stefan Assmann <sassmann@kpanic.de>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
In case of error, the function ioremap() returns NULL pointer
not ERR_PTR(). The IS_ERR() test in the return value check
should be replaced with NULL test.
Also add check for return value of platform_get_resource().
Fixes: 54e19bc74f ("net: qcom/emac: do not use devm on internal
phy pdev")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Acked-by: Timur Tabi <timur@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Trivial change here to cleanup a checkpatch.pl warning that got
introduced when changing to alloc_workqueue.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
There is no need to clk_disable_unprepare(dev->clk)
before it was initialized.
Found by Linux Driver Verification project (linuxtesting.org).
Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
If fi->fib_nhs is zero, the router interface pointer is uninitialized, as shown by
this warning:
drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c: In function 'mlxsw_sp_router_fib_event':
drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c:1674:21: error: 'r' may be used uninitialized in this function [-Werror=maybe-uninitialized]
drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c:1643:23: note: 'r' was declared here
This changes the loop so we handle the case the same way as finding no router
interface pointer attached to one of the nexthops to ensure we always
trap here instead of using uninitialized data.
Fixes: b45f64d16d ("mlxsw: spectrum_router: Use FIB notifications instead of switchdev calls")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Build-testing this driver with -Wmaybe-uninitialized gives a new false-positive
warning that I can't really explain:
drivers/net/ethernet/mellanox/mlx5/core/en_tc.c: In function 'mlx5e_configure_flower':
drivers/net/ethernet/mellanox/mlx5/core/en_tc.c:509:3: error: 'old_attr' may be used uninitialized in this function [-Werror=maybe-uninitialized]
It's obvious from the code that 'old_attr' is initialized whenever 'old'
is non-NULL here. The warning appears with all versions I tested from gcc-4.7
through gcc-6.1, and I could not come up with a way to rewrite the function
in a more readable way that avoids the warning, so I'm adding another
initialization to shut it up.
Fixes: 8b32580df1 ("net/mlx5e: Add TC vlan action for SRIOV offloads")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This generic callback is for drivers which have software Tx timestamp
support enabled. Without this, PTP applications requesting software
timestamps may complain that the requested mode is not supported.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
A recent cleanup marked cxgb4_dcb_enabled as 'static', which is correct, but this ignored
how the symbol is also exported. In addition, the export can be compiled out when modules
are disabled, causing a harmless compiler warning in configurations for which it is not
used at all:
drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c:282:12: error: 'cxgb4_dcb_enabled' defined but not used [-Werror=unused-function]
This removes the export and moves the function into the correct #ifdef so we only build
it when there are users.
Fixes: 50935857f8 ("cxgb4: mark symbols static where possible")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
If the mac address origin is not dt, you can only safely assign a mac
address after "link up" of the device. If the link is off the clocks are
disabled and because of issues assigning registers when clocks are off the
new mac address cannot be written in .ndo_set_mac_address() on some soc's.
This fix sets the mac address unconditionally in fec_restart(...) and
ensures consistency between fec registers and the network layer.
Signed-off-by: Gavin Schenk <g.schenk@eckelmann.de>
Acked-by: Fugang Duan <fugang.duan@nxp.com>
Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Fixes: 9638d19e48 ("net: fec: add netif status check before set mac address")
Signed-off-by: David S. Miller <davem@davemloft.net>
We get 2 warnings when building kernel with W=1:
drivers/net/ethernet/mediatek/mtk_eth_soc.c:2041:5: warning: no previous prototype for 'mtk_get_link_ksettings' [-Wmissing-prototypes]
drivers/net/ethernet/mediatek/mtk_eth_soc.c:2052:5: warning: no previous prototype for 'mtk_set_link_ksettings' [-Wmissing-prototypes]
In fact, these functions are only used in the file in which they are
declared and don't need a declaration, but can be made static.
So this patch marks these functions with 'static'.
Signed-off-by: Baoyou Xie <baoyou.xie@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
We get 1 warning when building kernel with W=1:
drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c:2715:5: warning: no previous prototype for 'cxgb_setup_tc' [-Wmissing-prototypes]
In fact, this function is only used in the file in which it is
declared and don't need a declaration, but can be made static.
so this patch marks this function with 'static'.
Signed-off-by: Baoyou Xie <baoyou.xie@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This implements ndo_poll_controller in net_device_ops callbacks for mlx5,
which is necessary to use netconsole with this driver.
Acked-By: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Calvin Owens <calvinowens@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Set upper 32 bits of destination register to zeros after
load from the context structure.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This also can address following UBSAN warnings:
[ 36.640343] ================================================================================
[ 36.648772] UBSAN: Undefined behaviour in drivers/net/ethernet/mellanox/mlx4/fw.c:857:26
[ 36.656853] shift exponent 64 is too large for 32-bit type 'int'
[ 36.663348] ================================================================================
[ 36.671783] ================================================================================
[ 36.680213] UBSAN: Undefined behaviour in drivers/net/ethernet/mellanox/mlx4/fw.c:861:27
[ 36.688297] shift exponent 35 is too large for 32-bit type 'int'
[ 36.694702] ================================================================================
Tested:
reboot with UBSAN, no warning.
Signed-off-by: David Decotigny <decot@googlers.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
While the driver is probing the adapter, an error may occur before the
netdev structure is allocated and attached to pci_dev. In this case,
not only netdev isn't available, but the tg3 private structure is also
not available as it is just math from the NULL pointer, so dereferences
must be skipped.
The following trace is seen when the error is triggered:
[1.402247] Unable to handle kernel paging request for data at address 0x00001a99
[1.402410] Faulting instruction address: 0xc0000000007e33f8
[1.402450] Oops: Kernel access of bad area, sig: 11 [#1]
[1.402481] SMP NR_CPUS=2048 NUMA PowerNV
[1.402513] Modules linked in:
[1.402545] CPU: 0 PID: 651 Comm: eehd Not tainted 4.4.0-36-generic #55-Ubuntu
[1.402591] task: c000001fe4e42a20 ti: c000001fe4e88000 task.ti: c000001fe4e88000
[1.402742] NIP: c0000000007e33f8 LR: c0000000007e3164 CTR: c000000000595ea0
[1.402787] REGS: c000001fe4e8b790 TRAP: 0300 Not tainted (4.4.0-36-generic)
[1.402832] MSR: 9000000100009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 28000422 XER: 20000000
[1.403058] CFAR: c000000000008468 DAR: 0000000000001a99 DSISR: 42000000 SOFTE: 1
GPR00: c0000000007e3164 c000001fe4e8ba10 c0000000015c5e00 0000000000000000
GPR04: 0000000000000001 0000000000000000 0000000000000039 0000000000000299
GPR08: 0000000000000000 0000000000000001 c000001fe4e88000 0000000000000006
GPR12: 0000000000000000 c00000000fb40000 c0000000000e6558 c000003ca1bffd00
GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR20: 0000000000000000 0000000000000000 0000000000000000 c000000000d52768
GPR24: c000000000d52740 0000000000000100 c000003ca1b52000 0000000000000002
GPR28: 0000000000000900 0000000000000000 c00000000152a0c0 c000003ca1b52000
[1.404226] NIP [c0000000007e33f8] tg3_io_error_detected+0x308/0x340
[1.404265] LR [c0000000007e3164] tg3_io_error_detected+0x74/0x340
This patch avoids the NULL pointer dereference by moving the access after
the netdev NULL pointer check on tg3_io_error_detected(). Also, we add a
check for netdev being NULL on tg3_io_resume() [suggested by Michael Chan].
Fixes: 0486a063b1 ("tg3: prevent ifup/ifdown during PCI error recovery")
Fixes: dfc8f37031 ("net/tg3: Release IRQs on permanent error")
Tested-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
Signed-off-by: Milton Miller <miltonm@us.ibm.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
Acked-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add support for reading addresses, interrupts, and _DSD properties
from ACPI tables, just like with device tree. The HID for the
EMAC device itself is QCOM8070. The internal PHY is represented
by a child node with a HID of QCOM8071.
The EMAC also has some complex clock initialization requirements
that are not represented by this patch. This will be addressed
in a future patch.
Signed-off-by: Timur Tabi <timur@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Replace the DT-specific of_get_mac_address() function with
device_get_mac_address, which works on both DT and ACPI platforms. This
change makes it easier to add ACPI support.
Signed-off-by: Timur Tabi <timur@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The platform_device returned by of_find_device_by_node() is not
automatically released when the driver unprobes. Therefore,
managed calls like devm_ioremap_resource() should not be used.
Instead, we manually allocate the resources and then free them
on driver release.
Signed-off-by: Timur Tabi <timur@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add support for the first two members of the Renesas RZ/G family, RZ/G1M/E
(also known as R8A7743/5). The Ether core is the same as in the R-Car gen2
SoCs, so will share the code/data with them...
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Acked-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
Until now, in order to offload a FIB entry to HW we use switchdev op.
Use the newly introduced FIB notifier infrasturucture to process
FIB entries to offload the in HW.
Abort mechanism is now handled within the rocker driver.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Until now, in order to offload a FIB entry to HW we use switchdev op.
However that has limits. Mainly in case we need to make the HW aware of
all route prefixes configured in kernel. HW needs to know those in order
to properly trap appropriate packets and pass the to kernel to do
the forwarding. Abort mechanism is now handled within the mlxsw driver.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When a reset occurs, the PPS SYS_WRAP interrupt was not re-enabled which
resulted in disabling of the PPS signaling. Fix this by recording when
the interrupt is on and ensuring that we re-enable it every time we
reset.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bump igb version to match other igb drivers.
Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bump version to match other igbvf drivers.
Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Fixes the following sparse warning:
drivers/net/ethernet/intel/igb/igb_ethtool.c:2707:5: warning:
symbol 'igb_rxnfc_write_vlan_prio_filter' was not declared. Should it be static?
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The group list must be freed prior to freeing the command otherwise
we have a use-after-free.
Signed-off-by: Jason Baron <jbaron@akamai.com>
Cc: Yuval Mintz <Yuval.Mintz@qlogic.com>
Cc: Ariel Elior <Ariel.Elior@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(1) Modify the register settings for LRO relinquishments
(2) Jump out from the waiting loop while LRO relinquishments are done
Signed-off-by: Nelson Chang <nelson.chang@mediatek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stop PDMA while the frame engine is going to stop.
Signed-off-by: Nelson Chang <nelson.chang@mediatek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The ethtool api {get|set}_settings is deprecated.
We move this driver to new api {get|set}_link_ksettings.
Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This reverts commit 62469c7600 ("net: ethernet: bcmgenet: use phydev
from struct net_device") because it causes GENETv1/2/3 adapters to
expose the following behavior after an ifconfig down/up sequence:
PING fainelli-linux (10.112.156.244): 56 data bytes
64 bytes from 10.112.156.244: seq=1 ttl=61 time=1.352 ms
64 bytes from 10.112.156.244: seq=1 ttl=61 time=1.472 ms (DUP!)
64 bytes from 10.112.156.244: seq=1 ttl=61 time=1.496 ms (DUP!)
64 bytes from 10.112.156.244: seq=1 ttl=61 time=1.517 ms (DUP!)
64 bytes from 10.112.156.244: seq=1 ttl=61 time=1.536 ms (DUP!)
64 bytes from 10.112.156.244: seq=1 ttl=61 time=1.557 ms (DUP!)
64 bytes from 10.112.156.244: seq=1 ttl=61 time=752.448 ms (DUP!)
This was previously fixed by commit 5dbebbb44a ("net: bcmgenet:
Software reset EPHY after power on") but the commit we are reverting was
essentially making this previous commit void, here is why.
Without commit 62469c7600 we would have the following scenario after
an ifconfig down then up sequence:
- bcmgenet_open() calls bcmgenet_power_up() to make sure the PHY is
initialized *before* we get to initialize the UniMAC, this is
critical to ensure the PHY is in a correct state, priv->phydev is
valid, this code executes fine
- second time from bcmgenet_mii_probe(), through the normal
phy_init_hw() call (which arguably could be optimized out)
Everything is fine in that case. With commit 62469c7600, we would have
the following scenario to happen after an ifconfig down then up
sequence:
- bcmgenet_close() calls phy_disonnect() which makes dev->phydev become
NULL
- when bcmgenet_open() executes again and calls bcmgenet_mii_reset() from
bcmgenet_power_up() to initialize the internal PHY, the NULL check
becomes true, so we do not reset the PHY, yet we keep going on and
initialize the UniMAC, causing MAC activity to occur
- we call bcmgenet_mii_reset() from bcmgenet_mii_probe(), but this is
too late, the PHY is botched, and causes the above bogus pings/packets
transmission/reception to occur
Reported-by: Jaedon Shin <jaedon.shin@gmail.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This reverts commit 6b352ebccb ("net: ethernet: broadcom: bcmgenet:
use new api ethtool_{get|set}_link_ksettings").
We needs to revert the commit 62469c7600 ("net: ethernet: bcmgenet:
use phydev from struct net_device"), because this commit add a
regression. As the commit 6b352ebccb ("net: ethernet: broadcom:
bcmgenet: use new api ethtool_{get|set}_link_ksettings") depend
on the first one, we also need to revert it first.
Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This reverts commit 62469c7600 ("net: ethernet: bcmgenet: use phydev
from struct net_device") because it causes GENETv1/2/3 adapters to
expose the following behavior after an ifconfig down/up sequence:
PING fainelli-linux (10.112.156.244): 56 data bytes
64 bytes from 10.112.156.244: seq=1 ttl=61 time=1.352 ms
64 bytes from 10.112.156.244: seq=1 ttl=61 time=1.472 ms (DUP!)
64 bytes from 10.112.156.244: seq=1 ttl=61 time=1.496 ms (DUP!)
64 bytes from 10.112.156.244: seq=1 ttl=61 time=1.517 ms (DUP!)
64 bytes from 10.112.156.244: seq=1 ttl=61 time=1.536 ms (DUP!)
64 bytes from 10.112.156.244: seq=1 ttl=61 time=1.557 ms (DUP!)
64 bytes from 10.112.156.244: seq=1 ttl=61 time=752.448 ms (DUP!)
This was previously fixed by commit 5dbebbb44a ("net: bcmgenet:
Software reset EPHY after power on") but the commit we are reverting was
essentially making this previous commit void, here is why.
Without commit 62469c7600 we would have the following scenario after
an ifconfig down then up sequence:
- bcmgenet_open() calls bcmgenet_power_up() to make sure the PHY is
initialized *before* we get to initialize the UniMAC, this is
critical to ensure the PHY is in a correct state, priv->phydev is
valid, this code executes fine
- second time from bcmgenet_mii_probe(), through the normal
phy_init_hw() call (which arguably could be optimized out)
Everything is fine in that case. With commit 62469c7600, we would have
the following scenario to happen after an ifconfig down then up
sequence:
- bcmgenet_close() calls phy_disonnect() which makes dev->phydev become
NULL
- when bcmgenet_open() executes again and calls bcmgenet_mii_reset() from
bcmgenet_power_up() to initialize the internal PHY, the NULL check
becomes true, so we do not reset the PHY, yet we keep going on and
initialize the UniMAC, causing MAC activity to occur
- we call bcmgenet_mii_reset() from bcmgenet_mii_probe(), but this is
too late, the PHY is botched, and causes the above bogus pings/packets
transmission/reception to occur
Reported-by: Jaedon Shin <jaedon.shin@gmail.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The FEC receive accelerator (RACC) supports shifting the data payload of
received packets by 16-bits, which aligns the payload (IP header) on a
4-byte boundary, which is, if not required, at least strongly suggested
by the Linux networking layer.
Without this patch, a huge number of alignment faults will be taken by the
IP stack, as seen in /proc/cpu/alignment:
~/$ cat /proc/cpu/alignment
User: 0
System: 72645 (inet_gro_receive+0x104/0x27c)
Skipped: 0
Half: 0
Word: 0
DWord: 0
Multi: 72645
User faults: 3 (fixup+warn)
This patch was suggested by Andrew Lunn in this message to linux-netdev:
http://marc.info/?l=linux-arm-kernel&m=147465452108384&w=2
and adapted from a patch by Russell King from 2014:
http://git.arm.linux.org.uk/cgit/linux-arm.git/commit/?id=70d8a8a
Signed-off-by: Eric Nelson <eric@nelint.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
According to the i.MX27 reference manual, this SoC does not have support
for the receive accelerator (RACC) register at offset 0x1C4.
http://cache.nxp.com/files/32bit/doc/ref_manual/MCIMX27RM.pdf
Signed-off-by: Eric Nelson <eric@nelint.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
According to the i.MX25 reference manual, this SoC does not have support
for the receive accelerator (RACC) register at offset 0x1C4.
http://www.nxp.com/files/dsp/doc/ref_manual/IMX25RM.pdf
Signed-off-by: Eric Nelson <eric@nelint.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The ethtool api {get|set}_settings is deprecated.
We move this driver to new api {get|set}_link_ksettings.
Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Acked-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, we can have high order page allocations that specify
GFP_ATOMIC when configuring multicast MAC address filters.
For example, we have seen order 2 page allocation failures with
~500 multicast addresses configured.
Convert the allocation for the pending list to be done in PAGE_SIZE
increments.
Signed-off-by: Jason Baron <jbaron@akamai.com>
Cc: Yuval Mintz <Yuval.Mintz@qlogic.com>
Cc: Ariel Elior <Ariel.Elior@qlogic.com>
Acked-by: Yuval Mintz <Yuval.Mintz@caviumnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, we can have high order page allocations that specify
GFP_ATOMIC when configuring multicast MAC address filters.
For example, we have seen order 2 page allocation failures with
~500 multicast addresses configured.
Convert the allocation for 'mcast_list' to be done in PAGE_SIZE
increments.
Signed-off-by: Jason Baron <jbaron@akamai.com>
Cc: Yuval Mintz <Yuval.Mintz@qlogic.com>
Cc: Ariel Elior <Ariel.Elior@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
I stumbled over a new warning during randconfig testing,
with CONFIG_BPF_SYSCALL disabled:
drivers/net/ethernet/netronome/nfp/nfp_net_offload.c: In function 'nfp_net_bpf_offload':
drivers/net/ethernet/netronome/nfp/nfp_net_offload.c:263:3: error: '*((void *)&res+4)' may be used uninitialized in this function [-Werror=maybe-uninitialized]
drivers/net/ethernet/netronome/nfp/nfp_net_offload.c:263:3: error: 'res.n_instr' may be used uninitialized in this function [-Werror=maybe-uninitialized]
As far as I can tell, this is a false positive caused by the compiler
getting confused about a function that is partially inlined, but it's
easy to avoid while improving the code:
The nfp_bpf_jit() stub helper for that configuration is unusual as it
is defined in a header file but not marked 'static inline'. By moving
the compile-time check into the caller using the IS_ENABLED() macro,
we can remove that stub and simplify the nfp_net_bpf_offload_prepare()
function enough to unconfuse the compiler.
Fixes: 7533fdc0f7 ("nfp: bpf: add hardware bpf offload")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We get 1 warning when building kernel with W=1:
drivers/net/ethernet/broadcom/genet/bcmgenet.c:2763:5: warning: no previous prototype for 'bcmgenet_hfb_add_filter' [-Wmissing-prototypes]
In fact, this function is implemented in
drivers/net/ethernet/broadcom/genet/bcmgenet.c, but be called
by no one, thus can be removed.
So this patch removes the unused functions.
Signed-off-by: Baoyou Xie <baoyou.xie@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
We get 10 warnings when building kernel with W=1:
drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c:304:5: warning: no previous prototype for 'cxgb4_dcb_enabled' [-Wmissing-prototypes]
drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.c:194:5: warning: no previous prototype for 'setup_sge_queues_uld' [-Wmissing-prototypes]
drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.c:241:6: warning: no previous prototype for 'free_sge_queues_uld' [-Wmissing-prototypes]
drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.c:268:5: warning: no previous prototype for 'cfg_queues_uld' [-Wmissing-prototypes]
drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.c:344:6: warning: no previous prototype for 'free_queues_uld' [-Wmissing-prototypes]
drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.c:353:5: warning: no previous prototype for 'request_msix_queue_irqs_uld' [-Wmissing-prototypes]
drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.c:379:6: warning: no previous prototype for 'free_msix_queue_irqs_uld' [-Wmissing-prototypes]
drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.c:393:6: warning: no previous prototype for 'name_msix_vecs_uld' [-Wmissing-prototypes]
drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.c:433:6: warning: no previous prototype for 'enable_rx_uld' [-Wmissing-prototypes]
drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.c:442:6: warning: no previous prototype for 'quiesce_rx_uld' [-Wmissing-prototypes]
In fact, these functions are only used in the file in which they are
declared and don't need a declaration, but can be made static.
so this patch marks these functions with 'static'.
Signed-off-by: Baoyou Xie <baoyou.xie@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
We get 2 warnings when building kernel with W=1:
drivers/net/ethernet/marvell/mvneta.c:639:27: warning: no previous prototype for 'mvneta_get_stats64' [-Wmissing-prototypes]
drivers/net/ethernet/marvell/mvneta.c:3529:5: warning: no previous prototype for 'mvneta_ethtool_set_link_ksettings' [-Wmissing-prototypes]
In fact, these two functions are only used in the file in which they are
declared and don't need a declaration, but can be made static.
so this patch marks these functions with 'static'.
Signed-off-by: Baoyou Xie <baoyou.xie@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
We get 1 warning when building kernel with W=1:
drivers/net/ethernet/hisilicon/hip04_eth.c:603:22: warning: no previous prototype for 'tx_done' [-Wmissing-prototypes]
In fact, this function is only used in the file in which it is
declared and don't need a declaration, but can be made static.
so this patch marks this function with 'static'.
Signed-off-by: Baoyou Xie <baoyou.xie@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
We get 2 warnings when building kernel with W=1:
drivers/net/ethernet/hisilicon/hisi_femac.c:943:5: warning: no previous prototype for 'hisi_femac_drv_suspend' [-Wmissing-prototypes]
drivers/net/ethernet/hisilicon/hisi_femac.c:960:5: warning: no previous prototype for 'hisi_femac_drv_resume' [-Wmissing-prototypes]
In fact, these two functions are only used in the file in which they are
declared and don't need a declaration, but can be made static.
so this patch marks these functions with 'static'.
Signed-off-by: Baoyou Xie <baoyou.xie@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixes the following sparse warnings:
drivers/net/ethernet/emulex/benet/be_main.c:47:25: warning:
symbol 'be_err_recovery_workq' was not declared. Should it be static?
drivers/net/ethernet/emulex/benet/be_main.c:63:25: warning:
symbol 'be_wq' was not declared. Should it be static?
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This aligns smc91x with its cousin, namely smc911x.c.
This also allows the driver to run also in a device-tree based lubbock
board build, on which it was tested.
Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
iq is unsigned, so the error check for iq < 0 has no effect so errors
can slip past this check. Fix this by making iq signed and also
get_filter_steerq return a signed int so a -ve error can be returned.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In commit a75e8005d5 ("i40e: queue-specific settings for interrupt
moderation") the i40e driver gained support for setting interrupt
moderation values per queue. This patch adds support for this feature
to the i40evf driver as well. In addition, a few changes are made to
the i40e implementation to add function header documentation comments,
as well.
This behaves in a similar fashion to the implementation in i40e. Thus,
requesting the moderation value when no queue is provided will report
queue 0 value, while setting the value without a queue will set all
queues at once.
Change-ID: I1f310a57c8e6c84a8524c178d44d1b7a6d3a848e
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
In some rare cases, we might get a VSI with no queues. In this case, we
cannot configure RSS on this VSI as it will try to divide by zero when
configuring the lookup table.
Change-ID: I6ae173a7dd3481a081e079eb10eb80275de2adb0
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This interface was only ever meant for debug only. Since it is not
supposed to be here we are removing it.
Change-ID: Id771a1e5e7d3e2b4b7f56591b61fb48c921e1d04
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
In an effort to improve code readability I am splitting the Flow Director
filter configuration out into a separate function like we have done for the
standard xmit path. The general idea is to provide a single block of code
that translates the flow specification into a proper Flow Director
descriptor.
Change-ID: Id355ad8030c4e6c72c57504fa09de60c976a8ffe
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch adds a txring_txq function which allows us to convert a
i40e_ring/i40evf_ring to a netdev_tx_queue structure. This way we
can avoid having to make a multi-line function call for all the spots
that need access to this.
Change-ID: Ic063b71d8b92ea406d2c32e798c8e2b02809d65b
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The Tx cleanup flow was incorrectly assuming it could check for the flow
director bits after it had unmapped the buffer. However in this case it
results in us trying to free a raw_buf as though it is an sk_buff.
To fix this I am moving up the flag test for the FD_SB bit so that when
find a non-NULL skb or raw_buf value we then check the flag and use the
appropriate call to free the buffer.
Change-ID: I6284034ba1ea87c9922e56f6eb3181f7f09bddde
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
All of the code to support adaptive interrupt throttling is already in
the interrupt handler, it just needs to be enabled. Fill out the data
structures properly to make it happen. Single-flow traffic tests may
show slightly lower throughput, but interrupts per second will drop by
about 75%.
Change-ID: I9cd7d42c025b906bf1bb85c6aeb6112684aa6471
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch increases minimum number of allocated VSIs, so as to resolve
failure adding VSI for VF when 64-VFs assigned to a PF. The driver
supports up to 128 VFs per device, users can decide to enable up to
64-VFs on a single PF, especially 2 X 40 devices. In that scenario, with
VMDq co-existence, there would be starvation of VSIs - with this patch,
supported features would have enough VSIs for configuration now.
Change-ID: If084f4cd823667af8fe7fdc11489c705b32039d5
Signed-off-by: Akeem Abodunrin <akeem.g.abodunrin@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The return value from i40e_shutdown_adminq() is always 0
(I40E_SUCCESS). So, the test for non-0 will never be true. Cleanup
by removing the test and debug print statement.
Change-ID: Ie51e8e37515c3e3a6a9ff26fa951d0e5e24343c1
Signed-off-by: Henry Tieman <henry.w.tieman@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
In i40e_fdir_check_and_reenable(), the driver performs some checks to
determine whether it is safe to re-enable FD Sideband and FD ATR
support. The current check will only determine if there is available
space in the flow director table. However, this ignores the fact that
ATR should be disabled when there are TCP/IPv4 sideband rules in effect.
Add the missing check, and update the info message printed when
I40E_DEBUG_FD is enabled.
Change-ID: Ibb9c63e5be95d63c53a498fdd5dbf69f54a00e08
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Some locations that disable ATR accidentally used the "full" disable by
disabling the flag in the standard flags field. This incorrectly forces
ATR off permanently instead of temporarily disabling it. In addition,
some code locations accidentally set the ATR flag enabled when they only
meant to clear the auto_disable_flags. This results in ignoring the
user's ethtool private flag settings.
Additionally, when disabling ATR via ethtool, we did not perform a flush
of the FD table. This results in the previously assigned ATR rules still
functioning which was not expected.
Cleanup all these areas so that automatic disable uses only the
auto_disable_flag. Fix the flush code so that we can trigger a flush
even when we've disabled ATR and SB support, as otherwise the flush
doesn't work. Fix ethtool setting to actually request a flush. Fix
NETIF_F_NTUPLE flag to only clear the auto_disable setting and not
enable the full feature.
Change-ID: Ib2486111f8031bd16943e9308757b276305c03b5
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Add ENCAP_CSUM offload negotiation flag. Currently VF assumes checksum
offload for encapsulated packets is supported by default. Going forward,
this feature needs to be negotiated with PF before advertising to the
stack. Hence, we need a flag to control it.
This is in regards to prepping up for VF base mode functionality support.
Change-ID: Iaab1f25cc0abda5f2fbe3309092640f0e77d163e
Signed-off-by: Preethi Banala <preethi.banala@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
There exists a bug in which deleting a mac filter does not actually
occur. The driver reports that the filter has been deleted with no
error. The problem occurs because the wrong cmd_flag is passed to the
firmware when deleting the filter. The firmware reports an error back
to the driver but it is expressly ignored.
This fixes the bug by using the correct flag when deleting a filter.
Without this patch, deleted filters remain in firmware and function as
if they had not been deleted.
Change-ID: I5f22b874f3b83f457702f18f0d5602ca21ac40c3
Signed-off-by: Alan Brady <alan.brady@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch fixes the problem where driver shows 100 Mbps as a supported speed,
and allows it to be configured for advertising on X722 devices. This patch
fixes the problem by not setting the 100 Mbps SGMII flag for X722 devices.
Without this patch, the user incorrectly thinks that 100 Mbps is supported
and hence might try to advertise it on X722 devices when it is actually not
a supported speed.
Change-ID: I8c3d7c4251a9402d98994ed29749b7b895a0f205
Signed-off-by: Harshitha Ramamurthy <harshitha.ramamurthy@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch fixes a regression caused by previous commit
when irq name exceeds 20 byte array if interface's name
size is large.
Fixes: e412621394 ("net: thunderx: Use netdev's name for naming VF's interrupts")
Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is an earlier check and return if err is non-zero, so
the check to see if it is zero is redundant in every iteration
of the loop and hence the check can be removed.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jeff Kirsher says:
====================
10GbE Intel Wired LAN Driver Updates 2016-09-23
This series contains updates to ixgbe and ixgbevf.
Emil provides several changes, first simplifies the logic for setting VLAN
filtering by checking the VMDQ flag and the old 82598 MAC, instead of
having to maintain a list of MAC types. Then made two functions static
that are used only within the file, a by-product is sparse is now happy.
Added spinlocks to make sure that the MTU configuration is handled
properly. Fixed an issue where when SR-IOV is enabled while the
ixgbevf driver is loaded would result in all mailbox requests being
rejected by ixgbe, so call ixgbe_sriov_reinit() before pci_enable_sriov()
to ensure mailbox requests are properly handled.
Mark resolves a NULL pointer issue by simply setting the read and write
*_ref_mdi pointers for x550em_a devices. Then clearly indicates within
ethtool that all MACs support pause frames and made sure that the
advertising is set to the requested mode. Fixed an issue where
MDIO_PRTAD_NONE was not being used consistently to indicate no PHY
address.
Alex fixes an issue, where the support for multiple queues when SR-IOV
is enabled was added but the support was not reported. With that, fix
an issue where the hardware redirection table could support more queues
then the PF currently has when SR-IOV is enabled, so use the RSS mask to
trim off the bits that are not used. Lastly, instead of limiting the
VFs if we do not use 4 queues for RSS in the PF, we can instead just limit
the RSS queues used to a power of 2. We can now support use cases where
VFs are using more queues than the PF is currently using and can support
RSS if so desired.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jeff Kirsher says:
====================
40GbE Intel Wired LAN Driver Updates 2016-09-22
This series contains updates to i40e and i40evf only.
Sridhar fixes link state event handling by updating the carrier and
starts/stops the Tx queues based on the link state notification from PF.
Brady fixes an issue where a user defined RSS hash key was not being
set because a user defined indirection table is not supplied when changing
the hash key, so if an indirection table is not supplied now, then a
default one is created and the hash key is correctly set. Also fixed
an issue where when NPAR was enabled, we were still using pf->mac_seid
to perform the dump port query. Instead, go through the VSI to determine
the correct ID to use in either case.
Mitch provides one fix where a conditional return code was reversed, so
he does a "switheroo" to fix the issue.
Carolyn has two fixes, first fixes an issue in the virt channel code,
where a return code was not checked for NULL when applicable. Second,
fixes an issue where we were byte swapping the port parameter, then
byte swapping it again in function execution.
Colin Ian King fixes a potential NULL pointer dereference.
Bimmy changes up i40evf_up_complete() to be void since it always returns
success anyways, which allows cleaning up of code which checked the
return code from this function.
Alex fixed an issue where the driver was incorrectly assuming that we
would always be pulling no more than 1 descriptor from each fragment.
So to correct this, we just need to make certain to test all the way to
the end of the fragments as it is possible for us to span 2 descriptors
in the block before us so we need to guarantee that even the last 6
descriptors have enough data to fill a full frame.
v2: dropped patches 1-3, 10 and 12 from the original series since Or
Gerlitz pointed out several areas of improvement in the implementation
of the VF Port representor netdev. Sridhar is re-working the series
for later submission.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Move the vf to VST 802.1ad mode (mlx4 VST QinQ mode) by setting vf vlan
protocol to 802.1ad.
VST 802.1ad mode in mlx4, is used for STAG strip/insertion by PF, while
the CTAG is set by the VF.
Read current vlan protocol as part of the vf configuration state.
Upon setting vf vlan protocol to 802.1ad, we use a mechanism of handshake
to verify that both the vf and the pf driver version support it.
The handshake uses the command QUERY_FUNC_CAP:
- The vf sets a pre-defined support bit in input modifier.
- A pf that supports the feature sends the request to the vf through a
pre-defined field in the output mailbox.
- In case vf does not support the feature, the pf will fail the control
command (in this case, IP link tool command to set the vf vlan
protocol to 802.1ad).
No change in VST 802.1Q mode.
Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Introduce new rtnl UAPI that exposes a list of vlans per VF, giving
the ability for user-space application to specify it for the VF, as an
option to support 802.1ad.
We adjusted IP Link tool to support this option.
For future use cases, the new UAPI supports multiple vlans. For now we
limit the list size to a single vlan in kernel.
Add IFLA_VF_VLAN_LIST in addition to IFLA_VF_VLAN to keep backward
compatibility with older versions of IP Link tool.
Add a vlan protocol parameter to the ndo_set_vf_vlan callback.
We kept 802.1Q as the drivers' default vlan protocol.
Suitable ip link tool command examples:
Set vf vlan protocol 802.1ad:
ip link set eth0 vf 1 vlan 100 proto 802.1ad
Set vf to VST (802.1Q) mode:
ip link set eth0 vf 1 vlan 100 proto 802.1Q
Or by omitting the new parameter
ip link set eth0 vf 1 vlan 100
Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In Ethernet VF, disable vlan HW acceleration on VF
while it is set to VF vlan protocol 802.1ad mode.
Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Check device capability to support VF vlan protocol 802.1ad mode.
Add vport attribute vlan protocol.
Init vport vlan protocol by default to 802.1Q.
Add update QP support for VF vlan protocol 802.1ad.
Add func capability vlan_offload_disable to disable all
vlan HW acceleration on VF while the VF is set to VF vlan protocol
802.1ad mode.
No change in VF vlan protocol 802.1Q (VST) mode.
Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Separate QUERY_FUNC_CAP flags0 from QUERY_FUNC_CAP flags, as 'flags' is
already used for another set of flags in FUNC CAP, while phv bit should be
part of a different set of flags.
Remove QUERY_FUNC_CAP port_flags field, as it is not in use.
Fixes: 77fc29c4bb ('net/mlx4_core: Preparations for 802.1ad VLAN support')
Fixes: 5cc914f108 ('mlx4_core: Added FW commands and their wrappers for supporting SRIOV')
Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Current driver programs static value of MSS in hardware register for TSO
offload engine to segment the TCP payload regardless the MSS value
provided by network stack.
This patch fixes this by programming hardware registers with the
stack provided MSS value.
Since the hardware has the limitation of having only 4 MSS registers,
this patch uses reference count of mss values being used.
Signed-off-by: Iyappan Subramanian <isubramanian@apm.com>
Signed-off-by: Toan Le <toanle@apm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Change predecrement compare to post decrement compare to avoid an
unsigned integer wrap-around comparison when decrementing idx in
the while loop.
For example, when idx is zero, the current situation will
predecrement idx in the while loop, wrapping idx to the maximum
signed integer and cause out of bounds reads on rxq_info->msix_tbl[idx].
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Enhance the parsing of offloaded TC rules matches to handle vlans.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Parse TC vlan actions and set the required elements to allow offloading.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Many virtualization systems use a policy under which a vlan tag is
pushed to packets sent by guests, and popped before the packet is
forwarded to the VM.
The current generation of the mlx5 HW doesn't fully support that on
a per flow level. As such, we are addressing the above common use
case with the SRIOV e-Switch abilities to push vlan into packets
sent by VFs and pop vlan from packets forwarded to VFs.
The HW can match on the correct vlan being present in packets
forwarded to VFs (eSwitch steering is done before stripping
the tag), so this part is offloaded as is.
A common practice for vlans is to avoid both push vlan and pop vlan
for inter-host VM/VM (east-west) communication because in this case,
push on egress cancels out with pop on ingress.
For supporting that, we use a global eswitch vlan pop policy, hence
allowing guest A to communicate with both remote VM B and local VM C.
This works since the HW pops the vlan only if it exists (e.g for
C --> A packets but not for B --> A packets).
On the slow path, when a VF vport has an offloaded flow which involves
pushing vlans, wheres another flow is not currently offloaded, the
packets from the 2nd flow seen by the VF representor on the host have
vlan. The VF rep driver removes such vlan before calling into the host
networking stack.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Factor the relevant code into a static inline helper (skb_from_cqe)
doing that.
Move the call to napi_gro_receive to be carried out just
after mlx5e_complete_rx_cqe returns.
Both changes are to be used for the VF representor as well
in the next commit.
This patch doesn't change any functionality.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Put the representors related to the source and dest vports and the
action in struct mlx5_esw_flow_attr which is used while setting the FDB rule.
This patch doesn't change any functionality.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The HW can be programmed to push vlan, pop vlan or both.
A factorization step towards using the push/pop capabilties in the
eswitch offloads mode. This patch doesn't add new functionality.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The structure we use for the eswitch vport representor (mlx5_eswitch_rep)
has some fields which are set from upper layers in the driver when they
register the rep. Use explicit setting on registration time for them and
avoid global memcpy. This patch doesn't add new functionality.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Set the vport value in the PF entry to be that of the uplink so
we can use it blindly over the tc / eswitch offload code without
translating it each time we deal with the uplink representor.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a NULL check before calling asynchronous MCDI completion functions
during device removal.
Fixes: 7014d7f6 ("sfc: allow asynchronous MCDI without completion function")
Signed-off-by: Bert Kenward <bkenward@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Enabling SRIOV while the ixgbevf driver is loaded will result in all
mailbox requests from ixgbevf_open() being rejected by ixgbe because
adapter->clear_to_send is set to false on reset.
Call ixgbe_sriov_reinit() before pci_enable_sriov() to make sure that
mailbox requests are handled from the time ixgbevf is loaded.
Reported-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Instead of limiting the VFs if we don't use 4 queues for RSS in the PF we
can instead just limit the RSS queues used to a power of 2. By doing this
we can support use cases where VFs are using more queues than the PF is
currently using and can support RSS if so desired.
The only limitation on this is that we cannot support 3 queues of RSS in
the PF or VF. In either of these cases we should fall back to 2 queues in
order to be able to use the power of 2 masking provided by the psrtype
register.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The hardware redirection table can support more queues then the PF
currently has when SR-IOV is enabled. In order to account for this use the
RSS mask to trim of the bits that are not used.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The maximum queue count reported was 1, however support for multiple queues
with SR-IOV was added some time ago so we should report support for it to
the user so that they can select multiple queues if they so desire.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The value MDIO_PRTAD_NONE should be used to indicate no PHY address.
Not 0, not 0xFFFF. Use the MDIO_PRTAD_NONE value consistently.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Protect set_rlpml with mailbox lock to make sure the MTU configuration
is handled properly.
This change resolves an issue where set_rlpml can fail when the VF
interface is brought up:
ixgbevf 0000:03:1d.6: Failed to set MTU at 1500
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
All the MACs supported by ixgbe support pause frames, so indicate
that support in ethtool. Also set advertising according to requested
mode.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Set the read_reg_mdi and write_reg_mdi method pointers for
X550EM_A_10G_T devices to resolve jumping to NULL.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
These functions are only used in ixgbe_x550.c.
Fixes a warning when compiling with -Wmissing-prototypes
Reported-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Simplify the logic for setting VLNCTRL.VFE by checking the VMDQ flag
and 82598 MAC instead of having to maintain a list of MAC types.
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The i40e_shutdown_adminq function never returns failure. There is no need to
check the non-0 return value. Clean up the unnecessary error checking and
warning against it.
Change-ID: Ibb616f09cfb93bd1a872ebf3241a15fb8354b31b
Signed-off-by: Lihong Yang <lihong.yang@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The i40e driver was incorrectly assuming that we would always be pulling
no more than 1 descriptor from each fragment. It is in fact possible for
us to end up with the case where 2 descriptors worth of data may be pulled
when a frame is larger than one of the pieces generated when aligning the
payload to either 4K or pieces smaller than 16K.
To adjust for this we just need to make certain to test all the way to the
end of the fragments as it is possible for us to span 2 descriptors in the
block before us so we need to guarantee that even the last 6 descriptors
have enough data to fill a full frame.
Change-ID: Ic2ecb4d6b745f447d334e66c14002152f50e2f99
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Function i40evf_up_complete() always returns success. Changed this to a
void type and removed the code that checks the return status and prints
an error message.
Change-ID: I8c400f174786b9c855f679e470f35af292fb50ad
Signed-off-by: Bimmy Pujari <bimmy.pujari@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Currently disabling the link state from PF via
ip link set enp5s0f0 vf 0 state disable
doesn't disable the CARRIER on the VF.
This patch updates the carrier and starts/stops the tx queues based on the
link state notification from PF.
PF: enp5s0f0, VF: enp5s2
#modprobe i40e
#echo 2 > /sys/class/net/enp5s0f0/device/sriov_numvfs
#ip link set enp5s2 up
#ip -d link show enp5s2
175: enp5s2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
link/ether ea:4d:60:bc:6f:85 brd ff:ff:ff:ff:ff:ff promiscuity 0 addrgenmode eui64
#ip link set enp5s0f0 vf 0 state disable
#ip -d link show enp5s0f0
171: enp5s0f0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether 68:05:ca:2e:72:68 brd ff:ff:ff:ff:ff:ff promiscuity 0 addrgenmode eui64 numtxqueues 72 numrxqueues 72 portid 6805ca2e7268
vf 0 MAC 00:00:00:00:00:00, spoof checking on, link-state disable, trust off
vf 1 MAC 00:00:00:00:00:00, spoof checking on, link-state auto, trust off
#ip -d link show enp5s2
175: enp5s2: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN mode DEFAULT group default qlen 1000
link/ether ea:4d:60:bc:6f:85 brd ff:ff:ff:ff:ff:ff promiscuity 0 addrgenmode eui64 numtxqueues 16 numrxqueues 16
Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
There is a sanitcy check for desc being null in the first line of
function i40evf_debug_aq. However, before that, aq_desc is cast from
desc, and aq_desc is being dereferenced on the assignment of len, so
this could be a potential null pointer deference. Fix this by moving
the initialization of len to the code block where len is being used
and hence at this point we know it is OK to dereference aq_desc.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch fixes an issue where we were byte swapping the port
parameter, then byte swapping it again in function execution.
Obviously, that's unnecessary, so take it out of the function calls.
Without this patch, the udp based tunnel configuration would
not be correct.
Change-ID: I788d83c5bd5732170f1a81dbfa0b1ac3ca8ea5b7
Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch fixes an issue in the virt channel code, where a return
from i40e_find_vsi_from_id was not checked for NULL when applicable.
Without this patch, there is a risk for panic and static analysis
tools complain. This patch fixes the problem by adding the check
and adding an additional input check for similar reasons.
Change-ID: I7e9be88eb7a3addb50eadc451c8336d9e06f5394
Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This conditional is backward, so the driver responds back to the VF with
the wrong opcode. Do the old switcheroo to fix this.
Change-ID: I384035b0fef8a3881c176de4b4672009b3400b25
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
When using the debugfs to issue the "dump port" command
with NPAR enabled, the firmware reports back with invalid argument.
The issue occurs because the pf->mac_seid was used to perform the query.
This is fine when NPAR is disabled because the switch ID == pf->mac_seid,
however this is not the case when NPAR is enabled. This fix instead
goes through the VSI to determine the correct ID to use in either case.
Change-ID: I0cd67913a7f2c4a2962e06d39e32e7447cc55b6a
Signed-off-by: Alan Brady <alan.brady@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Previously, when using ethtool to change the RSS hash key, ethtool would
report back saying the old key was still being used and no error was
reported. It was unclear whether it was being reported incorrectly or
being set incorrectly. Debugging revealed 'i40e_set_rxfh()' returned
zero immediately instead of setting the key because a user defined
indirection table is not supplied when changing the hash key.
This fix instead changes it such that if an indirection table is not
supplied, then a default one is created and the hash key is now
correctly set.
Change-ID: Iddb621897ecf208650272b7ee46702cad7b69a71
Signed-off-by: Alan Brady <alan.brady@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Potential dangerous invalid pointer might be accessed if
the error happens when couple phy_device to net_device so
cleanup the error path.
Signed-off-by: Sean Wang <sean.wang@mediatek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
1) use new api [get|set]_link_ksettings instead
of [get|set]_settings old ones.
2) dev->phydev is sure being ready before calling
these callbacks, so removing all the sanity check
if it is existing.
Signed-off-by: Sean Wang <sean.wang@mediatek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
remove the unused variable for parsing PHY address
and the related logic for sanity test which would
be all already handled done when of_mdiobus_register
was called
Reported-by: Nelson Chang <nelson.chang@mediatek.com>
Signed-off-by: Sean Wang <sean.wang@mediatek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
reuse phydev already in struct net_device instead of creating
another new one in private structure.
Signed-off-by: Sean Wang <sean.wang@mediatek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Changing dynamically source clock, TX/RX delay and interface mode
used by TRGMII hardware module inside PHY capability polling routine
for adapting to the various speed of RGMII used by external PHY for
GMAC0.
Signed-off-by: Sean Wang <sean.wang@mediatek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
adds PHY-mode "trgmii" as an extension for the operation
mode of the PHY interface for PHY_INTERFACE_MODE_TRGMII.
and adds a variable trgmii inside mtk_mac as the indication
to make the difference between the MAC connected to internal
switch or connected to external PHY by the given configuration
on the board and then to perform the corresponding setup on
TRGMII hardware module.
Signed-off-by: Sean Wang <sean.wang@mediatek.com>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The PHYSTS_CHG (the ftgmac100's PHY IRQ) is telling the system to go
look at the PHY registers for a link status change.
The interrupt was causing issues on Aspeed SoC where some board designs
had an active high configuration, some active low, and in some cases
repurposed for other functions. When misconfigured Linux would chew 100%
of CPU cycles servicing interrupts:
[ 20.280000] ftgmac100 1e660000.ethernet eth0: [ISR] = 0x200: PHYSTS_CHG
[ 20.280000] ftgmac100 1e660000.ethernet eth0: [ISR] = 0x200: PHYSTS_CHG
[ 20.280000] ftgmac100 1e660000.ethernet eth0: [ISR] = 0x200: PHYSTS_CHG
[ 20.300000] ftgmac100 1e660000.ethernet eth0: [ISR] = 0x200: PHYSTS_CHG
While in the ftgmac100 IP can be configured for high, low and edge
sensitivity the current driver always polls the PHY, so we chose to mask
out the interrupt.
See https://patchwork.ozlabs.org/patch/672099/ for more discussion.
Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
The Aspeed SoCs have a new MDIO interface as an option in the G4 and G5
SoCs. The old one is still available, so select it in order to remain
compatible with the ftgmac100 driver.
Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is stale interrupt (PHYSTS_CHG in ISR, bit#6 in 0x0) from
the bootloader (uboot) when enabling the MAC. The stale interrupts
aren't part of kernel and should be cleared.
This clears the stale interrupts in ISR (0x0) when enabling the MAC.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
The RXDES and TXDES registers bits in the ftgmac100 indicates EDO{R,T}R
at bit position 15 for the Faraday Tech IP. However, the version of this
IP present in the Aspeed SoCs has these bits at position 30 in the
registers.
It appers that ast2400 SoCs support both positions, with the 15th bit
marked as reserved but still functional. In the ast2500 this bit is
reused for another function, so we need a work around.
This was confirmed with engineers from Aspeed that using bit 30 is
correct for both the ast2400 and ast2500 SoCs.
Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
These bits are #defined at a fixed location. In order to support future
hardware that has chosen to move these bits around move the bits into a
member of the struct ftgmac100.
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
The ftgmac100 hardware revision in e.g. the Aspeed AST2500 no longer
reserves all bits in RXDES#2 but instead uses the bottom 16 bits to
store MAC frame metadata. Avoid corruption by shifting struct page
pointers out to their own member in struct ftgmac100.
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove an open coded simple_open() function and replace file
operations references to the function with simple_open()
instead.
Generated by: scripts/coccinelle/api/simple_open.cocci
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Previously we rang XDP SQ doorbell on every forwarded XDP packet.
Here we introduce a xmit more like mechanism that will queue up more
than one packet into SQ (up to RX napi budget) w/o notifying the hardware.
Once RX napi budget is consumed and we exit napi RX loop, we will
flush (doorbell) all XDP looped packets in case there are such.
XDP forward packet rate:
Comparing XDP with and w/o xmit more (bulk transmit):
RX Cores XDP TX XDP TX (xmit more)
---------------------------------------------------
1 6.5Mpps 12.4Mpps
2 13.2Mpps 24.2Mpps
4 25.2Mpps 36.3Mpps*
8 36.3Mpps* 36.3Mpps*
*My xmitter was limited to 36.3Mpps, so it is the bottleneck.
It seems that receive side can handle more.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Adding support for XDP_TX forwarding from xdp program.
Using XDP, now user can loop packets out of the same port.
We create a dedicated TX SQ for each channel that will serve
XDP programs that return XDP_TX action to loop packets back to
the wire directly from the channel RQ RX path.
For that RX pages will now need to be mapped bi-directionally,
and on XDP_TX action we will sync the page back to device then
queue it into SQ for transmission. The XDP xmit frame function will
report back to the RX path if the page was consumed (transmitted), if so,
RX path will forget about that page as if it were released to the stack.
Later on, on XDP TX completion, the page will be released back to the
page cache.
For simplicity this patch will hit a doorbell on every XDP TX packet.
Next patch will introduce a xmit more like mechanism that will
queue up more than one packet into SQ w/o notifying the hardware,
once RX napi loop is done we will hit doorbell once for all XDP TX
packets form the previous loop. This should drastically improve
XDP TX performance.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make a clear separate between Regular SQ (TXQ) and ICO SQ creation,
destruction and union their mutual information structures.
Don't allocate redundant TXQ skb/wqe_info/dma_fifo arrays for ICO SQ.
And have a different SQ edge for ICO SQ than TXQ SQ, to be more
accurate.
In preparation for XDP TX support.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add support for the BPF_PROG_TYPE_PHYS_DEV hook in mlx5e driver.
When XDP is on we make sure to change channels RQs type to
MLX5_WQ_TYPE_LINKED_LIST rather than "striding RQ" type to
ensure "page per packet".
On XDP set, we fail if HW LRO is set and request from user to turn it
off. Since on ConnectX4-LX HW LRO is always on by default, this will be
annoying, but we prefer not to enforce LRO off from XDP set function.
Full channels reset (close/open) is required only when setting XDP
on/off.
When XDP set is called just to exchange programs, we will update
each RQ xdp program on the fly and for synchronization with current
data path RX activity of that RQ, we temporally disable that RQ and
ensure RX path is not running, quickly update and re-enable that RQ,
for that we do:
- rq.state = disabled
- napi_synnchronize
- xchg(rq->xdp_prg)
- rq.state = enabled
- napi_schedule // Just in case we've missed an IRQ
Packet rate performance testing was done with pktgen 64B packets and on
TX side and, TC drop action on RX side compared to XDP fast drop.
CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz
Comparison is done between:
1. Baseline, Before this patch with TC drop action
2. This patch with TC drop action
3. This patch with XDP RX fast drop
RX Cores Baseline(TC drop) TC drop XDP fast Drop
--------------------------------------------------------------
1 5.3Mpps 5.3Mpps 16.5Mpps
2 10.2Mpps 10.2Mpps 31.3Mpps
4 20.5Mpps 19.9Mpps 36.3Mpps*
*My xmitter was limited to 36.3Mpps, so it is the bottleneck.
It seems that receive side can handle more.
Signed-off-by: Rana Shahout <ranas@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add two helper functions to allow dynamic changes of RQ type.
mlx5e_set_rq_priv_params and mlx5e_set_rq_type_params will be
used on netdev creation to determine the default RQ type.
This will be needed later for downstream patches of XDP support.
When enabling XDP we will dynamically move from striding RQ to
linked list RQ type.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>