The calculation of the Tx and Rx fifo sizes can be calculated rather
than hardcoded in a switch statement. Additionally, the per-queue fifo
sizes can be calculated rather than hardcoded using if/else if statements
that can possibly underutilize the available fifo area.
Change the code to calculate the fifo sizes and the per-queue fifo sizes
to simplify the code and make best use of the available fifo.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add error and dynamic debug messages to various ethtool functions in
the driver while also removing the DBGPR debug print calls. Also, change
the message level for some error messages from alert to err.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Provide the ethtool functions to support getting and setting the
msglevel for the driver.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Device tree presence takes precedence over ACPI in the device_* APIs.
The amd-xgbe driver should follow the same precedence. Update the check
on whether to use DT / ACPI to follow this.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove an unneeded semicolon at the end of a switch statement block.
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We want to deprecate the use of 'struct timespec' on 32-bit
architectures, as it is will overflow in 2038. The igb
driver uses it to read the current time, and can simply
be changed to use ktime_get_real_ts64() instead.
Because of hardware limitations, there is still an overflow
in year 2106, which we cannot really avoid, but this documents
the overflow.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: intel-wired-lan@lists.osuosl.org
Reviewed-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We want to deprecate the use of 'struct timespec' on 32-bit
architectures, as it is will overflow in 2038. The stmmac
driver uses it to read the current time, and can simply
be changed to use ktime_get_real_ts64() instead.
Because of hardware limitations, there is still an overflow
in year 2106, which we cannot really avoid, but this documents
the overflow.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The fec_ptp_enable_pps uses an open-coded implementation of ns_to_timespec,
which will be removed eventually as it is not y2038-safe on 32-bit
architectures. Two more instances of the same code in this file were
already converted to use the safe ns_to_timespec64 in commit 6630514fce
("ptp: fec: use helpers for converting ns to timespec"), this changes
the last one as well.
The seconds portion here is actually unused and we could just remove the
timespec variable, but using ns_to_timespec64 can still be better as the
implementation can be hand-optimized in the future.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Fugang Duan <b38611@freescale.com>
Cc: Luwei Zhou <b45643@freescale.com>
Cc: Frank Li <Frank.Li@freescale.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Under certain circumstances, we can get an extra VF_RESOURCES message
from the PF driver at runtime. When this happens, we need to parse it
because our VSI may have changed out from underneath us, and that will
affect our relationship with the PF driver.
However, parsing the resources message also blows away our current MAC
address in the hardware struct, usually with all zeros. When this
happens, the next time the interface is opened, it will have no MAC
address and will a) not work and b) complain.
Fix this issue by restoring the current MAC address from the netdev
struct after we parse the resource message.
Change-ID: I6cd1b624fc20432f81dc901166c8de195b8e0e65
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Make sure we have the spinlocks before we clear the ARQ and ASQ management
registers. Also, widen the locked portion and make a sanity check earlier
in the send function to be sure we're working with safe register values.
Change-ID: I34b56044b33461ed780f3d2de8d62826cdf933f9
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
In any case free the memory allocated before exiting.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Function i40e_clean_rx_irq() tries to reuse memory pages allocated
from the nearest node. To better support memoryless node, use
numa_mem_id() instead of numa_node_id() to get the nearest node with
memory.
This change should only affect performance.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Function i40e_clean_rx_irq() tries to reuse memory pages allocated
from the nearest node. To better support memoryless node, use
numa_mem_id() instead of numa_node_id() to get the nearest node with
memory.
This change should only affect performance.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Jeff Kirsher says:
====================
Intel Wired LAN Driver Updates 2015-09-30
This series contains updates to i40e and i40evf only.
Vasily Averin provides a couple of rtnl lock/unlock fixes for both i40e
and i40evf.
Shannon provides several updates and fixes, first fixes up a type clash
in i40e_aq_rc_to_posix(), where the error codes are signed values, so we
need to treat them as such. Then fixes up a padding issue where an
extra byte is added in i40e_aqc_get_cee_dcb_cfg_v1_resp to directly
acknowledge the padding. Updated i40e to keep debugfs register read
and writes from accessing outside of the io-remapped space. Added
support and device id for another 20 GbE device.
Jesse fixes the transmit hand workaround code for ARM that was causing
Tx hangs to still occur occasionally when there really was no hang. Then
fixed the receive dropped counter to show up in netstat interface.
Refactor the interrupt enable function since it was always making the
caller add the base_vector from the VSI struct which is already passed
to the function. Fix kbuild warnings found in 0day build infrastructure
by adding a harmless cast to a dev_info(), also fix 32 bit build
warnings found by sparse.
Greg fixed a configuration error that results if a port VLAN is set
for a VF before the VF driver is loaded, so that when the VF driver is
loaded the port VLAN is ignored.
Mitch fixes the use of QOS field consistently in
i40e_ndo_set_vf_port_vlan(). Modified the init timing of the driver
to increase stability on load/unload and SR-IOV enable/disable cycles.
Anjali updates i40e to not collect VEB stats if they are disabled in the
hardware for performance reasons.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch supports the r8a7795 SoC by:
- Using two interrupts
+ One for E-MAC
+ One for everything else
+ Both can be handled by the existing common interrupt handler, which
affords a simpler update to support the new SoC. In future some
consideration may be given to implementing multiple interrupt handlers
- Limiting the phy speed to 100Mbit/s for the new SoC;
at this time it is not clear how this restriction may be lifted
but I hope it will be possible as more information comes to light
Signed-off-by: Kazuya Mizuguchi <kazuya.mizuguchi.ks@renesas.com>
[horms: reworked]
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch is in preparation for using this driver on arm64 where the
implementation of __dma_alloc_coherent fails if a device parameter is not
provided.
Signed-off-by: Kazuya Mizuguchi <kazuya.mizuguchi.ks@renesas.com>
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: Masaru Nagai <masaru.nagai.vx@renesas.com>
[horms: squashed into a single patch]
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Suggested-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Scott Feldman <sfeldma@gmail.com>
Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Replace "void *obj" with a generic structure. Introduce couple of
helpers along that.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Scott Feldman <sfeldma@gmail.com>
Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make the struct name in sync with object id name.
Suggested-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Scott Feldman <sfeldma@gmail.com>
Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make the struct name in sync with object id name.
Suggested-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Scott Feldman <sfeldma@gmail.com>
Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
To be aligned with obj.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Scott Feldman <sfeldma@gmail.com>
Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Suggested-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Scott Feldman <sfeldma@gmail.com>
Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull networking fixes from David Miller:
1) Fix regression in SKB partial checksum handling, from Pravin B
Shalar.
2) Fix VLAN inside of VXLAN handling in i40e driver, from Jesse
Brandeburg.
3) Cure softlockups during accept() in SCTP, from Karl Heiss.
4) MSG_PEEK should return multiple SKBs worth of data in AF_UNIX, from
Aaron Conole.
5) IPV6 erroneously ignores output interface specifier in lookup key for
route lookups, fix from David Ahern.
6) In Marvell DSA driver, forward unknown frames to CPU port, from
Andrew Lunn.
7) Mission flow flag initializations in some code paths, from David
Ahern.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
net: Initialize flow flags in input path
net: dsa: fix preparation of a port STP update
testptp: Silence compiler warnings on ppc64
net/mlx4: Handle return codes in mlx4_qp_attach_common
dsa: mv88e6xxx: Enable forwarding for unknown to the CPU port
skbuff: Fix skb checksum partial check.
net: ipv6: Add RT6_LOOKUP_F_IFACE flag if oif is set
net sysfs: Print link speed as signed integer
bna: fix error handling
af_unix: return data from multiple SKBs on recv() with MSG_PEEK flag
af_unix: Convert the unix_sk macro to an inline function for type safety
net: sctp: Don't use 64 kilobyte lookup table for four elements
l2tp: protect tunnel->del_work by ref_count
net/ibm/emac: bump version numbers for correct work with ethtool
sctp: Prevent soft lockup when sctp_accept() is called during a timeout event
sctp: Whitespace fix
i40e/i40evf: check for stopped admin queue
i40e: fix VLAN inside VXLAN
r8169: fix handling rtl_readphy result
net: hisilicon: fix handling platform_get_irq result
- Fixes for mlx5 related issues
- Fixes for ipoib multicast handling
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJWCfALAAoJELgmozMOVy/dc+MQAKoD6echYpTkWE0otMuHQcYf
zMaVVots+JdRKpA6OqHYQHgKGA80z21BpnjGYwcwB5zB1zPrJwz4vxwGlOBHt01T
xLBReFgSKyJlgOWLXKfPx4bXUdivOBKm203wY0dh+/dC/VROGYoiXYTmSDsfsuKa
8OXT1kWgzRVLtqwqj5GSkgWvtFZ28CjKh6d9egjqcj9tpbh2UupQDZzMyOtZ52X6
Nz/Vo3u4T7qjzlhHOlCwHCDw+97x0yvmvLY1mWweGPfKOnxtXjkzQmTQEpyzU5Mo
EwcqJucrBnmjbLAIBMrbR1mzTUQeD4dHz1jx+EzWE0lVnRL3twe1UaY40176sNlm
aCBA4bIOQ242r3IJ++ss15ol1k5hu7PYKRn9Q8d2sSbQGcSnCHe/YOutQQ+FTEFG
yE9xiLL+pgT8koauROnxg66E3HDM78NGTpjP3EuG4r2Qwa1iFANPfDB6kikuv8bO
rG3qUJcloEPvfatZY+h5QC4UCoB0/W1DAhlfzE3tPBYPmhSEgQDfEOzXTKDakeF0
VB903bYrOL3CVOun4I7fLrDc1leVeiAUKqO2orZs3qIpRWvAKyV/VjolAusMv2+F
/4xPyh95AEMTFfmZogOCofQFk3eOnkWpLdrVTYCKy3i6NVBoy2wHldrl+LuCAN/m
r/DNRBmazShashbeU6wg
=8+cX
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma
Pull rdma fixes from Doug Ledford:
- Fixes for mlx5 related issues
- Fixes for ipoib multicast handling
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma:
IB/ipoib: increase the max mcast backlog queue
IB/ipoib: Make sendonly multicast joins create the mcast group
IB/ipoib: Expire sendonly multicast joins
IB/mlx5: Remove pa_lkey usages
IB/mlx5: Remove support for IB_DEVICE_LOCAL_DMA_LKEY
IB/iser: Add module parameter for always register memory
xprtrdma: Replace global lkey with lkey local to PD
Sparse found some issues with 32 bit compilation, which probably should
at least work without warning. Not only that, but the code was wrong.
Thanks sparse!!
And thanks to the kbuild robot zero day testing for finding this issue.
$ make ARCH=i386 M=drivers/net/ethernet/intel/i40e C=2 CF="-D__CHECK_ENDIAN__"
CHECK drivers/net/ethernet/intel/i40e/i40e_main.c
include/linux/etherdevice.h:79:32: warning: restricted __be16 degrades to integer
drivers/net/ethernet/intel/i40e/i40e_main.c:7565:17: warning: shift too big (32) for type unsigned long
drivers/net/ethernet/intel/i40e/i40e_main.c:7565:17: warning: shift too big (42) for type unsigned long
drivers/net/ethernet/intel/i40e/i40e_main.c:7565:17: warning: shift too big (39) for type unsigned long
drivers/net/ethernet/intel/i40e/i40e_main.c:7565:17: warning: shift too big (40) for type unsigned long
CC: kbuild-all@01.org
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The 0day build infrastructure found some issues in i40e, this
removes the warnings by adding a harmless cast to a dev_info.
CC: kbuild-all@01.org
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch tweaks the init timing of the driver just a little bit to
increase stability on load/unload and SR-IOV enable/disable cycles.
First, run the init_task loop a little quicker in order to reduce
overall init time.
Second, stagger the start of the init task based on the device's
PCIe function ID. This lessens the impact on the firmware when a
whole bunch of VFs are initialized simultaneously, e.g. enabling
SR-IOV without the VF driver blacklisted. For single VFs assigned
to VMs this will have no effect as the function ID will always be 0.
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Down was requesting queue disables, but then exited immediately without
waiting for the queues to actually disable. This could allow any
function called after i40evf_down to run immediately, including
i40evf_up, and causes a memory leak.
This issue has been fixed in a recent refactor of the reset code, but
add a couple WARN_ONs in the slow path to help us recognize if we
reintroduce this issue or if we missed any cases.
Change-ID: I27b6b5c9a79c1892f0ba453129f116bc32647dd0
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The interrupt enable function was always making the caller add
the base_vector from the VSI struct which is already passed to
the function. Just collapse the math into the helper function.
Change-ID: I54ef33aa7ceebc3231c3cc48f7b39fd0c3ff5806
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Due to performance reasons, VEB stats have been disabled in the hw. This
patch adds code to check for that condition before accumulating these
stats.
Change-ID: I7d805669476fedabb073790403703798ae5d878e
Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Add new device id and support for another 20Gb device.
Change-ID: Ib1b61e5bb6201d84953f97cade39a6e3369c2cf2
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Remove a useless message that blathers on whenever a vxlan port is deleted.
Change-ID: If63fb8cf38e56cf433b68e498f11389de51919ba
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Don't let the debugfs register read and write commands try to access
outside of the ioremapped space. While we're at it, remove the use of
a misleading constant.
Change-ID: Ifce2893e232c65c7a76c23532c658f298218a81b
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
In i40e_ndo_set_vf_port_vlan, we were using the QOS value
inconsistently, sometimes shifting it, sometimes not. Do the shift-and-
or operation correctly, once, and use the result consistently everywhere
in the function.
Change-ID: I46f062f3edc90a8a017ecec9137f4d1ab0ab9e41
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The i40e rx_dropped counter was not showing up in netstat -i.
Add the right counter to be updated with the stats.
Change-ID: I4dd552e9995836099184f9d9a08e90edb591155f
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The arm writeback (arm_wb) code is used for kicking the Tx ring to
make sure any pending work is completed even if interrupts are
disabled. It was running when it didn't need to, and not clearing
the ring->arm_wb state after it was set. This caused Tx hangs
to still occur occasionally when there really was no hang.
Fix this by resetting the variable right after it was used.
Change-ID: I7bf75d552ba9c4bd203d40615213861a24bb5594
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The struct i40e_aqc_get_cee_dcb_cfg_v1_resp was originally defined with
word boundary layout issues, which most compilers deal with by silently
adding padding, making the actual struct larger than designed.
This patch adds an extra byte in fields reserved3 and reserved4 to directly
acknowledge that padding.
Because the struct doesn't actually change in size or layout, this doesn't
constitute a change in the API.
Change-ID: I53fa4741b73fa255621232a85fba000b0e223015
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
If a port VLAN is set for a given virtual function (VF) before the VF
driver is loaded then a configuration error results in which the port
VLAN is ignored when the VF driver is subsequently loaded. This causes
the VF's MAC/VLAN filters to not use the correct VLAN filter. This
patch ensures that the port VLAN filter is considered at the right time
during configuration of the VF's MAC/VLAN filters.
Change-ID: I28f404cbc21a4c6d70a7980b87c77f13f06685a4
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The error code sent into i40e_aq_rc_to_posix() are signed values, so we
really need to treat them as such.
Change-ID: I3d1ae0ee9ae0b1b6f5fc424f8b8cc58b0ea93203
Reported-by: Helin Zhang <helin.zhang@intel.com>
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Similar to the notifier_call callback of a notifier_block, change the
function signature of switchdev add and del operations to:
int switchdev_port_obj_add/del(struct net_device *dev,
enum switchdev_obj_id id, void *obj);
This allows the caller to pass a specific switchdev_obj_* structure
instead of the generic switchdev_obj one.
Drivers implementation of these operations and switchdev have been
changed accordingly.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Similar to the notifier_call callback of a notifier_block, change the
function signature of switchdev dump operation to:
int switchdev_port_obj_dump(struct net_device *dev,
enum switchdev_obj_id id, void *obj,
int (*cb)(void *obj));
This allows the caller to pass and expect back a specific
switchdev_obj_* structure instead of the generic switchdev_obj one.
Drivers implementation of dump operation can now expect this specific
structure and call the callback with it. Drivers have been changed
accordingly.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The net_device associated to a dump operation does not have to be passed
to the callback. switchdev stores it in a superset struct, if needed.
Also some drivers (such as DSA drivers) may not have easy access to it.
This will simplify pushing the callback function down to the drivers.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Both new_steering_entry() and existing_steering_entry() return values
based on their success or failure, but currently they fall through
silently. This can make troubleshooting difficult, as we were unable
to tell which one of these two functions returned errors or
specifically what code was returned. This patch remedies that
situation by passing the return codes to err, which is returned by
mlx4_qp_attach_common() itself.
This also addresses a leak in the call to mlx4_bitmap_free() as well.
Signed-off-by: Robb Manes <rmanes@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If CONFIG_HPLANCE=m and CONFIG_NET_POLL_CONTROLLER=y:
ERROR: "lance_poll" [drivers/net/ethernet/amd/hplance.ko] undefined!
Add the missing export to fix this.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The modular driver supports only one card, just like the built-in
driver.
Note that this limitation is a problem which affects all Nubus card
drivers, because they have to do all their own bus matching, because
Nubus still lacks the necessary driver model support.
Suggested-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
On some embedded systems the EEPROM does not contain a valid MAC address.
In that case it is better to fallback to a generated mac address and
let init scripts fix the value later.
Reported-by: Liviu Dudau <Liviu.Dudau@arm.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
[Changed handcoded setup to use eth_hw_addr_random() and to save new address into HW]
Signed-off-by: Liviu Dudau <Liviu.Dudau@arm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Several functions can return negative value in case of error,
so their return type should be fixed as well as type of variables
to which this value is assigned.
The problem has been detected using proposed semantic patch
scripts/coccinelle/tests/assign_signed_to_unsigned.cocci [1].
[1]: http://permalink.gmane.org/gmane.linux.kernel/2046107
Signed-off-by: Andrzej Hajda <a.hajda@samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since the switch to per-CPU interrupts, we lost the ability to set which
CPU was going to receive our RX interrupt, which was now only the CPU on
which the mvneta_open function was run.
We can now assign our queues to their respective CPUs, and make sure only
this CPU is going to handle our traffic.
This also paves the road to be able to change that at runtime, and later on
to support RSS.
[gregory.clement@free-electrons.com]: hardened the CPU hotplug support.
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The mvneta driver allows to change the default RX queue trough the rxq_def
kernel parameter.
However, the current code doesn't allow to have any value but 0. It is
actively checked for in the driver's probe because the drivers makes a
number of assumption and takes a number of shortcuts in order to just use
that RX queue.
Remove these limitations in order to be able to specify any available
queue.
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now that our interrupt controller is allowing us to use per-CPU interrupts,
actually use it in the mvneta driver.
This involves obviously reworking the driver to have a CPU-local NAPI
structure, and report for incoming packet using that structure.
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The CPU_MAP register is duplicated for each CPUs at different addresses,
each instance being at a different address.
However, the code so far was using CONFIG_NR_CPUS to initialise the CPU_MAP
registers for each registers, while the SoCs embed at most 4 CPUs.
This is especially an issue with multi_v7_defconfig, where CONFIG_NR_CPUS
is currently set to 16, resulting in writes to registers that are not
CPU_MAP.
Fixes: c5aff18204 ("net: mvneta: driver for Marvell Armada 370/XP network unit")
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Cc: <stable@vger.kernel.org> # v3.8+
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Adds support for ethtool get time stamp ioctl, which is used by
tcpdump to get the supported time stamp types
eg: tcpdump -i eth5 -J
Time stamp types for eth5 (use option -j to set):
host (Host)
adapter_unsynced (Adapter, not synced with system time)
Adds support for adapter unsynced mode, by adding SIOCSHWTSTAMP support
in driver.
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes the compilation error with arm allmodconfig, this error
generated due to unavailability of readq() on 32-bit platform which was
found during net-next daily compilation. In the same time, fix all the
hns drivers compilation warnings.
Signed-off-by: huangdaode <huangdaode@hisilicon.com>
Signed-off-by: zhaungyuzeng <Yisen.zhuang@huawei.com>
Signed-off-by: kenneth Lee <liguozhu@hisilicon.com>
Signed-off-by: yankejian <yankejian@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is no need to have FEATURES_NEED_QUIESCE defined as we
can simply use NETIF_F_RXCSUM instead as done in other parts
of the driver.
Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The size of the MAC register dump used to be the size specified by the
reg property in the device tree. Userland has no good way of finding
out that size, and it was not specified consistently for each MAC type,
so ethtool would end up printing junk at the end of the register dump
if the device tree didn't match the size it assumed.
Using the new version numbers indicates unambiguously that the size of
the MAC register dump is dependent only on the MAC type.
Fixes: 5369c71f7c ("net/ibm/emac: fix size of emac dump memory areas")
Signed-off-by: Ivan Mikhaylov <ivan@ru.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Update new health monitored syndromes and their descriptions.
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The name refers to syndrome so uset ext_synd instread of ext_sync.
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the new flow, we separate the pci initialization and teardown from the
initialization and teardown of the other resources.
init_one calls mlx5_pci_init that handles the pci resources initialization.
It then calls mlx5_load_one to initialize the remainder of the resources.
When removing a device, remove_one is invoked. However, now remove_one
calls mlx5_unload_one to free all the resources except the pci resources.
When mlx5_unload_one returns, mlx5_pci_close is called to free the pci
resources.
The above separation will allow us to implement the pci error handlers and
suspend and resume callbacks.
Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some errors did not result with notifying firmware that the page request
could not be fulfilled. Fix this and put the notification logic into a
separate function.
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In case of async command completion, the error code returned should take
into account the command completion status.
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cosmetic change.
Do not use the an err variable just to assign and return it.
Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Used the output mailbox format for input mailbox.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The private mlx5 state flag that indicates that the netdev is
opened is set at the beginning of the netdev open flow.
In case an error occured later in the mlx5 netdev open flow, this
flag was not cleared, remaining set although the actual set is
closed.
Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It's possible that while we are waiting for the spinlock, another
entity (that owns the spinlock) has shut down the admin queue.
If we then attempt to use the queue, we will panic.
Add a check for this condition on the receive side. This matches
an existing check on the send queue side.
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Previously to this patch, the hardware was removing
VLAN tags from the inner header of VXLAN packets. The
hardware configuration can be changed to leave the
packet alone since that is what the linux stack
expects for this type of VLAN in VXLAN packet.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In igb_sw_init() the sequence of calls was changed from
igb_init_queue_configuration()
igb_init_interrupt_scheme()
igb_probe_vfs()
to
igb_probe_vfs()
igb_init_queue_configuration()
igb_init_interrupt_scheme()
This results in adapter->flags not having the IGB_FLAG_HAS_MSIX bit set
during igb_probe_vfs()->igb_enable_sriov(). Therefore SR-IOV does not
get enabled properly and we run into a NULL pointer if the max_vfs
module parameter is specified (adapter->vf_data does not get allocated,
crash on accessing the structure).
[ 7.419348] BUG: unable to handle kernel NULL pointer dereference at 0000000000000048
[ 7.419367] IP: [<ffffffffa02161c6>] igb_reset+0xe6/0x5d0 [igb]
[ 7.419370] PGD 0
[ 7.419373] Oops: 0002 [#1] SMP
[ 7.419381] Modules linked in: ahci(+) libahci igb(+) i40e(+) vxlan ip6_udp_tunnel udp_tunnel megaraid_sas(+) ixgbe(+) mdio
[ 7.419385] CPU: 0 PID: 4 Comm: kworker/0:0 Not tainted 4.2.0+ #153
[ 7.419387] Hardware name: Dell Inc. PowerEdge R720/0C4Y3R, BIOS 1.6.0 03/07/2013
[...]
[ 7.419431] Call Trace:
[ 7.419442] [<ffffffffa0217236>] igb_probe+0x8b6/0x1340 [igb]
[ 7.419447] [<ffffffff814c7f15>] local_pci_probe+0x45/0xa0
Prevent this by setting the IGB_FLAG_HAS_MSIX bit before calling
igb_probe_vfs(). The real interrupt capabilities will be checked during
igb_init_interrupt_scheme() so this is safe to do.
Signed-off-by: Stefan Assmann <sassmann@kpanic.de>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The sync_vsi_filters function can be called directly under RTNL
or through the timer subtask without one. This was causing a deadlock.
If sync_vsi_filters is called from a thread which held the lock,
and in another thread the PROMISC setting got changed we would
be executing the PROMISC change in the thread which already held
the lock alongside the other filter update. The PROMISC change
requires a reset if we are on a VEB, which requires it to be called
under RTNL.
Earlier the driver would call reset for PROMISC change without
checking if we were already under RTNL and would try to grab it
causing a deadlock. This patch changes the flow to see if we are
already under RTNL before trying to grab it.
Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Signed-off-by: Kiran Patil <kiran.patil@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch fixes the issue of forcing WB too often causing us to not
benefit from NAPI.
Without this patch we were forcing WB/arming interrupt too often taking
away the benefits of NAPI and causing a performance impact.
With this patch we disable force WB in the clean routine for X710
and XL710 adapters. X722 adapters do not enable interrupt to force
a WB and benefit from WB_ON_ITR and hence force WB is left enabled
for those adapters.
For XL710 and X710 adapters if we have less than 4 packets pending
a software Interrupt triggered from service task will force a WB.
This patch also changes the conditions for setting RS bit as described
in code comments. This optimizes when the HW does a tail bump and when
it does a WB. It also optimizes when we do a wmb.
Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Make sure the Tx checksum encoder knows about GRE protocol and sets the
descriptor flag appropriately.
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch modifies the driver timeout logic by issuing a writeback
request via a software interrupt to the hardware the first time the
driver detects a hang. The driver was too aggressive in resetting a hung
queue, so back that off by removing logic to down the netdevice after
too many hangs, and move the function to the service task.
Change-ID: Ife100b9d124cd08cbdb81ab659008c1b9abbedea
Signed-off-by: Kiran Patil <kiran.patil@intel.com>
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
i40e_get_head needs to be called in multiple files in a further patch,
prepare by moving the function into a header file.
Signed-off-by: Kiran Patil <kiran.patil@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The function can return negative value.
The problem has been detected using proposed semantic patch
scripts/coccinelle/tests/assign_signed_to_unsigned.cocci [1].
[1]: http://permalink.gmane.org/gmane.linux.kernel/2046107
Signed-off-by: Andrzej Hajda <a.hajda@samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The function can return negative value.
The problem has been detected using proposed semantic patch
scripts/coccinelle/tests/assign_signed_to_unsigned.cocci [1].
[1]: http://permalink.gmane.org/gmane.linux.kernel/2046107
Signed-off-by: Andrzej Hajda <a.hajda@samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When fixing the TSO support I noticed we just mask ->gso_size with the
MSSMask value and don't care about the consequences.
Provide a .ndo_features_check() method which drops the NETIF_F_TSO
feature for any skb which would exceed the maximum, and thus forces it
to be segmented by software.
Then we can stop the masking in cp_start_xmit(), and just WARN if the
maximum is exceeded, which should now never happen.
Finally, Francois Romieu noticed that we didn't even have the right
value for MSSMask anyway; it should be 0x7ff (11 bits) not 0xfff.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
I fixed TSO. Hardware checksum and scatter/gather also appear to be
working correctly both on real hardware and in QEMU's emulation.
Let's enable them by default and see if anyone screams...
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Conflicts:
net/ipv4/arp.c
The net/ipv4/arp.c conflict was one commit adding a new
local variable while another commit was deleting one.
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull networking fixes from David Miller:
1) When we run a tap on netlink sockets, we have to copy mmap'd SKBs
instead of cloning them. From Daniel Borkmann.
2) When converting classical BPF into eBPF, fix the setting of the
source reg to BPF_REG_X. From Tycho Andersen.
3) Fix igmpv3/mldv2 report parsing in the bridge multicast code, from
Linus Lussing.
4) Fix dst refcounting for ipv6 tunnels, from Martin KaFai Lau.
5) Set NLM_F_REPLACE flag properly when replacing ipv6 routes, from
Roopa Prabhu.
6) Add some new cxgb4 PCI device IDs, from Hariprasad Shenai.
7) Fix headroom tests and SKB leaks in ipv6 fragmentation code, from
Florian Westphal.
8) Check DMA mapping errors in bna driver, from Ivan Vecera.
9) Several 8139cp bug fixes (dev_kfree_skb_any in interrupt context,
misclearing of interrupt status in TX timeout handler, etc.) from
David Woodhouse.
10) In tipc, reset SKB header pointer after skb_linearize(), from Erik
Hugne.
11) Fix autobind races et al. in netlink code, from Herbert Xu with
help from Tejun Heo and others.
12) Missing SET_NETDEV_DEV in sunvnet driver, from Sowmini Varadhan.
13) Fix various races in timewait timer and reqsk_queue_hadh_req, from
Eric Dumazet.
14) Fix array overruns in mac80211, from Johannes Berg and Dan
Carpenter.
15) Fix data race in rhashtable_rehash_one(), from Dmitriy Vyukov.
16) Fix race between poll_one_napi and napi_disable, from Neil Horman.
17) Fix byte order in geneve tunnel port config, from John W Linville.
18) Fix handling of ARP replies over lightweight tunnels, from Jiri
Benc.
19) We can loop when fib rule dumps cross multiple SKBs, fix from Wilson
Kok and Roopa Prabhu.
20) Several reference count handling bug fixes in the PHY/MDIO layer
from Russel King.
21) Fix lockdep splat in ppp_dev_uninit(), from Guillaume Nault.
22) Fix crash in icmp_route_lookup(), from David Ahern.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (116 commits)
net: Fix panic in icmp_route_lookup
net: update docbook comment for __mdiobus_register()
ppp: fix lockdep splat in ppp_dev_uninit()
net: via/Kconfig: GENERIC_PCI_IOMAP required if PCI not selected
phy: marvell: add link partner advertised modes
net: fix net_device refcounting
phy: add phy_device_remove()
phy: fixed-phy: properly validate phy in fixed_phy_update_state()
net: fix phy refcounting in a bunch of drivers
of_mdio: fix MDIO phy device refcounting
phy: add proper phy struct device refcounting
phy: fix mdiobus module safety
net: dsa: fix of_mdio_find_bus() device refcount leak
phy: fix of_mdio_find_bus() device refcount leak
ip6_tunnel: Reduce log level in ip6_tnl_err() to debug
ip6_gre: Reduce log level in ip6gre_err() to debug
fib_rules: fix fib rule dumps across multiple skbs
bnx2x: byte swap rss_key to comply to Toeplitz specs
net: revert "net_sched: move tp->root allocation into fw_init()"
lwtunnel: remove source and destination UDP port config option
...
The builds of allmodconfig of avr32 is failing with:
drivers/net/ethernet/via/via-rhine.c:1098:2: error: implicit declaration
of function 'pci_iomap' [-Werror=implicit-function-declaration]
drivers/net/ethernet/via/via-rhine.c:1119:2: error: implicit declaration
of function 'pci_iounmap' [-Werror=implicit-function-declaration]
The generic empty pci_iomap and pci_iounmap is used only if CONFIG_PCI
is not defined and CONFIG_GENERIC_PCI_IOMAP is defined.
Add GENERIC_PCI_IOMAP in the dependency list for VIA_RHINE as we are
getting build failure when CONFIG_PCI and CONFIG_GENERIC_PCI_IOMAP both
are not defined.
Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 96249d70dd ("IB/core: Guarantee that a local_dma_lkey
is available") allows ULPs that make use of the local dma key to keep
working as before by allocating a DMA MR with local permissions and
converted these consumers to use the MR associated with the PD
rather then device->local_dma_lkey.
ConnectIB has some known issues with memory registration
using the local_dma_lkey (SEND, RDMA, RECV seems to work ok).
Thus don't expose support for it (remove device->local_dma_lkey
setting), and take advantage of the above commit such that no regression
is introduced to working systems.
The local_dma_lkey support will be restored in CX4 depending on FW
capability query.
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Add a phy_device_remove() function to complement phy_device_register(),
which undoes the effects of phy_device_register() by removing the phy
device from visibility, but not freeing it.
This allows these details to be moved out of the mdio bus code into
the phy code where this action belongs.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
of_phy_find_device() increments the phy struct device refcount, which
we need to properly balance. Add code to network drivers using this
function to ensure that the struct device refcount is correctly
balanced.
For xgene, looking back in the history, we should be able to use
of_phy_connect() with a zero flags argument for the DT case as this is
how the driver used to operate prior to de7b5b3d79 ("net: eth: xgene:
change APM X-Gene SoC platform ethernet to support ACPI").
This leaves the Cavium Thunder BGX unfixed; fixing this driver is a
complicated task, one which the maintainers need to be involved with.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Benefit from previously introduced transaction item queue infrastructure
and remove rocker specific transaction memory management.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There will be needed to have switchdev_trans available down in the call
chain, so propagate it instead of trans phase enum. This enum will be
removed anyway. Also, use prepare/commit phase check helpers to get
information about current phase of transaction.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Before it disappears completely, move transaction phase enum under
transaction structure and make attr/obj structures a bit cleaner.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now, the memory allocation in prepare/commit state is done separatelly
in each driver (rocker). Introduce the similar mechanism in generic
switchdev code, in form of queue. That can be used not only for memory
allocations, but also for different items. Abort item destruction
is handled as well.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is temporary, name "trans" will be used for something else and
"trans_ph" will eventually disappear.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jeff Kirsher says:
====================
Intel Wired LAN Driver Updates 2015-09-23
This series contains updates to ixgbe only.
Mark provides all the changes in this series, first clears the destination
location for I2C data initially so that the received data will not be
corrupted by previous attempts. Then reduced the pauses/delays in the
PHY detection when no SFP is present by reducing the number of retires,
once an SFP is detected, the "normal" number of retries in PHY detection
will be used. Added support for X55EM_x SFP+ dual-speed, and fixed 1G and
10G link stability for X550EM_x by configuring the CS4227 correctly by
moving code to ixgbe_setup_mac_link_sfp_x550em(). Added functionality to
reset CS4227, since on some platforms the CS4227 does not initialize
properly. Next reduces the SFP polling rate, due to when an SFP is not
present, the I2C timeouts that result are very costly. So prevent the
SFP polling from being done more than once every two seconds. Added
support for I2C bus MUX. Fixed the setting of RDRXCTL register which
should fall through X540 and 82599, not 82598. In addition, added small
packet padding support in X550 by setting RDRXCTL.PSP when the driver is
in SRIOV mode. Fixed a known hardware issue where the PCI transactions
pending bit sticks high when there are pending transactions, so
workaround the issue by wait and then continue with our reset flow.
Added a new device ID for X550EM device with SFPs. Provided a fix with
the DCA setup, which was suggested by Alex Duyck <aduyck@mirantis.com>,
by making it so that we always set the relaxed ordering bits related to
the DCA registers even if DCA is not enbaled. Then moves the
configuration out of the ixgbe_down() and into ixgbe_configure() before
enabling the transmit and receive rings. This ensures that DCA is
configured correctly before starting the processing of packets.
Fixed VM-to-VM loopback mode which requires that FCRTH be set, but
the datasheets did not specify what the value should be. It has now
been determined that the correct value should be RXPBSIZE - (24*1024).
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
After a good amount of debugging, I found bnx2x was byte swaping
the 40 bytes of rss_key.
If we byte swap the key, then bnx2x generates hashes matching
MSDN specs as documented in (Verifying the RSS Hash Calculation)
https://msdn.microsoft.com/en-us/library/windows/hardware/ff571021%
28v=vs.85%29.aspx
It is mostly a non issue, unless we want to mix different NIC
in a host, and want consistent hashing among all of them, ie
if they all use the boot time generated rss key, or if some application
is choosing specific tuple(s) so that incoming traffic lands into known
rx queue(s).
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use of_property_read_u32 instead of of_get_property with return value
checks and endianness conversion.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Reviewed-by: Sören Brinkmann <soren.brinkmann@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use of_property_read_u32 to read the "clock-frequency" property instead
of using of_get_property with return value checks.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Reviewed-by: Sören Brinkmann <soren.brinkmann@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The device is set as wakeup capable using proper wakeup API but the
driver misuses IRQF_NO_SUSPEND to set the interrupt as wakeup source
which is incorrect.
This patch removes the use of IRQF_NO_SUSPEND flags replacing it with
enable_irq_wake instead.
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Claudiu Manoil <claudiu.manoil@freescale.com>
Cc: Kevin Hao <haokexin@gmail.com>
Cc: netdev@vger.kernel.org
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: Claudiu Manoil <claudiu.manoil@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
With the addition of X550em_x SFP+ support, the driver is now
functionally equivalent to what will be the 4.2.1 driver when
released, so change the version to match.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The X540 thermal interrupt (IXGBE_EIMS_TS) is not an SDP, so it
doesn't need to be enabled in ixgbe_setup_gpie(). In fact the
value is simply not for the GPIE register at all.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The 82599 and X540 datasheets require that FCRTH be "set" for Tx
switching (VM-to-VM loopback) but it did not previously specify what
the value should be set to. It has now been determined that
the correct value is RXPBSIZE - (24*1024).
This setting is also required for later devices.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
A logic error here results in the adapter_stopped flag only being
cleared when ixgbe_setup_fc returns an error. Correct the logic.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change does two things. First, it makes it so that we always
set the relaxed ordering bits related to the DCA registers even if
DCA is not enabled. Second, it moves the configuration out of the
ixgbe_down function and into the ixgbe_configure function before
enabling the Rx and Tx rings. This ensures that DCA is configured
correctly before starting to process packets.
Thanks to Alex Duyck for this fix.
CC: Alex Duyck <aduyck@mirantis.com>
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Add new device ID for X550EM device with SFPs.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch skips the PCI transactions pending check in
ixgbe_disable_pcie_master. This is done to addresses a known HW
issue where the PCI transactions pending bit sticks high when there
are pending transactions. HW engineering instructed to workaround
this issue by wait and then continue with our reset flow.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch sets RDRXCTL.PSP when the driver is in SRIOV mode which
enables padding of small packets.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Setting the X550* RDRXCTL register should fall through into X540
and 82599, not 82598.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The timeout path is supposed to release the semaphore, so do that.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Take control of an I2C mux that selects which SFP is attached to
the I2C bus. The control of the mux is captured in the taking and
releasing of the related semaphore. Because only port 1 can control
the mux, port 1 always leaves the mux set to select port 0.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Reduce the frequency of polling for SFP modules. Because the
service task sometimes runs at high rates, we can poll for
SFPs too often. When an SFP is not present, the I2C timeouts
that result are very costly. So, prevent SFP polling from
being done more than once every two seconds. To reduce latency,
the poll time is cleared in a couple of cases to permit the
next service task execution to poll the SFP module.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Since SFP+ can be used with some X550 devices, permit them to be
detected.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
On some hardware platforms, the CS4227 does not initialize properly.
Detect those cases and reset it appropriately.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Configures the CS4227 correctly for both 1G and 10G operation,
by moving the code to ixgbe_setup_mac_link_sfp_x550em(). It
needs to be in this function because we need both the module
type and the speed, and this is the only function in the init
flow that knows the speed. In contrast,
ixgbe_setup_sfp_modules_X550em() does not know the speed, so we
can't do anything useful here. This is a fundamental difference
from the previous flow, and is due to the way the CS4227 is
implemented.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch adds X550EM_x SFP+ dual-speed support. 82599 fiber link
code was moved from ixgbe_82599.c to ixgbe_common.c for use by
X550EM. SFP MAC link code is added to x550EM.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Reduce the number of retries during PHY detection. This reduces
pauses when no SFP is present. Once an SFP is detected, the normal
retry count will be used.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Clear the destination location for I2C data initially so that
the received data will not be affected by previous attempts.
This could have returned wrong data in certain retry sequences.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
In case the DaVinci Emac is directly connected to a
non-mdio PHY/device, it should be possible to provide
a fixed link configuration in the DT.
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Tested-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jeff Kirsher says:
====================
Intel Wired LAN Driver Updates 2015-09-22
This series contains updates to e1000, e1000e, igbvf, ixgbe, ixgbevf and
fm10k.
Jacob provides several updates for fm10k, which cleans up comments and
most notably a fix for a corner case issue with the PF/VF mailbox code.
The issue being fm10k_mbx_reset_work clears various states about the
mailbox, but does not clear the Tx FIFO head/tail pointers. We also
are not able to simply clear these pointers, as we would drop
untransmitted messages without error. Also adds support for extra debug
statistics, which provides the ability to see what the PF thinks the
VF mailboxes look like.
Don adds support for SFP+ in X550 and support for SCTP flow director
filters SCTP mask.
Francois Romieu dusts off the e1000 driver and removes some dead calls
to e1000_init_eeprom_params().
Toshiaki Makita provides three patches to enable TSO for stacked VLAN's
on e1000e, igbvf and ixgbevf.
Mark provides the first of several ixgbe updates. First updates the
driver to accept SFP not present error for all devices, since an SFP
can still be inserted. Adds support for SFP insertion interrupt on
X550EM devices with SPFs. Adds I2C combined operations on X550EM
(not X550) devices. Moved the setting of lan_id to before any I2C
eeprom access. Lastly, set the bit bang mode in the hardware when
performing bit banding I2C operations on X550.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
We are seeing unexplained TX timeouts under heavy load. Let's try to get
a better idea of what's going on.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The low 16 bits of the 'opts1' field in the TX descriptor are supposed
to still contain the buffer length when the descriptor is handed back to
us. In practice, at least on my hardware, they don't. So stash the
original value of the opts1 field and get the length to unmap from
there.
There are other ways we could have worked out the length, but I actually
want a stash of the opts1 field anyway so that I can dump it alongside
the contents of the descriptor ring when we suffer a TX timeout.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We calculate the value of the opts1 descriptor field in three different
places. With two different behaviours when given an invalid packet to
be checksummed — none of them correct. Sort that out.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When sending a TSO frame in multiple buffers, we were neglecting to set
the first descriptor up in TSO mode.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
After a certain amount of staring at the debug output of this driver, I
realised it was lying to me.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If an RX interrupt was already received but NAPI has not yet run when
the RX timeout happens, we end up in cp_tx_timeout() with RX interrupts
already disabled. Blindly re-enabling them will cause an IRQ storm.
(This is made particularly horrid by the fact that cp_interrupt() always
returns that it's handled the interrupt, even when it hasn't actually
done anything. If it didn't do that, the core IRQ code would have
detected the storm and handled it, I'd have had a clear smoking gun
backtrace instead of just a spontaneously resetting router, and I'd have
at *least* two days of my life back. Changing the return value of
cp_interrupt() will be argued about under separate cover.)
Unconditionally leave RX interrupts disabled after the reset, and
schedule NAPI to check the receive ring and re-enable them.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A deadlock trace is seen in netcp driver with lockup detector enabled.
The trace log is provided below for reference. This patch fixes the
bug by removing the usage of netcp_modules_lock within ndo_ops functions.
ndo_{open/close/ioctl)() is already called with rtnl_lock held. So there
is no need to hold another mutex for serialization across processes on
multiple cores. So remove use of netcp_modules_lock mutex from these
ndo ops functions.
ndo_set_rx_mode() shouldn't be using a mutex as it is called from atomic
context. In the case of ndo_set_rx_mode(), there can be call to this API
without rtnl_lock held from an atomic context. As the underlying modules
are expected to add address to a hardware table, it is to be protected
across concurrent updates and hence a spin lock is used to synchronize
the access. Same with ndo_vlan_rx_add_vid() & ndo_vlan_rx_kill_vid().
Probably the netcp_modules_lock is used to protect the module not being
removed as part of rmmod. Currently this is not fully implemented and
assumes the interface is brought down before doing rmmod of modules.
The support for rmmmod while interface is up is expected in a future
patch set when additional modules such as pa, qos are added. For now
all of the tests such as if up/down, reboot, iperf works fine with this
patch applied.
Deadlock trace seen with lockup detector enabled is shown below for
reference.
[ 16.863014] ======================================================
[ 16.869183] [ INFO: possible circular locking dependency detected ]
[ 16.875441] 4.1.6-01265-gfb1e101 #1 Tainted: G W
[ 16.881176] -------------------------------------------------------
[ 16.887432] ifconfig/1662 is trying to acquire lock:
[ 16.892386] (netcp_modules_lock){+.+.+.}, at: [<c03e8110>]
netcp_ndo_open+0x168/0x518
[ 16.900321]
[ 16.900321] but task is already holding lock:
[ 16.906144] (rtnl_mutex){+.+.+.}, at: [<c053a418>] devinet_ioctl+0xf8/0x7e4
[ 16.913206]
[ 16.913206] which lock already depends on the new lock.
[ 16.913206]
[ 16.921372]
[ 16.921372] the existing dependency chain (in reverse order) is:
[ 16.928844]
-> #1 (rtnl_mutex){+.+.+.}:
[ 16.932865] [<c06023f0>] mutex_lock_nested+0x68/0x4a8
[ 16.938521] [<c04c5758>] register_netdev+0xc/0x24
[ 16.943831] [<c03e65c0>] netcp_module_probe+0x214/0x2ec
[ 16.949660] [<c03e8a54>] netcp_register_module+0xd4/0x140
[ 16.955663] [<c089654c>] keystone_gbe_init+0x10/0x28
[ 16.961233] [<c000977c>] do_one_initcall+0xb8/0x1f8
[ 16.966714] [<c0867e04>] kernel_init_freeable+0x148/0x1e8
[ 16.972720] [<c05f9994>] kernel_init+0xc/0xe8
[ 16.977682] [<c0010038>] ret_from_fork+0x14/0x3c
[ 16.982905]
-> #0 (netcp_modules_lock){+.+.+.}:
[ 16.987619] [<c006eab0>] lock_acquire+0x118/0x320
[ 16.992928] [<c06023f0>] mutex_lock_nested+0x68/0x4a8
[ 16.998582] [<c03e8110>] netcp_ndo_open+0x168/0x518
[ 17.004064] [<c04c48f0>] __dev_open+0xa8/0x10c
[ 17.009112] [<c04c4b74>] __dev_change_flags+0x94/0x144
[ 17.014853] [<c04c4c3c>] dev_change_flags+0x18/0x48
[ 17.020334] [<c053a9fc>] devinet_ioctl+0x6dc/0x7e4
[ 17.025729] [<c04a59ec>] sock_ioctl+0x1d0/0x2a8
[ 17.030865] [<c0142844>] do_vfs_ioctl+0x41c/0x688
[ 17.036173] [<c0142ae4>] SyS_ioctl+0x34/0x5c
[ 17.041046] [<c000ff60>] ret_fast_syscall+0x0/0x54
[ 17.046441]
[ 17.046441] other info that might help us debug this:
[ 17.046441]
[ 17.054434] Possible unsafe locking scenario:
[ 17.054434]
[ 17.060343] CPU0 CPU1
[ 17.064862] ---- ----
[ 17.069381] lock(rtnl_mutex);
[ 17.072522] lock(netcp_modules_lock);
[ 17.078875] lock(rtnl_mutex);
[ 17.084532] lock(netcp_modules_lock);
[ 17.088366]
[ 17.088366] *** DEADLOCK ***
[ 17.088366]
[ 17.094279] 1 lock held by ifconfig/1662:
[ 17.098278] #0: (rtnl_mutex){+.+.+.}, at: [<c053a418>]
devinet_ioctl+0xf8/0x7e4
[ 17.105774]
[ 17.105774] stack backtrace:
[ 17.110124] CPU: 1 PID: 1662 Comm: ifconfig Tainted: G W
4.1.6-01265-gfb1e101 #1
[ 17.118637] Hardware name: Keystone
[ 17.122123] [<c00178e4>] (unwind_backtrace) from [<c0013cbc>]
(show_stack+0x10/0x14)
[ 17.129862] [<c0013cbc>] (show_stack) from [<c05ff450>]
(dump_stack+0x84/0xc4)
[ 17.137079] [<c05ff450>] (dump_stack) from [<c0068e34>]
(print_circular_bug+0x210/0x330)
[ 17.145161] [<c0068e34>] (print_circular_bug) from [<c006ab7c>]
(validate_chain.isra.35+0xf98/0x13ac)
[ 17.154372] [<c006ab7c>] (validate_chain.isra.35) from [<c006da60>]
(__lock_acquire+0x52c/0xcc0)
[ 17.163149] [<c006da60>] (__lock_acquire) from [<c006eab0>]
(lock_acquire+0x118/0x320)
[ 17.171058] [<c006eab0>] (lock_acquire) from [<c06023f0>]
(mutex_lock_nested+0x68/0x4a8)
[ 17.179140] [<c06023f0>] (mutex_lock_nested) from [<c03e8110>]
(netcp_ndo_open+0x168/0x518)
[ 17.187484] [<c03e8110>] (netcp_ndo_open) from [<c04c48f0>]
(__dev_open+0xa8/0x10c)
[ 17.195133] [<c04c48f0>] (__dev_open) from [<c04c4b74>]
(__dev_change_flags+0x94/0x144)
[ 17.203129] [<c04c4b74>] (__dev_change_flags) from [<c04c4c3c>]
(dev_change_flags+0x18/0x48)
[ 17.211560] [<c04c4c3c>] (dev_change_flags) from [<c053a9fc>]
(devinet_ioctl+0x6dc/0x7e4)
[ 17.219729] [<c053a9fc>] (devinet_ioctl) from [<c04a59ec>]
(sock_ioctl+0x1d0/0x2a8)
[ 17.227378] [<c04a59ec>] (sock_ioctl) from [<c0142844>]
(do_vfs_ioctl+0x41c/0x688)
[ 17.234939] [<c0142844>] (do_vfs_ioctl) from [<c0142ae4>]
(SyS_ioctl+0x34/0x5c)
[ 17.242242] [<c0142ae4>] (SyS_ioctl) from [<c000ff60>]
(ret_fast_syscall+0x0/0x54)
[ 17.258855] netcp-1.0 2620110.netcp eth0: Link is Up - 1Gbps/Full - flow
control off
[ 17.271282] BUG: sleeping function called from invalid context at
kernel/locking/mutex.c:616
[ 17.279712] in_atomic(): 1, irqs_disabled(): 0, pid: 1662, name: ifconfig
[ 17.286500] INFO: lockdep is turned off.
[ 17.290413] Preemption disabled at:[< (null)>] (null)
[ 17.295728]
[ 17.297214] CPU: 1 PID: 1662 Comm: ifconfig Tainted: G W
4.1.6-01265-gfb1e101 #1
[ 17.305735] Hardware name: Keystone
[ 17.309223] [<c00178e4>] (unwind_backtrace) from [<c0013cbc>]
(show_stack+0x10/0x14)
[ 17.316970] [<c0013cbc>] (show_stack) from [<c05ff450>]
(dump_stack+0x84/0xc4)
[ 17.324194] [<c05ff450>] (dump_stack) from [<c06023b0>]
(mutex_lock_nested+0x28/0x4a8)
[ 17.332112] [<c06023b0>] (mutex_lock_nested) from [<c03e9840>]
(netcp_set_rx_mode+0x160/0x210)
[ 17.340724] [<c03e9840>] (netcp_set_rx_mode) from [<c04c483c>]
(dev_set_rx_mode+0x1c/0x28)
[ 17.348982] [<c04c483c>] (dev_set_rx_mode) from [<c04c490c>]
(__dev_open+0xc4/0x10c)
[ 17.356724] [<c04c490c>] (__dev_open) from [<c04c4b74>]
(__dev_change_flags+0x94/0x144)
[ 17.364729] [<c04c4b74>] (__dev_change_flags) from [<c04c4c3c>]
(dev_change_flags+0x18/0x48)
[ 17.373166] [<c04c4c3c>] (dev_change_flags) from [<c053a9fc>]
(devinet_ioctl+0x6dc/0x7e4)
[ 17.381344] [<c053a9fc>] (devinet_ioctl) from [<c04a59ec>]
(sock_ioctl+0x1d0/0x2a8)
[ 17.388994] [<c04a59ec>] (sock_ioctl) from [<c0142844>]
(do_vfs_ioctl+0x41c/0x688)
[ 17.396563] [<c0142844>] (do_vfs_ioctl) from [<c0142ae4>]
(SyS_ioctl+0x34/0x5c)
[ 17.403873] [<c0142ae4>] (SyS_ioctl) from [<c000ff60>]
(ret_fast_syscall+0x0/0x54)
[ 17.413772] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
udhcpc (v1.20.2) started
Sending discover...
[ 18.690666] netcp-1.0 2620110.netcp eth0: Link is Up - 1Gbps/Full - flow
control off
Sending discover...
[ 22.250972] netcp-1.0 2620110.netcp eth0: Link is Up - 1Gbps/Full - flow
control off
[ 22.258721] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[ 22.265458] BUG: sleeping function called from invalid context at
kernel/locking/mutex.c:616
[ 22.273896] in_atomic(): 1, irqs_disabled(): 0, pid: 342, name: kworker/1:1
[ 22.280854] INFO: lockdep is turned off.
[ 22.284767] Preemption disabled at:[< (null)>] (null)
[ 22.290074]
[ 22.291568] CPU: 1 PID: 342 Comm: kworker/1:1 Tainted: G W
4.1.6-01265-gfb1e101 #1
[ 22.300255] Hardware name: Keystone
[ 22.303750] Workqueue: ipv6_addrconf addrconf_dad_work
[ 22.308895] [<c00178e4>] (unwind_backtrace) from [<c0013cbc>]
(show_stack+0x10/0x14)
[ 22.316643] [<c0013cbc>] (show_stack) from [<c05ff450>]
(dump_stack+0x84/0xc4)
[ 22.323867] [<c05ff450>] (dump_stack) from [<c06023b0>]
(mutex_lock_nested+0x28/0x4a8)
[ 22.331786] [<c06023b0>] (mutex_lock_nested) from [<c03e9840>]
(netcp_set_rx_mode+0x160/0x210)
[ 22.340394] [<c03e9840>] (netcp_set_rx_mode) from [<c04c9d18>]
(__dev_mc_add+0x54/0x68)
[ 22.348401] [<c04c9d18>] (__dev_mc_add) from [<c05ab358>]
(igmp6_group_added+0x168/0x1b4)
[ 22.356580] [<c05ab358>] (igmp6_group_added) from [<c05ad2cc>]
(ipv6_dev_mc_inc+0x4f0/0x5a8)
[ 22.365019] [<c05ad2cc>] (ipv6_dev_mc_inc) from [<c058f0d0>]
(addrconf_dad_work+0x21c/0x33c)
[ 22.373460] [<c058f0d0>] (addrconf_dad_work) from [<c0042850>]
(process_one_work+0x214/0x8d0)
[ 22.381986] [<c0042850>] (process_one_work) from [<c0042f54>]
(worker_thread+0x48/0x4bc)
[ 22.390071] [<c0042f54>] (worker_thread) from [<c004868c>]
(kthread+0xf0/0x108)
[ 22.397381] [<c004868c>] (kthread) from [<c0010038>]
Trace related to incorrect usage of mutex inside ndo_set_rx_mode
[ 24.086066] BUG: sleeping function called from invalid context at
kernel/locking/mutex.c:616
[ 24.094506] in_atomic(): 1, irqs_disabled(): 0, pid: 1682, name: ifconfig
[ 24.101291] INFO: lockdep is turned off.
[ 24.105203] Preemption disabled at:[< (null)>] (null)
[ 24.110511]
[ 24.112005] CPU: 2 PID: 1682 Comm: ifconfig Tainted: G W
4.1.6-01265-gfb1e101 #1
[ 24.120518] Hardware name: Keystone
[ 24.124018] [<c00178e4>] (unwind_backtrace) from [<c0013cbc>]
(show_stack+0x10/0x14)
[ 24.131772] [<c0013cbc>] (show_stack) from [<c05ff450>]
(dump_stack+0x84/0xc4)
[ 24.138989] [<c05ff450>] (dump_stack) from [<c06023b0>]
(mutex_lock_nested+0x28/0x4a8)
[ 24.146908] [<c06023b0>] (mutex_lock_nested) from [<c03e9840>]
(netcp_set_rx_mode+0x160/0x210)
[ 24.155523] [<c03e9840>] (netcp_set_rx_mode) from [<c04c483c>]
(dev_set_rx_mode+0x1c/0x28)
[ 24.163787] [<c04c483c>] (dev_set_rx_mode) from [<c04c490c>]
(__dev_open+0xc4/0x10c)
[ 24.171531] [<c04c490c>] (__dev_open) from [<c04c4b74>]
(__dev_change_flags+0x94/0x144)
[ 24.179528] [<c04c4b74>] (__dev_change_flags) from [<c04c4c3c>]
(dev_change_flags+0x18/0x48)
[ 24.187966] [<c04c4c3c>] (dev_change_flags) from [<c053a9fc>]
(devinet_ioctl+0x6dc/0x7e4)
[ 24.196145] [<c053a9fc>] (devinet_ioctl) from [<c04a59ec>]
(sock_ioctl+0x1d0/0x2a8)
[ 24.203803] [<c04a59ec>] (sock_ioctl) from [<c0142844>]
(do_vfs_ioctl+0x41c/0x688)
[ 24.211373] [<c0142844>] (do_vfs_ioctl) from [<c0142ae4>]
(SyS_ioctl+0x34/0x5c)
[ 24.218676] [<c0142ae4>] (SyS_ioctl) from [<c000ff60>]
(ret_fast_syscall+0x0/0x54)
[ 24.227156] IPv6: ADDRCONF(NETDEV_UP): eth1: link is not ready
Signed-off-by: Murali Karicheri <m-karicheri2@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently netcp_rxpool_refill() that refill descriptors and attached
buffers to fdq while interrupt is enabled as part of NAPI poll. Doing
it while interrupt is disabled could be beneficial as hardware will
not be starved when CPU is busy with processing interrupt.
Signed-off-by: Murali Karicheri <m-karicheri2@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently netcp_module_probe() doesn't check the return value of
of_parse_phandle() that points to the interface data for the
module and then pass the node ptr to the module which is incorrect.
Check for return value and free the intf_modpriv if there is error.
Signed-off-by: Murali Karicheri <m-karicheri2@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, if netcp_allocate_rx_buf() fails due no descriptors
in the rx free descriptor queue, inside the netcp_rxpool_refill() function
the iterative loop to fill buffers doesn't terminate right away. So modify
the netcp_allocate_rx_buf() to return an error code and use it break the
loop when there is error.
Signed-off-by: Murali Karicheri <m-karicheri2@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The netcp interface is not fully initialized before attach the module
to the interface. For example, the tx pipe/rx pipe is initialized
in ethss module as part of attach(). So until this is complete, the
interface can't be registered. So move registration of interface to
net device outside the current loop that attaches the modules to the
interface.
Signed-off-by: Murali Karicheri <m-karicheri2@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
netcp_core is the first driver that will get initialized and the modules
(ethss, pa etc) will then get initialized. So the code at the end of
netcp_probe() that iterate over the modules is a dead code as the module
list will be always be empty. So remove this code.
Signed-off-by: Murali Karicheri <m-karicheri2@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On K2HK, sgmii module registers of slave 0 and 1 are mem
mapped to one contiguous block, while those of slave 2
and 3 are mapped to another contiguous block. However,
on K2E and K2L, sgmii module registers of all slaves are
mem mapped to one contiguous block. SGMII APIs expect
slave 0 sgmii base when API is invoked for slave 0 and 1,
and slave 2 sgmii base when invoked for other slaves.
Before this patch, slave 0 sgmii base is always passed to
sgmii API for K2E regardless which slave is the API invoked
for. This patch fixes the problem.
Signed-off-by: WingMan Kwok <w-kwok2@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a timer to each rocker switch to do FDB entry cleanup by ageing out
expired entries. The timer scheduling algo is copied from the bridge
driver, for the most part, to keep the firing of the timer to a minimum.
Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Acked-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
Follow-up patcheset will allow user to change ageing_time, but for now
just hard-code it to a fixed value (the same value used as the default
for the bridge driver).
Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Acked-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
We'll need more info from rocker_port than just pport when we age out fdb
entries, so store rocker_port rather than pport in each fdb entry.
Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Acked-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
The entry is touched once when created, and touched again for each update.
The touched time is used to calculate FDB entry age.
Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Acked-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
Variable can store negative values.
The problem has been detected using proposed semantic patch
scripts/coccinelle/tests/unsigned_lesser_than_zero.cocci [1].
[1]: http://permalink.gmane.org/gmane.linux.kernel/2038576
Signed-off-by: Andrzej Hajda <a.hajda@samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
phy_mode can be negative.
The problem has been detected using proposed semantic patch
scripts/coccinelle/tests/unsigned_lesser_than_zero.cocci [1].
[1]: http://permalink.gmane.org/gmane.linux.kernel/2038576
Signed-off-by: Andrzej Hajda <a.hajda@samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Set the bit banging mode in the hardware when performing bit banging
I2C operations on X550. Also control the output enable on both the
clock and data lines.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The lan_id is being set after a previous I2C eeprom access which
makes no sense because it needs to be set before any access. Move
the setting to before the access.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Most I2C accesses take and release semaphores for each access. Now
there is a reason to perform multiple I2C operations under the same
holding of the semaphore, so provide unlocked I2C methods for that
purpose.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Provide I2C combined operations on X550EM, not X550 devices.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Add support for the SFP insertion interrupt on X550EM devices with
SFPs.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
When an SFP not present error is returned by the reset_hw method,
accept it and go on, since an SFP can still be inserted. Previously
it was only accepted for 82598 devices.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Setting ndo_features_check to passthru_features_check allows the driver
to skip the check for multiple tagged TSO packets and enables stacked
VLAN TSO.
Tested with 82599ES.
Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Setting ndo_features_check to passthru_features_check allows the driver
to skip the check for multiple tagged TSO packets and enables stacked
VLAN TSO.
Tested with I217-LM.
Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Setting ndo_features_check to passthru_features_check allows the driver
to skip the check for multiple tagged TSO packets and enables stacked
VLAN TSO.
Tested with I350.
Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The device probe method e1000_probe calls e1000_init_eeprom_params
itself so there's no reason to call it again from e1000_do_write_eeprom
or e1000_do_read_eeprom.
The sentence above assumes that e1000_init_eeprom_params is effective.
e1000_init_eeprom_params depends mostly on hw->mac_type and e1000_probe
bails out early if it can't set mac_type (see e1000_init_hw_struct, then
e1000_set_mac_type), qed.
Btw, if effective, the removed paths would had been deadlock prone when
e1000_eeprom_spi was set:
-> e1000_write_eeprom (takes e1000_eeprom_lock)
-> e1000_do_write_eeprom
-> e1000_init_eeprom_params
-> e1000_read_eeprom (takes e1000_eeprom_lock)
(same narrative with e1000_read_eeprom -> e1000_do_read_eeprom etc.)
As a final note, the candidate deadlock above can't happen in e1000_probe
due to the way eeprom->word_size is set / tested.
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Add a private ethtool flag to enable display of these statistics, which
are generally less useful. However, sometimes it can be useful for
debugging purposes. The most useful portion is the ability to see what
the PF thinks the VF mailboxes look like.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
When we connect to the mailbox, we insert a fake disconnect header so
that the code does not see an invalid header and thus instantly error
every time we bring up the mailbox. However, we incorrectly record the
tail and head from the local perspective. Since the remote end shouldn't
have anything for us, add a "create_fake_disconnect_hdr" function which
inverts the TAIL and HEAD fields. This enables us to connect without any
errors of either TAIL or HEAD incorrectness, and prevents creating
extraneous error messages. This is necessary now since mbx_reset_work
does not actually reset the Tx FIFO head and tail pointers, thus head
and tail might not be equivalent on a reconnect.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch fixes a corner case issue with the PF/VF mailbox code.
Currently, fm10k_mbx_reset_work clears various state about the mailbox.
However, it does not clear the Tx FIFO head/tail pointers. We can't
simply clear these pointers as we unintentionally drop untransmitted
messages without error.
Doing nothing results in a possible phantom re-transmission of messages,
since we leave tx.head and tx.tail intact, but clear the tx_pulled and
tail_len values. This means that the PF could continuously re-send a
message which triggers a reset in the VF. Upon reset, the VF will
re-receive the same message after a reconnect.
If we reset the tx.head and tx.tail pointers completely, we end up
dropping some messages that were pending before connect. This results in
missing LPORT_MSG_READY bits, and VFs will end up reporting no link.
However, we can resolve both issues by simply incrementing head to
account for the already transmitted messages, before we reset tx_pulled.
We do this via the same logic as fm10k_mbx_head_pull.
We account for the tail_len which includes all data not yet transmitted,
once we account for the acked data which means re-reading the HEAD
variable from the message header. Then, we drop messages until we've
dropped more than the new tx_pulled value. At this point, resetting
tail_len and tx_pulled, but not tx.head and tx.tail will result in
prevention of the phantom message. It also prevents us from dropping
untransmitted messages upon attempting to Tx into a connect or
disconnect header.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>