linux

Author	SHA1	Message	Date
Lihong Yang	9bcc07f065	i40e: use a local variable instead of calculating multiple times The computed result of I40E_MAX_VSI_QP * I40E_VIRTCHNL_SUPPORTED_QTYPES is used more than three times in function i40e_config_irq_link_list. Simply declare a local variable to store it to improve readability. Signed-off-by: Lihong Yang <lihong.yang@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-09 14:38:04 -07:00
Jayaprakash Shanmugam	4988410f8d	i40e: Retry AQC GetPhyAbilities to overcome I2CRead hangs - When the I2C is busy, the PHY reads are delayed. The firmware will return EGAIN in these cases with an expectation that the SW will trigger the reads again - This patch retries the operation for a maximum period of 500ms Signed-off-by: Jayaprakash Shanmugam <jayaprakash.shanmugam@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-09 14:32:18 -07:00
Lihong Yang	b861fb762a	i40e: add check for return from find_first_bit call The find_first_bit function will return the size passed to search if the first set bit is not found. This patch adds the check in case that happens as the return value would be used as the index in an array and that would have caused the out-of-bounds access. Detected by CoverityScan, CID 1295969 Out-of-bounds access Signed-off-by: Lihong Yang <lihong.yang@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-09 14:30:41 -07:00
Jacob Keller	6f853d4f8e	i40e: allow XPS with QoS enabled Recently, the kernel gained support for enabling XPS and QoS at the same time. Thus, we no longer need to worry about the number of traffic classes when enabling XPS. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-09 14:28:56 -07:00
Jacob Keller	95bc2fb4c6	i40e/i40evf: bundle more descriptors when allocating buffers Double the number of descriptors we'll bundle into one tail bump when receiving. Empirical testing has shown that we reduce CPU utilization and don't appear to reduce throughput or packet rate. 32 seems to be the sweet spot, as it's half the default polling budget, so we'd essentially reduce from 4 tail writes when polling down to 2. Increasing this up to 64 appears to have negative impacts as it may become possible that we don't bump the tail each time we get polled, which could cause a long delay between returning descriptors to the hardware. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-09 14:27:42 -07:00
Jacob Keller	11f29003d6	i40e/i40evf: bump tail only in multiples of 8 Hardware only fetches descriptors on cachelines of 8, essentially ignoring the lower 3 bits of the tail register. Thus, it is pointless to bump tail by an unaligned access as the hardware will ignore some of the new descriptors we allocated. Thus, it's ideal if we can ensure tail writes are always aligned to 8. At first, it seems like we'd already do this, since we allocate descriptors in batches which are a multiple of 8. Since we'd always increment by a multiple of 8, it seems like the value should always be aligned. However, this ignores allocation failures. If we fail to allocate a buffer, our tail register will become unaligned. Once it has become unaligned it will essentially be stuck unaligned until a buffer allocation happens to fail at the exact amount necessary to re-align it. We can do better, by simply rounding down the number of buffers we're about to allocate (cleaned_count) such that "next_to_clean + cleaned_count" is rounded to the nearest multiple of 8. We do this by calculating how far off that value is and subtracting it from the cleaned_count. This essentially defers allocation of buffers if they're going to be ignored by hardware anyways, and re-aligns our next_to_use and tail values after a failure to allocate a descriptor. This calculation ensures that we always align the tail writes in a way the hardware expects and don't unnecessarily allocate buffers which won't be fetched immediately. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-09 14:26:29 -07:00
Jacob Keller	7362be9eee	i40e: reduce lrxqthresh from 2 to 1 The lrxq thresh value tells hardware to immediately interrupt when there are fewer than N*64 packets left in the ring. Counter intuitively, empirical testing has shown that decreasing this value from 2 to 1, and thus changing from an immediate interrupt at fewer than 128 descriptors down to 64 descriptors causes a small increase in the maximum total packets per second we can receive. This increase occurs even when we're polling with interrupts masked, as the hardware must still handle interrupts internally even if we've disabled them in software. Also reduce the value for any VFs we allocate. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-09 14:24:46 -07:00
Jacob Keller	dbadbbe235	i40e/i40evf: always set the CLEARPBA flag when re-enabling interrupts In the past we changed driver behavior to not clear the PBA when re-enabling interrupts. This change was motivated by the flawed belief that clearing the PBA would cause a lost interrupt if a receive interrupt occurred while interrupts were disabled. According to empirical testing this isn't the case. Additionally, the data sheet specifically says that we should set the CLEARPBA bit when re-enabling interrupts in a polling setup. This reverts commit `40d72a5098` ("i40e/i40evf: don't lose interrupts") Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-09 14:21:26 -07:00
Jacob Keller	4270255929	i40e/i40evf: fix incorrect default ITR values on driver load The ITR register expects to be programmed in units of 2 microseconds. Because of this, all of the drivers I40E_ITR_* constants are in terms of this 2 microsecond register. Unfortunately, the rx_itr_default value is expected to be programmed in microseconds. Effectively the driver defaults to an ITR value of half the expected value (in terms of minimum microseconds between interrupts). Fix this by changing the default values to be calculated using ITR_REG_TO_USEC macro which indicates that we're converting from the register units into microseconds. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-09 14:19:46 -07:00
Alan Brady	c766b9af9a	i40evf: fix mac filter removal timing issue Due to the asynchronous nature in which mac filters are added and deleted, there exists a bug in which filters are erroneously removed if removed then added again quickly. The events are as such: - filter marked for removal - same filter is re-added before watchdog that cleans up filters - we skip re-adding the filter because we have it already in the list - watchdog filter cleanup kicks off and filter is removed So when we were re-adding the same filter, it didn't actually get added because it already existed in the list, but was marked for removal and had yet to actually be removed. This patch fixes the issue by making sure that when adding a filter, if we find it already existing in our list, make sure it is not marked to be removed. Signed-off-by: Alan Brady <alan.brady@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-09 14:17:03 -07:00
Lihong Yang	784548c40d	i40e: use the safe hash table iterator when deleting mac filters This patch replaces hash_for_each function with hash_for_each_safe when calling __i40e_del_filter. The hash_for_each_safe function is the right one to use when iterating over a hash table to safely remove a hash entry. Otherwise, incorrect values may be read from freed memory. Detected by CoverityScan, CID 1402048 Read from pointer after free Signed-off-by: Lihong Yang <lihong.yang@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-09 14:12:54 -07:00
Jacob Keller	b48be9978e	i40e: fix flags declaration Since we don't yet have more than 32 flags, we'll use a u32 for both the hw_features and flag field. Should we gain more flags in the future, we may need to convert to a u64 or separate flags out into two fields. This was overlooked in the previous commit 2781de2134c4 ("i40e/i40evf: organize and re-number feature flags"), where the feature flag was not converted form u64 to u32. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-09 12:37:28 -07:00
Jacob Keller	b74f571f59	i40e/i40evf: organize and re-number feature flags Now that we've reduced the number of flags, organize similar flags together and re-number them accordingly. Since we don't yet have more than 32 flags, we'll use a u32 for both the hw_features and flag field. Should we gain more flags in the future, we may need to convert to a u64 or separate flags out into two fields. One alternative approach considered, but not implemented here, was to use an enumeration for the flag variables, and create a macro I40E_FLAG() which used string concatenation to generate BIT_ULL values. This has the advantage of making the actual bit values compile-time dynamic so that we do not need to worry about matching the order to the bit value. However, this does produce a high level of code churn, and makes it more difficult to read a dumped flags value when debugging. Change-ID: I8653fff69453cd547d6fe98d29dfa9d8710387d1 Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-06 08:11:32 -07:00
Jacob Keller	a5340d933e	i40e: ignore skb->xmit_more when deciding to set RS bit Since commit `6a7fded776` ("i40e: Fix RS bit update in Tx path and disable force WB workaround") we've tried to "optimize" setting the RS bit based around skb->xmit_more. This same logic was refactored in commit `1dc8b53879` ("i40e: Reorder logic for coalescing RS bits"), but ultimately was not functionally changed. Using skb->xmit_more in this way is incorrect, because in certain circumstances we may see a large number of skbs in sequence with xmit_more set. This leads to a performance loss as the hardware does not writeback anything for those packets, which delays the time it takes for us to respond to the stack transmit requests. This significantly impacts UDP performance, especially when layered with multiple devices, such as bonding, VLANs, and vnet setups. This was not noticed until now because it is difficult to create a setup which reproduces the issue. It was discovered in a UDP_STREAM test in a VM, connected using a vnet device to a bridge, which is connected to a bonded pair of X710 ports in active-backup mode with a VLAN. These layered devices seem to compound the number of skbs transmitted at once by the qdisc. Additionally, the problem can be masked by reducing the ITR value. Since the original commit does not provide strong justification for this RS bit "optimization", revert to the previous behavior of setting the RS bit every 4th packet. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-06 08:11:32 -07:00
Jacob Keller	0a3b4f702f	i40evf: enable support for VF VLAN tag stripping control A recent commit 809481484e5d ("i40e/i40evf: support for VF VLAN tag stripping control") added support for VFs to negotiate the control of VLAN tag stripping. This should have allowed VFs to disable the feature. Unfortunately, the flag was set only in netdev->feature flags and not in netdev->hw_features. This ultimately causes the stack to assume that it cannot change the flag, so it was unchangeable and marked as [fixed] in the ethtool -k output. Fix this by setting the feature in hw_features first, just as we do for the PF code. This enables ethtool -K to disable the feature correctly, and fully enables user control of the VLAN tag stripping feature. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-06 08:11:32 -07:00
Mariusz Stachura	052b93d0c2	i40e: do not enter PHY debug mode while setting LEDs behaviour Previous implementation of LED set/get functions required to enter PHY debug mode, in order to prevent access to it from FW and SW at the same time. Reset of all ports was a unwanted side effect. Signed-off-by: Mariusz Stachura <mariusz.stachura@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-06 08:11:32 -07:00
Alan Brady	19b7960b2d	i40e: implement split PCI error reset handler This patch implements the PCI error handler reset_prepare and reset_done. This allows us to handle function level reset. Without this patch we are unable to perform and recover from an FLR correctly and this will cause VFs to be unable to recover from an FLR on the PF. Signed-off-by: Alan Brady <alan.brady@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-06 08:11:32 -07:00
Filip Sadowski	013df598d6	i40e: Properly maintain flow director filters list When there is no space for more flow director filters and user requested to add a new one it is rejected by firmware and automatically removed from the filter list maintained by driver. This behaviour is correct. Afterwards existing filter can be removed making free slot for the new one. This however causes the newly added filter to be accepted by firmware but removed from driver filter list resulting in not showing after issuing 'ethtool -n <dev_name>'. This happened due to not clearing the variable pf->fd_inv which stores filter number to be removed from the list when firmware refused to add the requested filter. It caused the filter with this specific ID to be constantly removed once it was added to the list although it has been accepted by firmware and effectively applied to the NIC. It was fixed by clearing pf->fd_inv variable after removal of the filter from the list when it was rejected by firmware. Signed-off-by: Filip Sadowski <filip.sadowski@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-06 08:11:32 -07:00
Filip Sadowski	9a858178ef	i40e: Display error message if module does not meet thermal requirements This patch causes error message to be displayed when NIC detects insertion of module that does not meet thermal requirements. Signed-off-by: Filip Sadowski <filip.sadowski@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-06 08:11:31 -07:00
Alice Michael	7f66182263	i40e: fix merge error This patch removes some code that was accidentally added to the wrong function with a merge error. Fixes: `c53934c6d1` ("i40e: fix: do not sleep in netdev_ops") Signed-off-by: Alice Michael <alice.michael@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-06 08:11:31 -07:00
Jesse Brandeburg	bd6cd4e6dd	i40e/i40evf: use DECLARE_BITMAP for state When using set_bit and friends, we should be using actual bitmaps, and fix all the locations where we might access it. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-06 08:11:31 -07:00
Mitch Williams	0a0d9af5bc	i40e: fix incorrect register definition This register was defined incorrectly. Fix the increment value to 8, and replace the iterator with _i to make the definition consistent with other statistics registers. Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-06 08:11:31 -07:00
Mitch Williams	60518a0489	i40e: redfine I40E_PHY_TYPE_MAX Since I40E_PHY_TYPE_MAX is used as an iterator, usually combined with some sort of bit-shifting, it should only include actual PHY types and not error cases. Move it up in the enum declaration so that loops only iterate across valid PHY types. Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-06 08:11:31 -07:00
Alan Brady	c3d26b75c2	i40e: re-enable PTP L4 capabilities for XL710 if FW >6.0 Starting with XL710 FW 5.3 PTP L4 was disabled for XL710 due to a bug. The bug has since been resolved in XL710 FW >6.0 and PTP L4 can now be re-enabled on those devices with updated firmware. Signed-off-by: Alan Brady <alan.brady@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-06 08:11:31 -07:00
Jacob Keller	be664cbefc	i40e/i40evf: spread CPU affinity hints across online CPUs only Currently, when setting up the IRQ for a q_vector, we set an affinity hint based on the v_idx of that q_vector. Meaning a loop iterates on v_idx, which is an incremental value, and the cpumask is created based on this value. This is a problem in systems with multiple logical CPUs per core (like in simultaneous multithreading (SMT) scenarios). If we disable some logical CPUs, by turning SMT off for example, we will end up with a sparse cpu_online_mask, i.e., only the first CPU in a core is online, and incremental filling in q_vector cpumask might lead to multiple offline CPUs being assigned to q_vectors. Example: if we have a system with 8 cores each one containing 8 logical CPUs (SMT == 8 in this case), we have 64 CPUs in total. But if SMT is disabled, only the 1st CPU in each core remains online, so the cpu_online_mask in this case would have only 8 bits set, in a sparse way. In general case, when SMT is off the cpu_online_mask has only C bits set: 0, 1N, 2N, ..., C*(N-1) where C == # of cores; N == # of logical CPUs per core. In our example, only bits 0, 8, 16, 24, 32, 40, 48, 56 would be set. Instead, we should only assign hints for CPUs which are online. Even better, the kernel already provides a function, cpumask_local_spread() which takes an index and returns a CPU, spreading the interrupts across local NUMA nodes first, and then remote ones if necessary. Since we generally have a 1:1 mapping between vectors and CPUs, there is no real advantage to spreading vectors to local CPUs first. In order to avoid mismatch of the default XPS hints, we'll pass -1 so that it spreads across all CPUs without regard to the node locality. Note that we don't need to change the q_vector->affinity_mask as this is initialized to cpu_possible_mask, until an actual affinity is set and then notified back to us. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-06 08:11:31 -07:00
Mitch Williams	64615b5418	i40e: add private flag to control source pruning By default, our devices do source pruning, that is, they drop receive packets that have the source MAC matching one of the receive filters. Unfortunately, this breaks ARP monitoring in channel bonding, as the bonding driver expects devices to receive ARPs containing their own source address. Add an ethtool private flag to control this feature. Also, remove the netif_running() check when we process our private flags. It's OK to reset when the device is closed and in most cases we need the reset the apply these changes. Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-06 08:11:31 -07:00
Rami Rosen	ec2f25d203	i40e: fix a typo in i40e_pf documentation This patch fixes a typo in i40e_pf object documentation; num_req_vfs refers to the number of VFs requested for the PF. Signed-off-by: Rami Rosen <rami.rosen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-06 08:11:31 -07:00
Jacob Keller	3e256ac5b1	fm10k: fix mis-ordered parameters in declaration for .ndo_set_vf_bw We've had support for setting both a minimum and maximum bandwidth via .ndo_set_vf_bw since commit `883a9ccbae` ("fm10k: Add support for SR-IOV to driver", 2014-09-20). Likely because we do not support minimum rates, the declaration mis-ordered the "unused" parameter, which causes warnings when analyzed with cppcheck. Fix this warning by properly declaring the min_rate and max_rate variables in the declaration and definition (rather than using "unused"). Also rename "rate" to max_rate so as to clarify that we only support setting the maximum rate. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-03 09:00:04 -07:00
Jacob Keller	87be98927e	fm10k: prefer %s and __func__ for diagnostic prints Don't hard code the function names in the diagnostic output when these reset related routines fail. Instead, use %s and __func__ so that future refactors don't need to change the print outs. Additionally, while we are here, add missing function header comments for the new reset_prepare and reset_done function handlers. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-03 08:55:20 -07:00
Joe Perches	c0ad8ef3df	fm10k: Fix misuse of net_ratelimit() Correct the backward logic using !net_ratelimit() Miscellanea: o Add a blank line before the error return label Signed-off-by: Joe Perches <joe@perches.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-03 08:51:27 -07:00
Jacob Keller	ef57ab791c	fm10k: bump version number Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-03 08:49:52 -07:00
Jacob Keller	1f5c27e528	fm10k: use the MAC/VLAN queue for VF<->PF MAC/VLAN requests Now that we have a working MAC/VLAN queue for handling MAC/VLAN messages from the netdev, replace the default handler for the VF<->PF messages. This new handler is very similar to the default code, but uses the MAC/VLAN queue instead of sending the message directly. Unfortunately we can't easily re-use the default code, so we'll just replace the entire function. This ensures that a VF requesting a large number of VLANs or MAC addresses does not start a reset cycle, as explained in the commit which introduced the message queue. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Ngai-mint Kwan <ngai-mint.kwan@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-03 08:39:17 -07:00
Jacob Keller	fc9173682d	fm10k: introduce a message queue for MAC/VLAN messages Under some circumstances, when dealing with a large number of MAC address or VLAN updates at once, the fm10k driver, particularly the VFs can overload the mailbox with too many messages at once. This results in a mailbox timeout, which causes the driver to initiate a reset. During the reset, we re-send all the same messages that originally caused the timeout. This results in a cycle of resets each triggering a future reset. To fix or avoid this, we introduce a workqueue item which monitors a queue of MAC and VLAN requests. These requests are queued to the end of the list, and we process as a FIFO periodically. Initially we only handle requests for the netdev, but we do handle unicast MAC addresses, multicast MAC addresses, and update VLAN requests. A future patch will add support to use this queue for handling MAC update requests from the VF<->PF mailbox. The MAC/VLAN work item will keep checking to make sure that each request does not overflow the mailbox and cause a timeout. If it might, then the work item will reschedule itself a short time later. This avoids any reset cycle, since we never send the message if the mailbox is not ready. As an alternative, we tried increasing the mailbox message FIFO, but this just delays the problem and results in needless memory waste on the system. Our new message queue is dynamically allocated so only uses as much memory as it needs. Additionally, it need not be contiguous like the Tx and Rx FIFOs. Note that this patch chose to only create a queue for MAC and VLAN messages, since these are the only messages sent in a large enough volume to cause the reset loop. Other messages are very unlikely to overflow the mailbox Tx FIFO so easily. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-03 08:25:36 -07:00
Jacob Keller	8249c47c6b	fm10k: use generic PM hooks instead of legacy PCIe power hooks Replace the PCI specific legacy power management hooks with the new generic power management hooks which work properly for both suspend and hibernate. The new generic system is better and properly handles the lower level PCIe power management rather than forcing the driver to handle it. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-03 08:19:33 -07:00
Jacob Keller	b4fcd43661	fm10k: use spinlock to implement mailbox lock Lets not re-invent the locking wheel. Remove our bitlock and use a proper spinlock instead. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-03 08:12:44 -07:00
Jacob Keller	0b40f45748	fm10k: prepare_for_reset() when we lose PCIe Link If we lose PCIe link, such as when an unannounced PFLR event occurs, or when a device is surprise removed, we currently detach the device and close the netdev. This unfortunately leaves a lot of things still active, such as the msix_mbx_pf IRQ, and Tx/Rx resources. This can cause problems because the register reads will return potentially invalid values which may result in unknown driver behavior. Begin the process of resetting using fm10k_prepare_for_reset(), much in the same way as the suspend and resume cycle does. This will attempt to shutdown as much as possible, in order to prevent possible issues. A naive implementation for this has issues, because there are now multiple flows calling the reset logic and setting a reset bit. This would cause problems, because the "re-attach" routine might call fm10k_handle_reset() prior to the reset actually finishing. Instead, we'll add state bits to indicate which flow actually initiated the reset. For the general reset flow, we'll assume that if someone else is resetting that we do not need to handle it at all, so it does not need its own state bit. For the suspend case, we will simply issue a warning indicating that we are attempting to recover from this case when resuming. For the detached subtask, we'll simply refuse to re-attach until we've actually initiated a reset as part of that flow. Finally, we'll stop attempting to manage the mailbox subtask when we're detached, since there's nothing we can do if we don't have a PCIe address. Overall this produces a much cleaner shutdown and recovery cycle for a PCIe surprise remove event. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-03 08:06:44 -07:00
David S. Miller	4efac6ff4d	Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 40GbE Intel Wired LAN Driver Updates 2017-10-02 This series contains updates to i40e and i40evf. Shannon Nelson fixes an issue where when a machine has more CPUs than queue pairs, the counting gets a "little funky" and turns off Flow Director. So to correct it, limit the number of LAN queues initially allocated to be sure there are some left for Flow Director and other features. Lihong cleans up dead code by removing a condition check which cannot ever be true. Christophe Jaillet fixes a potential NULL pointer dereference, which could happen if kzalloc() fails. Filip corrects the reporting of supported link modes, which was incorrect for some NICs. Added support for 'ethtool -m' command, which displays information about QSFP+ modules. Mariusz adds functions to read/write the LED registers to control the LEDS, instead of accessing the registers directly whenever the LEDs need to be controlled. Jake fixes a regression where we introduced a scheduling while atomic, so introduce a separate helper function which will manage its own need for the mac_filter_hash_lock. Also cleaned up the "PF" parameter in i40e_vc_disable_vf() since it is never used and is not needed. Fixed a rare case where it is possible that a reset does not occur when i40e_vc_disable_vf() is called, so modify i40e_reset_vf() to return a bool to indicate whether it reset or not so that i40e_vc_disable_vf() can wait until a reset actually occurs. Alan adds the ability for the VF to request more or less underlying allocated queues from the PF. Fixes the incorrect method for clearing the vf_states variable with a NULL assignment, when we should be using atomic bitops since we don't actually want to clear all the flags. Fixed a resource leak, where the PF driver fails to inform clients of a VF reset because we were incorrectly checking the I40E_VF_STATE_PRE_ENABLE bit. Mitch converts i40evf_map_rings_to_vectors() to a void function since it cannot fail and allows us to clean up the checks for the function return value. Scott enables the driver(s) to pass traffic with VLAN tags using the 802.1ad Ethernet protocol. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-02 15:16:03 -07:00
Scott Peterson	ab243ec940	i40e: Stop dropping 802.1ad tags - eth proto 0x88a8 Enable i40e to pass traffic with VLAN tags using the 802.1ad ethernet protocol ID (0x88a8). This requires NIC firmware providing version 1.7 of the API. With older NIC firmware 802.1ad tagged packets will continue to be dropped. No VLAN offloads nor RSS are supported for 802.1ad VLANs. Signed-off-by: Scott Peterson <scott.d.peterson@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 12:46:36 -07:00
Alan Brady	c53d11f669	i40e: fix client notify of VF reset Currently there is a bug in which the PF driver fails to inform clients of a VF reset which then causes clients to leak resources. The bug exists because we were incorrectly checking the I40E_VF_STATE_PRE_ENABLE bit. When a VF is first init we go through a reset to initialize variables and allocate resources but we don't want to inform clients of this first reset since the client isn't fully enabled yet so we set a state bit signifying we're in a "pre-enabled" client state. During the first reset we should be clearing the bit, allowing all following resets to notify the client of the reset when the bit is not set. This patch fixes the issue by negating the 'test_and_clear_bit' check to accurately reflect the behavior we want. Signed-off-by: Alan Brady <alan.brady@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 12:46:36 -07:00
Alan Brady	41d0a4d0c8	i40e: fix handling of vf_states variable Currently we inappropriately clear the vf_states variable with a null assignment. This is problematic because we should be using atomic bitops on this variable and we don't actually want to clear all the flags. We should just clear the ones we know we want to clear. Additionally remove the I40E_VF_STATE_FCOEENA bit because it is no longer being used. Signed-off-by: Alan Brady <alan.brady@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 12:46:35 -07:00
Mitch Williams	1b7b7596ae	i40e: make i40evf_map_rings_to_vectors void This function cannot fail, so why is it returning a value? And why are we checking it? Why shouldn't we just make it void? Why is this commit message made up of only questions? Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 12:46:35 -07:00
Alan Brady	5b36e8d04b	i40evf: Enable VF to request an alternate queue allocation Currently the VF gets a default number of allocated queues from HW on init and it could choose to enable or disable those allocated queues. This makes it such that the VF can request more or less underlying allocated queues from the PF. First the VF negotiates the number of queues it wants that can be supported by the PF and if successful asks for a reset. During reset the PF will reallocate the HW queues for the VF and will then remap the new queues. Signed-off-by: Alan Brady <alan.brady@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 12:46:35 -07:00
Jacob Keller	d43d60e5eb	i40e: ensure reset occurs when disabling VF It is possible although rare that we may not reset when i40e_vc_disable_vf() is called. This can lead to some weird circumstances with some values not being properly set. Modify i40e_reset_vf() to return a code indicating whether it reset or not. Now, i40e_vc_disable_vf() can wait until a reset actually occurs. If it fails to free up within a reasonable time frame we'll display a warning message. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 12:46:35 -07:00
Jacob Keller	f18d20218a	i40e: make use of i40e_vc_disable_vf Replace i40e_vc_notify_vf_reset and i40e_reset_vf with a call to i40e_vc_disable_vf which does this exact thing. This matches similar code patterns throughout the driver. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 12:46:35 -07:00
Jacob Keller	eeeddbb806	i40e: drop i40e_pf *pf from i40e_vc_disable_vf() It's never used, and the vf structure could get back to the PF if necessary. Lets just drop the extra unneeded parameter. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 12:46:35 -07:00
Jacob Keller	ba4e003d29	i40e: don't hold spinlock while resetting VF When we refactored handling of the PVID in commit `9af52f60b2` ("i40e: use (add\|rm)_vlan_all_mac helper functions when changing PVID") we introduced a scheduling while atomic regression. This occurred because we now held the spinlock across a call to i40e_reset_vf(), which results in a usleep_range() call that triggers a scheduling while atomic bug. This was rare as it only occurred if the user configured a VLAN on a VF and also attempted to reconfigure the VF from the host system with a port VLAN. We do need to hold the lock while calling i40e_is_vsi_in_vlan(), but we should not be holding it while we reset the VF. We'll fix this by introducing a separate helper function i40e_vsi_has_vlans which checks whether we have a PVID and whether the VSI has configured VLANs. This helper function will manage its own need for the mac_filter_hash_lock. Then, we can move the acquiring of the spinlock until after we reset the VF, which ensures that we do not sleep while holding the lock. Using a separate function like this makes the code more clear and is easier to read than attempting to release and re-acquire the spinlock when we reset the VF. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 12:46:35 -07:00
Mariusz Stachura	00f6c2f5e2	i40e: use admin queue for setting LEDs behavior Instead of accessing register directly, use newly added AQC in order to blink LEDs. Introduce and utilize a new flag to prevent excessive API version checking. Signed-off-by: Mariusz Stachura <mariusz.stachura@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 12:46:35 -07:00
Filip Sadowski	9c0e5caf63	i40e: Add support for 'ethtool -m' This patch adds support for 'ethtool -m' command which displays information about (Q)SFP+ module plugged into NIC's cage. Signed-off-by: Filip Sadowski <filip.sadowski@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 12:46:35 -07:00
Filip Sadowski	d60bcc7980	i40e: Fix reporting of supported link modes This patch fixes incorrect reporting of supported link modes on some NICs. Signed-off-by: Filip Sadowski <filip.sadowski@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 12:46:35 -07:00
Christophe JAILLET	54902349ee	i40e: Fix a potential NULL pointer dereference If 'kzalloc()' fails, a NULL pointer will be dereferenced. Return an error code (-ENOMEM) instead. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 12:46:35 -07:00

1 2 3 4 5 ...

4527 Commits