linux

Author	SHA1	Message	Date
Alexander Duyck	e486bdfd7c	i40e/i40evf: Add txring_txq function to match fm10k and ixgbe This patch adds a txring_txq function which allows us to convert a i40e_ring/i40evf_ring to a netdev_tx_queue structure. This way we can avoid having to make a multi-line function call for all the spots that need access to this. Change-ID: Ic063b71d8b92ea406d2c32e798c8e2b02809d65b Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-09-24 22:22:17 -07:00
Alexander Duyck	64bfd68eae	i40e: Fix Flow Director raw_buf cleanup The Tx cleanup flow was incorrectly assuming it could check for the flow director bits after it had unmapped the buffer. However in this case it results in us trying to free a raw_buf as though it is an sk_buff. To fix this I am moving up the flag test for the FD_SB bit so that when find a non-NULL skb or raw_buf value we then check the flag and use the appropriate call to free the buffer. Change-ID: I6284034ba1ea87c9922e56f6eb3181f7f09bddde Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-09-24 22:15:12 -07:00
Alexander Duyck	841493a3f6	i40e: Limit TX descriptor count in cases where frag size is greater than 16K The i40e driver was incorrectly assuming that we would always be pulling no more than 1 descriptor from each fragment. It is in fact possible for us to end up with the case where 2 descriptors worth of data may be pulled when a frame is larger than one of the pieces generated when aligning the payload to either 4K or pieces smaller than 16K. To adjust for this we just need to make certain to test all the way to the end of the fragments as it is possible for us to span 2 descriptors in the block before us so we need to guarantee that even the last 6 descriptors have enough data to fill a full frame. Change-ID: Ic2ecb4d6b745f447d334e66c14002152f50e2f99 Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-09-22 22:33:41 -07:00
Carolyn Wyborny	ffeac83685	i40e: refactor tail_bump check This patch refactors tail bump check. Change-ID: Ide0e19171d67d90cb2b06b8dcd4fa791ae120160 Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-08-19 21:24:28 -07:00
David S. Miller	da54bb13c0	Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Conflicts: drivers/net/ethernet/intel/i40e/i40e_main.c Jeff Kirsher says: ==================== 40GbE Intel Wired LAN Driver Updates 2016-07-22 This series contains updates to i40e and i40evf. Heinrich Schuchardt found a possible null pointer being dereferenced in i40e_debug_aq(), fixed the issue by doing the variable assignment after we are sure the pointer is not null. Avinash fixed an issue when link was down, we were not showing the correct advertised link modes. Mitch cleans up a useless initializer since the variable is assigned right away. Refactors the receive filter handling to properly track filter adds and deletes so the driver will not lose filters during a reset and up/down cycles. Also added a tracking mechanism so that the driver knows when to enter and leave promiscuous mode. Catherine removes a device id which is not needed (or used). Moves a mutex lock since we need to lock the client list around the i40e_client_release() call to prevent the release from interrupting the client instances while they are being added. Joshua adds Hyper-V specific VF device ids. Amitoj Kaur Chawla cleans up a redundant memset() call before a memcpy(). Stefan Assmann adds the missing link advertise for some x710 NICs. Tushar Dave fixes and issue found on SPARC, where a PF reset clears MAC filters and if a platform-specific MAC address is used, the driver has to explicitly write default MAC address to MAC filters otherwise all incoming traffic destined to the default MAC address will be dropped after reset. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-25 10:43:07 -07:00
Mitch Williams	88dc9e6fed	i40e/i40evf: remove useless initializer This initializer isn't needed because the variable is assigned right away. Change-ID: I6ce3edb3f4e0364db248a7a0bcc62ca95c01d941 Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-07-22 00:07:03 -07:00
Alexander Duyck	858296c878	i40e/i40evf: Fix i40e_rx_checksum There are a couple of issues I found in i40e_rx_checksum while doing some recent testing. As a result I have found the Rx checksum logic is pretty much broken and returning that the checksum is valid for tunnels in cases where it is not. First the inner types are not the correct values to use to test for if a tunnel is present or not. In addition the inner protocol types are not a bitmask as such performing an OR of the values doesn't make sense. I have instead changed the code so that the inner protocol types are used to determine if we report CHECKSUM_UNNECESSARY or not. For anything that does not end in UDP, TCP, or SCTP it doesn't make much sense to report a checksum offload since it won't contain a checksum anyway. This leaves us with the need to set the csum_level based on some value. For that purpose I am using the tunnel_type field. If the tunnel type is GRENAT or greater then this means we have a GRE or UDP tunnel with an inner header. In the case of GRE or UDP we will have a possible checksum present so for this reason it should be safe to set the csum_level to 1 to indicate that we are reporting the state of the inner header. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-07-14 23:17:45 -07:00
Alexander Duyck	bf2d1df395	intel: Add support for IPv6 IP-in-IP offload This patch adds support for offloading IPXIP6 type packets that represent either IPv4 or IPv6 encapsulated inside of an IPv6 outer IP header. In addition with this change we should also be able to support FOU encapsulated traffic with outer IPv6 headers. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-05-20 19:25:52 -04:00
Tom Herbert	7e13318daa	net: define gso types for IPx over IPv4 and IPv6 This patch defines two new GSO definitions SKB_GSO_IPXIP4 and SKB_GSO_IPXIP6 along with corresponding NETIF_F_GSO_IPXIP4 and NETIF_F_GSO_IPXIP6. These are used to described IP in IP tunnel and what the outer protocol is. The inner protocol can be deduced from other GSO types (e.g. SKB_GSO_TCPV4 and SKB_GSO_TCPV6). The GSO types of SKB_GSO_IPIP and SKB_GSO_SIT are removed (these are both instances of SKB_GSO_IPXIP4). SKB_GSO_IPXIP6 will be used when support for GSO with IP encapsulation over IPv6 is added. Signed-off-by: Tom Herbert <tom@herbertland.com> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-05-20 18:03:15 -04:00
Jesse Brandeburg	ab9ad98eb5	i40evf: refactor receive routine This is part 2 of the Rx refactor series, just including changes to i40evf. This refactor aligns the receive routine with the one in ixgbe which was highly optimized. This reduces the code we have to maintain and allows for (hopefully) more readable and maintainable RX hot path. In order to do this: - consolidate the receive path into a single function that doesn't use packet split but does use pages for Rx buffers. - remove the old _1buf routine - consolidate several routines into helper functions - remove VF ethtool control over packet split - remove priv_flags interface since it is unused Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-05-05 22:42:58 -07:00
Jesse Brandeburg	19b85e677d	i40evf: Drop packet split receive routine As part of preparation for the rx-refactor, remove the packet split receive routine and ancillary code. Some of the split related context set up code stays in i40e_virtchnl_pf.c in case an older VF driver tries to load and still wants to use packet split. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-05-05 22:31:23 -07:00
Jesse Brandeburg	f8a952cb40	i40e/i40evf: Refactor tunnel interpretation Refactor the interpretation of a tunnel. This removes some code and lets us start using the hardware's parsing. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-05-05 18:24:10 -07:00
Alexander Duyck	1c7b4a23d1	i40e/i40evf: Add support for GSO partial with UDP_TUNNEL_CSUM and GRE_CSUM This patch makes it so that i40e and i40evf can use GSO_PARTIAL to support segmentation for frames with checksums enabled in outer headers. As a result we can now send data over these types of tunnels at over 20Gb/s versus the 12Gb/s that was previously possible on my system. The advantage with the i40e parts is that this offload is mostly transparent as the hardware still deals with the inner and/or outer IPv4 headers so the IP ID is still incrementing for both when this offload is performed. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-05-01 17:05:16 -07:00
Jesse Brandeburg	a149f2c323	i40e/i40evf: Only offload VLAN tag if enabled The driver was offloading the VLAN tag into the skb any time there was a VLAN tag and the hardware stripping was enabled. Just check to make sure it's enabled before put_tag. Change-Id: Ife95290c06edd9a616393b38679923938b382241 Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-04-27 13:05:47 -07:00
Alexander Duyck	577389a5db	i40e/i40evf: Add support for IPIP and SIT offloads Looking over the documentation it turns out enabling IPIP and SIT offloads for i40e is pretty straightforward. As such I decided to enable them with this patch. In my testing I am seeing an improvement of 8 to 10 Gb/s for IPIP and SIT tunnels with this offload enabled. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-04-26 03:29:49 -07:00
David S. Miller	1602f49b58	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts were two cases of simple overlapping changes, nothing serious. In the UDP case, we need to add a hlist_add_tail_rcu() to linux/rculist.h, because we've moved UDP socket handling away from using nulls lists. Signed-off-by: David S. Miller <davem@davemloft.net>	2016-04-23 18:51:33 -04:00
Alexander Duyck	3f3f7cb875	i40e/i40evf: Limit TSO to 7 descriptors for payload instead of 8 per packet This patch addresses a bug introduced based on my interpretation of the XL710 datasheet. Specifically section 8.4.1 states that "A single transmit packet may span up to 8 buffers (up to 8 data descriptors per packet including both the header and payload buffers)." It then later goes on to say that each segment for a TSO obeys the previous rule, however it then refers to TSO header and the segment payload buffers. I believe the actual limit for fragments with TSO and a skbuff that has payload data in the header portion of the buffer is actually only 7 fragments as the skb->data portion counts as 2 buffers, one for the TSO header, and one for a segment payload buffer. Fixes: `2d37490b82` ("i40e/i40evf: Rewrite logic for 8 descriptor per packet check") Reported-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-04-13 19:59:23 -07:00
Jesse Brandeburg	1f15d66712	i40e/i40evf: Faster RX via avoiding FCoE As it turns out, calling into other files from hot path hurts performance a lot. In this case the majority of the time we call "check FCoE" and the packet is not FCoE, but this call was taking 5% of our total cycles spent on receive. Change-ID: I080552c26e7060bc7b78504dc2763f6f0b3d8c76 Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-04-06 18:26:23 -07:00
Jesse Brandeburg	84b079928a	i40e/i40evf: Drop unused tx_ring argument Some of the tx_ring arguments can be deleted since they are not used. Change-ID: I99275b0f191d7f63ec2f05061919904940c36f31 Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-04-06 18:15:17 -07:00
Jesse Brandeburg	d1bd743b5b	i40e/i40evf: Move stack var deeper A local variable could move down inside the context where it is used. Change-ID: I9caba9e1eacf921037077f2665cbce83fd8e95d6 Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-04-06 18:10:57 -07:00
Alexander Duyck	24d41e5e2c	i40e/i40evf: Fix TSO checksum pseudo-header adjustment With IPv4 and IPv6 now using the same format for checksums based on the length of the frame we need to update the i40e and i40evf drivers so that they correctly account for lengths greater than or equal to 64K. With this patch the driver should now correctly update checksums for frames up to 16776960 in length which should be more than large enough for all possible TSO frames in the near future. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-04-05 20:34:51 -07:00
Jesse Brandeburg	4ea623922d	i40e/i40evf: Fix casting in transmit code Simple cast to fix a sparse warning. Fixes: commit `5453205cd0` ("i40e/i40evf: Enable support for SKB_GSO_UDP_TUNNEL_CSUM") Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-04-05 02:05:34 -07:00
Alexander Duyck	a619afe814	i40e/i40evf: Add support for bulk free in Tx cleanup This patch enables bulk Tx clean for skbs. In order to enable it we need to pass the napi_budget value as that is used to determine if we are truly running in NAPI mode or if we are simply calling the routine from netpoll with a budget of 0. In order to avoid adding too many more variables I thought it best to pass the VSI directly in a fashion similar to what we do on igb and ixgbe with the q_vector. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-04-05 01:58:53 -07:00
Alexander Duyck	f2edaaaa39	i40e/i40evf: Fix handling of boolean logic in polling routines In the polling routines for i40e and i40evf we were using bitwise operators to avoid the side effects of the logical operators, specifically the fact that if the first case is true with "\|\|" we skip the second case, or if it is false with "&&" we skip the second case. This fixes an earlier patch that converted the bitwise operators over to the logical operators and instead replaces the entire thing with just an if statement since it should be more readable what we are trying to do this way. Fixes: `1a36d7fadd` ("i40e/i40evf: use logical operators, not bitwise") Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-04-05 01:54:12 -07:00
Alexander Duyck	5c4654daf2	i40e/i40evf: Allow up to 12K bytes of data per Tx descriptor instead of 8K From what I can tell the practical limitation on the size of the Tx data buffer is the fact that the Tx descriptor is limited to 14 bits. As such we cannot use 16K as is typically used on the other Intel drivers. However artificially limiting ourselves to 8K can be expensive as this means that we will consume up to 10 descriptors (1 context, 1 for header, and 9 for payload, non-8K aligned) in a single send. I propose that we can reduce this by increasing the maximum data for a 4K aligned block to 12K. We can reduce the descriptors used for a 32K aligned block by 1 by increasing the size like this. In addition we still have the 4K - 1 of space that is still unused. We can use this as a bit of extra padding when dealing with data that is not aligned to 4K. By aligning the descriptors after the first to 4K we can improve the efficiency of PCIe accesses as we can avoid using byte enables and can fetch full TLP transactions after the first fetch of the buffer. This helps to improve PCIe efficiency. Below is the results of testing before and after with this patch: Recv Send Send Utilization Service Demand Socket Socket Message Elapsed Send Recv Send Recv Size Size Size Time Throughput local remote local remote bytes bytes bytes secs. 10^6bits/s % S % U us/KB us/KB Before: 87380 16384 16384 10.00 33682.24 20.27 -1.00 0.592 -1.00 After: 87380 16384 16384 10.00 34204.08 20.54 -1.00 0.590 -1.00 So the net result of this patch is that we have a small gain in throughput due to a reduction in overhead for putting together the frame. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-04-05 01:28:44 -07:00
Alexander Duyck	3bc67973e8	i40e/i40evf: Move Tx checksum closer to TSO On all of the other Intel drivers we place checksum close to TSO as they have a significant amount in common and it can help to reduce the decision tree for how to handle the frame as the first check in TSO is to see if checksumming is offloaded, and if it is not we can skip _BOTH_ TSO and Tx checksum offload based on a single check. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-02-18 23:30:19 -08:00
Alexander Duyck	2d37490b82	i40e/i40evf: Rewrite logic for 8 descriptor per packet check This patch is meant to rewrite the logic for how we determine if we can transmit the frame or if it needs to be linearized. The previous code for this function was using a mix of division and modulus division as a part of computing if we need to take the slow path. Instead I have replaced this by simply working with a sliding window which will tell us if the frame would be capable of causing a single packet to span several descriptors. The logic for the scan is fairly simple. If any given group of 6 fragments is less than gso_size - 1 then it is possible for us to have one byte coming out of the first fragment, 6 fragments, and one or more bytes coming out of the last fragment. This gives us a total of 8 fragments which exceeds what we can allow so we send such frames to be linearized. Arguably the use of modulus might be more exact as the approach I propose may generate some false positives. However the likelihood of us taking much of a hit for those false positives is fairly low, and I would rather not add more overhead in the case where we are receiving a frame composed of 4K pages. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-02-18 23:27:05 -08:00
Alexander Duyck	4ec441df25	i40e/i40evf: Break up xmit_descriptor_count from maybe_stop_tx In an upcoming patch I would like to have access to the descriptor count used for the data portion of the frame. For this reason I am splitting up the descriptor count function from the function that stops the ring. Also in order to try and reduce unnecessary duplication of code I am moving the slow-path portions of the code out of being inline calls so that we can just jump to them and process them instead of having to build them into each function that calls them. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-02-18 23:23:51 -08:00
Alexander Duyck	5453205cd0	i40e/i40evf: Enable support for SKB_GSO_UDP_TUNNEL_CSUM The XL722 has support for providing the outer UDP tunnel checksum on transmits. Make use of this feature to support segmenting UDP tunnels with outer checksums enabled. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-02-18 14:46:23 -08:00
Alexander Duyck	fad57330b6	i40e/i40evf: Clean-up Rx packet checksum handling This is mostly a minor clean-up for the Rx checksum path in order to avoid some of the unnecessary conditional checks that were being applied. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-02-18 14:38:56 -08:00
Alexander Duyck	529f1f652e	i40e/i40evf: Add exception handling for Tx checksum Add exception handling to the Tx checksum path so that we can handle cases of TSO where the frame is bad, or Tx checksum where we didn't recognize a protocol Drop I40E_TX_FLAGS_CSUM as it is unused, move the CHECKSUM_PARTIAL check into the function itself so that we can decrease indent. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-02-18 11:05:58 -08:00
Alexander Duyck	475b4205aa	i40e/i40evf: Do not write to descriptor unless we complete This patch defers writing to the Tx descriptor bits until we know we have successfully completed a given operation. So for example we defer updating the tunnelling portion of the context descriptor until we have fully identified the type. The advantage to this approach is that we can assemble values as we go instead of having to try and kludge everything together all at once. As a result we can significantly clean up the tunneling configuration for instance as we can just do a pointer walk and do the math for the distance between each set of points. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-02-18 11:00:56 -08:00
Alexander Duyck	a3fd9d8876	i40e/i40evf: Handle IPv6 extension headers in checksum offload This patch adds support for IPv6 extension headers in setting up the Tx checksum. Without this patch extension headers would cause IPv6 traffic to fail as the transport protocol could not be identified. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-02-18 10:49:42 -08:00
Alexander Duyck	a0064728f8	i40e/i40evf: Add support for IPv4 encapsulated in IPv6 This patch fixes two issues. First was the fact that iphdr(skb)->protocl was being used to test for the outer transport protocol. This completely breaks IPv6 support. Second was the fact that we cleared the flag for v4 going to v6, but we didn't take care of txflags going the other way. As such we would have the v6 flag still set even if the inner header was v4. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-02-18 10:45:12 -08:00
Alexander Duyck	b96b78f2b7	i40e/i40evf: Replace header pointers with unions of pointers in Tx checksum path The Tx checksum path was maintaining a set of 3 pointers and two lengths in order to prepare the packet for being checksummed. The thing is we only really needed 2 pointers, and the lengths that were being maintained can easily be computed. As such we can replace the IPv4 and IPv6 header pointers with one single union that represents both, or a generic pointer to the start of the network header. For the L4 headers we can do the same with TCP and a generic pointer to the start of the transport header. The length of the TCP header is obtained by simply multiplying doff by 4, and the network header length can be obtained by subtracting the network header pointer from the transport header pointer. While I was at it I renamed l4_hdr to l4_proto to make it a bit more clear and less likely to be confused with l4.hdr which is the transport header pointer. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-02-18 10:40:22 -08:00
Alexander Duyck	c777019af1	i40e/i40evf: Consolidate all header changes into TSO function This patch goes through and pulls all of the spots where we were updating either the TCP or IP checksums in the TSO and checksum path into the TSO function. The general idea here is that we should only be updating the header after we verify we have completed a skb_cow_head check to verify the head is writable. One other advantage to doing this is that it makes things much more obvious. For example, in the case of IPv6 there was one spot where the offset of the IPv4 header checksum was being updated which is obviously incorrect. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-02-18 10:37:15 -08:00
Alexander Duyck	c49a7bc330	i40e/i40evf: Factor out L4 header and checksum from L3 bits in TSO path This patch makes it so that the L4 header offsets and such can be ignored when dealing with the L3 checksum and length update. This is done making use of two things. First we can just use the offset from the L4 header to the start of the packet to determine the L4 offset, and from that we can then make use of the data offset to determine the full length of the headers. As far as adjusting the checksum to remove the length we can simply add the inverse of the length instead of having to recompute the entire pseudo-header without the length. In the case of an IPv6 header this should be significantly cheaper since we can make use of a value we already needed instead of having to read the source and destination address out of the packet. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-02-18 10:34:05 -08:00
Alexander Duyck	03f9d6a59f	i40e/i40evf: Use u64 values instead of casting them in TSO function Instead of casing u32 values to u64 it makes more sense to just start out with u64 values in the first place. This way we don't need to create a mess with all of the casts needed to populate a 64b value. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-02-18 10:30:55 -08:00
Alexander Duyck	a9c9a81f58	i40e/i40evf: Drop outer checksum offload that was not requested The i40e and i40evf drivers contained code for inserting an outer checksum on UDP tunnels. The issue however is that the upper levels of the stack never requested such an offload and it results in possible errors. In addition the same logic was being applied to the Rx side where it was attempting to validate the outer checksum, but the logic there was incorrect in that it was testing for the resultant sum to be equal to the header checksum instead of being equal to 0. Since this code is so massively flawed, and doing things that we didn't ask for it to do I am just dropping it, and will bring it back later to use as an offload for SKB_GSO_UDP_TUNNEL_CSUM which can make use of such a feature. As far as the Rx feature I am dropping it completely since it would need to be massively expanded and applied to IPv4 and IPv6 checksums for all parts, not just the one that supports Tx checksum offload for the outer. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-02-18 10:27:45 -08:00
Mitch Williams	16fd08b859	i40e/i40evf: avoid atomics In the case where we have a page fully used by receive data, we need to release the page fully to the stack. Instead of calling get_page (which increments the page count) followed by free_page (which decrements the page count), just donate our reference to the stack. Although this donation is not tax deductible, it does allow us to avoid two very expensive atomic operations that reverse each other. Change-ID: If70739792d5748995fc175ec92ac2171ed4ad8fc Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-02-17 23:11:07 -08:00
Anjali Singhai Jain	dd353109e4	i40e: Add a SW workaround for lost interrupts This patch adds a workaround for cases where we might have interrupts that got lost but WB happened. If that happens without this patch we will see a tx_timeout. To work around it, this patch goes ahead and reschedules NAPI in that situation, if NAPI is not already scheduled. We also add a counter in ethtool to keep track of when we detect a case of tx_lost_interrupt. Note: napi_reschedule() can be safely called from process/service_task context and is done in other drivers as well without an issue. Change-ID: I00f98f1ce3774524d9421227652bef20fcbd0d20 Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-02-17 22:44:00 -08:00
Mitch Williams	1a36d7fadd	i40e/i40evf: use logical operators, not bitwise Mr. Spock would certainly raise an eyebrow to see us using bitwise operators, when we should clearly be relying on logic. Fascinating. Change-ID: Ie338010c016f93e9faa2002c07c90b15134b7477 Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-02-17 15:26:48 -08:00
Mitch Williams	f16704e5e8	i40e/i40evf: use pages correctly in Rx Refactor the packet split Rx code to properly use half-pages for receives. The previous code was doing way more mapping and unmapping than it needed to, and wasn't properly using half-pages. Increment the page use count each time we give a half-page to an skb, knowing that the stack will probably process and release the page before we need it again. Only free and reallocate pages if the count shows that both half-pages are in use. Add counters to track reallocations and page reuse. Change-ID: I534b299196036b64be82b4861a0a4036310a8f22 Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-02-17 15:22:22 -08:00
Jesse Brandeburg	dd1a5df872	i40e/i40evf: use __GFP_NOWARN The i40e and i40evf drivers now cleanly handle allocation failures and can avoid kernel log spew from the memory allocator when allocations fail, so set __GFP_NOWARN on Rx buffer alloc. Change-ID: Ic9e1b83c495e2a3ef6b069ba7fb6e52ce134cd23 Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-02-17 15:15:37 -08:00
Jesse Brandeburg	c2e245ab1e	i40e/i40evf: try again after failure This is the "Don't Give Up" patch. Previously the driver could fail an allocation, and then possibly stall a queue forever, by never coming back to continue receiving or allocating buffers. With this patch, the driver will keep polling trying to allocate receive buffers until it succeeds. This should keep all receive queues running even in the face of memory pressure. Also update copyright year in file header. Change-ID: I2b103d1ce95b9831288a7222c3343ffa1988b81b Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-02-17 14:43:48 -08:00
Jesse Brandeburg	40d72a5098	i40e/i40evf: don't lose interrupts While re-enabling interrupts the driver would clear all pending causes. This meant that if an interrupt was generated while the driver was cleaning or polling with interrupts disabled, then that interrupt was lost. This could cause a queue to become dead, especially for receive. Refactored the enable_icr0 function in order to allow it to be decided by the caller whether the CLEARPBA (clear pending events) bit will be set while re-enabling the interrupt. Also update copyright year in file headers. Change-ID: Ic1db100a05e13c98919057696db147a258ca365a Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-02-17 14:39:07 -08:00
Anjali Singhai Jain	ecc6a239e8	i40e: Refactor force_wb and WB_ON_ITR functionality code Now that the Force-WriteBack functionality in X710/XL710 devices has been moved out of the clean routine and into the service task, we need to make sure WriteBack-On-ITR is separated out since it is still called from clean. In the X722 devices, Force-WriteBack implies WriteBack-On-ITR but without the interrupt, which put the driver into a missed interrupt scenario and a potential tx-timeout report. With this patch, we break the two functions out, and call the appropriate ones at the right place. This will avoid creating missed interrupt like scenarios for X722 devices. Also update copyright year in file headers. Change-ID: Iacbde39f95f332f82be8736864675052c3583a40 Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-02-17 14:31:12 -08:00
Shannon Nelson	e9f6563d7b	i40e: do TSO only if CHECKSUM_PARTIAL is set Don't bother trying to set up a TSO if the skb->ip_summed is not set to CHECKSUM_PARTIAL. Change-ID: I6495b3568e404907a2965b48cf3e2effa7c9ab55 Signed-off-by: Shannon Nelson <shannon.nelson@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-02-17 08:20:45 -08:00
Jesse Brandeburg	3578fa0a8c	i40e: fix bug in dma sync Driver was using an offset based off a DMA handle while mapping and unmapping using sync_single_range_for[cpu\|device], where it should be using DMA handle (returned from alloc_coherent) and the offset of the memory to be sync'd. Change-ID: I208256565b1595ff0e9171ab852de06b997917c6 Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Reviewed-by: Nelson, Shannon <shannon.nelson@intel.com> Reviewed-by: Williams, Mitch A <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-02-17 08:16:18 -08:00
Anjali Singhai Jain	a3d772a392	i40e: fix write-back-on-itr to work with legacy itr We were not doing write-back on interrupt throttle for Legacy case in X722. This patch fixes that, so we do WB_ON_ITR for Legacy as well. Plus the issue that we should still be setting NO_ITR if we are touching the DYN_CTLN register since we do not want to change ITR setting here. Change-ID: I5db8491ee1544118a389db839cecc93e1bbc480e Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-02-16 18:50:15 -08:00
Anjali Singhai Jain	f6d83d1376	i40evf: add new write-back mode Add write-back on interrupt throttle rate timer expiration support for the i40evf driver, when running on X722 devices. Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-02-03 22:01:01 -08:00
Anjali Singhai Jain	857942fd1a	i40e: Fix Rx hash reported to the stack by our driver If the driver calls skb_set_hash even with a zero hash, that indicates to the stack that the hash calculation is offloaded in hardware. So the Stack doesn't do a SW hash which is required for load balancing if the user decides to turn of rx-hashing on our device. This patch fixes the path so that we do not call skb_set_hash if the feature is disabled. Change-ID: Ic4debfa4ff91b5a72e447348a75768ed7a2d3e1b Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-01-08 01:30:11 -08:00
Jesse Brandeburg	4eeb1fff27	i40e: trivial fixes 1) remove duplicate include of tcp.h 2) put an ampersand at the end of a line instead of the beginning 3) remove a useless dev_info 4) match declaration of function to the implementation 5) repair incorrect comment 6) correct whitespace 7) remove unused define Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-12-03 02:23:39 -08:00
Mitch Williams	44cdb791ae	i40e/i40evf: use logical operator We shouldn't be using a bitwise operator here; it's not a bitwise operation. Use a logical operator instead. Why doesn't c have a logical-or-and-assign operator? Change-ID: Id84f3ca884910bed7073c84b1e16a102e958d0de Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-12-03 02:23:38 -08:00
Kiran Patil	a42e7a369e	i40e: Fix memory leaks, sideband filter programming This patch fixes the memory leak which would be seen otherwise when user programs flow-director filter using ethtool (sideband filter programming). When ethtool is used to program flow directory filter, 'raw_buf' gets allocated and it is supposed to be freed as part of queue cleanup. But check of 'tx_buffer->skb' was preventing it from being freed. Change-ID: Ief4f0a1a32a653180498bf6e987c1b4342ab8923 Signed-off-by: Kiran Patil <kiran.patil@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-12-03 02:23:31 -08:00
Kiran Patil	9c6c12595b	i40e: Detection and recovery of TX queue hung logic moved to service_task from tx_timeout This patch contains following changes: - detection and recovery logic (issue SW interrupt) has been moved to service_task from timeout function. - added some more debug info from tx_timeout. Logic to detect and recover TX queue hung is now two step process: - service_task detects TX queue hung and sets a bit(hung_detected) if it was not set. - if bit was set (means this is back-back hung condition detected), issue SW interrupt and clear the bit. - napi_poll clears the bit unconditionally since it cleans TX/RX queues. Change-ID: Ieed03a48927c845a988b3ff375090bf37caeb903 Signed-off-by: Kiran Patil <kiran.patil@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-12-03 02:23:31 -08:00
Mitch Williams	0dd438d8ad	i40evf: allocate ring structs dynamically Instead of awkwardly keeping a fixed array of pointers in the adapter struct and then allocating ring structs individually, just keep a single pointer and allocate a single blob for the arrays. This simplifies code, shrinks the adapter structure, and future-proofs the driver by not limiting the number of rings we can handle. Change-ID: I31334ff911a6474954232cfe4bc98ccca3c769ff Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-12-01 22:56:30 -08:00
Jesse Brandeburg	b74118f083	i40e/i40evf: prefetch skb data on transmit Issue a prefetch for data early in the transmit path. This should not be generally needed for Tx traffic, but it helps immensely for pktgen workloads and should help for forwarding workloads as well. Change-ID: Iefee870c20599e0c4240e1d8637e4f16b625f83a Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-12-01 22:29:50 -08:00
Anjali Singhai Jain	6a7fded776	i40e/i40evf: Fix RS bit update in Tx path and disable force WB workaround This patch fixes the issue of forcing WB too often causing us to not benefit from NAPI. Without this patch we were forcing WB/arming interrupt too often taking away the benefits of NAPI and causing a performance impact. With this patch we disable force WB in the clean routine for X710 and XL710 adapters. X722 adapters do not enable interrupt to force a WB and benefit from WB_ON_ITR and hence force WB is left enabled for those adapters. For XL710 and X710 adapters if we have less than 4 packets pending a software Interrupt triggered from service task will force a WB. This patch also changes the conditions for setting RS bit as described in code comments. This optimizes when the HW does a tail bump amd when it does a WB. It also optimizes when we do a wmb. Change-ID: Id831e1ae7d3e2ec3f52cd0917b41ce1d22d75d9d Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-12-01 22:29:04 -08:00
Anjali Singhai Jain	164c9f5463	i40e/i40evf: Add a stat to track how many times we have to do a force WB When in NAPI with interrupts disabled, the HW needs to be forced to do a write back on TX if the number of descriptors pending are less than a cache line. This stat helps keep track of how many times we get into this situation. Change-ID: I76c1bcc7ebccd6bffcc5aa33bfe05f2fa1c9a984 Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-11-25 10:05:56 -08:00
Shannon Nelson	9c883bd3eb	i40e/i40evf: remove unused tunnel parameter Code was moved into a separate function some time ago. Change-ID: Icabbe71ce05cf5d716d3e1152cdd9cd41d11bcb5 Signed-off-by: Shannon Nelson <shannon.nelson@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-11-25 10:05:54 -08:00
Eric Dumazet	93f93a4404	net: move skb_mark_napi_id() into core networking stack We would like to automatically provide busy polling support to all NAPI drivers, without them having to implement anything. skb_mark_napi_id() can be called from napi_gro_receive() and napi_get_frags(). Few drivers are still calling skb_mark_napi_id() because they use netif_receive_skb(). They should eventually call napi_gro_receive() instead. I will leave this to drivers maintainers. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-11-18 16:17:41 -05:00
Jesse Brandeburg	ee2319cf17	i40e/i40evf: adjust interrupt throttle less frequently The adaptive ITR (interrupt throttle rate) algorithm was adjusting the hardware's interrupt rate too frequently. This caused a lot of variation in the interrupt rate for fairly constant workloads. Change the code to have a counter and adjust only once every N number of interrupts. Change-ID: I0460f1f86571037484eca5aca36ac4d889cb8389 Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-10-19 15:35:12 -07:00
Jesse Brandeburg	c56625d597	i40e/i40evf: change dynamic interrupt thresholds The dynamic algorithm, while now working, doesn't have good performance in 40G mode. One part of this patch addresses the high CPU utilization of some small streaming workloads that the driver should reduce CPU in. It also changes the minimum ITR that the dynamic algorithm will settle on, causing our minimum latency to go from 12us to about 14us, when using adaptive mode. It also changes the BULK interrupt rate to allow maximum throughput on a 40Gb connection with a single thread of transmit, clamping interrupt rate to 8000 for TX makes single thread traffic go too slow. The new ULTRA bulk setting is introduced and is used when the Rx packet rate on this queue exceeds 40000 packets per second. This value of 40000 was chosen because the automatic tuning of minimum ITR=20us means that a single queue can't quite achieve that many packets per second from a round-robin test. Change-ID: Icce8faa128688ca5fd2c4229bdd9726877a92ea2 Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-10-19 15:35:04 -07:00
Jesse Brandeburg	51cc6d9fcc	i40e/i40evf: fix bug in throttle rate math The driver was using a value expressed in 2us increments for the divisor to figure out our bytes/usec values. Fix the usecs variable to contain a value in microseconds. Change-ID: I5c20493103c295d6f201947bb908add7040b7c41 Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-10-19 15:34:30 -07:00
Jesse Brandeburg	8f5e39ce92	i40e/i40evf: refactor IRQ enable function This change moves a multi-line register setting into a function which simplifies reading the flow of the enable function. This also fixes a bug where the enable function was enabling the interrupt twice while trying to update the two interrupt throttle rate thresholds for Rx and Tx. Change-ID: Ie308f9d0d48540204590cb9d7a5a7b1196f959bb Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-10-19 15:33:57 -07:00
Jesse Brandeburg	32b3e08fff	drivers/net/intel: use napi_complete_done() As per Eric Dumazet's previous patches: (see commit (`24d2e4a507`) - tg3: use napi_complete_done()) Quoting verbatim: Using napi_complete_done() instead of napi_complete() allows us to use /sys/class/net/ethX/gro_flush_timeout GRO layer can aggregate more packets if the flush is delayed a bit, without having to set too big coalescing parameters that impact latencies. </end quote> Tested configuration: low latency via ethtool -C ethx adaptive-rx off rx-usecs 10 adaptive-tx off tx-usecs 15 workload: streaming rx using netperf TCP_MAERTS igb: MIGRATED TCP MAERTS TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.0.0.1 () port 0 AF_INET : demo ... Interim result: 941.48 10^6bits/s over 1.000 seconds ending at 1440193171.589 Alignment Offset Bytes Bytes Recvs Bytes Sends Local Remote Local Remote Xfered Per Per Recv Send Recv Send Recv (avg) Send (avg) 8 8 0 0 1176930056 1475.36 797726 16384.00 71905 MIGRATED TCP MAERTS TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.0.0.1 () port 0 AF_INET : demo ... Interim result: 941.49 10^6bits/s over 0.997 seconds ending at 1440193142.763 Alignment Offset Bytes Bytes Recvs Bytes Sends Local Remote Local Remote Xfered Per Per Recv Send Recv Send Recv (avg) Send (avg) 8 8 0 0 1175182320 50476.00 23282 16384.00 71816 i40e: Hard to test because the traffic is incoming so fast (24Gb/s) that GRO always receives 87kB, even at the highest interrupt rate. Other drivers were only compile tested. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-10-16 04:33:46 -07:00
Alexander Duyck	8b65035905	i40e/i40evf: Drop useless "IN_NETPOLL" flag The code in i40e and i40evf is using an "IN_NETPOLL" flag that has never added any value due to the fact that the Rx clean-up is handled in NAPI. As such the flag was set, the queue was scheduled via NAPI, and then polled from the netpoll controller and if any Rx packets were processed the were processed in the wrong context. In addition the flag itself just added an unneeded conditional to the hot-path so it can safely be dropped and save us a few instructions. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-10-16 04:28:57 -07:00
Alexander Duyck	c67caceb86	i40e/i40evf: Fix handling of napi budget The polling routine for i40e was rounding up the budget for Rx cleanup to 1. This is incorrect as the netpoll poll call is expecting no Rx to be processed as the budget passed was 0. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-10-16 04:26:33 -07:00
Jesse Brandeburg	6995b36c0f	i40e/i40evf: clean up some code Add missings spaces after declarations, remove another __func__ use, remove uncessary braces, remove unneeded breaks, and useless returns, and generally fix up some code. Change-ID: Ie715d6b64976c50e1c21531685fe0a2bd38c4244 Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Shannon Nelson <shannon.nelson@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-10-08 16:19:55 -07:00
Anjali Singhai Jain	2fc3d7152a	i40e/i40evf: Add a stat to keep track of linearization count Keep track of how many times we ask the stack to linearize the skb because the HW cannot handle skbs with more than 8 frags per segment/single packet. Change-ID: If455452060963a769bbe6112cba952e79e944b52 Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-10-07 14:35:45 -07:00
Jiang Liu	27ca275350	i40evf: Use numa_mem_id() to better support memoryless node Function i40e_clean_rx_irq() tries to reuse memory pages allocated from the nearest node. To better support memoryless node, use numa_mem_id() instead of numa_node_id() to get the nearest node with memory. This change should only affect performance. Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-10-03 19:02:53 -07:00
Jesse Brandeburg	0deda86836	i40e/i40evf: fix Tx hang workaround code The arm writeback (arm_wb) code is used for kicking the Tx ring to make sure any pending work is completed even if interrupts are disabled. It was running when it didn't need to, and not clearing the ring->arm_wb state after it was set. This caused Tx hangs to still occur occasionally when there really was no hang. Fix this by resetting the variable right after it was used. Change-ID: I7bf75d552ba9c4bd203d40615213861a24bb5594 Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-09-30 05:21:26 -07:00
Kiran Patil	b03a8c1f4c	i40e/i40evf: refactor tx timeout logic This patch modifies the driver timeout logic by issuing a writeback request via a software interrupt to the hardware the first time the driver detects a hang. The driver was too aggressive in resetting a hung queue, so back that off by removing logic to down the netdevice after too many hangs, and move the function to the service task. Change-ID: Ife100b9d124cd08cbdb81ab659008c1b9abbedea Signed-off-by: Kiran Patil <kiran.patil@intel.com> Signed-off-by: Shannon Nelson <shannon.nelson@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-09-28 17:38:27 -07:00
Anjali Singhai Jain	b1f3366b86	i40evf: Use the correct defines to match the VF registers Use CTLN1 instead of CTLN for the VF relative register space. Change-ID: Iefba63faf0307af55fec8dbb64f26059f7d91318 Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-08-26 15:02:59 -07:00
Anjali Singhai Jain	527274c78e	i40e/i40evf: Add TX/RX outer UDP checksum support for X722 X722 supports offloading of outer UDP TX and RX checksum for tunneled packets. This patch exposes the support and leaves it enabled by default. Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com> Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com> Tested-by: Jim Young <james.m.young@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-08-05 16:53:45 -07:00
Anjali Singhai Jain	8e0764b4d6	i40e/i40evf: Add support for writeback on ITR feature for X722 X722 fixes an issue from X710 where TX descriptor WB would not happen if the interrupts were disabled. In order for the write backs to happen a bit needs to be set in the dynamic interrupt control register called WB_ON_ITR. With this feature, the SW driver need not arm SW interrupts to work around the issue in X710. Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com> Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com> Tested-by: Jim Young <james.m.young@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-08-05 16:53:45 -07:00
Jesse Brandeburg	41a1d04b9d	i40e: use BIT and BIT_ULL macros Use macros for abstracting (1 << foo) to BIT(foo) and (1ULL << foo64) to BIT_ULL(foo64) in order to match better with kernel requirements. NOTE: the adminq_cmd.h file was not modified on purpose because of the dependency upon firmware for that file. Change-ID: I73ee2e48c880d671948aad19bd53ca6b2ac558fc Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com> Tested-by: Jim Young <james.m.young@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-07-23 05:33:55 -07:00
Carolyn Wyborny	de32e3efd5	i40e/i40evf: Fix and refactor dynamic ITR code This patch changes the switch statement for dynamic interrupt throttling and adds a default case. With this patch, we check the latency setting instead of the current ITR settings and the included refactor improves performance. Without this patch, the ITR setting would never change dynamically, and there was no default. Change-ID: Idb5a8a14c7109ec47c90f6e94bd43baa17d7ee37 Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com> Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com> Tested-by: Jim Young <james.m.young@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-07-23 05:13:08 -07:00
Jesse Brandeburg	489ce7a463	i40e/i40evf: improve Tx performance with a small tweak Add a prefetch for the next Tx descriptor to be used when we know there are more coming. Change-ID: Ibb9acab11d508eec2db7da795df74debc16eeacb Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Jim Young <james.m.young@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-07-14 18:26:40 -07:00
Mitch Williams	67c818a1d5	i40evf: fix panic during MTU change Down was requesting queue disables, but then exited immediately without waiting for the queues to actually disable. This could allow any function called after i40evf_down to run immediately, including i40evf_up, and causes a memory leak. Removing the whole reinit_locked function is the best way to go about this, and allows for the driver to handle the state changes by requesting reset from the periodic timer. Also, add a couple WARN_ONs in slow path to help us recognize if we re-introduce this issue or missed any cases. Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-06-26 02:51:31 -07:00
David S. Miller	941742f497	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2015-06-08 20:06:56 -07:00
Anjali Singhai Jain	30520831f0	i40e/i40evf: Fix mixed size frags and linearization This patch fixes a bug where the i40e Tx queue will hang if this skb is passed to the driver. With mixed size fragments while using TSO there was a corner case where we needed to linearize but we were not. This was seen with iSCSI traffic and could be reproduced with a frag list that looks like this: num_frags = 17, gso_segs = 17, hdr_len = 66, skb_shinfo(skb)->gso_size = 1448 size = 3002, j = 1, frag_size = 2936, num_frags = 17 size = 4268, j = 1, frag_size = 4096, num_frags = 16 size = 5534, j = 1, frag_size = 4096, num_frags = 15 size = 5352, j = 1, frag_size = 4096, num_frags = 14 size = 5170, j = 1, frag_size = 4096, num_frags = 13 size = 3468, j = 1, frag_size = 2576, num_frags = 12 size = 750, j = 1, frag_size = 112, num_frags = 11 size = 862, j = 2, frag_size = 112, num_frags = 10 size = 974, j = 3, frag_size = 112, num_frags = 9 size = 1126, j = 4, frag_size = 152, num_frags = 8 size = 1330, j = 5, frag_size = 204, num_frags = 7 size = 1534, j = 6, frag_size = 204, num_frags = 6 size = 356, j = 1, frag_size = 204, num_frags = 5 size = 560, j = 2, frag_size = 204, num_frags = 4 size = 764, j = 3, frag_size = 204, num_frags = 3 size = 968, j = 4, frag_size = 204, num_frags = 2 size = 1140, j = 5, frag_size = 172, num_frags = 1 result: linearize = 0, j = 6 Change-ID: I79bb1aeab0af255fe2ce28e93672a85d85bf47e8 Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-06-04 20:06:06 -07:00
Jesse Brandeburg	335075989f	i40e/i40evf: remove time_stamp member The driver doesn't use the time_stamp member to determine if there is a tx_hang any more. There really isn't any point to the variable at all so just remove it. It was left over from a previous tx_hang design. Change-ID: I4c814827e1bcb46e45118fe37acdcfa814fb62a0 Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Jim Young <james.m.young@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-05-28 03:46:03 -07:00
Jesse Brandeburg	3e587cf3c1	i40e/i40evf: force inline transmit functions Inlining these functions gives us about 15% more 64 byte packets per second when using pktgen. 13.3 million to 15 million with a single queue. Also fix the function names in i40evf to i40evf not i40e while we are touching the function header. Change-ID: I3294ae9b085cf438672b6db5f9af122490ead9d0 Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com> Tested-by: Jim Young <james.m.young@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-05-28 03:42:22 -07:00
Jesse Brandeburg	8f6a2b05c6	i40evf: skb->xmit_more support Eric added support for skb->xmit_more in i40e, this ports that into i40evf as well. Support skb->xmit_more in i40evf is straightforward; we need to move around i40e_maybe_stop_tx() call to correctly test netif_xmit_stopped() before taking the decision to not kick the NIC. Change-ID: Idddda6a2e4a7ab335631c91ced51f55b25eb8468 Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Jim Young <james.m.young@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-05-28 03:41:30 -07:00
Greg Rose	6b02a174c1	i40e/i40evf: Remove unneeded TODO There's no need for a counter so remove the TODO comment. Change-ID: I3321dda04934c4f5fda9b279ab666192bda44214 Signed-off-by: Greg Rose <gregory.v.rose@intel.com> Tested-by: Jim Young <james.m.young@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-05-28 03:38:43 -07:00
Anjali Singhai Jain	89232c3bf7	i40e/i40evf: Add ATR support for tunneled TCP/IPv4/IPv6 packets. Without this, RSS would have done inner header load balancing. Now we can get the benefits of ATR for tunneled packets to better align TX and RX queues with the right core/interrupt. Change-ID: I07d0e0a192faf28fdd33b2f04c32b2a82ff97ddd Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-05-28 03:37:31 -07:00
françois romieu	4ffd3c730e	net: batch of last_rx update avoidance in ethernet drivers. None of those drivers uses last_rx for its own needs. See `4dc89133f4` ("net: add a comment on netdev->last_rx") for reference. Signed-off-by: Francois Romieu <romieu@fr.zoreil.com> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Zhangfei Gao <zhangfei.gao@linaro.org> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Cc: Wingman Kwok <w-kwok2@ti.com> Cc: Murali Karicheri <m-karicheri2@ti.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-14 17:38:17 -04:00
Alexander Duyck	67317166dd	i40e/i40evf: Use dma_rmb where appropriate Update i40e and i40evf to use dma_rmb. This should improve performance by decreasing the barrier overhead on strong ordered architectures. Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-09 14:25:26 -04:00
Greg Rose	31eaaccff0	i40e/i40evf: Set Ethernet protocol correctly when Tx VLAN offloads are disabled If transmit VLAN HW offloads are disabled then the network stack sends up an skb with the protocol set to 8021q. In that case to get the correct checksum offloads we have to reset the skb protocol to the encapsulated ethertype. Change-ID: I903d78533de09b1c5d3ec695ee1990dd0fa5dd0d Signed-off-by: Greg Rose <gregory.v.rose@intel.com> Tested-by: Jim Young <james.m.young@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-04-03 03:54:30 -07:00
Jesse Brandeburg	8b6ed9c202	i40e/i40evf: fix bug when skb allocation fails If the skb allocation fails we should not continue using the skb pointer. Breaking out at the point of failure means that at the next RX interrupt the driver will try the allocation again. Change-ID: Iefaad69856ced7418bfd92afe55322676341f82e Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Jim Young <james.m.young@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-04-03 03:54:29 -07:00
Anjali Singhai Jain	818f2e7b25	i40evf: Fix Outer UDP RX checksum code Inner protocol being UDP should not stop us from verifying Outer UDP checksum correctness. If the Outer protocol is not UDP (NVGRE) we should not be doing a UDP checksum check. If the packet has zero checksum, skip checksum check. Change-ID: Ie7f153feb276a59f66a54a0938901b2c0a8100fa Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com> Tested-by: Jim Young <james.m.young@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-04-03 03:54:28 -07:00
Jesse Brandeburg	97bf75f169	i40e/i40evf: fix accidental write to ITR registers Fix a bug introduced in the force writeback code, where the interrupt rate was set to 0 (maximum) by accident. The driver must correctly set the NOITR fields to avoid ITR update as a side effect of triggering the software interrupt. Change-ID: I290851ae04ef3811c43aab5ee33242029f26c1a3 Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-03-09 05:20:46 -07:00
Jesse Brandeburg	016890b994	i40e/i40evf: enable prefetch of Tx descriptors during cleanup Performance can be improved a bit by imitating ixgbe and using prefetch to get us the next Tx descriptor. Change-ID: Ice7ffd4cd0ce87c35295059bdb7972a7f53723aa Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-03-07 07:01:05 -08:00
Anjali Singhai Jain	4599120466	i40e/i40evf: Simplify tunnel selection logic Use l4_tunnel type generically to keep code flow simple. Change-ID: Ic52287e3b1ca4204e6b6e13431890c1a6ae9c422 Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-03-07 07:00:13 -08:00
Shannon Nelson	37a2973a05	i40e/i40evf: Refactor i40e_debug_aq and make some functions static A sparse complaint in i40e_debug_aq in a funky buffer write goes away by straightening out the code out to something less convoluted. Also fix some other sparse warnings while we are at it, making some functions static and using NULL instead of 0. Change-ID: I93907534fe1f1f675830774b3d14ecf1c6ffc9a0 Signed-off-by: Shannon Nelson <shannon.nelson@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-03-07 04:59:40 -08:00
David S. Miller	71a83a6db6	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/ethernet/rocker/rocker.c The rocker commit was two overlapping changes, one to rename the ->vport member to ->pport, and another making the bitmask expression use '1ULL' instead of plain '1'. Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-03 21:16:48 -05:00
Anjali Singhai Jain	f8faaa40b4	i40e/i40evf: Add missing packet types for VXLAN encapsulated packet types We were missing a few packet types for VXLAN offload. This patch fixes that. Change-ID: I4b23aa0b08e40ed49d0df6c49a5ed9f2009b44ce Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-03-03 01:07:27 -08:00
Anjali Singhai	85e76d0312	i40evf: TCP/IPv6 over Vxlan Tx checksum offload fix We were checking the outer Protocol flags and deciding the flow for inner header. This patch fixes that. This fixes the Tx checksum offload for TCP/IPv6 over vxlan. Change-ID: I837aaea921d34f71b24c2bc32aaadea5001ddf78 Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-02-26 04:54:01 -08:00

1 2 3 4

176 Commits