linux

Author	SHA1	Message	Date
Jacob Keller	e5f36ad14c	igb: check for Tx timestamp timeouts during watchdog The igb driver has logic to handle only one Tx timestamp at a time, using a state bit lock to avoid multiple requests at once. It may be possible, if incredibly unlikely, that a Tx timestamp event is requested but never completes. Since we use an interrupt scheme to determine when the Tx timestamp occurred we would never clear the state bit in this case. Add an igb_ptp_tx_hang() function similar to the already existing igb_ptp_rx_hang() function. This function runs in the watchdog routine and makes sure we eventually recover from this case instead of permanently disabling Tx timestamps. Note: there is no currently known way to cause this without hacking the driver code to force it. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-06-06 01:03:17 -07:00
Jacob Keller	c3b8f85ec2	igb: add statistic indicating number of skipped Tx timestamps The igb driver can only handle one Tx timestamp request at a time. This means it is possible for an application timestamp request to be ignored. There is no easy way for an administrator to determine if this occurred. Add a new statistic which tracks this, tx_hwtstamp_skipped. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-06-06 01:02:05 -07:00
Jacob Keller	74344e32fc	igb: avoid permanent lock of *_PTP_TX_IN_PROGRESS The igb driver uses a state bit lock to avoid handling more than one Tx timestamp request at once. This is required because hardware is limited to a single set of registers for Tx timestamps. The state bit lock is not properly cleaned up during igb_xmit_frame_ring() if the transmit fails such as due to DMA or TSO failure. In some hardware this results in blocking timestamps until the service task times out. In other hardware this results in a permanent lock of the timestamp bit because we never receive an interrupt indicating the timestamp occurred, since indeed the packet was never transmitted. Fix this by checking for DMA and TSO errors in igb_xmit_frame_ring() and properly cleaning up after ourselves when these occur. Reported-by: Reported-by: David Mirabito <davidm@metamako.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-06-06 00:53:48 -07:00
Arnd Bergmann	000ba1f2eb	igb: mark PM functions as __maybe_unused The new wake function is only used by the suspend/resume handlers that are defined in inside of an #ifdef, which can cause this harmless warning: drivers/net/ethernet/intel/igb/igb_main.c:7988:13: warning: 'igb_deliver_wake_packet' defined but not used [-Wunused-function] Removing the #ifdef, instead using a __maybe_unused annotation simplifies the code and avoids the warning. Fixes: `b90fa87635` ("igb: Enable reading of wake up packet") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-06-06 00:51:36 -07:00
Kim Tatt Chuah	b90fa87635	igb: Enable reading of wake up packet Currently, in igb_resume(), igb driver ignores the Wake Up Status (WUS) and Wake Up Packet Memory (WUPM) registers. This patch enables the igb driver to read the WUPM if the controller was woken by a wake up packet that is not more than 128 bytes long (maximum WUPM size), then pass it up the kernel network stack. Signed-off-by: Kim Tatt Chuah <kim.tatt.chuah@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-20 16:32:45 -07:00
Yury Kylulin	4827cc3779	igb/igbvf: Add VF MAC filter request capabilities Add functionality for the VF to request up to 3 additional MAC filters. This is done using existing E1000_VF_SET_MAC_ADDR message, but with additional message info - E1000_VF_MAC_FILTER_CLR to clear all unicast MAC filters previously set for this VF and E1000_VF_MAC_FILTER_ADD to add MAC filter. Additional filters can be added only in case if administrator did not set VF MAC explicitly and allowed to change default MAC to the VF. Due to the limited number of RAR entries reserve at least 3 MAC filters for the PF. If SRIOV is supported by the NIC after this change RAR entries starting from 1 to (RAR MAX ENTRIES - NUM SRIOV VFS) will be used for PF and VF MAC filters. Signed-off-by: Yury Kylulin <yury.kylulin@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-20 16:32:44 -07:00
Yury Kylulin	83c21335c8	igb: improve MAC filter handling Using the work which was done for ixgbe driver by Jacob Keller commit `5d7daa35b9` ("ixgbe: improve mac filter handling") and Alexander Duyck commit `0f079d2283` ("ixgbe: Use __dev_uc_sync and __dev_uc_unsync for unicast addresses") and out-of-tree igb driver add functionality to manage (add and delete) MAC filters. Signed-off-by: Yury Kylulin <yury.kylulin@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-20 16:32:44 -07:00
Alexander Duyck	3a1eb6d10c	igb/ixgbe: Fix typo in igb_build_skb and/or ixgbe_build_skb code comment There was a typo that I had left in the code comments for the igb and ixgbe functions that enabled build_skb support. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-17 12:55:55 -07:00
Alexander Duyck	b1bb2eb0a0	igb: Re-add support for build_skb in igb This reverts commit `f9d40f6a99` ("igb: Revert support for build_skb in igb") and adds a few changes to update it to work with the latest version of igb. We are now able to revert the removal of this due to the fact that with the recent changes to the page count and the use of DMA_ATTR_SKIP_CPU_SYNC we can make the pages writable so we should not be invalidating the additional data added when we call build_skb. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-17 12:11:44 -07:00
Alexander Duyck	e014272672	igb: Break out Rx buffer page management At this point we have 2 to 3 paths that can be taken depending on what Rx modes are enabled. In order to better support that and improve the maintainability I am breaking out the common bits from those paths and making them into their own functions. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-17 12:11:44 -07:00
Alexander Duyck	e3cdf68d4a	igb: Add support for padding packet With the size of the frame limited we can now write to an offset within the buffer instead of having to write at the very start of the buffer. The advantage to this is that it allows us to leave padding room for things like supporting XDP in the future. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-17 12:11:44 -07:00
Alexander Duyck	8649aaef40	igb: Add support for using order 1 pages to receive large frames This patch adds support for using 3K buffers in order 1 pages the same way we were using 2K buffers in 4K pages. We are reserving 1K of room for now to have space available for future headroom and tailroom when we enable build_skb support. One side effect of this patch is that we can end up using a larger buffer if jumbo frames is enabled. The impact shouldn't be too great, but it could hurt small packet performance for UDP workloads if jumbo frames is enabled as the truesize of frames will be larger. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-17 12:11:44 -07:00
Alexander Duyck	3456fd5342	igb: Use page_address offset from page instead of masking virtual address Update the handling of page addresses so that we always refer to them using a void pointer, and try to use the consistent name of va indicating we are working with a virtual address. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-17 12:11:44 -07:00
Alexander Duyck	cfbc871c21	igb: Limit maximum frame Rx based on MTU In order to support the use of build_skb going forward it will be necessary to place a maximum limit on the amount of data we can receive when jumbo frames is not enabled. In order to do this I am adding a new upper limit for receive based on the size of a 2K buffer minus padding. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-17 12:11:44 -07:00
Alexander Duyck	7cc6fd4c60	igb: Don't bother clearing Tx buffer_info in igb_clean_tx_ring In the case of the Tx rings we need to only clear the Tx buffer_info when we are resetting the rings. Ideally we do this when we configure the ring to bring it back up instead of when we are taking it down in order to avoid dirtying pages we don't need to. In addition we don't need to clear the Tx descriptor ring since we will fully repopulate it when we begin transmitting frames and next_to_watch can be cleared to prevent the ring from being cleaned beyond that point instead of needing to touch anything in the Tx descriptor ring. Finally with these changes we can avoid having to reset the skb member of the Tx buffer_info structure in the cleanup path since the skb will always be associated with the first buffer which has next_to_watch set. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-17 12:11:44 -07:00
Alexander Duyck	d2bead576e	igb: Clear Rx buffer_info in configure instead of clean This change makes it so that instead of going through the entire ring on Rx cleanup we only go through the region that was designated to be cleaned up and stop when we reach the region where new allocations should start. In addition we can avoid having to perform a memset on the Rx buffer_info structures until we are about to start using the ring again. By deferring this we can avoid dirtying the cache any more than we have to which can help to improve the time needed to bring the interface down and then back up again in a reset or suspend/resume cycle. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-17 12:11:44 -07:00
Alexander Duyck	7ec0116c91	igb: Use length to determine if descriptor is done This change makes it so that we use the length of the packet instead of the DD status bit to determine if a new descriptor is ready to be processed. The obvious advantage is that it cuts down on reads as we don't really even need the DD bit if going from a 0 to a non-zero value on size is enough to inform us that the packet has been completed. In addition I have updated the code so that we only reset the Rx descriptor length for descriptor zero when resetting a ring instead of having to do a memset with 0 over the entire ring. By doing this we can save some time on initialization. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-17 12:11:44 -07:00
Alexander Duyck	7bd1759282	igb: Add support for DMA_ATTR_WEAK_ORDERING Since we are already using DMA attributes in igb for Rx there is no reason why we can't also apply DMA_ATTR_WEAK_ORDERING which is needed on some platforms to improve performance. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-17 12:11:43 -07:00
Tobias Klauser	4a7c972644	net: Remove usage of net_device last_rx member The network stack no longer uses the last_rx member of struct net_device since the bonding driver switched to use its own private last_rx in commit `9f24273837` ("bonding: use last_arp_rx in slave_last_rx()"). However, some drivers still (ab)use the field for their own purposes and some driver just update it without actually using it. Previously, there was an accompanying comment for the last_rx member added in commit `4dc89133f4` ("net: add a comment on netdev->last_rx") which asked drivers not to update is, unless really needed. However, this commend was removed in commit `f8ff080dac` ("bonding: remove useless updating of slave->dev->last_rx"), so some drivers added later on still did update last_rx. Remove all usage of last_rx and switch three drivers (sky2, atp and smc91c92_cs) which actually read and write it to use their own private copy in netdev_priv. Compile-tested with allyesconfig and allmodconfig on x86 and arm. Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Jay Vosburgh <j.vosburgh@gmail.com> Cc: Veaceslav Falico <vfalico@gmail.com> Cc: Andy Gospodarek <andy@greyhouse.net> Cc: Mirko Lindner <mlindner@marvell.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Acked-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Jay Vosburgh <jay.vosburgh@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-18 17:22:49 -05:00
David S. Miller	02ac5d1487	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Two AF_* families adding entries to the lockdep tables at the same time. Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-11 14:43:39 -05:00
Alexander Duyck	2976db8018	mm: rename __page_frag functions to __page_frag_cache, drop order from drain This patch does two things. First it goes through and renames the __page_frag prefixed functions to __page_frag_cache so that we can be clear that we are draining or refilling the cache, not the frags themselves. Second we drop the order parameter from __page_frag_cache_drain since we don't actually need to pass it since all fragments are either order 0 or must be a compound page. Link: http://lkml.kernel.org/r/20170104023954.13451.5678.stgit@localhost.localdomain Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-01-10 18:31:55 -08:00
stephen hemminger	bc1f44709c	net: make ndo_get_stats64 a void function The network device operation for reading statistics is only called in one place, and it ignores the return value. Having a structure return value is potentially confusing because some future driver could incorrectly assume that the return value was used. Fix all drivers with ndo_get_stats64 to have a void function. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-08 17:51:44 -05:00
Todd Fujinaka	9474933caf	igb: close/suspend race in netif_device_detach Similar to ixgbe, when an interface is part of a namespace it is possible that igb_close() may be called while __igb_shutdown() is running which ends up in a double free WARN and/or a BUG in free_msi_irqs(). Extend the rtnl_lock() to protect the call to netif_device_detach() and igb_clear_interrupt_scheme() in __igb_shutdown() and check for netif_device_present() to avoid calling igb_clear_interrupt_scheme() a second time in igb_close(). Also extend the rtnl lock in igb_resume() to netif_device_attach(). Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com> Acked-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-01-06 02:18:51 -08:00
Guilherme G Piccoli	69b97cf6db	igb: re-assign hw address pointer on reset after PCI error Whenever the igb driver detects the result of a read operation returns a value composed only by F's (like 0xFFFFFFFF), it will detach the net_device, clear the hw_addr pointer and warn to the user that adapter's link is lost - those steps happen on igb_rd32(). In case a PCI error happens on Power architecture, there's a recovery mechanism called EEH, that will reset the PCI slot and call driver's handlers to reset the adapter and network functionality as well. We observed that once hw_addr is NULL after the error is detected on igb_rd32(), it's never assigned back, so in the process of resetting the network functionality we got a NULL pointer dereference in both igb_configure_tx_ring() and igb_configure_rx_ring(). In order to avoid such bug, this patch re-assigns the hw_addr value in the slot_reset handler. Reported-by: Anthony H Thai <ahthai@us.ibm.com> Reported-by: Harsha Thyagaraja <hathyaga@in.ibm.com> Signed-off-by: Guilherme G Piccoli <gpiccoli@linux.vnet.ibm.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-01-06 02:18:51 -08:00
Cao jin	629823b872	igb: use igb_adapter->io_addr instead of e1000_hw->hw_addr When running as guest, under certain condition, it will oops as following. writel() in igb_configure_tx_ring() results in oops, because hw->hw_addr is NULL. While other register access won't oops kernel because they use wr32/rd32 which have a defense against NULL pointer. [ 141.225449] pcieport 0000:00:1c.0: AER: Multiple Uncorrected (Fatal) error received: id=0101 [ 141.225523] igb 0000:01:00.1: PCIe Bus Error: severity=Uncorrected (Fatal), type=Unaccessible, id=0101(Unregistered Agent ID) [ 141.299442] igb 0000:01:00.1: broadcast error_detected message [ 141.300539] igb 0000:01:00.0 enp1s0f0: PCIe link lost, device now detached [ 141.351019] igb 0000:01:00.1 enp1s0f1: PCIe link lost, device now detached [ 143.465904] pcieport 0000:00:1c.0: Root Port link has been reset [ 143.465994] igb 0000:01:00.1: broadcast slot_reset message [ 143.466039] igb 0000:01:00.0: enabling device (0000 -> 0002) [ 144.389078] igb 0000:01:00.1: enabling device (0000 -> 0002) [ 145.312078] igb 0000:01:00.1: broadcast resume message [ 145.322211] BUG: unable to handle kernel paging request at 0000000000003818 [ 145.361275] IP: [<ffffffffa02fd38d>] igb_configure_tx_ring+0x14d/0x280 [igb] [ 145.400048] PGD 0 [ 145.438007] Oops: 0002 [#1] SMP A similar issue & solution could be found at: http://patchwork.ozlabs.org/patch/689592/ Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com> Acked-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-01-06 02:18:50 -08:00
Alexander Duyck	bd4171a5d4	igb: update code to better handle incrementing page count Update the driver code so that we do bulk updates of the page reference count instead of just incrementing it by one reference at a time. The advantage to doing this is that we cut down on atomic operations and this in turn should give us a slight improvement in cycles per packet. In addition if we eventually move this over to using build_skb the gains will be more noticeable. Link: http://lkml.kernel.org/r/20161110113616.76501.17072.stgit@ahduyck-blue-test.jf.intel.com Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: Chris Metcalf <cmetcalf@mellanox.com> Cc: David Howells <dhowells@redhat.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no> Cc: Helge Deller <deller@gmx.de> Cc: James Hogan <james.hogan@imgtec.com> Cc: Jonas Bonn <jonas@southpole.se> Cc: Keguang Zhang <keguang.zhang@gmail.com> Cc: Ley Foon Tan <lftan@altera.com> Cc: Mark Salter <msalter@redhat.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Rich Felker <dalias@libc.org> Cc: Richard Kuo <rkuo@codeaurora.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Steven Miao <realmz6@gmail.com> Cc: Tobias Klauser <tklauser@distanz.ch> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-12-14 16:04:08 -08:00
Alexander Duyck	5be5955425	igb: update driver to make use of DMA_ATTR_SKIP_CPU_SYNC The ARM architecture provides a mechanism for deferring cache line invalidation in the case of map/unmap. This patch makes use of this mechanism to avoid unnecessary synchronization. A secondary effect of this change is that the portion of the page that has been synchronized for use by the CPU should be writable and could be passed up the stack (at least on ARM). The last bit that occurred to me is that on architectures where the sync_for_cpu call invalidates cache lines we were prefetching and then invalidating the first 128 bytes of the packet. To avoid that I have moved the sync up to before we perform the prefetch and allocate the skbuff so that we can actually make use of it. Link: http://lkml.kernel.org/r/20161110113611.76501.98897.stgit@ahduyck-blue-test.jf.intel.com Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: Chris Metcalf <cmetcalf@mellanox.com> Cc: David Howells <dhowells@redhat.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no> Cc: Helge Deller <deller@gmx.de> Cc: James Hogan <james.hogan@imgtec.com> Cc: Jonas Bonn <jonas@southpole.se> Cc: Keguang Zhang <keguang.zhang@gmail.com> Cc: Ley Foon Tan <lftan@altera.com> Cc: Mark Salter <msalter@redhat.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Rich Felker <dalias@libc.org> Cc: Richard Kuo <rkuo@codeaurora.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Steven Miao <realmz6@gmail.com> Cc: Tobias Klauser <tklauser@distanz.ch> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-12-14 16:04:08 -08:00
David S. Miller	2745529ac7	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Couple conflicts resolved here: 1) In the MACB driver, a bug fix to properly initialize the RX tail pointer properly overlapped with some changes to support variable sized rings. 2) In XGBE we had a "CONFIG_PM" --> "CONFIG_PM_SLEEP" fix overlapping with a reorganization of the driver to support ACPI, OF, as well as PCI variants of the chip. 3) In 'net' we had several probe error path bug fixes to the stmmac driver, meanwhile a lot of this code was cleaned up and reorganized in 'net-next'. 4) The cls_flower classifier obtained a helper function in 'net-next' called __fl_delete() and this overlapped with Daniel Borkamann's bug fix to use RCU for object destruction in 'net'. It also overlapped with Jiri's change to guard the rhashtable_remove_fast() call with a check against tc_skip_sw(). 5) In mlx4, a revert bug fix in 'net' overlapped with some unrelated changes in 'net-next'. 6) In geneve, a stale header pointer after pskb_expand_head() bug fix in 'net' overlapped with a large reorganization of the same code in 'net-next'. Since the 'net-next' code no longer had the bug in question, there was nothing to do other than to simply take the 'net-next' hunks. Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-03 12:29:53 -05:00
Alexander Duyck	516165a1e2	igb/igbvf: Don't use lco_csum to compute IPv4 checksum In the case of IPIP and SIT tunnel frames the outer transport header offset is actually set to the same offset as the inner transport header. This results in the lco_csum call not doing any checksum computation over the inner IPv4/v6 header data. In order to account for that I am updating the code so that we determine the location to start the checksum ourselves based on the location of the IPv4 header and the length. Fixes: `e10715d3e9` ("igb/igbvf: Add support for GSO partial") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-01 15:41:35 -05:00
Jarod Wilson	91c527a556	ethernet/intel: use core min/max MTU checking e100: min_mtu 68, max_mtu 1500 - remove e100_change_mtu entirely, is identical to old eth_change_mtu, and no longer serves a purpose. No need to set min_mtu or max_mtu explicitly, as ether_setup() will already set them to 68 and 1500. e1000: min_mtu 46, max_mtu 16110 e1000e: min_mtu 68, max_mtu varies based on adapter fm10k: min_mtu 68, max_mtu 15342 - remove fm10k_change_mtu entirely, does nothing now i40e: min_mtu 68, max_mtu 9706 i40evf: min_mtu 68, max_mtu 9706 igb: min_mtu 68, max_mtu 9216 - There are two different "max" frame sizes claimed and both checked in the driver, the larger value wasn't relevant though, so I've set max_mtu to the smaller of the two values here to retain identical behavior. igbvf: min_mtu 68, max_mtu 9216 - Same issue as igb duplicated ixgb: min_mtu 68, max_mtu 16114 - Also remove pointless old == new check, as that's done in dev_set_mtu ixgbe: min_mtu 68, max_mtu 9710 ixgbevf: min_mtu 68, max_mtu dependent on hardware/firmware - Some hw can only handle up to max_mtu 1504 on a vf, others 9710 CC: netdev@vger.kernel.org CC: intel-wired-lan@lists.osuosl.org CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Jarod Wilson <jarod@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-18 11:34:18 -04:00
Todd Fujinaka	0742337c19	igb: bump version to igb-5.4.0 Bump igb version to match other igb drivers. Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-09-27 19:00:52 -07:00
Moshe Shemesh	79aab093a0	net: Update API for VF vlan protocol 802.1ad support Introduce new rtnl UAPI that exposes a list of vlans per VF, giving the ability for user-space application to specify it for the VF, as an option to support 802.1ad. We adjusted IP Link tool to support this option. For future use cases, the new UAPI supports multiple vlans. For now we limit the list size to a single vlan in kernel. Add IFLA_VF_VLAN_LIST in addition to IFLA_VF_VLAN to keep backward compatibility with older versions of IP Link tool. Add a vlan protocol parameter to the ndo_set_vf_vlan callback. We kept 802.1Q as the drivers' default vlan protocol. Suitable ip link tool command examples: Set vf vlan protocol 802.1ad: ip link set eth0 vf 1 vlan 100 proto 802.1ad Set vf to VST (802.1Q) mode: ip link set eth0 vf 1 vlan 100 proto 802.1Q Or by omitting the new parameter ip link set eth0 vf 1 vlan 100 Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-24 08:01:26 -04:00
Gangfeng Huang	0e71def252	igb: add support of RX network flow classification This patch is meant to allow for RX network flow classification to insert and remove Rx filter by ethtool. Ethtool interface has it's own rules manager Show all filters: $ ethtool -n eth0 4 RX rings available Total 2 rules Signed-off-by: Ruhao Gao <ruhao.gao@ni.com> Signed-off-by: Gangfeng Huang <gangfeng.huang@ni.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-08-18 22:27:47 -07:00
Linus Torvalds	c8d0267efd	PCI changes for the v4.8 merge window: Enumeration Move ecam.h to linux/include/pci-ecam.h (Jayachandran C) Add parent device field to ECAM struct pci_config_window (Jayachandran C) Add generic MCFG table handling (Tomasz Nowicki) Refactor pci_bus_assign_domain_nr() for CONFIG_PCI_DOMAINS_GENERIC (Tomasz Nowicki) Factor DT-specific pci_bus_find_domain_nr() code out (Tomasz Nowicki) Resource management Add devm_request_pci_bus_resources() (Bjorn Helgaas) Unify pci_resource_to_user() declarations (Bjorn Helgaas) Implement pci_resource_to_user() with pcibios_resource_to_bus() (microblaze, powerpc, sparc) (Bjorn Helgaas) Request host bridge window resources (designware, iproc, rcar, xgene, xilinx, xilinx-nwl) (Bjorn Helgaas) Make PCI I/O space optional on ARM32 (Bjorn Helgaas) Ignore write combining when mapping I/O port space (Bjorn Helgaas) Claim bus resources on MIPS PCI_PROBE_ONLY set-ups (Bjorn Helgaas) Remove unicore32 pci=firmware command line parameter handling (Bjorn Helgaas) Support I/O resources when parsing host bridge resources (Jayachandran C) Add helpers to request/release memory and I/O regions (Johannes Thumshirn) Use pci_(request\|release)_mem_regions (NVMe, lpfc, GenWQE, ethernet/intel, alx) (Johannes Thumshirn) Extend pci=resource_alignment to specify device/vendor IDs (Koehrer Mathias (ETAS/ESW5)) Add generic pci_bus_claim_resources() (Lorenzo Pieralisi) Claim bus resources on ARM32 PCI_PROBE_ONLY set-ups (Lorenzo Pieralisi) Remove ARM32 and ARM64 arch-specific pcibios_enable_device() (Lorenzo Pieralisi) Add pci_unmap_iospace() to unmap I/O resources (Sinan Kaya) Remove powerpc __pci_mmap_set_pgprot() (Yinghai Lu) PCI device hotplug Allow additional bus numbers for hotplug bridges (Keith Busch) Ignore interrupts during D3cold (Lukas Wunner) Power management Enforce type casting for pci_power_t (Andy Shevchenko) Don't clear d3cold_allowed for PCIe ports (Mika Westerberg) Put PCIe ports into D3 during suspend (Mika Westerberg) Power on bridges before scanning new devices (Mika Westerberg) Runtime resume bridge before rescan (Mika Westerberg) Add runtime PM support for PCIe ports (Mika Westerberg) Remove redundant check of pcie_set_clkpm (Shawn Lin) Virtualization Add function 1 DMA alias quirk for Marvell 88SE9182 (Aaron Sierra) Add DMA alias quirk for Adaptec 3805 (Alex Williamson) Mark Atheros AR9485 and QCA9882 to avoid bus reset (Chris Blake) Add ACS quirk for Solarflare SFC9220 (Edward Cree) MSI Fix PCI_MSI dependencies (Arnd Bergmann) Add pci_msix_desc_addr() helper (Christoph Hellwig) Switch msix_program_entries() to use pci_msix_desc_addr() (Christoph Hellwig) Make the "entries" argument to pci_enable_msix() optional (Christoph Hellwig) Provide sensible IRQ vector alloc/free routines (Christoph Hellwig) Spread interrupt vectors in pci_alloc_irq_vectors() (Christoph Hellwig) Error Handling Bind DPC to Root Ports as well as Downstream Ports (Keith Busch) Remove DPC tristate module option (Keith Busch) Convert Downstream Port Containment driver to use devm_* functions (Mika Westerberg) Generic host bridge driver Select IRQ_DOMAIN (Arnd Bergmann) Claim bus resources on PCI_PROBE_ONLY set-ups (Lorenzo Pieralisi) ACPI host bridge driver Add ARM64 acpi_pci_bus_find_domain_nr() (Tomasz Nowicki) Add ARM64 ACPI support for legacy IRQs parsing and consolidation with DT code (Tomasz Nowicki) Implement ARM64 AML accessors for PCI_Config region (Tomasz Nowicki) Support ARM64 ACPI-based PCI host controller (Tomasz Nowicki) Altera host bridge driver Check link status before retrain link (Ley Foon Tan) Poll for link up status after retraining the link (Ley Foon Tan) Axis ARTPEC-6 host bridge driver Add PCI_MSI_IRQ_DOMAIN dependency (Arnd Bergmann) Add DT binding for Axis ARTPEC-6 PCIe controller (Niklas Cassel) Add Axis ARTPEC-6 PCIe controller driver (Niklas Cassel) Intel VMD host bridge driver Use lock save/restore in interrupt enable path (Jon Derrick) Select device dma ops to override (Keith Busch) Initialize list item in IRQ disable (Keith Busch) Use x86_vector_domain as parent domain (Keith Busch) Separate MSI and MSI-X vector sharing (Keith Busch) Marvell Aardvark host bridge driver Add DT binding for the Aardvark PCIe controller (Thomas Petazzoni) Add Aardvark PCI host controller driver (Thomas Petazzoni) Add Aardvark PCIe support for Armada 3700 (Thomas Petazzoni) Microsoft Hyper-V host bridge driver Fix interrupt cleanup path (Cathy Avery) Don't leak buffer in hv_pci_onchannelcallback() (Vitaly Kuznetsov) Handle all pending messages in hv_pci_onchannelcallback() (Vitaly Kuznetsov) NVIDIA Tegra host bridge driver Program PADS_REFCLK_CFG* always, not just on legacy SoCs (Stephen Warren) Program PADS_REFCLK_CFG* registers with per-SoC values (Stephen Warren) Use lower-case hex consistently for register definitions (Thierry Reding) Use generic pci_remap_iospace() rather than ARM32-specific one (Thierry Reding) Stop setting pcibios_min_mem (Thierry Reding) Renesas R-Car host bridge driver Drop gen2 dummy I/O port region (Bjorn Helgaas) TI DRA7xx host bridge driver Fix return value in case of error (Christophe JAILLET) Xilinx AXI host bridge driver Fix return value in case of error (Christophe JAILLET) Miscellaneous Make bus_attr_resource_alignment static (Ben Dooks) Include <asm/dma.h> for isa_dma_bridge_buggy (Ben Dooks) MAINTAINERS: Add file patterns for PCI device tree bindings (Geert Uytterhoeven) Make host bridge drivers explicitly non-modular (Paul Gortmaker) -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJXoNRtAAoJEFmIoMA60/r8LMkP/3kiNh21QFS6RZGOaDft5/Py n14Zo0w51avspxoI3iyDlBd5q/SssMqi+2c6Ko/fh2D2xMxJgmQOjdMDrIGARxGA qEHk/5IoXquY2/GcptmCk3ap66cJ6kTovS4OPrb73m3fPuknFwFwdzExq22XHbnI crPya6xwQxPLc54VpY/TsgW8E+EKZd/3FW9wuzzNHXrXmTILyhBQzQAA0K470GMx wEXU6kc3M/XhRuF1zjV9/O+H/xguwfnbTpZLvd2NAF6uXKZoRytEHHtNnVqu1hoe UPpDS2xq32pMNbGxGqBetCdIbkY/hWOufmckHI7Yu2OfXBYyHBYMG2je1+nMPkOV WiFhhrchGt5KnEMUwXPS4ROqnSZVpZBl1Fd4s10GhUYkoE2HNKJXta398H9FR1jj 4NEVSi4mSX/+CkaoIN3lXYiaf9P0wv4Wppve4Scr30+VnLjJhm7Vw5La7v12oo6x otrJ/g98AkmnbuUdLeWBUS/+TOcdPjZYbw52rqBsbOOjFm51Zcj6D7kf5WcTypQy HzbvygSVabcioWehUG1uudC8pdJmQlUGx1aES/iu+mZEae4cuUFALu6hDBD9IYnZ 5JdwjVzI0UItEwT3rQt3t4xiAqHADQ0NAVNJVCeREdoy/YQpSoTWGXIpyqCZ1yCm aBykjRsxbKQXlhVeIxuc =NVxu -----END PGP SIGNATURE----- Merge tag 'pci-v4.8-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI updates from Bjorn Helgaas: "Highlights: - ARM64 support for ACPI host bridges - new drivers for Axis ARTPEC-6 and Marvell Aardvark - new pci_alloc_irq_vectors() interface for MSI-X, MSI, legacy INTx - pci_resource_to_user() cleanup (more to come) Detailed summary: Enumeration: - Move ecam.h to linux/include/pci-ecam.h (Jayachandran C) - Add parent device field to ECAM struct pci_config_window (Jayachandran C) - Add generic MCFG table handling (Tomasz Nowicki) - Refactor pci_bus_assign_domain_nr() for CONFIG_PCI_DOMAINS_GENERIC (Tomasz Nowicki) - Factor DT-specific pci_bus_find_domain_nr() code out (Tomasz Nowicki) Resource management: - Add devm_request_pci_bus_resources() (Bjorn Helgaas) - Unify pci_resource_to_user() declarations (Bjorn Helgaas) - Implement pci_resource_to_user() with pcibios_resource_to_bus() (microblaze, powerpc, sparc) (Bjorn Helgaas) - Request host bridge window resources (designware, iproc, rcar, xgene, xilinx, xilinx-nwl) (Bjorn Helgaas) - Make PCI I/O space optional on ARM32 (Bjorn Helgaas) - Ignore write combining when mapping I/O port space (Bjorn Helgaas) - Claim bus resources on MIPS PCI_PROBE_ONLY set-ups (Bjorn Helgaas) - Remove unicore32 pci=firmware command line parameter handling (Bjorn Helgaas) - Support I/O resources when parsing host bridge resources (Jayachandran C) - Add helpers to request/release memory and I/O regions (Johannes Thumshirn) - Use pci_(request\|release)_mem_regions (NVMe, lpfc, GenWQE, ethernet/intel, alx) (Johannes Thumshirn) - Extend pci=resource_alignment to specify device/vendor IDs (Koehrer Mathias (ETAS/ESW5)) - Add generic pci_bus_claim_resources() (Lorenzo Pieralisi) - Claim bus resources on ARM32 PCI_PROBE_ONLY set-ups (Lorenzo Pieralisi) - Remove ARM32 and ARM64 arch-specific pcibios_enable_device() (Lorenzo Pieralisi) - Add pci_unmap_iospace() to unmap I/O resources (Sinan Kaya) - Remove powerpc __pci_mmap_set_pgprot() (Yinghai Lu) PCI device hotplug: - Allow additional bus numbers for hotplug bridges (Keith Busch) - Ignore interrupts during D3cold (Lukas Wunner) Power management: - Enforce type casting for pci_power_t (Andy Shevchenko) - Don't clear d3cold_allowed for PCIe ports (Mika Westerberg) - Put PCIe ports into D3 during suspend (Mika Westerberg) - Power on bridges before scanning new devices (Mika Westerberg) - Runtime resume bridge before rescan (Mika Westerberg) - Add runtime PM support for PCIe ports (Mika Westerberg) - Remove redundant check of pcie_set_clkpm (Shawn Lin) Virtualization: - Add function 1 DMA alias quirk for Marvell 88SE9182 (Aaron Sierra) - Add DMA alias quirk for Adaptec 3805 (Alex Williamson) - Mark Atheros AR9485 and QCA9882 to avoid bus reset (Chris Blake) - Add ACS quirk for Solarflare SFC9220 (Edward Cree) MSI: - Fix PCI_MSI dependencies (Arnd Bergmann) - Add pci_msix_desc_addr() helper (Christoph Hellwig) - Switch msix_program_entries() to use pci_msix_desc_addr() (Christoph Hellwig) - Make the "entries" argument to pci_enable_msix() optional (Christoph Hellwig) - Provide sensible IRQ vector alloc/free routines (Christoph Hellwig) - Spread interrupt vectors in pci_alloc_irq_vectors() (Christoph Hellwig) Error Handling: - Bind DPC to Root Ports as well as Downstream Ports (Keith Busch) - Remove DPC tristate module option (Keith Busch) - Convert Downstream Port Containment driver to use devm_* functions (Mika Westerberg) Generic host bridge driver: - Select IRQ_DOMAIN (Arnd Bergmann) - Claim bus resources on PCI_PROBE_ONLY set-ups (Lorenzo Pieralisi) ACPI host bridge driver: - Add ARM64 acpi_pci_bus_find_domain_nr() (Tomasz Nowicki) - Add ARM64 ACPI support for legacy IRQs parsing and consolidation with DT code (Tomasz Nowicki) - Implement ARM64 AML accessors for PCI_Config region (Tomasz Nowicki) - Support ARM64 ACPI-based PCI host controller (Tomasz Nowicki) Altera host bridge driver: - Check link status before retrain link (Ley Foon Tan) - Poll for link up status after retraining the link (Ley Foon Tan) Axis ARTPEC-6 host bridge driver: - Add PCI_MSI_IRQ_DOMAIN dependency (Arnd Bergmann) - Add DT binding for Axis ARTPEC-6 PCIe controller (Niklas Cassel) - Add Axis ARTPEC-6 PCIe controller driver (Niklas Cassel) Intel VMD host bridge driver: - Use lock save/restore in interrupt enable path (Jon Derrick) - Select device dma ops to override (Keith Busch) - Initialize list item in IRQ disable (Keith Busch) - Use x86_vector_domain as parent domain (Keith Busch) - Separate MSI and MSI-X vector sharing (Keith Busch) Marvell Aardvark host bridge driver: - Add DT binding for the Aardvark PCIe controller (Thomas Petazzoni) - Add Aardvark PCI host controller driver (Thomas Petazzoni) - Add Aardvark PCIe support for Armada 3700 (Thomas Petazzoni) Microsoft Hyper-V host bridge driver: - Fix interrupt cleanup path (Cathy Avery) - Don't leak buffer in hv_pci_onchannelcallback() (Vitaly Kuznetsov) - Handle all pending messages in hv_pci_onchannelcallback() (Vitaly Kuznetsov) NVIDIA Tegra host bridge driver: - Program PADS_REFCLK_CFG* always, not just on legacy SoCs (Stephen Warren) - Program PADS_REFCLK_CFG* registers with per-SoC values (Stephen Warren) - Use lower-case hex consistently for register definitions (Thierry Reding) - Use generic pci_remap_iospace() rather than ARM32-specific one (Thierry Reding) - Stop setting pcibios_min_mem (Thierry Reding) Renesas R-Car host bridge driver: - Drop gen2 dummy I/O port region (Bjorn Helgaas) TI DRA7xx host bridge driver: - Fix return value in case of error (Christophe JAILLET) Xilinx AXI host bridge driver: - Fix return value in case of error (Christophe JAILLET) Miscellaneous: - Make bus_attr_resource_alignment static (Ben Dooks) - Include <asm/dma.h> for isa_dma_bridge_buggy (Ben Dooks) - MAINTAINERS: Add file patterns for PCI device tree bindings (Geert Uytterhoeven) - Make host bridge drivers explicitly non-modular (Paul Gortmaker)" * tag 'pci-v4.8-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (125 commits) PCI: xgene: Make explicitly non-modular PCI: thunder-pem: Make explicitly non-modular PCI: thunder-ecam: Make explicitly non-modular PCI: tegra: Make explicitly non-modular PCI: rcar-gen2: Make explicitly non-modular PCI: rcar: Make explicitly non-modular PCI: mvebu: Make explicitly non-modular PCI: layerscape: Make explicitly non-modular PCI: keystone: Make explicitly non-modular PCI: hisi: Make explicitly non-modular PCI: generic: Make explicitly non-modular PCI: designware-plat: Make it explicitly non-modular PCI: artpec6: Make explicitly non-modular PCI: armada8k: Make explicitly non-modular PCI: artpec: Add PCI_MSI_IRQ_DOMAIN dependency PCI: Add ACS quirk for Solarflare SFC9220 arm64: dts: marvell: Add Aardvark PCIe support for Armada 3700 PCI: aardvark: Add Aardvark PCI host controller driver dt-bindings: add DT binding for the Aardvark PCIe controller PCI: tegra: Program PADS_REFCLK_CFG* registers with per-SoC values ...	2016-08-02 17:12:29 -04:00
Andrew Lunn	64f2525ca4	igb: Only DMA sync frame length On some platforms, syncing a buffer for DMA is expensive. Rather than sync the whole 2K receive buffer, only synchronise the length of the frame, which will typically be the MTU, or a much smaller TCP ACK. For an IMX6Q, this gives around 6% increased TCP receive performance, which is cache operations bound and reduces CPU load for TCP transmit. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-06-29 13:59:24 -07:00
Jacob Keller	8646f7b4cd	igb: call igb_ptp_suspend during suspend/resume cycle Properly stop the extra workqueue items and ensure that we resume cleanly. This is better than using igb_ptp_init and igb_ptp_stop since these functions destroy the PHC device, which will cause other problems if we do so. Since igb_ptp_reset now re-schedules the work-queue item we don't need an equivalent igb_ptp_resume in the resume workflow. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-06-29 11:14:31 -07:00
Jacob Keller	4f3ce71bb8	igb: re-use igb_ptp_reset in igb_ptp_init Modify igb_ptp_init to take advantage of igb_ptp_reset, and remove duplicated work that was occurring in both igb_ptp_reset and igb_ptp_init. In total, resetting the TSAUXC register, and resetting the system time both happen in igb_ptp_reset already. igb_ptp_reset now also takes care of starting the delayed work item for overflow checks, as well. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-06-29 10:56:11 -07:00
Johannes Thumshirn	56d766d64c	ethernet/intel: Use pci_(request\|release)_mem_regions Now that we do have pci_request_mem_regions() and pci_release_mem_regions() at hand, use it in the Intel ethernet drivers. Suggested-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> CC: David S. Miller <davem@davemloft.net>	2016-06-23 11:48:58 -05:00
Alexander Duyck	bf2d1df395	intel: Add support for IPv6 IP-in-IP offload This patch adds support for offloading IPXIP6 type packets that represent either IPv4 or IPv6 encapsulated inside of an IPv6 outer IP header. In addition with this change we should also be able to support FOU encapsulated traffic with outer IPv6 headers. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-05-20 19:25:52 -04:00
Tom Herbert	7e13318daa	net: define gso types for IPx over IPv4 and IPv6 This patch defines two new GSO definitions SKB_GSO_IPXIP4 and SKB_GSO_IPXIP6 along with corresponding NETIF_F_GSO_IPXIP4 and NETIF_F_GSO_IPXIP6. These are used to described IP in IP tunnel and what the outer protocol is. The inner protocol can be deduced from other GSO types (e.g. SKB_GSO_TCPV4 and SKB_GSO_TCPV6). The GSO types of SKB_GSO_IPIP and SKB_GSO_SIT are removed (these are both instances of SKB_GSO_IPXIP4). SKB_GSO_IPXIP6 will be used when support for GSO with IP encapsulation over IPv6 is added. Signed-off-by: Tom Herbert <tom@herbertland.com> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-05-20 18:03:15 -04:00
Alexander Duyck	e10715d3e9	igb/igbvf: Add support for GSO partial This patch adds support for partial GSO segmentation in the case of tunnels. Specifically with this change the driver an perform segmentation as long as the frame either has IPv6 inner headers, or we are allowed to mangle the IP IDs on the inner header. This is needed because we will not be modifying any fields from the start of the start of the outer transport header to the start of the inner transport header as we are treating them like they are just a block of IP options. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-05-13 15:26:37 -07:00
Jacob Keller	8008f68cb8	igb: make igb_update_pf_vlvf static Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-05-13 14:39:59 -07:00
Jacob Keller	a51d8c217b	igb: use BIT() macro or unsigned prefix For bitshifts, we should make use of the BIT macro when possible, and ensure that other bitshifts are marked as unsigned. This helps prevent signed bitshift errors, and ensures similar style. Make use of GENMASK and the unsigned postfix where BIT() isn't appropriate. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-05-13 14:39:47 -07:00
Florian Westphal	4d0e965732	drivers: replace dev->trans_start accesses with dev_trans_start a trans_start struct member exists twice: - in struct net_device (legacy) - in struct netdev_queue Instead of open-coding dev->trans_start usage to obtain the current trans_start value, use dev_trans_start() instead. This is not exactly the same, as dev_trans_start also considers the trans_start values of the netdev queues owned by the device and provides the most recent one. For legacy devices this doesn't matter as dev_trans_start can cope with netdev trans_start values of 0 (they are ignored). This is a prerequisite to eventual removal of dev->trans_start. Cc: linux-rdma@vger.kernel.org Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-05-04 14:16:47 -04:00
Arika Chen	d99e366fc9	Revert "igb: Fix a deadlock in igb_sriov_reinit" This reverts commit `3eb14ea8d9` ("igb: Fix a deadlock in igb_sriov_reinit") It is the same as commit `f468adc944` ("igb: missing rtnl_unlock in igb_sriov_reinit()") There is no rtnl_lock() in igb_resume before, rtnl_unlock will cause a deadlock. Signed-off-by: Arika Chen <arika.chen@huawei.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-04-06 21:02:11 -07:00
John Holland	806ffb1d50	igb: allow setting MAC address on i211 using a device tree blob The Intel i211 LOM PCIe Ethernet controllers' iNVM operates as an OTP and has no external EEPROM interface [1]. The following allows the driver to pickup the MAC address from a device tree blob when CONFIG_OF has been enabled. [1] http://www.intel.com/content/www/us/en/embedded/products/networking/i211-ethernet-controller-datasheet.html Signed-off-by: John Holland <jotihojr@gmail.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-04-06 12:38:36 -07:00
Alexander Duyck	7f0ba84560	igb: Add support for bulk Tx cleanup & cleanup boolean logic This patch enables bulk free in Tx cleanup for igb and cleans up the boolean logic in the polling routines for igb in the hopes of avoiding any mix-ups similar to what occurred with i40e and i40evf. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-04-06 12:26:36 -07:00
Alexander Duyck	415cd2a645	igb: Fix sparse warning about passing __beXX into leXX_to_cpup We were casting the addr as __beXX and then passing it into le32_to_cpu because the device expects the MAC address to be in network order even though the register set is little endian. Instead of casting it as __beXX we can just cast it as __leXX in order to maintain consistency since the region of memory is already in little endian order as far as we are concerned. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-04-06 12:16:07 -07:00
Linus Torvalds	1200b6809d	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next Pull networking updates from David Miller: "Highlights: 1) Support more Realtek wireless chips, from Jes Sorenson. 2) New BPF types for per-cpu hash and arrap maps, from Alexei Starovoitov. 3) Make several TCP sysctls per-namespace, from Nikolay Borisov. 4) Allow the use of SO_REUSEPORT in order to do per-thread processing of incoming TCP/UDP connections. The muxing can be done using a BPF program which hashes the incoming packet. From Craig Gallek. 5) Add a multiplexer for TCP streams, to provide a messaged based interface. BPF programs can be used to determine the message boundaries. From Tom Herbert. 6) Add 802.1AE MACSEC support, from Sabrina Dubroca. 7) Avoid factorial complexity when taking down an inetdev interface with lots of configured addresses. We were doing things like traversing the entire address less for each address removed, and flushing the entire netfilter conntrack table for every address as well. 8) Add and use SKB bulk free infrastructure, from Jesper Brouer. 9) Allow offloading u32 classifiers to hardware, and implement for ixgbe, from John Fastabend. 10) Allow configuring IRQ coalescing parameters on a per-queue basis, from Kan Liang. 11) Extend ethtool so that larger link mode masks can be supported. From David Decotigny. 12) Introduce devlink, which can be used to configure port link types (ethernet vs Infiniband, etc.), port splitting, and switch device level attributes as a whole. From Jiri Pirko. 13) Hardware offload support for flower classifiers, from Amir Vadai. 14) Add "Local Checksum Offload". Basically, for a tunneled packet the checksum of the outer header is 'constant' (because with the checksum field filled into the inner protocol header, the payload of the outer frame checksums to 'zero'), and we can take advantage of that in various ways. From Edward Cree" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1548 commits) bonding: fix bond_get_stats() net: bcmgenet: fix dma api length mismatch net/mlx4_core: Fix backward compatibility on VFs phy: mdio-thunder: Fix some Kconfig typos lan78xx: add ndo_get_stats64 lan78xx: handle statistics counter rollover RDS: TCP: Remove unused constant RDS: TCP: Add sysctl tunables for sndbuf/rcvbuf on rds-tcp socket net: smc911x: convert pxa dma to dmaengine team: remove duplicate set of flag IFF_MULTICAST bonding: remove duplicate set of flag IFF_MULTICAST net: fix a comment typo ethernet: micrel: fix some error codes ip_tunnels, bpf: define IP_TUNNEL_OPTS_MAX and use it bpf, dst: add and use dst_tclassid helper bpf: make skb->tc_classid also readable net: mvneta: bm: clarify dependencies cls_bpf: reset class and reuse major in da ldmvsw: Checkpatch sunvnet.c and sunvnet_common.c ldmvsw: Add ldmvsw.c driver code ...	2016-03-19 10:05:34 -07:00
Joonsoo Kim	fe896d1878	mm: introduce page reference manipulation functions The success of CMA allocation largely depends on the success of migration and key factor of it is page reference count. Until now, page reference is manipulated by direct calling atomic functions so we cannot follow up who and where manipulate it. Then, it is hard to find actual reason of CMA allocation failure. CMA allocation should be guaranteed to succeed so finding offending place is really important. In this patch, call sites where page reference is manipulated are converted to introduced wrapper function. This is preparation step to add tracepoint to each page reference manipulation function. With this facility, we can easily find reason of CMA allocation failure. There is no functional change in this patch. In addition, this patch also converts reference read sites. It will help a second step that renames page._count to something else and prevents later attempt to direct access to it (Suggested by Andrew). Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Acked-by: Michal Nazarewicz <mina86@mina86.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Minchan Kim <minchan@kernel.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-03-17 15:09:34 -07:00
Stefan Assmann	46eafa59e1	igb: call ndo_stop() instead of dev_close() when running offline selftest Calling dev_close() causes IFF_UP to be cleared which will remove the interfaces routes and some addresses. That's probably not what the user intended when running the offline selftest. Besides this does not happen if the interface is brought down before the test, so the current behaviour is inconsistent. Instead call the net_device_ops ndo_stop function directly and avoid touching IFF_UP at all. Signed-off-by: Stefan Assmann <sassmann@kpanic.de> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-02-24 15:54:39 -08:00
Corinna Vinschen	030f9f5264	igb: Fix VLAN tag stripping on Intel i350 Problem: When switching off VLAN offloading on an i350, the VLAN interface gets unusable. For testing, set up a VLAN on an i350 and some remote machine, e.g.: $ ip link add link eth0 name eth0.42 type vlan id 42 $ ip addr add 192.168.42.1/24 dev eth0.42 $ ip link set dev eth0.42 up Offloading is switched on by default: $ ethtool -k eth0 \| grep vlan-offload rx-vlan-offload: on tx-vlan-offload: on $ ping -c 3 -I eth0.42 192.168.42.2 [...works as usual...] Now switch off VLAN offloading and try again: $ ethtool -K eth0 rxvlan off Actual changes: rx-vlan-offload: off tx-vlan-offload: off [requested on] $ ping -c 3 -I eth0.42 192.168.42.2 PING 192.168.42.2 (192.168.42.2) from 192.168.42.1 eth0.42: 56(84) bytes of da ta. --- 192.168.42.2 ping statistics --- 3 packets transmitted, 0 received, 100% packet loss, time 1999ms I can only reproduce it on an i350, the above works fine on a 82580. While inspecting the igb source, I came across the code in igb_set_vmolr which sets the E1000_VMOLR_STRVLAN/E1000_DVMOLR_STRVLAN flags once and for all, and in all of the igb code there's no other place where the STRVLAN is set or cleared. Thus, VLAN stripping is enabled in igb unconditionally, independently of the offloading setting. I compared that to the latest Intel igb-5.3.3.5 driver from http://sourceforge.net/projects/e1000/ which in fact sets and clears the STRVLAN flag independently from igb_set_vmolr in its own function igb_set_vf_vlan_strip, depending on the vlan settings. So I included the STRVLAN handling from the igb-5.3.3.5 driver into our current igb driver and tested the above scenario again. This time ping still works after switching off VLAN offloading. Tested on i350, with and without addtional VFs, as well as on 82580 successfully. Signed-off-by: Corinna Vinschen <vinschen@redhat.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-02-24 15:50:40 -08:00
Alexander Duyck	6e03370088	igb: Add support for generic Tx checksums This patch adds support for generic Tx checksums to the igb driver. It turns out this is actually pretty easy after going over the datasheet as we were doing a number of steps we didn't need to. In order to perform a Tx checksum for an L4 header we need to fill in the following fields in the Tx descriptor: MACLEN (maximum of 127), retrieved from: skb_network_offset() IPLEN (maximum of 511), retrieved from: skb_checksum_start_offset() - skb_network_offset() TUCMD.L4T indicates offset and if checksum or crc32c, based on: skb->csum_offset The added advantage to doing this is that we can support inner checksum offloads for tunnels and MPLS while still being able to transparently insert VLAN tags. I also took the opportunity to clean-up many of the feature flag configuration bits to make them a bit more consistent between drivers. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-02-24 15:32:42 -08:00
Todd Fujinaka	c883de9fd7	igb: rename igb define to be more generic E1000_MRQC_ENABLE_RSS_4Q enables 4 and 8 queues depending on the part so rename to be generic. Similarly, E1000_MRQC_ENABLE_VMDQ_RSS_2Q has no numeric meaning so rename to be more generic. Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-02-24 15:29:16 -08:00
Todd Fujinaka	5e350b9260	igb: enable WoL for OEM devices regardless of EEPROM setting Override EEPROM settings for specific OEM devices. Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-02-24 15:14:32 -08:00
Takuma Ueba	b72f3f7200	igb: When GbE link up, wait for Remote receiver status condition I210 device IPv6 autoconf test sometimes fails, because DAD NS for link-local is not transmitted. This packet is silently dropped. This problem is seen only GbE environment. igb_watchdog_task link up detection continues to the following process. The following cases are observed: 1.PHY 1000BASE-T Status Register Remote receiver status bit is NG. (NG status becomes OK after about 200 - 700ms) 2.In this case, the transfer packet is silently dropped. 1000BASE-T Status register [Expected]: 0x3800 or 0x7800 [problem occurred]: 0x2800 or 0x6800 Frequency of occurrence: approx 1/10 - 1/40 observed In order to avoid this problem, wait until 1000BASE-T Status register "Remote receiver status OK" After applying this patch, at least 400 runs succeed with no problems. Signed-off-by: Takuma Ueba <t.ueba11@gmail.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-02-24 14:56:42 -08:00
Alexander Duyck	bf456abb9b	igb: Add workaround for VLAN tag stripping on 82576 There was a workaround partially implemented for the 82576 that is needed in order for VLAN tag stripping to function correctly. The original code had side effects that would make it so the workaround was active on all MACs. I have updated the code so that the workaround is enabled, but limited to the 82576, or activated if we exceed the available unicast addresses. The workaround has a side effect of mirroring all of the traffic outgoing from the VFs back to the PF. As such it is not recommended to use the 82576 in promiscuous mode as it will take a performance hit, though this is now consistent with the performance as seen on the out-of-tree igb driver. I also limited the scope of the UTA bits all being set to only when the VMOLR register is enabled. This should limit the effects of the UTA register so that we don't pick up any excess traffic unless promiscuous mode has been enabled on the PF, whereas before the PF would have ended up in something equivalent to unicast promiscuous mode with VLAN filtering otherwise. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-02-15 16:52:33 -08:00
Alexander Duyck	268f9d33a9	igb: Enable use of "bridge fdb add" to set unicast table entries This change makes it so that we can use the bridge utility to add a FDB entry for the PF to an igb port. By doing this we can enable the VFs to talk to virtual ports residing on top of the PF. In addition this should also address issues with MACVLANs trying to reside on top of the PF as well as they would have had similar issues when added to the PF with SR-IOV enabled. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-02-15 16:49:25 -08:00
Alexander Duyck	9c2f186e45	igb: Drop unnecessary checks in transmit path This patch drops several checks that we dropped from ixgbe some ago. It should not be possible for us to be called with either of the conditional statements returning true so we can just drop them from the hot-path. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-02-15 16:46:15 -08:00
Alexander Duyck	16903caa33	igb: Add support for VLAN promiscuous with SR-IOV and NTUPLE This change fixes things so that we can fully support SR-IOV or the recently added NTUPLE filtering while allowing support for VLAN promiscuous mode. By making this change we are able to support possible scenarios such as SR-IOV with the PF connected to a Linux bridge hosting other VMs. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-02-15 16:43:06 -08:00
Alexander Duyck	a15d92598a	igb: Clean-up configuration of VF port VLANs This patch is meant to clean-up the configuration of the VF port based VLAN configuration. The original logic was a bit muddled and had some undesirable side effects such as VLANs being either completely stripped from the port or VLANs being left when they shouldn't be. The idea behind this code is to avoid any events such as spurious spoof notifications when we are removing one VLAN tag and replacing it with another. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-02-15 16:39:57 -08:00
Alexander Duyck	8b77c6b20f	igb: Merge VLVF configuration into igb_vfta_set This change makes it so that we can merge the configuration of the VLVF registers into the setting of the VFTA register. By doing this we simplify the logic and make use of similar functionality that we have already added for ixgbe making it easier to maintain both drivers. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-02-15 16:36:52 -08:00
Alexander Duyck	5982a5565a	igb: Always enable VLAN 0 even if 8021q is not loaded This patch makes it so that we always add VLAN 0. This is important as we need to guarantee the PF can receive untagged frames in the case of SR-IOV being enabled but VLAN filtering not being enabled in the kernel. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-02-15 16:33:42 -08:00
Alexander Duyck	d3836f8e25	igb: Do not factor VLANs into RLPML calculation The RLPML registers already take the size of VLAN headers into account when determining the maximum packet length. This is called out in EAS documents for several parts including the 82576 and the i350. As such we can drop the addition of size to the value programmed into the RLPML registers. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-02-15 16:30:33 -08:00
Alexander Duyck	45693bcb00	igb: Allow asymmetric configuration of MTU versus Rx frame size Since the igb driver is using page based receive there is no point in limiting the Rx capabilities of the device. The driver can receive 9K jumbo frames at all times. The only changes needed due to MTU changes are updates for the FIFO sizes and flow-control watermarks. Update the maximum frame size to reflect the 9.5K limitation of the hardware, and replace all instances of max_frame_size with MAX_JUMBO_FRAME_SIZE when referring to an Rx FIFO or frame. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-02-15 16:27:21 -08:00
Alexander Duyck	c3278587e7	igb: clean up code for setting MAC address Drop a bunch of hand written byte swapping code in favor of just doing the byte swapping ourselves. The registers are little endian registers storing a big endian value so if we read the MAC address array as little endian then we will get the CPU registers into the proper layout. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-02-15 16:21:05 -08:00
Shota Suzuki	37a5d163fb	igb: Unpair the queues when changing the number of queues By the commit `72ddef0506` ("igb: Fix oops caused by missing queue pairing"), the IGB_FLAG_QUEUE_PAIRS flag can now be set when changing the number of queues by "ethtool -L", but it is never cleared unless the igb driver is reloaded. This patch clears it if queue pairing becomes unnecessary as a result of "ethtool -L". Signed-off-by: Shota Suzuki <suzuki_shota_t3@lab.ntt.co.jp> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-02-15 16:14:50 -08:00
Shota Suzuki	ceb2775998	igb: Remove unnecessary flag setting in igb_set_flag_queue_pairs() If VFs are enabled (max_vfs >= 1), both max_rss_queues and adapter->rss_queues are set to 2 in the case of e1000_82576. In this case, IGB_FLAG_QUEUE_PAIRS is always set in the default block as a result of fall-through, thus setting it in the e1000_82576 block is not necessary. Signed-off-by: Shota Suzuki <suzuki_shota_t3@lab.ntt.co.jp> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-02-15 16:09:39 -08:00
Tom Herbert	53692b1de4	sctp: Rename NETIF_F_SCTP_CSUM to NETIF_F_SCTP_CRC The SCTP checksum is really a CRC and is very different from the standards 1's complement checksum that serves as the checksum for IP protocols. This offload interface is also very different. Rename NETIF_F_SCTP_CSUM to NETIF_F_SCTP_CRC to highlight these differences. The term CSUM should be reserved in the stack to refer to the standard 1's complement IP checksum. Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-12-15 16:49:58 -05:00
Jarod Wilson	7b06a69095	igb: improve handling of disconnected adapters Clean up array_rd32 so that it uses igb_rd32 the same as rd32, per the suggestion of Alexander Duyck, and use io_addr in more places, so that we don't have the need to call E1000_REMOVED (which simply looks for a null hw_addr) nearly as much. Signed-off-by: Jarod Wilson <jarod@redhat.com> Acked-by: Alexander Duyck <aduyck@mirantis.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-12-12 23:51:26 -08:00
Jan Beulich	be06998f96	igb: fix NULL derefs due to skipped SR-IOV enabling The combined effect of commits `6423fc3416` ("igb: do not re-init SR-IOV during probe") and `ceee3450b3` ("igb: make sure SR-IOV init uses the right number of queues") causes VFs no longer getting set up, leading to NULL pointer dereferences due to the adapter's ->vf_data being NULL while ->vfs_allocated_count is non-zero. The first commit not only neglected the side effect of igb_sriov_reinit() that the second commit tried to account for, but also that of setting IGB_FLAG_HAS_MSIX, without which igb_enable_sriov() is effectively a no-op. Calling igb_{,re}set_interrupt_capability() as done here seems to address this, but I'm not sure whether this is better than sinply reverting the other two commits. Signed-off-by: Jan Beulich <jbeulich@suse.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-12-12 23:19:37 -08:00
Jarod Wilson	73bf8048d7	igb: don't unmap NULL hw_addr I've got a startech thunderbolt dock someone loaned me, which among other things, has the following device in it: 08:00.0 Ethernet controller: Intel Corporation I210 Gigabit Network Connection (rev 03) This hotplugs just fine (kernel 4.2.0 plus a patch or two here): [ 863.020315] igb: Intel(R) Gigabit Ethernet Network Driver - version 5.2.18-k [ 863.020316] igb: Copyright (c) 2007-2014 Intel Corporation. [ 863.028657] igb 0000:08:00.0: enabling device (0000 -> 0002) [ 863.062089] igb 0000:08:00.0: added PHC on eth0 [ 863.062090] igb 0000:08:00.0: Intel(R) Gigabit Ethernet Network Connection [ 863.062091] igb 0000:08:00.0: eth0: (PCIe:2.5Gb/s:Width x1) e8:ea:6a:00:1b:2a [ 863.062194] igb 0000:08:00.0: eth0: PBA No: 000200-000 [ 863.062196] igb 0000:08:00.0: Using MSI-X interrupts. 4 rx queue(s), 4 tx queue(s) [ 863.064889] igb 0000:08:00.0 enp8s0: renamed from eth0 But disconnecting it is another story: [ 1002.807932] igb 0000:08:00.0: removed PHC on enp8s0 [ 1002.807944] igb 0000:08:00.0 enp8s0: PCIe link lost, device now detached [ 1003.341141] ------------[ cut here ]------------ [ 1003.341148] WARNING: CPU: 0 PID: 199 at lib/iomap.c:43 bad_io_access+0x38/0x40() [ 1003.341149] Bad IO access at port 0x0 () [ 1003.342767] Modules linked in: snd_usb_audio snd_usbmidi_lib snd_rawmidi igb dca firewire_ohci firewire_core crc_itu_t rfcomm ctr ccm arc4 iwlmvm mac80211 fuse xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw iptable_filter bnep dm_mirror dm_region_hash dm_log dm_mod coretemp x86_pkg_temp_thermal intel_powerclamp kvm_intel snd_hda_codec_hdmi kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel drbg [ 1003.342793] ansi_cprng aesni_intel hp_wmi aes_x86_64 iTCO_wdt lrw iTCO_vendor_support ppdev gf128mul sparse_keymap glue_helper ablk_helper cryptd snd_hda_codec_realtek snd_hda_codec_generic microcode snd_hda_intel uvcvideo iwlwifi snd_hda_codec videobuf2_vmalloc videobuf2_memops snd_hda_core videobuf2_core snd_hwdep btusb v4l2_common btrtl snd_seq btbcm btintel videodev cfg80211 snd_seq_device rtsx_pci_ms bluetooth pcspkr input_leds i2c_i801 media parport_pc memstick rfkill sg lpc_ich snd_pcm 8250_fintek parport joydev snd_timer snd soundcore hp_accel ie31200_edac mei_me lis3lv02d edac_core input_polldev mei hp_wireless shpchp tpm_infineon sch_fq_codel nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables autofs4 xfs libcrc32c sd_mod sr_mod cdrom rtsx_pci_sdmmc mmc_core crc32c_intel serio_raw rtsx_pci [ 1003.342822] nouveau ahci libahci mxm_wmi e1000e xhci_pci hwmon ptp drm_kms_helper pps_core xhci_hcd ttm wmi video ipv6 [ 1003.342839] CPU: 0 PID: 199 Comm: kworker/0:2 Not tainted 4.2.0-2.el7_UNSUPPORTED.x86_64 #1 [ 1003.342840] Hardware name: Hewlett-Packard HP ZBook 15 G2/2253, BIOS M70 Ver. 01.07 02/26/2015 [ 1003.342843] Workqueue: pciehp-3 pciehp_power_thread [ 1003.342844] ffffffff81a90655 ffff8804866d3b48 ffffffff8164763a 0000000000000000 [ 1003.342846] ffff8804866d3b98 ffff8804866d3b88 ffffffff8107134a ffff8804866d3b88 [ 1003.342847] ffff880486f46000 ffff88046c8a8000 ffff880486f46840 ffff88046c8a8098 [ 1003.342848] Call Trace: [ 1003.342852] [<ffffffff8164763a>] dump_stack+0x45/0x57 [ 1003.342855] [<ffffffff8107134a>] warn_slowpath_common+0x8a/0xc0 [ 1003.342857] [<ffffffff810713c6>] warn_slowpath_fmt+0x46/0x50 [ 1003.342859] [<ffffffff8133719e>] ? pci_disable_msix+0x3e/0x50 [ 1003.342860] [<ffffffff812f6328>] bad_io_access+0x38/0x40 [ 1003.342861] [<ffffffff812f6567>] pci_iounmap+0x27/0x40 [ 1003.342865] [<ffffffffa0b728d7>] igb_remove+0xc7/0x160 [igb] [ 1003.342867] [<ffffffff8132189f>] pci_device_remove+0x3f/0xc0 [ 1003.342869] [<ffffffff81433426>] __device_release_driver+0x96/0x130 [ 1003.342870] [<ffffffff814334e3>] device_release_driver+0x23/0x30 [ 1003.342871] [<ffffffff8131b404>] pci_stop_bus_device+0x94/0xa0 [ 1003.342872] [<ffffffff8131b3ad>] pci_stop_bus_device+0x3d/0xa0 [ 1003.342873] [<ffffffff8131b3ad>] pci_stop_bus_device+0x3d/0xa0 [ 1003.342874] [<ffffffff8131b516>] pci_stop_and_remove_bus_device+0x16/0x30 [ 1003.342876] [<ffffffff81333f5b>] pciehp_unconfigure_device+0x9b/0x180 [ 1003.342877] [<ffffffff81333a73>] pciehp_disable_slot+0x43/0xb0 [ 1003.342878] [<ffffffff81333b6d>] pciehp_power_thread+0x8d/0xb0 [ 1003.342885] [<ffffffff810881b2>] process_one_work+0x152/0x3d0 [ 1003.342886] [<ffffffff8108854a>] worker_thread+0x11a/0x460 [ 1003.342887] [<ffffffff81088430>] ? process_one_work+0x3d0/0x3d0 [ 1003.342890] [<ffffffff8108ddd9>] kthread+0xc9/0xe0 [ 1003.342891] [<ffffffff8108dd10>] ? kthread_create_on_node+0x180/0x180 [ 1003.342893] [<ffffffff8164e29f>] ret_from_fork+0x3f/0x70 [ 1003.342894] [<ffffffff8108dd10>] ? kthread_create_on_node+0x180/0x180 [ 1003.342895] ---[ end trace 65a77e06d5aa9358 ]--- Upon looking at the igb driver, I see that igb_rd32() attempted to read from hw_addr and failed, so it set hw->hw_addr to NULL and spit out the message in the log output above, "PCIe link lost, device now detached". Well, now that hw_addr is NULL, the attempt to call pci_iounmap is obviously not going to go well. As suggested by Mark Rustad, do something similar to what ixgbe does, and save a copy of hw_addr as adapter->io_addr, so we can still call pci_iounmap on it on teardown. Additionally, for consistency, make the pci_iomap call assignment directly to io_addr, so map and unmap match. Signed-off-by: Jarod Wilson <jarod@redhat.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-12-12 23:09:54 -08:00
Jesse Brandeburg	32b3e08fff	drivers/net/intel: use napi_complete_done() As per Eric Dumazet's previous patches: (see commit (`24d2e4a507`) - tg3: use napi_complete_done()) Quoting verbatim: Using napi_complete_done() instead of napi_complete() allows us to use /sys/class/net/ethX/gro_flush_timeout GRO layer can aggregate more packets if the flush is delayed a bit, without having to set too big coalescing parameters that impact latencies. </end quote> Tested configuration: low latency via ethtool -C ethx adaptive-rx off rx-usecs 10 adaptive-tx off tx-usecs 15 workload: streaming rx using netperf TCP_MAERTS igb: MIGRATED TCP MAERTS TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.0.0.1 () port 0 AF_INET : demo ... Interim result: 941.48 10^6bits/s over 1.000 seconds ending at 1440193171.589 Alignment Offset Bytes Bytes Recvs Bytes Sends Local Remote Local Remote Xfered Per Per Recv Send Recv Send Recv (avg) Send (avg) 8 8 0 0 1176930056 1475.36 797726 16384.00 71905 MIGRATED TCP MAERTS TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.0.0.1 () port 0 AF_INET : demo ... Interim result: 941.49 10^6bits/s over 0.997 seconds ending at 1440193142.763 Alignment Offset Bytes Bytes Recvs Bytes Sends Local Remote Local Remote Xfered Per Per Recv Send Recv Send Recv (avg) Send (avg) 8 8 0 0 1175182320 50476.00 23282 16384.00 71816 i40e: Hard to test because the traffic is incoming so fast (24Gb/s) that GRO always receives 87kB, even at the highest interrupt rate. Other drivers were only compile tested. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-10-16 04:33:46 -07:00
Arnd Bergmann	40c9b0796d	net: igb: avoid using timespec We want to deprecate the use of 'struct timespec' on 32-bit architectures, as it is will overflow in 2038. The igb driver uses it to read the current time, and can simply be changed to use ktime_get_real_ts64() instead. Because of hardware limitations, there is still an overflow in year 2106, which we cannot really avoid, but this documents the overflow. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Cc: intel-wired-lan@lists.osuosl.org Reviewed-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-10-05 03:16:42 -07:00
Stefan Assmann	cbfe360a15	igb: assume MSI-X interrupts during initialization In igb_sw_init() the sequence of calls was changed from igb_init_queue_configuration() igb_init_interrupt_scheme() igb_probe_vfs() to igb_probe_vfs() igb_init_queue_configuration() igb_init_interrupt_scheme() This results in adapter->flags not having the IGB_FLAG_HAS_MSIX bit set during igb_probe_vfs()->igb_enable_sriov(). Therefore SR-IOV does not get enabled properly and we run into a NULL pointer if the max_vfs module parameter is specified (adapter->vf_data does not get allocated, crash on accessing the structure). [ 7.419348] BUG: unable to handle kernel NULL pointer dereference at 0000000000000048 [ 7.419367] IP: [<ffffffffa02161c6>] igb_reset+0xe6/0x5d0 [igb] [ 7.419370] PGD 0 [ 7.419373] Oops: 0002 [#1] SMP [ 7.419381] Modules linked in: ahci(+) libahci igb(+) i40e(+) vxlan ip6_udp_tunnel udp_tunnel megaraid_sas(+) ixgbe(+) mdio [ 7.419385] CPU: 0 PID: 4 Comm: kworker/0:0 Not tainted 4.2.0+ #153 [ 7.419387] Hardware name: Dell Inc. PowerEdge R720/0C4Y3R, BIOS 1.6.0 03/07/2013 [...] [ 7.419431] Call Trace: [ 7.419442] [<ffffffffa0217236>] igb_probe+0x8b6/0x1340 [igb] [ 7.419447] [<ffffffff814c7f15>] local_pci_probe+0x45/0xa0 Prevent this by setting the IGB_FLAG_HAS_MSIX bit before calling igb_probe_vfs(). The real interrupt capabilities will be checked during igb_init_interrupt_scheme() so this is safe to do. Signed-off-by: Stefan Assmann <sassmann@kpanic.de> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-09-28 17:48:34 -07:00
David S. Miller	0d36938bb8	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2015-08-27 21:45:31 -07:00
Michal Hocko	2f064f3485	mm: make page pfmemalloc check more robust Commit `c48a11c7ad` ("netvm: propagate page->pfmemalloc to skb") added checks for page->pfmemalloc to __skb_fill_page_desc(): if (page->pfmemalloc && !page->mapping) skb->pfmemalloc = true; It assumes page->mapping == NULL implies that page->pfmemalloc can be trusted. However, __delete_from_page_cache() can set set page->mapping to NULL and leave page->index value alone. Due to being in union, a non-zero page->index will be interpreted as true page->pfmemalloc. So the assumption is invalid if the networking code can see such a page. And it seems it can. We have encountered this with a NFS over loopback setup when such a page is attached to a new skbuf. There is no copying going on in this case so the page confuses __skb_fill_page_desc which interprets the index as pfmemalloc flag and the network stack drops packets that have been allocated using the reserves unless they are to be queued on sockets handling the swapping which is the case here and that leads to hangs when the nfs client waits for a response from the server which has been dropped and thus never arrive. The struct page is already heavily packed so rather than finding another hole to put it in, let's do a trick instead. We can reuse the index again but define it to an impossible value (-1UL). This is the page index so it should never see the value that large. Replace all direct users of page->pfmemalloc by page_is_pfmemalloc which will hide this nastiness from unspoiled eyes. The information will get lost if somebody wants to use page->index obviously but that was the case before and the original code expected that the information should be persisted somewhere else if that is really needed (e.g. what SLAB and SLUB do). [akpm@linux-foundation.org: fix blooper in slub] Fixes: `c48a11c7ad` ("netvm: propagate page->pfmemalloc to skb") Signed-off-by: Michal Hocko <mhocko@suse.com> Debugged-by: Vlastimil Babka <vbabka@suse.com> Debugged-by: Jiri Bohac <jbohac@suse.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: David Miller <davem@davemloft.net> Acked-by: Mel Gorman <mgorman@suse.de> Cc: <stable@vger.kernel.org> [3.6+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-08-21 14:30:10 -07:00
Todd Fujinaka	ceee3450b3	igb: make sure SR-IOV init uses the right number of queues Recent changes to igb_probe_vfs() could lead to the PF holding onto all of the queues. Reorder igb_probe_vfs() to be before gb_init_queue_configuration() and add some more error checking. Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-08-18 14:06:07 -07:00
Jia-Ju Bai	42ad1a03b4	igb: Fix a memory leak in igb_probe In error handling code of igb_probe, the memory adapter->shadow_vfta allocated by kcalloc in igb_sw_init is not freed. So when register_netdev or igb_init_i2c is failed, a memory leak will occur. This patch adds kfree to fix it. Signed-off-by: Jia-Ju Bai <baijiaju1990@163.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-08-18 14:06:06 -07:00
Jia-Ju Bai	3eb14ea8d9	igb: Fix a deadlock in igb_sriov_reinit When igb_init_interrupt_scheme in igb_sriov_reinit is failed, the lock acquired by rtnl_lock() is not released, which causes a deadlock. This patch adds rtnl_unlock() in error handling to fix it. Signed-off-by: Jia-Ju Bai <baijiaju1990@163.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-08-18 14:06:05 -07:00
Alex Williamson	c23d92b80e	igb: Teardown SR-IOV before unregister_netdev() When the .remove() callback for a PF is called, SR-IOV support for the device is disabled, which requires unbinding and removing the VFs. The VFs may be in-use either by the host kernel or userspace, such as assigned to a VM through vfio-pci. In this latter case, the VFs may be removed either by shutting down the VM or hot-unplugging the devices from the VM. Unfortunately in the case of a Windows 2012 R2 guest, hot-unplug is broken due to the ordering of the PF driver teardown. Disabling SR-IOV prior to unregister_netdev() avoids this issue. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Acked-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-08-18 14:06:05 -07:00
Stefan Assmann	6423fc3416	igb: do not re-init SR-IOV during probe During driver probing the following code path is triggered. igb_probe ->igb_sw_init ->igb_probe_vfs ->igb_pci_enable_sriov ->igb_sriov_reinit Doing the SR-IOV re-init is not necessary during probing since we're starting from scratch. Here we can call igb_enable_sriov() right away. Running igb_sriov_reinit() during igb_probe() also seems to cause occasional packet loss on some onboard 82576 NICs. Reproduced on Dell and HP servers with onboard 82576 NICs. Example: Intel Corporation 82576 Gigabit Network Connection [8086:10c9] (rev 01) Subsystem: Dell Device [1028:0481] Signed-off-by: Stefan Assmann <sassmann@kpanic.de> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-08-18 14:06:04 -07:00
Vasily Averin	f468adc944	igb: missing rtnl_unlock in igb_sriov_reinit() Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-08-18 14:06:03 -07:00
Shota Suzuki	72ddef0506	igb: Fix oops caused by missing queue pairing When initializing igb driver (e.g. 82576, I350), IGB_FLAG_QUEUE_PAIRS is set if adapter->rss_queues exceeds half of max_rss_queues in igb_init_queue_configuration(). On the other hand, IGB_FLAG_QUEUE_PAIRS is not set even if the number of queues exceeds half of max_combined in igb_set_channels() when changing the number of queues by "ethtool -L". In this case, if numvecs is larger than MAX_MSIX_ENTRIES (10), the size of adapter->msix_entries[], an overflow can occur in igb_set_interrupt_capability(), which in turn leads to an oops. Fix this problem as follows: - When changing the number of queues by "ethtool -L", set IGB_FLAG_QUEUE_PAIRS in the same way as initializing igb driver. - When increasing the size of q_vector, reallocate it appropriately. (With IGB_FLAG_QUEUE_PAIRS set, the size of q_vector gets larger.) Another possible way to fix this problem is to cap the queues at its initial number, which is the number of the initial online cpus. But this is not the optimal way because we cannot increase queues when another cpu becomes online. Note that before commit `cd14ef54d2` ("igb: Change to use statically allocated array for MSIx entries"), this problem did not cause oops but just made the number of queues become 1 because of entering msi_only mode in igb_set_interrupt_capability(). Fixes: `907b783579` ("igb: Add ethtool support to configure number of channels") CC: stable <stable@vger.kernel.org> Signed-off-by: Shota Suzuki <suzuki_shota_t3@lab.ntt.co.jp> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-08-18 14:06:03 -07:00
Todd Fujinaka	6fb469023c	igb: bump version to igb-5.3.0 Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-07-23 05:10:50 -07:00
Alexander Duyck	f56e7bba22	igb: Pull timestamp from fragment before adding it to skb This change makes it so that we pull the timestamp from the fragment before we add it to the skb. By doing this we can avoid a possible issue in which the fragment can possibly be less than IGB_RX_HDR_LEN due to the timestamp being pulled after the copybreak check. While making this change I realized we could also pull the rest of the igb_pull_tail function into igb_add_rx_frag since in the case of igb, unlike ixgbe, we are able to unmap the entire buffer before calling add_rx_frag so merging the two allows for sharing of code between the two merged functions. Reported-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-07-17 19:59:06 -07:00
Todd Fujinaka	73cd63598d	igb: bump version of igb to 5.2.18 Bump version of igb to igb-5.2.18 Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-06-26 02:36:38 -07:00
David S. Miller	b04096ff33	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Four minor merge conflicts: 1) qca_spi.c renamed the local variable used for the SPI device from spi_device to spi, meanwhile the spi_set_drvdata() call got moved further up in the probe function. 2) Two changes were both adding new members to codel params structure, and thus we had overlapping changes to the initializer function. 3) 'net' was making a fix to sk_release_kernel() which is completely removed in 'net-next'. 4) In net_namespace.c, the rtnl_net_fill() call for GET operations had the command value fixed, meanwhile 'net-next' adjusted the argument signature a bit. This also matches example merge resolutions posted by Stephen Rothwell over the past two days. Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-13 14:31:43 -04:00
Alexander Duyck	2ee52ad496	igb: Don't use NETDEV_FRAG_PAGE_MAX_SIZE in descriptor calculation This change updates igb so that it will correctly perform the descriptor count calculation. Previously it was taking NETDEV_FRAG_PAGE_MAX_SIZE into account with isn't really correct since a different value is used to determine the size of the pages used for TCP. That is actually determined by SKB_FRAG_PAGE_ORDER. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-12 10:39:26 -04:00
Toshiaki Makita	2439fc4d71	igb: Fix NULL assignment to incorrect variable in igb_reset_q_vector adapter->tx_ring is set to NULL where rx_ring should be. Fixes: `5536d2102a` ("igb: Combine q_vector and ring allocation into a single function") Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-05-07 05:11:29 -07:00
Toshiaki Makita	c0a06ee185	igb: Fix oops on changing number of rings When changing the number of rings by ethtool -L, q_vectors are reused, which causes oops because of uninitialized pointers. - When an rx is reused as a tx, q_vector->rx.ring is not set to NULL, which misleads igb_poll() to determine that it has an rx ring although it actually points to the tx ring. - When a tx is reused as an rx, q_vector->rx.ring->skb (q_vector->ring[0].skb) has a value that was used as tx_stats before. Fix these problems by zeroing it out on reuseing it. Fixes: `02ef6e1d0b` ("igb: Fix queue allocation method to accommodate changing during runtime") Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-05-07 05:08:32 -07:00
Todd Fujinaka	8cfb879d1b	igb: simplify and clean up igb_enable_mas() igb_enable_mas() should only be called for the 82575 and has no clear return so changing it to void. Also simplify the odd conditional expression. Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-05-04 01:17:47 -07:00
Toshiaki Makita	1abbc98a8a	igb: Enable TSO for stacked vlan As datasheets for igb (I210, I350, 82576, etc.) say, maclen can be from 14 to 127, which is enough for reasonable number of vlan tags. My netperf test showed I350's TSO works pretty fine with multiple vlans. Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-29 13:33:25 -07:00
Todd Fujinaka	f28ea083a3	igb: use netif_carrier_off earlier when bringing if down Use netif_carrier_off() first, since that will prevent the stack from queuing more packets to this IF. This operation is fast, and should behave much nicer when trying to bring down an interface under load. Reported-by: Eliezer Tamir <eliezer.tamir@linux.intel.com> Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-03-20 17:45:12 -07:00
Alexander Graf	6ddbc4cf1f	igb: Indicate failure on vf reset for empty mac address Commit `5ac6f91d` changed the igb driver to expose a zero (empty) mac address to the VF on reset rather than a random one. However, that behavioral change also requires igbvf driver changes which can be hard especially when we want to talk to proprietary guest OSs. Looking at the code previous to the commit in Linux that made igbvf work with empty mac addresses (`8d56b6d`), we can see that on reset failure the driver will try to generate a new mac address with both the old and the new code. Furthermore, ixgbe does send reset failure when it detects an empty mac address (`35055928c`). So I think it's safe to make igb behave the same. With this patch I can successfully run a Windows 8.1 guest with an empty mac address and an assigned igbvf device that has no mac address set by the host. If anyone is aware of a guest driver that chokes on NACK returns of VF RESET commands, please speak up. Signed-off-by: Alexander Graf <agraf@suse.de> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-01-22 18:10:23 -08:00
Richard Cochran	720db4ffd0	igb: enable auxiliary PHC functions for the i210 The i210 device offers a number of special PTP Hardware Clock features on the Software Defined Pins (SDPs). This patch adds support for two of the possible functions, namely time stamping external events, and periodic output signals. The assignment of PHC functions to the four SDP can be freely chosen by the user. Signed-off-by: Richard Cochran <richardcochran@gmail.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-01-22 18:10:19 -08:00
Richard Cochran	00c65578b4	igb: enable internal PPS for the i210 The i210 device can produce an interrupt on the full second. This patch allows using this interrupt to generate an internal PPS event for adjusting the kernel system time. Signed-off-by: Richard Cochran <richardcochran@gmail.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-01-22 18:10:19 -08:00
Richard Cochran	61d7f75f45	igb: refactor time sync interrupt handling The code that handles the time sync interrupt is repeated in three different places. This patch refactors the identical code blocks into a single helper function. Signed-off-by: Richard Cochran <richardcochran@gmail.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-01-22 18:10:18 -08:00
Alexander Duyck	95dd44b4f3	igb: Clean-up page reuse code This patch cleans up the page reuse code getting it into a state where all the workarounds needed are in place as well as cleaning up a few minor oversights such as using __free_pages instead of put_page to drop a locally allocated page. It also cleans up how we clear the descriptor status bits. Previously they were zeroed as a part of clearing the hdr_addr. However the hdr_addr is a 64 bit field and 64 bit writes can be a bit more expensive on on 32 bit systems. Since we are no longer using the header split feature the upper 32 bits of the address no longer need to be cleared. As a result we can just clear the status bits and leave the length and VLAN fields as-is which should provide more information in debugging. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-01-22 18:10:17 -08:00
Jiri Pirko	df8a39defa	net: rename vlan_tx_* helpers since "tx" is misleading there The same macros are used for rx as well. So rename it. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-13 17:51:08 -05:00

1 2 3 4 5 ...

402 Commits