linux

Author	SHA1	Message	Date
Krzysztof Majzerowicz-Jaszcz	887a79f4a8	e1000: e1000_ethertool.c coding style fixes Fixed many errors/warnings and checks in e1000_ethtool.c reported by checkpatch.pl. Suggestions from Joe Perches and Alexander Duyck applied as well Signed-off-by: Krzysztof Majzerowicz-Jaszcz <cristos@vipserv.org> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2014-09-06 03:26:30 -07:00
Eric Dumazet	cfecec56ae	mlx4: only pull headers into skb head Use the new fancy eth_get_headlen() to pull exactly the headers into skb->head. This speeds up GRE traffic (or more generally tunneled traffuc), as GRO can aggregate up to 17 MSS per GRO packet instead of 8. (Pulling too much data was forcing GRO to keep 2 frags per MSS) Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-05 22:35:27 -07:00
Colin Ian King	126859b992	mISDN: remove DSP_NEVER_DEFINED and adjust code identation The DSP_NEVER_DEFINED #ifdef is confusing, it slips in an extra } which is not required because the previous code is indented incorrectly. Correct the identation and remove the extraneous DSP_NEVER_DEFINED Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-05 22:20:21 -07:00
Jiri Pirko	cea6aeb697	bonding: add slave netlink policy and put slave-related ops together Signed-off-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-05 21:44:03 -07:00
David S. Miller	3aff50170a	Merge branch 'tcp' Eric Dumazet says: ==================== tcp: deduplicate TCP_SKB_CB(skb)->when TCP_SKB_CB(skb)->when has different meaning in output and input paths. In output path, it contains a timestamp. In input path, it contains an ISN, chosen by tcp_timewait_state_process() Its usage in output path is obsolete after usec timestamping. Lets simplify and clean this. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-05 17:49:39 -07:00
Eric Dumazet	7faee5c0d5	tcp: remove TCP_SKB_CB(skb)->when After commit `740b0f1841` ("tcp: switch rtt estimations to usec resolution"), we no longer need to maintain timestamps in two different fields. TCP_SKB_CB(skb)->when can be removed, as same information sits in skb_mstamp.stamp_jiffies Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-05 17:49:33 -07:00
Eric Dumazet	04317dafd1	tcp: introduce TCP_SKB_CB(skb)->tcp_tw_isn TCP_SKB_CB(skb)->when has different meaning in output and input paths. In output path, it contains a timestamp. In input path, it contains an ISN, chosen by tcp_timewait_state_process() Lets add a different name to ease code comprehension. Note that 'when' field will disappear in following patch, as skb_mstamp already contains timestamp, the anonymous union will promptly disappear as well. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-05 17:49:33 -07:00
David S. Miller	2ba38943ba	Merge branch 'eth_get_headlen' Alexander Duyck says: ==================== net: Drop get_headlen functions in favor of generic function This series replaces the igb_get_headlen and ixgbe_get_headlen functions with a generic function named eth_get_headlen. I have done some performance testing on ixgbe with 258 byte frames since the calls are only used on frames larger than 256 bytes and have seen no significant difference in CPU utilization. v2: renamed __skb_get_poff to skb_get_poff renamed ___skb_get_poff to __skb_get_poff ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-05 17:47:12 -07:00
Alexander Duyck	8496e3382e	ixgbe: use new eth_get_headlen interface Update ixgbe to drop the ixgbe_get_headlen function in favor of eth_get_headlen. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-05 17:47:03 -07:00
Alexander Duyck	24cd23d3d2	igb: use new eth_get_headlen interface Update igb to drop the igb_get_headlen function in favor of eth_get_headlen. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-05 17:47:02 -07:00
Alexander Duyck	56193d1bce	net: Add function for parsing the header length out of linear ethernet frames This patch updates some of the flow_dissector api so that it can be used to parse the length of ethernet buffers stored in fragments. Most of the changes needed were to __skb_get_poff as it needed to be updated to support sending a linear buffer instead of a skb. I have split __skb_get_poff into two functions, the first is skb_get_poff and it retains the functionality of the original __skb_get_poff. The other function is __skb_get_poff which now works much like __skb_flow_dissect in relation to skb_flow_dissect in that it provides the same functionality but works with just a data buffer and hlen instead of needing an skb. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-05 17:47:02 -07:00
David S. Miller	2c048e6462	Merge branch 'timestamping' Alexander Duyck says: ==================== This change makes it so that the core path for the phy timestamping logic is shared between skb_tx_tstamp and skb_complete_tx_timestamp. In addition it provides a means of using the same skb clone type path in non phy timestamping drivers. The main motivation for this is to enable non-phy drivers to be able to manipulate tx timestamp skbs for such things as putting them in lists or setting aside buffer in the context block. v2: Incorporated suggested changes from Willem de Bruijn and Eric Dumazet dropped uneeded comment restored order of hwtstamp vs swtstamp added destructor for skb Dropped usage of skb_complete_tx_timestamp as a kfree_skb w/ destructor v3: Updated destructor handling and dealt with socket reference counting issues v4: Split out combining destructors into a separate patch ====================	2014-09-05 17:43:54 -07:00
Alexander Duyck	82eabd9eb2	net: merge cases where sock_efree and sock_edemux are the same function Since sock_efree and sock_demux are essentially the same code for non-TCP sockets and the case where CONFIG_INET is not defined we can combine the code or replace the call to sock_edemux in several spots. As a result we can avoid a bit of unnecessary code or code duplication. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-05 17:43:45 -07:00
Alexander Duyck	62bccb8cdb	net-timestamp: Make the clone operation stand-alone from phy timestamping The phy timestamping takes a different path than the regular timestamping does in that it will create a clone first so that the packets needing to be timestamped can be placed in a queue, or the context block could be used. In order to support these use cases I am pulling the core of the code out so it can be used in other drivers beyond just phy devices. In addition I have added a destructor named sock_efree which is meant to provide a simple way for dropping the reference to skb exceptions that aren't part of either the receive or send windows for the socket, and I have removed some duplication in spots where this destructor could be used in place of sock_edemux. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-05 17:43:45 -07:00
Alexander Duyck	37846ef018	net-timestamp: Merge shared code between phy and regular timestamping This change merges the shared bits that exist between skb_tx_tstamp and skb_complete_tx_timestamp. By doing this we can avoid the two diverging as there were already changes pushed into skb_tx_tstamp that hadn't made it into the other function. In addition this resolves issues with the fact that skb_complete_tx_timestamp was included in linux/skbuff.h even though it was only compiled in if phy timestamping was enabled. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-05 17:43:45 -07:00
Eric Dumazet	d546c62154	ipv4: harden fnhe_hashfun() Lets make this hash function a bit secure, as ICMP attacks are still in the wild. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-05 17:40:33 -07:00
Willem de Bruijn	18a47e6d8a	net-timestamp: fix allocation error in test A buffer is incorrectly zeroed to the length of the pointer. If cfg_payload_len < sizeof(void *) this can overwrites unrelated memory. The buffer contents are never read, so no need to zero. Fixes: `8fe2f761ca` ("net-timestamp: expand documentation") Reported-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-05 17:31:03 -07:00
Dan Carpenter	b1c849276b	hyperv: NULL dereference on error We try to call free_netvsc_device(net_device) when "net_device" is NULL. It leads to an Oops. Fixes: `f90251c8a6` ('hyperv: Increase the buffer length for netvsc_channel_cb()') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-05 17:29:22 -07:00
David S. Miller	a77f9a282a	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next Jeff Kirsher says: ==================== Intel Wired LAN Driver Updates 2014-09-04 This series contains updates to i40e, i40evf, ixgbe and ixgbevf. Catherine adds dual speed module support to i40e. Updates i40e to allow the user to change link settings when the link is down. Serey renames i40e_ndo_set_vf_spoofck() to i40e_ndo_set_vf_spookchk() to be more consistent with what is defined in netdev and removes a unnecessary variable assignment. Jesse makes a malicious driver detection warning only print if extended driver string is enabled for i40e. Fixes a panic under traffic load when resetting or if/whenever there was a Tx-timeout because we were enabling the Tx queue to early. Anjali fixes an issue when PF reset fails, where we were trying to restart the admin queue which has not been setup at that point. This resolves an occasional kernel panic when PF reset fails for some reason. Ethan Zhao replaces the use of a local i40e_vfs_are_assigned() with the global kernel pci_vfs_assigned() for i40e. Alex cleans up the FDB handling for ixgbe. This change makes it so that the behavior for FDB handling is consistent between both the SR-IOV and non-SR-IOV cases. The main change is that we perform bounds checking on the number of SR-IOV addresses regardless of if SR-IOV is enabled or not as we can only support a certain number of addresses in the hardware. Emil extends the pending Tx work check to the VF interfaces, where the driver initiates a reset of the interface on link loss with pending Tx work in order to clear the rings. Introduces a delay for 82599 VFs of at least 500 usecs to make sure the VFLINKS value is correct, since this bit tends to flap when a DA or SFP+ cable is disconnected. Jacob adds code comments in ixgbe to make it more obvious that we are resetting features based on the fact that we do not have MSI-X enabled, and cannot use the previous settings. Also resolves a kernel NULL pointer dereference by limiting the combined total of MACVLAN and SR-IOV VFs, since the hardware has a limited number of pools available (64). Previously, no checks were in place to limit the number of accelerated MACVLAN devices based on the number of pools, which would be ok since there was already a limit for these well below the number of available pools. However, SR-IOV uses the very same pools, therefore we need to ensure that the total number of pools does not exceed the number of pools available in the hardware. v2: - clean up code comment in patch 5 by replacing "an" with "auto negotiation" based on feedback from Sergei Shtylyov - removed un-necessary parenthesis around function call in patch 8 based on feedback from Sergei Shtylyov ====================	2014-09-05 17:21:06 -07:00
Daniel Mack	c2b32e580c	net: ethernet: cpsw: improve interrupt lookup logic in cpsw_probe() Simplify the interrupt resource lookup code in cpsw_probe() by the following: * Only look at the first member of the resource. As the driver only works for DT-enabled platforms anyway, a resource of type IORESOURCE_IRQ will only contain one single entry (res->start == res->end), so there is no need for the iteration. * Add a bounds check to avoid overflows if we are passed more than ARRAY_SIZE(priv->irqs_table) resources. * Assign 'ret' with the return value of devm_request_irq() so that cpsw_probe() returns the appropriate error code. * If devm_request_irq() fails, report the error code in the log message. Signed-off-by: Daniel Mack <zonque@gmail.com> Acked-by: Mugunthan V N <mugunthanvnm@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-05 17:17:13 -07:00
Eric Dumazet	caa415270c	ipv4: fix a race in update_or_create_fnhe() nh_exceptions is effectively used under rcu, but lacks proper barriers. Between kzalloc() and setting of nh->nh_exceptions(), we need a proper memory barrier. Signed-off-by: Eric Dumazet <edumazet@google.com> Fixes: `4895c771c7` ("ipv4: Add FIB nexthop exceptions.") Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-05 17:15:50 -07:00
Andy Zhou	29abe2fda5	l2tp: fix missing line continuation This syntax error was covered by L2TP_REFCNT_DEBUG not being set by default. Signed-off-by: Andy Zhou <azhou@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-05 15:19:53 -07:00
David S. Miller	f35d2a5f8d	Merge branch 'amd-xgbe-next' Tom Lendacky says: ==================== amd-xgbe: AMD XGBE driver updates 2014-09-03 The following series of patches includes fixes/updates to the driver. - Query the device for the actual speed mode (KR/KX) rather than trying to track it - Update parallel detection logic to support KR mode - Fix new warnings from checkpatch in the amd-xgbe and amd-xgbe-phy driver This patch series is based on net-next. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-05 15:11:25 -07:00
Lendacky, Thomas	b73c798b17	amd-xgbe-phy: Checkpatch driver fixes This patch contains fixes identified by checkpatch when run with the strict option. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-05 15:11:21 -07:00
Lendacky, Thomas	a2ea14d772	amd-xgbe: Checkpatch driver fixes This patch contains fixes identified by checkpatch when run with the strict option. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-05 15:11:20 -07:00
Lendacky, Thomas	e6f0562ff4	amd-xgbe-phy: Enhance parallel detection to support KR speed Add support to allow parallel detection to work in KR speed. With both speed modes of KX and KR supported, KX must be checked first. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-05 15:11:20 -07:00
Lendacky, Thomas	e3eec4e793	amd-xgbe-phy: Check device for current speed mode (KR/KX) Since device resets can change the current mode it's possible to think the device is in a different mode than it actually is. Rather than trying to determine every place that is needed to set/save the current mode, be safe and check the devices actual mode when needed rather than trying to track it. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-05 15:11:20 -07:00
David S. Miller	e4cf0b756c	Merge branch 'r8152-next' Hayes Wang says: ==================== r8152: random MAC address If the interface has invalid MAC address, it couldn't be used. In order to let it work normally, give a random one. v3: Remove ether_addr_copy(dev->perm_addr, dev->dev_addr); v2: Use "%pM" format specifier for printing a MAC address. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-05 12:17:39 -07:00
hayeswang	179bb6d7f0	r8152: use eth_hw_addr_random If the hw doesn't have a valid MAC address, give a random one and set it to the hw. Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-05 12:17:00 -07:00
hayeswang	8ba789ab13	r8152: change the location of rtl8152_set_mac_address Exchange the location of rtl8152_set_mac_address() and set_ethernet_addr(). Then, the set_ethernet_addr() could set the MAC address by calling rtl8152_set_mac_address() later. Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-05 12:17:00 -07:00
David S. Miller	b52b727594	Merge branch 'rx_copybreak' Govindarajulu Varadarajan says: ==================== enic: Add support for rx_copybreak The following series implements rx_copybreak. dma_map_single()/dma_unmap_single() is more expensive than alloc_skb & memcpy for smaller packets. By doing this we can reuse the dma buff which is already mapped. This is very useful when iommu is on. The default skb copybreak value is 256. When iommu is on, we can go much higher than 256. All the drivers that supports rx_copybreak provides module parameter to change this value. Since module parameter is the least preferred way for changing driver values, this series adds ethtool support for setting rx_copybreak. v4: Validate tunable length in ethtool_get_tunable, not in driver implemented function. Loose tunable_ops array for each tunable type. Define one function and let the driver use switch case for each type. Use double underscore for data type in UAPI headers. Use const qualifier where possible. v3: Add tunable namespace to ethtool. Use new ethtool cmd ETHTOOL_S/GTUNABLE to set/get rx_copybreak from userspace. v2: Add new ethtool_cmd for DMA buffer parameters, instead of adding new members to existing ethtool_ringparam. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-05 12:12:32 -07:00
Govindarajulu Varadarajan	d4ad30b182	enic: Add tunable_ops support for rx_copybreak This patch adds support for setting/getting rx_copybreak using generic ethtool tunable. Defines enic_get_tunable() & enic_set_tunable() to get/set rx_copybreak. As of now, these two function supports only rx_copybreak. Signed-off-by: Govindarajulu Varadarajan <_govind@gmx.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-05 12:12:20 -07:00
Govindarajulu Varadarajan	f0db9b0734	ethtool: Add generic options for tunables This patch adds new ethtool cmd, ETHTOOL_GTUNABLE & ETHTOOL_STUNABLE for getting tunable values from driver. Add get_tunable and set_tunable to ethtool_ops. Driver implements these functions for getting/setting tunable value. Signed-off-by: Govindarajulu Varadarajan <_govind@gmx.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-05 12:12:20 -07:00
Govindarajulu Varadarajan	a03bb56e67	enic: implement rx_copybreak Calling dma_map_single()/dma_unmap_single() is quite expensive compared to copying a small packet. So let's copy short frames and keep the buffers mapped. Signed-off-by: Govindarajulu Varadarajan <_govind@gmx.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-05 12:12:20 -07:00
Daniel Borkmann	e020836d95	dev_ioctl: remove dev_load() CAP_SYS_MODULE message Marcel reported to see the following message when autoloading is being triggered when adding nlmon device: Loading kernel module for a network device with CAP_SYS_MODULE (deprecated). Use CAP_NET_ADMIN and alias netdev-nlmon instead. This false-positive happens despite with having correct capabilities set, e.g. through issuing `ip link del dev nlmon` more than once on a valid device with name nlmon, but Marcel has also seen it on creation time when no nlmon module is previously compiled-in or loaded as module and the device name equals a link type name (e.g. nlmon, vxlan, team). Stephen says: The netdev module alias is a hold over from the past. For normal devices, people used to create a alias eth0 to and point it to the type of network device used, that was back in the bad old ISA days before real discovery. Also, the tunnels create module alias for the control device and ip used to use this to autoload the tunnel device. The message is bogus and should just be removed, I also see it in a couple of other cases where tap devices are renamed for other usese. As mentioned in `8909c9ad8f` ("net: don't allow CAP_NET_ADMIN to load non-netdev kernel modules"), we nevertheless still might want to leave the old autoloading behaviour in place as it could break old scripts, so for now, lets just remove the log message as Stephen suggests. Reference: http://thread.gmane.org/gmane.linux.kernel/1105168 Reported-by: Marcel Holtmann <marcel@holtmann.org> Suggested-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Cc: Vasiliy Kulikov <segoon@openwall.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-05 12:04:40 -07:00
Daniel Borkmann	60a3b2253c	net: bpf: make eBPF interpreter images read-only With eBPF getting more extended and exposure to user space is on it's way, hardening the memory range the interpreter uses to steer its command flow seems appropriate. This patch moves the to be interpreted bytecode to read-only pages. In case we execute a corrupted BPF interpreter image for some reason e.g. caused by an attacker which got past a verifier stage, it would not only provide arbitrary read/write memory access but arbitrary function calls as well. After setting up the BPF interpreter image, its contents do not change until destruction time, thus we can setup the image on immutable made pages in order to mitigate modifications to that code. The idea is derived from commit `314beb9bca` ("x86: bpf_jit_comp: secure bpf jit against spraying attacks"). This is possible because bpf_prog is not part of sk_filter anymore. After setup bpf_prog cannot be altered during its life-time. This prevents any modifications to the entire bpf_prog structure (incl. function/JIT image pointer). Every eBPF program (including classic BPF that are migrated) have to call bpf_prog_select_runtime() to select either interpreter or a JIT image as a last setup step, and they all are being freed via bpf_prog_free(), including non-JIT. Therefore, we can easily integrate this into the eBPF life-time, plus since we directly allocate a bpf_prog, we have no performance penalty. Tested with seccomp and test_bpf testsuite in JIT/non-JIT mode and manual inspection of kernel_page_tables. Brad Spengler proposed the same idea via Twitter during development of this patch. Joint work with Hannes Frederic Sowa. Suggested-by: Brad Spengler <spender@grsecurity.net> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Kees Cook <keescook@chromium.org> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-05 12:02:48 -07:00
Florian Fainelli	4a804c0163	net: systemport: update UMAC_CMD only when link is detected When we bring the interface down, phy_stop() will schedule the PHY state machine to call our link adjustment callback. By the time we do so, we may have clock gated off the SYSTEMPORT hardware block, and this will cause bus errors to happen in bcm_sysport_adj_link(): Make sure that we only touch the UMAC_CMD register when there is an actual link. This is safe to do for two reasons: - updating the Ethernet MAC registers only make sense when a physical link is present - the PHY library state machine first set phydev->link = 0 before invoking phydev->adjust_link in the PHY_HALTED case This is a similar fix to the GENET one: `c677ba8b3c` ("net: bcmgenet: update UMAC_CMD only when link is detected"). Fixes: `80105befdb` ("net: systemport: add Broadcom SYSTEMPORT Ethernet MAC driver") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-04 23:13:58 -07:00
Hannes Frederic Sowa	a9fe8e2994	ipv4: implement igmp_qrv sysctl to tune igmp robustness variable As in IPv6 people might increase the igmp query robustness variable to make sure unsolicited state change reports aren't lost on the network. Add and document this new knob to igmp code. RFCs allow tuning this parameter back to first IGMP RFC, so we also use this setting for all counters, including source specific multicast. Also take over sysctl value when upping the interface and don't reuse the last one seen on the interface. Cc: Flavio Leitner <fbl@redhat.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Acked-by: Flavio Leitner <fbl@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-04 22:26:14 -07:00
Hannes Frederic Sowa	2f711939d2	ipv6: add sysctl_mld_qrv to configure query robustness variable This patch adds a new sysctl_mld_qrv knob to configure the mldv1/v2 query robustness variable. It specifies how many retransmit of unsolicited mld retransmit should happen. Admins might want to tune this on lossy links. Also reset mld state on interface down/up, so we pick up new sysctl settings during interface up event. IPv6 certification requests this knob to be available. I didn't make this knob netns specific, as it is mostly a setting in a physical environment and should be per host. Cc: Flavio Leitner <fbl@redhat.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Acked-by: Flavio Leitner <fbl@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-04 22:26:14 -07:00
Jacob Keller	aac2f1bf14	ixgbe: limit combined total of macvlan and SR-IOV VFs Hardware has a limited number of pools available (64). Previously, no checks were in place to limit the number of accelerated macvlan devices based on the number of pools. Normally this would be ok, because there was already a limit for these well below the number of available pools. However, SR-IOV uses the very same pools. Therefor, we need to ensure that the total number of pools (number of VFs plus the number of non-VF pools in use for accelerated macvlans) does not exceed the number of pools available in hardware. This patch resolves a kernel NULL pointer dereference caused by the following commands: $modprobe ixgbe max_vfs=63 $ethtool -K eth2 l2-fwd-offload on $ip link add link eth2 macvlan0 type macvlan $ip link set dev macvlan0 up [ 992.950080] BUG: unable to handle kernel NULL pointer dereference at 0000000000000056 [ 992.951109] IP: [<ffffffffa003b71e>] ixgbe_disable_fwd_ring+0x1e/0xf0 [ixgbe] [ 992.951684] PGD 22a80e067 PUD 232e9b067 PMD 0 [ 992.952389] Oops: 0000 [#1] SMP [ 992.953014] Modules linked in: nfsd lockd nfs_acl exportfs auth_rpcgss oid_registry sunrpc bridge stp llc vhost_net macvtap macvlan vhost tun kvm_intel kvm ioatdma ixgbe mdio igb dca [ 992.956042] CPU: 2 PID: 11928 Comm: ifconfig Not tainted 3.16.0-rc6-net-next-07-29-2014-FCoE+ #1 [ 992.956915] Hardware name: Intel Corporation S2600CO/S2600CO, BIOS SE5C600.86B.02.03.0003.041920141333 04/19/2014 [ 992.957791] task: ffff8804341c0000 ti: ffff8801d7dc8000 task.ti: ffff8801d7dc8000 [ 992.958660] RIP: 0010:[<ffffffffa003b71e>] [<ffffffffa003b71e>] ixgbe_disable_fwd_ring+0x1e/0xf0 [ixgbe] [ 992.959613] RSP: 0018:ffff8801d7dcbbb8 EFLAGS: 00010286 [ 992.960093] RAX: 0000000000000001 RBX: 0000000000000000 RCX: 0000000000000001 [ 992.960575] RDX: ffff880232eb7000 RSI: 0000000000000000 RDI: ffff88022dc05800 [ 992.961059] RBP: ffff8801d7dcbbd8 R08: 0000000000000000 R09: 0000000000000000 [ 992.961541] R10: 0000000000000001 R11: 0000000000000000 R12: ffff88022ec20980 [ 992.962023] R13: ffff880232eb7000 R14: 0000000000000001 R15: 0000000000000001 [ 992.962508] FS: 00007fab264887a0(0000) GS:ffff880237640000(0000) knlGS:0000000000000000 [ 992.963378] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 992.963858] CR2: 0000000000000056 CR3: 000000022a939000 CR4: 00000000001427e0 [ 992.964340] Stack: [ 992.964806] ffff88022ec28840 ffff88022ec20980 ffff88022dc05800 ffff880232eb7000 [ 992.965976] ffff8801d7dcbc28 ffffffffa003bae8 ffff8801d7dcbbe8 0000000000000400 [ 992.967147] 000000000000000d ffff88022ec20980 ffff88022ec20000 ffff88022dc05800 [ 992.968319] Call Trace: [ 992.968795] [<ffffffffa003bae8>] ixgbe_fwd_ring_up+0x88/0x280 [ixgbe] [ 992.969284] [<ffffffffa0041d83>] ixgbe_fwd_add+0x173/0x220 [ixgbe] [ 992.969767] [<ffffffffa015056c>] macvlan_open+0x1bc/0x230 [macvlan] [ 992.970256] [<ffffffff816b8de7>] __dev_open+0xd7/0x150 [ 992.970735] [<ffffffff816b8bd7>] __dev_change_flags+0xa7/0x170 [ 992.971220] [<ffffffff816b8ccb>] dev_change_flags+0x2b/0x70 [ 992.971703] [<ffffffff817471b2>] devinet_ioctl+0x602/0x6d0 [ 992.972184] [<ffffffff81748168>] inet_ioctl+0x78/0x90 [ 992.972666] [<ffffffff816a143b>] sock_do_ioctl+0x2b/0x70 [ 992.973146] [<ffffffff816a14ed>] sock_ioctl+0x6d/0x260 [ 992.973627] [<ffffffff811ad3b4>] do_vfs_ioctl+0x84/0x540 [ 992.974109] [<ffffffff811a4c81>] ? final_putname+0x21/0x50 [ 992.974593] [<ffffffff818725d5>] ? sysret_check+0x22/0x5d [ 992.975073] [<ffffffff811ad901>] SyS_ioctl+0x91/0xa0 [ 992.975550] [<ffffffff818725a9>] system_call_fastpath+0x16/0x1b [ 992.976026] Code: ff 66 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 48 83 ec 20 48 89 5d e8 4c 89 65 f0 48 89 f3 4c 89 6d f8 4c 8b a7 08 02 00 00 <44> 0f b6 6e 56 44 03 af 14 02 00 00 4c 89 e7 e8 5e f2 ff ff be [ 992.982261] RIP [<ffffffffa003b71e>] ixgbe_disable_fwd_ring+0x1e/0xf0 [ixgbe] [ 992.983212] RSP <ffff8801d7dcbbb8> [ 992.983681] CR2: 0000000000000056 [ 992.984248] ---[ end trace 9f54802b5cc3638b ]--- Cc: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2014-09-04 01:38:33 -07:00
Jacob Keller	eec66731de	ixgbe: add comment noting recalculation of queues Since we previously called ixgbe_set_num_queues just prior to attempting to set our interrupt scheme, it may be non obvious why we have to call it again inside the function. Add a comment which helps make it more obvious that we are resetting features based on the fact that we do not have MSI-X enabled, and cannot use the previous settings. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2014-09-04 01:38:33 -07:00
Emil Tantilov	b8a2ca19bc	ixgbevf: introduce delay for checking VFLINKS on 82599 VFLINKS.LINKUP bit tends to flap when a DA or SFP+ cable is disconnected. It can take up to 500 usecs for the LINKUP bit to be correct. This patch resolves the issue by introducing a delay for 82599 VFs of at least 500 usecs to make sure the VFLINKS value is correct. Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2014-09-04 01:38:32 -07:00
Emil Tantilov	07923c17b1	ixgbe: reset interface on link loss with pending Tx work from the VF ixgbe initiates a reset of the interface on link loss with pending Tx work in order to clear the rings. This patch extends the pending Tx work check to the VF interfaces with the same purpose. Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2014-09-04 01:38:32 -07:00
Alexander Duyck	bcfd3432d1	ixgbe: Cleanup FDB handling code This change makes it so that the behavior for FDB handling is consistent between both the SR-IOV and non-SR-IOV cases. The main change here is that we perform bounds checking on the number of SR-IOV addresses regardless of if SR-IOV is enabled or not as we can only support a certain number of addresses in the hardware. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2014-09-04 01:38:32 -07:00
Ethan Zhao	c24817b6ba	i40e: use global pci_vfs_assigned() to replace local i40e_vfs_are_assigned() There is global funcion pci_vfs_assigned(), so use it instead of composing local one. Signed-off-by: Ethan Zhao <ethan.kernel@gmail.com> Tested-by: Sibai Li <sibai.li@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2014-09-04 01:38:31 -07:00
Catherine Sullivan	e966d5c612	i40e/i40evf: Bump i40e/i40evf versions Bump i40e version to 1.0.11 and i40evf version to 1.0.5. Change-ID: I63a60fa2efe82aae87a8a3095f43218db57d46ce Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com> Tested-by: Jim Young <jamesx.m.young@intel.com>	2014-09-04 01:38:31 -07:00
Jesse Brandeburg	32b5b81170	i40e: fix panic due to too-early Tx queue enable This fixes the panic under traffic load when resetting. This issue could also show up if/whenever there is a Tx-timeout. Change-ID: Ie393a1f17fd5d962e56fc3bfe784899ef25402f5 Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Jim Young <jamesx.m.young@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2014-09-04 01:38:31 -07:00
Anjali Singhai Jain	a316f651c7	i40e: Fix an issue when PF reset fails We shouldn't restart Admin queue subtask if PF reset fails since we do not have the AQ setup at that point. This patch makes sure we disable AQ clean subtask when PF reset fails. This will resolve an occasional kernel panic when PF reset fails for some reason. Change-ID: I11a747773362a8c5c0ad7a10cd34be0bda8eb9e8 Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com> Tested-by: Jim Young <jamesx.m.young@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2014-09-04 01:38:30 -07:00
Jesse Brandeburg	faf3297861	i40e: make warning less verbose The driver is un-necessarily printing a warning that is only marginally useful to the user. Make the warning only print if extended driver string printing is enabled, other messages related to a reset event will still continue to print. Change-ID: I5e8beca6516a2f176cd2e72b0ac2b3b909e6c953 Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Jim Young <jamesx.m.young@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2014-09-04 01:38:30 -07:00
Catherine Sullivan	9412851629	i40e: Tell OS link is going down when calling set_phy_config Since we don't seem to be getting an LSE telling us link is going down during set_phy_config (but we do get an LSE telling us we are coming back up), fake one for the OS and tell them link is going down. Also do an atomic restart no matter what because there are times the user may want to end with link up even if they started with link down (like if they accidentally set it to a speed that can't link and are trying to fix it). Change-ID: I0a642af9c1d0feb67bce741aba1a9c33bd349ed6 Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com> Tested-by: Jim Young <jamesx.m.young@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2014-09-04 01:38:29 -07:00

1 2 3 4 5 ...

468622 Commits