linux

Author	SHA1	Message	Date
Ido Shamay	b1b6b4da78	net/mlx4_en: Add mlx4_en_get_cqe helper This function derives the base address of the CQE from the CQE size, and calculates the real CQE context segment in it from the factor (this is like before). Before this change the code used the factor to calculate the base address of the CQE as well. The factor indicates in which segment of the cqe stride the cqe information is located. For 32-byte strides, the segment is 0, and for 64 byte strides, the segment is 1 (bytes 32..63). Using the factor was ok as long as we had only 32 and 64 byte strides. However, with larger strides, the factor is zero, and so cannot be used to calculate the base of the CQE. The helper uses the same method of CQE buffer pulling made by all other components that reads the CQE buffer (mlx4_ib driver and libmlx4). Signed-off-by: Ido Shamay <idos@mellanox.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-19 17:30:11 -04:00
Ido Shamay	43c816c67a	net/mlx4_core: Cache line EQE size support Enable mlx4 interrupt handler to work with EQE stride feature, The feature may be enabled when cache line is bigger than 64B. The EQE size will then be the cache line size, and the context segment resides in [0-31] offset. Signed-off-by: Ido Shamay <idos@mellanox.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-19 17:30:10 -04:00
Ido Shamay	77507aa249	net/mlx4_core: Enable CQE/EQE stride support This feature is intended for archs having cache line larger then 64B. Since our CQE/EQEs are generally 64B in those systems, HW will write twice to the same cache line consecutively, causing pipe locks due to he hazard prevention mechanism. For elements in a cyclic buffer, writes are consecutive, so entries smaller than a cache line should be avoided, especially if they are written at a high rate. Reduce consecutive writes to same cache line in CQs/EQs, by allowing the driver to increase the distance between entries so that each will reside in a different cache line. Until the introduction of this feature, there were two types of CQE/EQE: 1. 32B stride and context in the [0-31] segment 2. 64B stride and context in the [32-63] segment This feature introduces two additional types: 3. 128B stride and context in the [0-31] segment (128B cache line) 4. 256B stride and context in the [0-31] segment (256B cache line) Modify the mlx4_core driver to query the device for the CQE/EQE cache line stride capability and to enable that capability when the host cache line size is larger than 64 bytes (supported cache lines are 128B and 256B). The mlx4 IB driver and libmlx4 need not be aware of this change. The PF context behaviour is changed to require this change in VF drivers running on such archs. Signed-off-by: Ido Shamay <idos@mellanox.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-19 17:30:10 -04:00
Sabrina Dubroca	54003f119c	net: fix sparse warnings in SNMP_UPD_PO_STATS(_BH) ptr used to be a non __percpu pointer (result of a this_cpu_ptr assignment, `7d720c3e4f` ("percpu: add __percpu sparse annotations to net")). Since `d25398df59` ("net: avoid reloads in SNMP_UPD_PO_STATS"), that's no longer the case, SNMP_UPD_PO_STATS uses this_cpu_add and ptr is now __percpu. Silence sparse warnings by preserving the original type and annotation, and remove the out-of-date comment. warning: incorrect type in initializer (different address spaces) expected unsigned long long ptr got unsigned long long [noderef] <asn:3><noident> warning: incorrect type in initializer (different address spaces) expected void const [noderef] <asn:3>__vpp_verify got unsigned long long <noident> warning: incorrect type in initializer (different address spaces) expected void const [noderef] <asn:3>__vpp_verify got unsigned long long <noident> Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-19 17:22:31 -04:00
David S. Miller	fb5690d245	Merge branch 'fou-next' Tom Herbert says: ==================== net: foo-over-udp (fou) This patch series implements foo-over-udp. The idea is that we can encapsulate different IP protocols in UDP packets. The rationale for this is that networking devices such as NICs and switches are usually implemented with UDP (and TCP) specific mechanims for processing. For instance, many switches and routers will implement a 5-tuple hash for UDP packets to perform Equal Cost Multipath Routing (ECMP) or RSS (on NICs). Many NICs also only provide rudimentary checksum offload (basic TCP and UDP packet), with foo-over-udp we may be able to leverage these NICs to offload checksums of tunneled packets (using checksum unnecessary conversion and eventually remote checksum offload) An example encapsulation of IPIP over FOU is diagrammed below. As illustrated, the packet overhead for FOU is the 8 byte UDP header. +------------------+ \| IPv4 hdr \| +------------------+ \| UDP hdr \| +------------------+ \| IPv4 hdr \| +------------------+ \| TCP hdr \| +------------------+ \| TCP payload \| +------------------+ Conceptually, FOU should be able to encapsulate any IP protocol. The FOU header (UDP hdr.) is essentially an inserted header between the IP header and transport, so in the case of TCP or UDP encapsulation the pseudo header would be based on the outer IP header and its length field must not include the UDP header. * Receive In this patch set the RX path for FOU is implemented in a new fou module. To enable FOU for a particular protocol, a UDP-FOU socket is opened to the port to receive FOU packets. The socket is mapped to the IP protocol for the packets. The XFRM mechanism used to receive encapsulated packets (udp_encap_rcv) for the port. Upon reception, the UDP is removed and packet is reinjected in the stack for the corresponding protocol associated with the socket (return -protocol from udp_encap_rcv function). GRO is provided with the appropriate fou_gro_receive and fou_gro_complete. These routines need to know the encapsulation protocol so we save that in udp_offloads structure with the port and pass it in the napi_gro_cb structure. * TX This patch series implements FOU transmit encapsulation for IPIP, GRE, and SIT. This done by some common infrastructure in ip_tunnel including an ip_tunnel_encap to perform FOU encapsulation and common configuration to enable FOU on IP tunnels. FOU is configured on existing tunnels and does not create any new interfaces. The transmit and receive paths are independent, so use of FOU may be assymetric between tunnel endpoints. * Configuration The fou module using netlink to configure FOU receive ports. The ip command can be augmented with a fou subcommand to support this. e.g. to configure FOU for IPIP on port 5555: ip fou add port 5555 ipproto 4 GRE, IPIP, and SIT have been modified with netlink commands to configure use of FOU on transmit. The "ip link" command will be augmented with an encap subcommand (for supporting various forms of secondary encapsulation). For instance, to configure an ipip tunnel with FOU on port 5555: ip link add name tun1 type ipip \ remote 192.168.1.1 local 192.168.1.2 ttl 225 \ encap fou encap-sport auto encap-dport 5555 * Notes - This patch set does not implement GSO for FOU. The UDP encapsulation code assumes TEB, so that will need to be reimplemented. - When a packet is received through FOU, the UDP header is not actually removed for the skbuf, pointers to transport header and length in the IP header are updated (like in ESP/UDP RX). A side effect is the IP header will now appear to have an incorrect checksum by an external observer (e.g. tcpdump), it will be off by sizeof UDP header. If necessary we could adjust the checksum to compensate. - Performance results are below. My expectation is that FOU should entail little overhead (clearly there is some work to do :-) ). Optimizing UDP socket lookup for encapsulation ports should help significantly. - I really don't expect/want devices to have special support for any of this. Generic checksum offload mechanisms (NETIF_HW_CSUM and use of CHECKSUM_COMPLETE) should be sufficient. RSS and flow steering is provided by commonly implemented UDP hashing. GRO/GSO seem fairly comparable with LRO/TSO already. * Performance Ran netperf TCP_RR and TCP_STREAM tests across various configurations. This was performed on bnx2x and I disabled TSO/GSO on sender to get fair comparison for FOU versus non-FOU. CPU utilization is reported for receive in TCP_STREAM. GRE IPv4, FOU, UDP checksum enabled TCP_STREAM 24.85% CPU utilization 9310.6 Mbps TCP_RR 94.2% CPU utilization 155/249/460 90/95/99% latencies 1.17018e+06 tps IPv4, FOU, UDP checksum disabled TCP_STREAM 31.04% CPU utilization 9302.22 Mbps TCP_RR 94.13% CPU utilization 154/239/419 90/95/99% latencies 1.17555e+06 tps IPv4, no FOU TCP_STREAM 23.13% CPU utilization 9354.58 Mbps TCP_RR 90.24% CPU utilization 156/228/360 90/95/99% latencies 1.18169e+06 tps IPIP FOU, UDP checksum enabled TCP_STREAM 24.13% CPU utilization 9328 Mbps TCP_RR 94.23 149/237/429 90/95/99% latencies 1.19553e+06 tps FOU, UDP checksum disabled TCP_STREAM 29.13% CPU utilization 9370.25 Mbps TCP_RR 94.13% CPU utilization 149/232/398 90/95/99% latencies 1.19225e+06 tps No FOU TCP_STREAM 10.43% CPU utilization 5302.03 Mbps TCP_RR 51.53% CPU utilization 215/324/475 90/95/99% latencies 864998 tps SIT FOU, UDP checksum enabled TCP_STREAM 30.38% CPU utilization 9176.76 Mbps TCP_RR 96.9% CPU utilization 170/281/581 90/95/99% latencies 1.03372e+06 tps FOU, UDP checksum disabled TCP_STREAM 39.6% CPU utilization 9176.57 Mbps TCP_RR 97.14% CPU utilization 167/272/548 90/95/99% latencies 1.03203e+06 tps No FOU TCP_STREAM 11.2% CPU utilization 4636.05 Mbps TCP_RR 59.51% CPU utilization 232/346/489 90/95/99% latencies 813199 tps v2: - Removed encap IP tunnel ioctls, configuration is done by netlink only. - Don't export fou_create and fou_destroy, they are currently intended to be called within fou module only. - Filled on tunnel netlink structures and functions for new values. v3: - Fixed change logs for some of the patches. - Remove inline from fou_gro_receive and fou_gro_complete, let compiler decide on these. v4: - Don't need to cast void in fou_from_sock - Removed incorrest htons for port in fou_destroy - Some minor cleanup for readability ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-19 17:15:40 -04:00
Tom Herbert	4565e9919c	gre: Setup and TX path for gre/UDP foo-over-udp encapsulation Added netlink attrs to configure FOU encapsulation for GRE, netlink handling of these flags, and properly adjust MTU for encapsulation. ip_tunnel_encap is called from ip_tunnel_xmit to actually perform FOU encapsulation. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-19 17:15:32 -04:00
Tom Herbert	473ab820dd	ipip: Setup and TX path for ipip/UDP foo-over-udp encapsulation Add netlink handling for IP tunnel encapsulation parameters and and adjustment of MTU for encapsulation. ip_tunnel_encap is called from ip_tunnel_xmit to actually perform FOU encapsulation. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-19 17:15:32 -04:00
Tom Herbert	14909664e4	sit: Setup and TX path for sit/UDP foo-over-udp encapsulation Added netlink handling of IP tunnel encapulation paramters, properly adjust MTU for encapsulation. Added ip_tunnel_encap call to ipip6_tunnel_xmit to actually perform FOU encapsulation. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-19 17:15:32 -04:00
Tom Herbert	5632848653	net: Changes to ip_tunnel to support foo-over-udp encapsulation This patch changes IP tunnel to support (secondary) encapsulation, Foo-over-UDP. Changes include: 1) Adding tun_hlen as the tunnel header length, encap_hlen as the encapsulation header length, and hlen becomes the grand total of these. 2) Added common netlink define to support FOU encapsulation. 3) Routines to perform FOU encapsulation. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-19 17:15:32 -04:00
Tom Herbert	afe93325bc	fou: Add GRO support Implement fou_gro_receive and fou_gro_complete, and populate these in the correponsing udp_offloads for the socket. Added ipproto to udp_offloads and pass this from UDP to the fou GRO routine in proto field of napi_gro_cb structure. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-19 17:15:31 -04:00
Tom Herbert	23461551c0	fou: Support for foo-over-udp RX path This patch provides a receive path for foo-over-udp. This allows direct encapsulation of IP protocols over UDP. The bound destination port is used to map to an IP protocol, and the XFRM framework (udp_encap_rcv) is used to receive encapsulated packets. Upon reception, the encapsulation header is logically removed (pointer to transport header is advanced) and the packet is reinjected into the receive path with the IP protocol indicated by the mapping. Netlink is used to configure FOU ports. The configuration information includes the port number to bind to and the IP protocol corresponding to that port. This should support GRE/UDP (http://tools.ietf.org/html/draft-yong-tsvwg-gre-in-udp-encap-02), as will as the other IP tunneling protocols (IPIP, SIT). Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-19 17:15:31 -04:00
Tom Herbert	ce3e02867e	net: Export inet_offloads and inet6_offloads Want to be able to use these in foo-over-udp offloads, etc. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-19 17:15:31 -04:00
John Fastabend	4e2840eee6	net: sched: cls_u32: rcu can not be last node tc_u32_sel 'sel' in tc_u_knode expects to be the last element in the structure and pads the structure with tc_u32_key fields for each key. kzalloc(sizeof(n) + s->nkeyssizeof(struct tc_u32_key), GFP_KERNEL) CC: Eric Dumazet <edumazet@google.com> Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-19 17:05:45 -04:00
Eric Dumazet	ab34f64808	net: sched: use __skb_queue_head_init() where applicable pfifo_fast and htb use skb lists, without needing their spinlocks. (They instead use the standard qdisc lock) We can use __skb_queue_head_init() instead of skb_queue_head_init() to be consistent. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-19 16:32:10 -04:00
David S. Miller	0ce77802f3	Merge branch 'bnx2x-next' Yuval Mintz says: ==================== bnx2x: Support new Multi-function modes This patch series adds support for 2 new Multi-function modes - Unified Fabric Port [UFP] as well as nic partitioning 1.5 [NPAR1.5]. With the addition of the new multi-function modes, the series also revises some of the storage-related multi-function macros. [Do notice this series has several small issues with checkpatch] ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-19 16:31:13 -04:00
Yuval Mintz	83bad206f7	bnx2x: Add a fallback multi-function mode NPAR1.5 When using new Multi-function modes it's possible that due to incompatible configuration management FW will fallback into an existing mode. Notice that at the moment this fallback is exactly the same as the already existing switch-independent multi-function mode, but we still use existing infrastructure to hold this information [in case some small differences will arise in the future]. Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: Dmitry Kravkov <Dmitry.Kravkov@qlogic.com> Signed-off-by: Ariel Elior <Ariel.Elior@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-19 16:31:08 -04:00
Yuval Mintz	7609647e25	bnx2x: New multi-function mode: UFP Add support for a new multi-function mode based on the Unified Fabric Port system specifications. Support includes configuration of: 1. Outer vlan tags. 2. Bandwidth settings. 3. Virtual link enable/disable. Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: Dmitry Kravkov <Dmitry.Kravkov@qlogic.com> Signed-off-by: Ariel Elior <Ariel.Elior@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-19 16:31:08 -04:00
Dmitry Kravkov	2e98ffc21c	bnx2x: Changes with storage & MAC macros Rearrange macros to query for storage-only modes in different MF environment. Improves the readibility and maintainability of the code. E.g.: - if (IS_MF_STORAGE_SD(bp) \|\| IS_MF_FCOE_AFEX(bp)) + if (IS_MF_STORAGE_ONLY(bp)) In addition, this removes the need for bnx2x_is_valid_ether_addr(). Signed-off-by: Dmitry Kravkov <Dmitry.Kravkov@qlogic.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: Ariel Elior <Ariel.Elior@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-19 16:31:08 -04:00
David S. Miller	77f4f6220a	Merge branch 'fec-next' Florian Fainelli says: ==================== net: phy: Broadcom BCM7xxx PHY workaround update This patch sets the change to of_phy_connect() that you have seen before, this time with the full context of why it is useful and applicable here. Due to some design decision, the internal PHY on Broadcom BCM7xxx chips is not entirely self contained and does not report its internal revision through MII_PHYSID2, that is left to external PHY designs. This forces us to get the PHY revision from the GENET and SF2 switch drivers because those two peripherals integrate such a PHY and do contain the PHY revision in their registers. The approach taken here is hopefully easy to extend to similar needs for other chips/ as well. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-19 16:27:13 -04:00
Florian Fainelli	d8ebfed3f1	net: phy: bcm7xxx: utilize PHY revision in config_init Now that the GENET and SF2 drivers have been updated to communicate us what is the revision of the BCM7xxx integrated PHY, utilize that information in the config_init() callback to call into the appropriate workaround function based on our revision. While at it, we also print the revision and patch level to help debug new chips. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-19 16:27:07 -04:00
Florian Fainelli	aa9aef77c7	net: dsa: bcm_sf2: communicate integrated PHY revision to PHY driver The integrated BCM7xxx PHY contains no useful revision information in its MII_PHYSID2 bits 3:0, that information is instead contained in the SWITCH_REG_PHY_REVISION register. Read this register, store its value, and return it by implementing the dsa_switch::get_phy_flags() callback accordingly. The register layout is already matching what the BCM7xxx PHY driver is expecting to find. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-19 16:27:07 -04:00
Florian Fainelli	6819563e64	net: dsa: allow switch drivers to specify phy_device::dev_flags Some switch drivers (e.g: bcm_sf2) may have to communicate specific workarounds or flags towards the PHY device driver. Allow switches driver to be delegated that task by introducing a get_phy_flags() callback which will do just that. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-19 16:27:07 -04:00
Florian Fainelli	487320c541	net: bcmgenet: communicate integrated PHY revision to PHY driver The integrated BCM7xxx PHY contains no useful revision information in its MII_PHYSID2 bits 3:0, that information is instead contained in the GENET hardware block. We already read the GENET 32-bit revision register, so store the integrated PHY revision in the driver private structure, and then communicate this revision value to the PHY driver by overriding the phy_flags value. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-19 16:27:07 -04:00
Florian Fainelli	80780a54ec	net: bcmgenet: remove PHY_BRCM_100MBPS_WAR Now that we have removed the need for the PHY_BRCM_100MBPS_WAR flag, we can remove it from the GENET driver and the broadcom shared header file. The PHY driver checks the PHY supported bitmask instead. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-19 16:27:07 -04:00
Florian Fainelli	e18556ee3b	net: phy: bcm7xxx: do not use PHY_BRCM_100MBPS_WAR There is no need for the PHY driver to check PHY_BRCM_100MBPS_WAR since that is redundant with checking the PHY device supported features. Get rid of that workaround flag. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-19 16:27:07 -04:00
Florian Fainelli	bb7d93496f	net: phy: broadcom: add helper for PHY revision and patch level The Broadcom BCM7xxx internal PHYs do not contain any useful revision information in the low 4-bits of their MII_PHYSID2 (MII register 3) which could allow us to properly identify them. As a result, we need the actual hardware block integrating these PHYs: GENET or the SF2 switch to tell us what revision they are built with. To assist with that, add two helper macros for fetching the the PHY revision and patch level from the struct phy_device::dev_flags. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-19 16:27:07 -04:00
Florian Fainelli	2f63715138	of: mdio: honor flags passed to of_phy_connect Commit `f9a8f83b04` ("net: phy: remove flags argument from phy_{attach, connect, connect_direct}") removed the flags argument to the PHY library calls to: phy_{attach,connect,connect_direct}. Most Device Tree aware drivers call of_phy_connect() with the flag argument set to 0, but some of them might want to set a different value there in order for the PHY driver to key a specific behavior based on the phy_device::phy_flags value. Allow such drivers to set custom phy_flags as part of the of_phy_connect() call since of_phy_connect() does start the PHY state machine, it will call into the PHY driver config_init() callback which is usually where a specific phy_flags value is important. Fixes: `f9a8f83b04` ("net: phy: remove flags argument from phy_{attach, connect, connect_direct}") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-19 16:27:07 -04:00
Eric Dumazet	2e4e441071	net: add alloc_skb_with_frags() helper Extract from sock_alloc_send_pskb() code building skb with frags, so that we can reuse this in other contexts. Intent is to use it from tcp_send_rcvq(), tcp_collapse(), ... We also want to replace some skb_linearize() calls to a more reliable strategy in pathological cases where we need to reduce number of frags. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-19 16:25:23 -04:00
Eric Dumazet	cb93471acc	tcp: do not fake tcp headers in tcp_send_rcvq() Now we no longer rely on having tcp headers for skbs in receive queue, tcp repair do not need to build fake ones. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-19 16:04:13 -04:00
David S. Miller	3ff6425961	Merge branch 'udp-tunnel-common' Andy Zhou says: ==================== Refactor vxlan and l2tp to use new common UDP tunnel APIs This patch series add a few more UDP tunnel APIs and refactoring current UDP tunnel based protocols, vxlan and l2tp to make use of the new APIs. The added APIs are setup_udp_tunnel_sock(), udp_tunnel_xmit_skb() and udp_tunnel_sock_release(). Those implementation logics already exist in current vxlan and l2tp implementation. Move them to common APIs to reduce code duplications. Also split udp_tunnel.c into net/ipv4/udp_tunnel.c and net/ipv6/ip6_udp_tunnel.c to maintain proper IP protocol separation. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-19 15:57:46 -04:00
Andy Zhou	c8fffcea0a	l2tp: Refactor l2tp core driver to make use of the common UDP tunnel functions Simplify l2tp implementation using common UDP tunnel APIs. Signed-off-by: Andy Zhou <azhou@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-19 15:57:15 -04:00
Andy Zhou	acbf74a763	vxlan: Refactor vxlan driver to make use of the common UDP tunnel functions. Simplify vxlan implementation using common UDP tunnel APIs. Signed-off-by: Andy Zhou <azhou@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-19 15:57:15 -04:00
Andy Zhou	6a93cc9052	udp-tunnel: Add a few more UDP tunnel APIs Added a few more UDP tunnel APIs that can be shared by UDP based tunnel protocol implementation. The main ones are highlighted below. setup_udp_tunnel_sock() configures UDP listener socket for receiving UDP encapsulated packets. udp_tunnel_xmit_skb() and upd_tunnel6_xmit_skb() transmit skb using UDP encapsulation. udp_tunnel_sock_release() closes the UDP tunnel listener socket. Signed-off-by: Andy Zhou <azhou@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-19 15:57:15 -04:00
Andy Zhou	fd384412e1	udp_tunnel: Seperate ipv6 functions into its own file. Add ip6_udp_tunnel.c for ipv6 UDP tunnel functions to avoid ifdefs in udp_tunnel.c Signed-off-by: Andy Zhou <azhou@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-19 15:57:15 -04:00
David S. Miller	79ba2b4c5d	Merge branch 'fec-next' Frank Li says: ==================== net: fec: add interrupt coalescence improve error handle when parse queue number. add interrupt coalescence feature. Change from v2 to v3 - add error check in fec_enet_set_coalesce - fix a run time warning to get clock rate in interrupt - fix commit message use TKT number Change from v1 to v2 - fix indention - use errata number instead of TKT ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-19 15:36:54 -04:00
Fugang Duan	37d6017b84	net: fec: Workaround for imx6sx enet tx hang when enable three queues When enable three queues on imx6sx enet, and then do tx performance test with iperf tool, after some time running, tx hang. Found that: If uDMA is running, software set TDAR may cause tx hang. If uDMA is in idle, software set TDAR don't cause tx hang. There is a TDAR race condition for mutliQ when the software sets TDAR and the UDMA clears TDAR simultaneously or in a small window (2-4 cycles). This will cause the udma_tx and udma_tx_arbiter state machines to hang. The issue exist at i.MX6SX enet IP. So, the Workaround is checking TDAR status four time, if TDAR cleared by hardware and then write TDAR, otherwise don't set TDAR. The patch is only one Workaround for the issue ERR007885. Signed-off-by: Fugang Duan <B38611@freescale.com> Signed-off-by: Frank Li <Frank.Li@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-19 15:36:50 -04:00
Fugang Duan	73e7228941	net:fec: increase DMA queue number when enable interrupt coalesce, 8 BD is not enough. Signed-off-by: Frank Li <Frank.Li@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-19 15:36:50 -04:00
Fugang Duan	d851b47b22	net: fec: add interrupt coalescence feature support i.MX6 SX support interrupt coalescence feature By default, init the interrupt coalescing frame count threshold and timer threshold. Supply the ethtool interfaces as below for user tuning to improve enet performance: rx_max_coalesced_frames rx_coalesce_usecs tx_max_coalesced_frames tx_coalesce_usecs Signed-off-by: Fugang Duan <B38611@freescale.com> Signed-off-by: Frank Li <Frank.Li@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-19 15:36:50 -04:00
Frank Li	b7bd75cf53	net: fec: refine error handle of parser queue number from DT check tx and rx queue seperately. fix typo, "Invalidate" and "fail". change pr_err to pr_warn. Signed-off-by: Frank Li <Frank.Li@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-19 15:36:49 -04:00
Alexei Starovoitov	709f6c58d4	sparc: bpf_jit: add SKF_AD_PKTTYPE support to JIT commit `233577a220` ("net: filter: constify detection of pkt_type_offset") allows us to implement simple PKTTYPE support in sparc JIT Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-19 15:34:40 -04:00
Frank Li	bf3c228d36	net: fec: fix build error at m68k platform reproduce: wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross git checkout `4d494cdc92` make.cross ARCH=m68k m5272c3_defconfig make.cross ARCH=m68k drivers/net/ethernet/freescale/fec.h:262:0: warning: "FEC_R_DES_START" redefined #define FEC_R_DES_START(X) ((X == 1) ? FEC_R_DES_START_1 : \ ^ drivers/net/ethernet/freescale/fec.h:158:0: note: this is the location of the previous definition #define FEC_R_DES_START 0x3d0 /* Receive descriptor ring */ ^ drivers/net/ethernet/freescale/fec.h:265:0: warning: "FEC_X_DES_START" redefined #define FEC_X_DES_START(X) ((X == 1) ? FEC_X_DES_START_1 : \ ... Signed-off-by: Frank Li <Frank.Li@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-16 16:50:14 -04:00
John Fastabend	9f6c38e70b	net: sched: cls_cgroup need tcf_exts_init in all cases This ensures the tcf_exts_init() is called for all cases. Fixes: `952313bd62` ("net: sched: cls_cgroup use RCU") Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Cong Wang <cwang@twopensource.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-16 16:26:39 -04:00
David S. Miller	2d9d65fa44	Merge branch 'net_next_ovs' of git://git.kernel.org/pub/scm/linux/kernel/git/pshelar/openvswitch Pravin B Shelar says: ==================== Open vSwitch Following patches adds recirculation and hash action to OVS. First patch removes pointer to stack object. Next three patches does code restructuring which is required for last patch. Recirculation implementation is changed, according to comments from David Miller, to avoid using recursive calls in OVS. It is using queue to record recirc action and deferred recirc is executed at the end of current actions execution. v1-v2: Changed subsystem name in subject to openvswitch v2-v3: Added patch to remove pkt_key pointer from skb->cb. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-16 16:21:48 -04:00
John Fastabend	e1f93eb06c	net: sched: cls_fw: add missing tcf_exts_init call in fw_change() When allocating a new structure we also need to call tcf_exts_init to initialize exts. A follow up patch might be in order to remove some of this code and do tcf_exts_assign(). With this we could remove the tcf_exts_init/tcf_exts_change pattern for some of the classifiers. As part of the future tcf_actions RCU series this will need to be done. For now fix the call here. Fixes `e35a8ee599` ("net: sched: fw use RCU") Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Cong Wang <cwang@twopensource.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-16 15:59:36 -04:00
John Fastabend	d14cbfc88f	net: sched: cls_cgroup fix possible memory leak of 'new' tree: git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git master head: `54996b529a` commit: `c7953ef230` [625/646] net: sched: cls_cgroup use RCU net/sched/cls_cgroup.c:130 cls_cgroup_change() warn: possible memory leak of 'new' net/sched/cls_cgroup.c:135 cls_cgroup_change() warn: possible memory leak of 'new' net/sched/cls_cgroup.c:139 cls_cgroup_change() warn: possible memory leak of 'new' Fixes: `c7953ef230` ("net: sched: cls_cgroup use RCU") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Cong Wang <cwang@twopensource.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-16 15:59:36 -04:00
John Fastabend	a96366bf26	net: sched: cls_u32 add missing rcu_assign_pointer and annotation Add missing rcu_assign_pointer and missing annotation for ht_up in cls_u32.c Caught by kbuild bot, >> net/sched/cls_u32.c:378:36: sparse: incorrect type in initializer (different address spaces) net/sched/cls_u32.c:378:36: expected struct tc_u_hnode ht net/sched/cls_u32.c:378:36: got struct tc_u_hnode [noderef] <asn:4>ht_up >> net/sched/cls_u32.c:610:54: sparse: incorrect type in argument 4 (different address spaces) net/sched/cls_u32.c:610:54: expected struct tc_u_hnode ht net/sched/cls_u32.c:610:54: got struct tc_u_hnode [noderef] <asn:4>ht_up >> net/sched/cls_u32.c:684:18: sparse: incorrect type in assignment (different address spaces) net/sched/cls_u32.c:684:18: expected struct tc_u_hnode [noderef] <asn:4>ht_up net/sched/cls_u32.c:684:18: got struct tc_u_hnode [assigned] ht >> net/sched/cls_u32.c:359:18: sparse: dereference of noderef expression Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-16 15:59:36 -04:00
John Fastabend	80aab73de4	net: sched: fix unsued cpu variable kbuild test robot reported an unused variable cpu in cls_u32.c after the patch below. This happens when PERF and MARK config variables are disabled Fix this is to use separate variables for perf and mark and define the cpu variable inside the ifdef logic. Fixes: `459d5f626d` ("net: sched: make cls_u32 per cpu")' Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Cong Wang <cwang@twopensource.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-16 15:59:36 -04:00
WANG Cong	69301eaa7f	net_sched: fix a null pointer dereference in tcindex_set_parms() This patch fixes the following crash: [ 42.199159] BUG: unable to handle kernel NULL pointer dereference at 0000000000000018 [ 42.200027] IP: [<ffffffff817e3fc4>] tcindex_set_parms+0x45c/0x526 [ 42.200027] PGD d2319067 PUD d4ffe067 PMD 0 [ 42.200027] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC [ 42.200027] CPU: 0 PID: 541 Comm: tc Not tainted 3.17.0-rc4+ #603 [ 42.200027] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [ 42.200027] task: ffff8800d22d2670 ti: ffff8800ce790000 task.ti: ffff8800ce790000 [ 42.200027] RIP: 0010:[<ffffffff817e3fc4>] [<ffffffff817e3fc4>] tcindex_set_parms+0x45c/0x526 [ 42.200027] RSP: 0018:ffff8800ce793898 EFLAGS: 00010202 [ 42.200027] RAX: 0000000000000001 RBX: ffff8800d1786498 RCX: 0000000000000000 [ 42.200027] RDX: ffffffff82114ec8 RSI: ffffffff82114ec8 RDI: ffffffff82114ec8 [ 42.200027] RBP: ffff8800ce793958 R08: 00000000000080d0 R09: 0000000000000001 [ 42.200027] R10: ffff8800ce7939a0 R11: 0000000000000246 R12: ffff8800d017d238 [ 42.200027] R13: 0000000000000018 R14: ffff8800d017c6a0 R15: ffff8800d1786620 [ 42.200027] FS: 00007f4e24539740(0000) GS:ffff88011a600000(0000) knlGS:0000000000000000 [ 42.200027] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 42.200027] CR2: 0000000000000018 CR3: 00000000cff38000 CR4: 00000000000006f0 [ 42.200027] Stack: [ 42.200027] ffff8800ce0949f0 0000000000000000 0000000200000003 ffff880000000000 [ 42.200027] ffff8800ce7938b8 ffff8800ce7938b8 0000000600000007 0000000000000000 [ 42.200027] ffff8800ce7938d8 ffff8800ce7938d8 0000000600000007 ffff8800ce0949f0 [ 42.200027] Call Trace: [ 42.200027] [<ffffffff817e4169>] tcindex_change+0xdb/0xee [ 42.200027] [<ffffffff817c16ca>] tc_ctl_tfilter+0x44d/0x63f [ 42.200027] [<ffffffff8179d161>] rtnetlink_rcv_msg+0x181/0x194 [ 42.200027] [<ffffffff8179cf9d>] ? rtnl_lock+0x17/0x19 [ 42.200027] [<ffffffff8179cfe0>] ? __rtnl_unlock+0x17/0x17 [ 42.200027] [<ffffffff817ee296>] netlink_rcv_skb+0x49/0x8b [ 43.462494] [<ffffffff8179cfc2>] rtnetlink_rcv+0x23/0x2a [ 43.462494] [<ffffffff817ec8df>] netlink_unicast+0xc7/0x148 [ 43.462494] [<ffffffff817ed413>] netlink_sendmsg+0x5cb/0x63d [ 43.462494] [<ffffffff810ad781>] ? mark_lock+0x2e/0x224 [ 43.462494] [<ffffffff817757b8>] __sock_sendmsg_nosec+0x25/0x27 [ 43.462494] [<ffffffff81778165>] sock_sendmsg+0x57/0x71 [ 43.462494] [<ffffffff81152bbd>] ? might_fault+0x57/0xa4 [ 43.462494] [<ffffffff81152c06>] ? might_fault+0xa0/0xa4 [ 43.462494] [<ffffffff81152bbd>] ? might_fault+0x57/0xa4 [ 43.462494] [<ffffffff817838fd>] ? verify_iovec+0x69/0xb7 [ 43.462494] [<ffffffff817784f8>] ___sys_sendmsg+0x21d/0x2bb [ 43.462494] [<ffffffff81009db3>] ? native_sched_clock+0x35/0x37 [ 43.462494] [<ffffffff8109ab53>] ? sched_clock_local+0x12/0x72 [ 43.462494] [<ffffffff810ad781>] ? mark_lock+0x2e/0x224 [ 43.462494] [<ffffffff8109ada4>] ? sched_clock_cpu+0xa0/0xb9 [ 43.462494] [<ffffffff810aee37>] ? __lock_acquire+0x5fe/0xde4 [ 43.462494] [<ffffffff8119f570>] ? rcu_read_lock_held+0x36/0x38 [ 43.462494] [<ffffffff8119f75a>] ? __fcheck_files.isra.7+0x4b/0x57 [ 43.462494] [<ffffffff8119fbf2>] ? __fget_light+0x30/0x54 [ 43.462494] [<ffffffff81779012>] __sys_sendmsg+0x42/0x60 [ 43.462494] [<ffffffff81779042>] SyS_sendmsg+0x12/0x1c [ 43.462494] [<ffffffff819d24d2>] system_call_fastpath+0x16/0x1b 'p->h' could be NULL while 'cp->h' is always update to date. Fixes: commit `331b72922c` ("net: sched: RCU cls_tcindex") Cc: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-By: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-16 15:20:09 -04:00
WANG Cong	44b75e4317	net_sched: fix memory leak in cls_tcindex Fixes: commit `331b72922c` ("net: sched: RCU cls_tcindex") Cc: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-By: John Fastabend <john.r.fastabend@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-16 15:19:23 -04:00
Andy Zhou	971427f353	openvswitch: Add recirc and hash action. Recirc action allows a packet to reenter openvswitch processing. currently openvswitch lookup flow for packet received and execute set of actions on that packet, with help of recirc action we can process/modify the packet and recirculate it back in openvswitch for another pass. OVS hash action calculates 5-tupple hash and set hash in flow-key hash. This can be used along with recirculation for distributing packets among different ports for bond devices. For example: OVS bonding can use following actions: Match on: bond flow; Action: hash, recirc(id) Match on: recirc-id == id and hash lower bits == a; Action: output port_bond_a Signed-off-by: Andy Zhou <azhou@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>	2014-09-15 23:28:14 -07:00

1 2 3 4 5 ...

469873 Commits