linux

Author	SHA1	Message	Date
Lendacky, Thomas	bd8255d8ba	amd-xgbe: Prepare for supporting PCI devices Update the driver framework to separate out platform/ACPI specific code from general code during device initialization. This will allow for the introduction of PCI device support. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-04 14:48:45 -04:00
Lendacky, Thomas	4b8acdf5fe	amd-xgbe: Update how to determine DMA channel status Tx and Rx DMA channel status determiniation is different depending on the version of the hardware. Update the channel status processing code to account for the change. Also, reduce the timeout value used when stopping the channels. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-04 14:48:45 -04:00
Lendacky, Thomas	e5a20b9072	amd-xgbe: Support for 64-bit management counter registers Add support for reading all management counter registers as 64-bit values. The indication of whether to read the high 32-bits to form a 64-bit value is indicated in the version data. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-04 14:48:44 -04:00
Lendacky, Thomas	b03a4a6fb3	amd-xgbe: Prepare for a new PCS register access method Prepare the code to be able to support accessing of the PCS registers in a new way, while maintaining the current access method. Provide a version specific field that indicates the method to use. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-04 14:48:44 -04:00
Lendacky, Thomas	1bf40ada62	amd-xgbe: Add support for clause 37 auto-negotiation Add support to be able to use clause 37 auto-negotiation. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-04 14:48:44 -04:00
Lendacky, Thomas	a64def4161	amd-xgbe: Prepare for introduction of clause 37 autoneg Prepare for the future introduction of clause 37 auto-negotiation by updating the current auto-negotiation related functions to identify them as clause 73 functions. Move interrupt enablement to the enable/disable auto-negotiation functions. Update what will be common routines to check for the current type of AN and process accordingly. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-04 14:48:44 -04:00
Lendacky, Thomas	e57f7a3fea	amd-xgbe: Prepare for working with more than one type of phy Prepare the code to be able to work with more than one type of phy by adding additional callable functions into the phy interface and removing phy specific settings/functions from non-phy related files. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-04 14:48:43 -04:00
Lendacky, Thomas	43e0dcf708	amd-xgbe: Perform priority-based hardware FIFO allocation Allocate the FIFO across the hardware Rx queues based on the priority of the queues. Giving more FIFO resources to queues with a higher priority. If PFC is active but not enabled for a queue, then less resources can allocated to the queue. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-04 14:48:43 -04:00
Lendacky, Thomas	586e3cfb26	amd-xgbe: Prepare for priority-based FIFO allocation Currently, the Rx and Tx fifos are evenly allocated between the hardware queues of the device. As more queues are instantiated, the fifo memory needs to be able to be allocated based on queue priority. This allows for higher priority queues to have more fifo memory than lower priority queues. Prepare for this by modifying the current fifo calculation to assign the fifo queue allocation in an array that is then used to program the hardware. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-04 14:48:43 -04:00
Lendacky, Thomas	d9682c90cf	amd-xgbe: Fix formatting of PCS register dump Fix the length value used for the PCS register dump so that the full value can be displayed. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-04 14:48:42 -04:00
David S. Miller	4fb7450683	Merge branch 'uid-routing' Lorenzo Colitti says: ==================== net: inet: Support UID-based routing This patchset adds support for per-UID routing. It allows the administrator to configure rules such as: ip rule add uidrange 100-200 lookup 123 This functionality has been in use by all Android devices since 5.0. It is primarily used to impose per-app routing policies (on Android, every app has its own UID) without having to resort to rerouting packets in iptables, which breaks getsockname() and MTU/MSS calculation, and generally disrupts end-to-end connectivity. This patch series is similar to the code currently used on Android, but has better correctness and performance because it stores the UID in the socket instead of calling sock_i_uid. This avoids contention on sk->sk_callback_lock, and makes it possible to correctly route a socket on which userspace has called close(), for which sock_i_uid will return 0. Changes from v1: - Don't set the UID in sk_clone_lock, it's already set by sock_copy. - For packets originated by kernel sockets, don't use the socket UID. This is the UID that created the namespace, but it might not be mapped in the namespace at all. Instead, use UID 0 in the namespace, which is less surprising and consistent with what happens in the root namespace. - Fix UID routing of IPv4 and IPv6 SYN_RECV sockets. - Fix UID routing of received IPv6 redirects. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-04 14:45:24 -04:00
Lorenzo Colitti	e2d118a1cb	net: inet: Support UID-based routing in IP protocols. - Use the UID in routing lookups made by protocol connect() and sendmsg() functions. - Make sure that routing lookups triggered by incoming packets (e.g., Path MTU discovery) take the UID of the socket into account. - For packets not associated with a userspace socket, (e.g., ping replies) use UID 0 inside the user namespace corresponding to the network namespace the socket belongs to. This allows all namespaces to apply routing and iptables rules to kernel-originated traffic in that namespaces by matching UID 0. This is better than using the UID of the kernel socket that is sending the traffic, because the UID of kernel sockets created at namespace creation time (e.g., the per-processor ICMP and TCP sockets) is the UID of the user that created the socket, which might not be mapped in the namespace. Tested: compiles allnoconfig, allyesconfig, allmodconfig Tested: https://android-review.googlesource.com/253302 Signed-off-by: Lorenzo Colitti <lorenzo@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-04 14:45:23 -04:00
Lorenzo Colitti	622ec2c9d5	net: core: add UID to flows, rules, and routes - Define a new FIB rule attributes, FRA_UID_RANGE, to describe a range of UIDs. - Define a RTA_UID attribute for per-UID route lookups and dumps. - Support passing these attributes to and from userspace via rtnetlink. The value INVALID_UID indicates no UID was specified. - Add a UID field to the flow structures. Signed-off-by: Lorenzo Colitti <lorenzo@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-04 14:45:23 -04:00
Lorenzo Colitti	86741ec254	net: core: Add a UID field to struct sock. Protocol sockets (struct sock) don't have UIDs, but most of the time, they map 1:1 to userspace sockets (struct socket) which do. Various operations such as the iptables xt_owner match need access to the "UID of a socket", and do so by following the backpointer to the struct socket. This involves taking sk_callback_lock and doesn't work when there is no socket because userspace has already called close(). Simplify this by adding a sk_uid field to struct sock whose value matches the UID of the corresponding struct socket. The semantics are as follows: 1. Whenever sk_socket is non-null: sk_uid is the same as the UID in sk_socket, i.e., matches the return value of sock_i_uid. Specifically, the UID is set when userspace calls socket(), fchown(), or accept(). 2. When sk_socket is NULL, sk_uid is defined as follows: - For a socket that no longer has a sk_socket because userspace has called close(): the previous UID. - For a cloned socket (e.g., an incoming connection that is established but on which userspace has not yet called accept): the UID of the socket it was cloned from. - For a socket that has never had an sk_socket: UID 0 inside the user namespace corresponding to the network namespace the socket belongs to. Kernel sockets created by sock_create_kern are a special case of #1 and sk_uid is the user that created them. For kernel sockets created at network namespace creation time, such as the per-processor ICMP and TCP sockets, this is the user that created the network namespace. Signed-off-by: Lorenzo Colitti <lorenzo@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-04 14:45:22 -04:00
David S. Miller	0d53072aa4	Merge branch 'dsa-mv88e6xxx-port-operation-refine' Vivien Didelot says: ==================== net: dsa: mv88e6xxx: refine port operations The Marvell chips have one internal SMI device per port, containing a set of registers used to configure a port's link, STP state, default VLAN or addresses database, etc. This patchset creates port files to implement the port operations as described in datasheets, and extend the chip ops structure with them. Patches 1 to 6 implement accessors for port's STP state, port based VLAN map, default FID, default VID, and 802.1Q mode. Patches 7 to 11 implement the port's MAC setup of link state, duplex mode, RGMII delay and speed, all accessed through port's register 0x01. The new port's MAC setup code is used to re-implement the adjust_link code and correctly force the link down before changing any of the MAC settings, as requested by the datasheets. The port's MAC accessors use values compatible with struct phy_device (e.g. DUPLEX_FULL) and extend them when needed (e.g. SPEED_MAX). Changes in v2: - Strictly use new _UNFORCED values instead of re-using _UNKNOWN ones. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-04 14:40:01 -04:00
Vivien Didelot	d78343d2d7	net: dsa: mv88e6xxx: setup port's MAC Now that we have setters to configure the port's MAC, use them to refactor the port setup and adjust_link code. Note that port's MAC speed, duplex or RGMII delay must not be changed unless the port's link is forced down. So wrap all that in a mv88e6xxx_port_setup_mac function. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-04 14:40:00 -04:00
Vivien Didelot	96a2b40c7b	net: dsa: mv88e6xxx: add port's MAC speed setter While the two bits for link, duplex or RGMII delays are used the same way on chips supporting the said feature, the two bits for speed have different meaning for most of the chips out there. Speed value is stored in bits 1:0, 0x3 means unforce (normal detection). Some chips reuse values for alternative speeds when bit 12 is set. Newer chips with speed > 1Gbps reuse value 0x3 thus need a new bit 13. Here are the values to write in register 0x1 to (un)force speed: \| Speed \| 88E6065 \| 88E6185 \| 88E6352 \| 88E6390 \| 88E6390X \| \| ------- \| ------- \| ------- \| ------- \| ------- \| -------- \| \| 10 \| 0x0000 \| 0x0000 \| 0x0000 \| 0x2000 \| 0x2000 \| \| 100 \| 0x0001 \| 0x0001 \| 0x0001 \| 0x2001 \| 0x2001 \| \| 200 \| 0x0002 \| NA \| 0x1001 \| 0x3001 \| 0x3001 \| \| 1000 \| NA \| 0x0002 \| 0x0002 \| 0x2002 \| 0x2002 \| \| 2500 \| NA \| NA \| NA \| 0x3003 \| 0x3003 \| \| 10000 \| NA \| NA \| NA \| NA \| 0x2003 \| \| unforce \| 0x0003 \| 0x0003 \| 0x0003 \| 0x0000 \| 0x0000 \| This patch implements a generic mv88e6xxx_port_set_speed() function used by chip-specific wrappers to filter supported ports and speeds. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-04 14:40:00 -04:00
Vivien Didelot	a0a0f6229b	net: dsa: mv88e6xxx: add port's RGMII delay setter Some chips such as 88E6352 and 88E6390 can be programmed to add delays to RXCLK for IND inputs or to GTXCLK for OUTD outputs when port is in RGMII mode. Add a port function to program such delays according to the provided PHY interface mode. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-04 14:40:00 -04:00
Vivien Didelot	7f1ae07b51	net: dsa: mv88e6xxx: add port duplex setter Similarly to port's link, add setter to force port's half duplex, full duplex or let normal duplex detection occurs. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-04 14:40:00 -04:00
Vivien Didelot	08ef7f1022	net: dsa: mv88e6xxx: add port link setter Most of the chips will have a port register control bits to force the port's link up, down, or let normal link detection occurs. Implement such operation to use it later when setting duplex, etc. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-04 14:40:00 -04:00
Vivien Didelot	385a0995cc	net: dsa: mv88e6xxx: add port 802.1Q mode setter Add port functions to set the port 802.1Q mode. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-04 14:39:59 -04:00
Vivien Didelot	77064f37b9	net: dsa: mv88e6xxx: add port PVID accessors Add port functions to access the ports default VID. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-04 14:39:59 -04:00
Vivien Didelot	b4e48c500e	net: dsa: mv88e6xxx: add port FID accessors Add functions to port files to access the ports default FID. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-04 14:39:59 -04:00
Vivien Didelot	5a7921f46d	net: dsa: mv88e6xxx: add port vlan map setter Add a port function to access the Port Based VLAN Map register. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-04 14:39:59 -04:00
Vivien Didelot	e28def3329	net: dsa: mv88e6xxx: add port state setter Add the port STP state setter to the port files. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-04 14:39:58 -04:00
Vivien Didelot	18abed211c	net: dsa: mv88e6xxx: add port files The Marvell switches contains one internal SMI device per port, called "Port Registers". Depending on the model, the addresses of these devices start from 0x0, 0x8 or 0x10. Start moving Port Registers specific code to their own files. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-04 14:39:58 -04:00
Simon Horman	5976c5f45c	net/sched: cls_flower: Support matching on SCTP ports Support matching on SCTP ports in the same way that matching on TCP and UDP ports is already supported. Example usage: tc qdisc add dev eth0 ingress tc filter add dev eth0 protocol ip parent ffff: \ flower indev eth0 ip_proto sctp dst_port 80 \ action drop Signed-off-by: Simon Horman <simon.horman@netronome.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-03 16:26:39 -04:00
Elad Raz	5e5f89e70b	mlxsw: pci: Fix the FW ready mask length The system-status register is actually 16-bit wide and not 8 bit-wide. Fixes: `233fa44bd6` ("mlxsw: pci: Implement reset done check") Signed-off-by: Elad Raz <eladr@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-03 16:21:27 -04:00
David S. Miller	a799126864	Merge branch 'ip-recvfragsize-cmsg' Willem de Bruijn says: ==================== ip: add RECVFRAGSIZE cmsg On IP datagrams and raw sockets, when packets arrive fragmented, expose the largest received fragment size through a new cmsg. Protocols implemented on top of these sockets may use this, for instance, to inform peers to lower MSS on platforms that silently allow send calls to exceed PMTU and cause fragmentation. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-03 15:41:12 -04:00
Willem de Bruijn	dbd1759e6a	ipv6: on reassembly, record frag_max_size IP6CB and IPCB have a frag_max_size field. In IPv6 this field is filled in when packets are reassembled by the connection tracking code. Also fill in when reassembling in the input path, to expose it through cmsg IPV6_RECVFRAGSIZE in all cases. Signed-off-by: Willem de Bruijn <willemb@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-03 15:41:11 -04:00
Willem de Bruijn	0cc0aa614b	ipv6: add IPV6_RECVFRAGSIZE cmsg When reading a datagram or raw packet that arrived fragmented, expose the maximum fragment size if recorded to allow applications to estimate receive path MTU. At this point, the field is only recorded when ipv6 connection tracking is enabled. A follow-up patch will record this field also in the ipv6 input path. Tested using the test for IP_RECVFRAGSIZE plus ip netns exec to ip addr add dev veth1 fc07::1/64 ip netns exec from ip addr add dev veth0 fc07::2/64 ip netns exec to ./recv_cmsg_recvfragsize -6 -u -p 6000 & ip netns exec from nc -q 1 -u fc07::1 6000 < payload Both with and without enabling connection tracking ip6tables -A INPUT -m state --state NEW -p udp -j LOG Signed-off-by: Willem de Bruijn <willemb@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-03 15:41:11 -04:00
Willem de Bruijn	70ecc24841	ipv4: add IP_RECVFRAGSIZE cmsg The IP stack records the largest fragment of a reassembled packet in IPCB(skb)->frag_max_size. When reading a datagram or raw packet that arrived fragmented, expose the value to allow applications to estimate receive path MTU. Tested: Sent data over a veth pair of which the source has a small mtu. Sent data using netcat, received using a dedicated process. Verified that the cmsg IP_RECVFRAGSIZE is returned only when data arrives fragmented, and in that cases matches the veth mtu. ip link add veth0 type veth peer name veth1 ip netns add from ip netns add to ip link set dev veth1 netns to ip netns exec to ip addr add dev veth1 192.168.10.1/24 ip netns exec to ip link set dev veth1 up ip link set dev veth0 netns from ip netns exec from ip addr add dev veth0 192.168.10.2/24 ip netns exec from ip link set dev veth0 up ip netns exec from ip link set dev veth0 mtu 1300 ip netns exec from ethtool -K veth0 ufo off dd if=/dev/zero bs=1 count=1400 2>/dev/null > payload ip netns exec to ./recv_cmsg_recvfragsize -4 -u -p 6000 & ip netns exec from nc -q 1 -u 192.168.10.1 6000 < payload using github.com/wdebruij/kerneltools/blob/master/tests/recvfragsize.c Signed-off-by: Willem de Bruijn <willemb@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-03 15:41:11 -04:00
David S. Miller	1b99e5e854	Merge branch 'stmmac-OXNAS' Neil Armstrong says: ==================== net: stmmac: Add OXNAS DWMAC Glue This patchset add support for the Sysnopsys DWMAC Gigabit Ethernet controller Glue layer of the Oxford Semiconductor OX820 SoC. Changes since v2 at http://lkml.kernel.org/r/20161031105345.16711-1-narmstrong@baylibre.com : - Disable/Unprepare clock if regmap read fails in oxnas_dwmac_init Changes since v1 at https://patchwork.kernel.org/patch/9388231/ : - Split dt-bindings in a separate patch - Add IP version in the dt-bindings compatible - Check return of clk_prepare_enable() - use get_stmmac_bsp_priv() helper - hardwire setup values in oxnas_dwmac_init() Changes since RFC at https://patchwork.kernel.org/patch/9387257 : - Drop init/exit callbacks - Implement proper remove and PM callback - Call init from probe - Disable/Unprepare clock if stmmac probe fails ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-03 15:31:34 -04:00
Neil Armstrong	52b6c5c218	dt-bindings: net: Add OXNAS DWMAC Bindings Signed-off-by: Neil Armstrong <narmstrong@baylibre.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-03 15:31:33 -04:00
Neil Armstrong	5ed7414062	net: stmmac: Add OXNAS Glue Driver Add Synopsys Designware MAC Glue layer for the Oxford Semiconductor OX820. Acked-by: Joachim Eastwood <manabian@gmail.com> Signed-off-by: Neil Armstrong <narmstrong@baylibre.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-03 15:31:33 -04:00
David S. Miller	d3bc29a4b3	Merge branch 'diag-raw-fixes' Cyrill Gorcunov says: ==================== net: Fixes for raw diag sockets handling Hi! Here are a few fixes for raw-diag sockets handling: missing sock_put call and jump for exiting from nested cycle. I made patches for iproute2 as well so will send them out soon. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-03 15:25:27 -04:00
Cyrill Gorcunov	9999370fae	net: ip, raw_diag -- Use jump for exiting from nested loop I managed to miss that sk_for_each is called under "for" cycle so need to use goto here to return matching socket. CC: David S. Miller <davem@davemloft.net> CC: Eric Dumazet <eric.dumazet@gmail.com> CC: David Ahern <dsa@cumulusnetworks.com> CC: Andrey Vagin <avagin@openvz.org> CC: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Acked-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-03 15:25:26 -04:00
Cyrill Gorcunov	cd05a0eca8	net: ip, raw_diag -- Fix socket leaking for destroy request In raw_diag_destroy the helper raw_sock_get returns with sock_hold call, so we have to put it then. CC: David S. Miller <davem@davemloft.net> CC: Eric Dumazet <eric.dumazet@gmail.com> CC: David Ahern <dsa@cumulusnetworks.com> CC: Andrey Vagin <avagin@openvz.org> CC: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Acked-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-03 15:25:26 -04:00
Govindarajulu Varadarajan	17197236d6	enic: set skb->hash type properly Driver sets the skb l4/l3 hash based on NIC_CFG_RSS_HASH_TYPE_, which is bit mask. This is wrong. Hw actually provides us enum. Use CQ_ENET_RQ_DESC_RSS_TYPE_ to set l3 and l4 hash type. Fixes: `bf751ba802` ("driver/net: enic: record q_number and rss_hash for skb") Signed-off-by: Govindarajulu Varadarajan <_govind@gmx.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-02 15:32:53 -04:00
Philippe Reynes	f7a5537cd2	net: 3com: typhoon: use new api ethtool_{get\|set}_link_ksettings The ethtool api {get\|set}_settings is deprecated. We move this driver to new api {get\|set}_link_ksettings. Signed-off-by: Philippe Reynes <tremyfr@gmail.com> Reviewed-by: David Dillow <dave@thedillows.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-02 15:26:51 -04:00
Tom Herbert	1913540a13	ila: Fix crash caused by rhashtable changes commit `ca26893f05` ("rhashtable: Add rhlist interface") added a field to rhashtable_iter so that length became 56 bytes and would exceed the size of args in netlink_callback (which is 48 bytes). The netlink diag dump function already has been allocating a iter structure and storing the pointed to that in the args of netlink_callback. ila_xlat also uses rhahstable_iter but is still putting that directly in the arg block. Now since rhashtable_iter size is increased we are overwriting beyond the structure. The next field happens to be cb_mutex pointer in netlink_sock and hence the crash. Fix is to alloc the rhashtable_iter and save it as pointer in arg. Tested: modprobe ila ./ip ila add loc 3333:0:0:0 loc_match 2222:0:0:1, ./ip ila list # NO crash now Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-02 15:26:02 -04:00
Cyrill Gorcunov	3de864f8c9	net: ip, diag -- Adjust raw_abort to use unlocked __udp_disconnect While being preparing patches for killing raw sockets via diag netlink interface I noticed that my runs are stuck: \| [root@pcs7 ~]# cat /proc/`pidof ss`/stack \| [<ffffffff816d1a76>] __lock_sock+0x80/0xc4 \| [<ffffffff816d206a>] lock_sock_nested+0x47/0x95 \| [<ffffffff8179ded6>] udp_disconnect+0x19/0x33 \| [<ffffffff8179b517>] raw_abort+0x33/0x42 \| [<ffffffff81702322>] sock_diag_destroy+0x4d/0x52 which has not been the case before. I narrowed it down to the commit \| commit `286c72deab` \| Author: Eric Dumazet <edumazet@google.com> \| Date: Thu Oct 20 09:39:40 2016 -0700 \| \| udp: must lock the socket in udp_disconnect() where we start locking the socket for different reason. So the raw_abort escaped the renaming and we have to fix this typo using __udp_disconnect instead. Fixes: `286c72deab` ("udp: must lock the socket in udp_disconnect()") CC: David S. Miller <davem@davemloft.net> CC: Eric Dumazet <eric.dumazet@gmail.com> CC: David Ahern <dsa@cumulusnetworks.com> CC: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> CC: James Morris <jmorris@namei.org> CC: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> CC: Patrick McHardy <kaber@trash.net> CC: Andrey Vagin <avagin@openvz.org> CC: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-02 15:25:16 -04:00
Woojung Huh	cc89c323a3	lan78xx: Use irq_domain for phy interrupt from USB Int. EP To utilize phylib with interrupt fully than handling some of phy stuff in the MAC driver, create irq_domain for USB interrupt EP of phy interrupt and pass the irq number to phy_connect_direct() instead of PHY_IGNORE_INTERRUPT. Idea comes from drivers/gpio/gpio-dl2.c Signed-off-by: Woojung Huh <woojung.huh@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-02 15:23:16 -04:00
Eric Dumazet	2331ccc5b3	tcp: enhance tcp collapsing As Ilya Lesokhin suggested, we can collapse two skbs at retransmit time even if the skb at the right has fragments. We simply have to use more generic skb_copy_bits() instead of skb_copy_from_linear_data() in tcp_collapse_retrans() Also need to guard this skb_copy_bits() in case there is nothing to copy, otherwise skb_put() could panic if left skb has frags. Tested: Used following packetdrill test // Establish a connection. 0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3 +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 +0 bind(3, ..., ...) = 0 +0 listen(3, 1) = 0 +0 < S 0:0(0) win 32792 <mss 1460,sackOK,nop,nop,nop,wscale 8> +0 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 8> +.100 < . 1:1(0) ack 1 win 257 +0 accept(3, ..., ...) = 4 +0 setsockopt(4, SOL_TCP, TCP_NODELAY, [1], 4) = 0 +0 write(4, ..., 200) = 200 +0 > P. 1:201(200) ack 1 +.001 write(4, ..., 200) = 200 +0 > P. 201:401(200) ack 1 +.001 write(4, ..., 200) = 200 +0 > P. 401:601(200) ack 1 +.001 write(4, ..., 200) = 200 +0 > P. 601:801(200) ack 1 +.001 write(4, ..., 200) = 200 +0 > P. 801:1001(200) ack 1 +.001 write(4, ..., 100) = 100 +0 > P. 1001:1101(100) ack 1 +.001 write(4, ..., 100) = 100 +0 > P. 1101:1201(100) ack 1 +.001 write(4, ..., 100) = 100 +0 > P. 1201:1301(100) ack 1 +.001 write(4, ..., 100) = 100 +0 > P. 1301:1401(100) ack 1 +.100 < . 1:1(0) ack 1 win 257 <nop,nop,sack 1001:1401> // Check that TCP collapse works : +0 > P. 1:1001(1000) ack 1 Reported-by: Ilya Lesokhin <ilyal@mellanox.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-02 15:21:36 -04:00
Philippe Reynes	b646cf299e	net: 3c509: use new api ethtool_{get\|set}_link_ksettings The ethtool api {get\|set}_settings is deprecated. We move this driver to new api {get\|set}_link_ksettings. Signed-off-by: Philippe Reynes <tremyfr@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-02 15:17:09 -04:00
Philippe Reynes	e19b7883ef	net: 3c59x: use new api ethtool_{get\|set}_link_ksettings The ethtool api {get\|set}_settings is deprecated. We move this driver to new api {get\|set}_link_ksettings. Signed-off-by: Philippe Reynes <tremyfr@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-02 15:17:09 -04:00
Philippe Reynes	bc8ee596af	net: mii: add generic function to support ksetting support The old ethtool api (get_setting and set_setting) has generic mii functions mii_ethtool_sset and mii_ethtool_gset. To support the new ethtool api ({get\|set}_link_ksettings), we add two generics mii function mii_ethtool_{get\|set}_link_ksettings_get. Signed-off-by: Philippe Reynes <tremyfr@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-02 15:17:09 -04:00
David S. Miller	55454e9e5c	Merge branch 'mlx4-XDP-tx-refactor' Tariq Toukan says: ==================== mlx4 XDP TX refactor This patchset refactors the XDP forwarding case, so that its dedicated transmit queues are managed in a complete separation from the other regular ones. It also adds ethtool counters for XDP cases. Series generated against net-next commit: `22ca904ad7` genetlink: fix error return code in genl_register_family() Thanks, Tariq. v3: * Exposed per ring counters. v2: * Added ethtool counters. * Rebased, now patch 2 reverts Brenden's fix, as the bug no longer exists: `958b3d396d` ("net/mlx4_en: fixup xdp tx irq to match rx") * Updated commit message of patch 2. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-02 15:07:12 -04:00
Tariq Toukan	15fca2c8eb	net/mlx4_en: Add ethtool statistics for XDP cases XDP statistics are reported in ethtool, in total and per ring, as follows: - xdp_drop: the number of packets dropped by xdp. - xdp_tx: the number of packets forwarded by xdp. - xdp_tx_full: the number of times an xdp forward failed due to a full tx xdp ring. In addition, all packets that are dropped/forwarded by XDP are no longer accounted in rx_packets/rx_bytes of the ring, so that they count traffic that is passed to the stack. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-02 15:07:11 -04:00
Tariq Toukan	67f8b1dcb9	net/mlx4_en: Refactor the XDP forwarding rings scheme Separately manage the two types of TX rings: regular ones, and XDP. Upon an XDP set, do not borrow regular TX rings and convert them into XDP ones, but allocate new ones, unless we hit the max number of rings. Which means that in systems with smaller #cores we will not consume the current TX rings for XDP, while we are still in the num TX limit. XDP TX rings counters are not shown in ethtool statistics. Instead, XDP counters will be added to the respective RX rings in a downstream patch. This has no performance implications. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-02 15:07:11 -04:00

1 2 3 4 5 ...

634562 Commits