linux

mirror of https://github.com/torvalds/linux.git synced 2024-11-26 22:21:42 +00:00

Author	SHA1	Message	Date
Roopa Prabhu	9c03b282ba	trace: events: add a few neigh tracepoints The goal here is to trace neigh state changes covering all possible neigh update paths. Plus have a specific trace point in neigh_update to cover flags sent to neigh_update. Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-17 10:33:39 -08:00
David S. Miller	9e8ccd8957	Merge branch 'net-phy-add-and-use-genphy_c45_an_config_an' Heiner Kallweit says: ==================== net: phy: add and use genphy_c45_an_config_an This series adds genphy_c45_an_config_an() and uses it in the marvell10g diver. In addition patch 4 aligns the aneg configuration with what is done in genphy_config_aneg(). v2: - in patch 2 changed function name to genphy_c45_an_config_aneg - in patch 3 add a comment regarding 1000BaseT vendor registers v3: - rebase patch 3 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-17 10:27:00 -08:00
Heiner Kallweit	3ce2a027ae	net: phy: marvell10g: check for newly set aneg Even if the advertisement registers content didn't change, we may have just switched to aneg, and therefore have to trigger an aneg restart. This matches the behavior of genphy_config_aneg(). Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-17 10:26:52 -08:00
Andrew Lunn	3de97f3c63	net: phy: marvell10g: use genphy_c45_an_config_aneg Use new function genphy_c45_config_aneg() in mv3310_config_aneg(). v2: - add a comment regarding 1000BaseT vendor registers v3: - rebased Signed-off-by: Andrew Lunn <andrew@lunn.ch> [hkallweit1@gmail.com: patch splitted] Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-17 10:26:52 -08:00
Andrew Lunn	9a5dc8af44	net: phy: add genphy_c45_an_config_aneg C45 configuration of 10/100 and multi-giga bit auto negotiation advertisement is standardized. Configuration of 1000Base-T however appears to be vendor specific. Move the generic code out of the Marvell driver into the common phy-c45.c file. v2: - change function name to genphy_c45_an_config_aneg Signed-off-by: Andrew Lunn <andrew@lunn.ch> [hkallweit1@gmail.com: use new helper linkmode_adv_to_mii_10gbt_adv_t and split patch] Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-17 10:26:35 -08:00
Heiner Kallweit	744e458aeb	net: phy: add helper linkmode_adv_to_mii_10gbt_adv_t Add a helper linkmode_adv_to_mii_10gbt_adv_t(), similar to linkmode_adv_to_mii_adv_t. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-17 10:26:34 -08:00
David S. Miller	885e631959	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Alexei Starovoitov says: ==================== pull-request: bpf-next 2019-02-16 The following pull-request contains BPF updates for your net-next tree. The main changes are: 1) numerous libbpf API improvements, from Andrii, Andrey, Yonghong. 2) test all bpf progs in alu32 mode, from Jiong. 3) skb->sk access and bpf_sk_fullsock(), bpf_tcp_sock() helpers, from Martin. 4) support for IP encap in lwt bpf progs, from Peter. 5) remove XDP_QUERY_XSK_UMEM dead code, from Jan. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-16 22:56:34 -08:00
Andrii Nakryiko	5aab392c55	tools/libbpf: support bigger BTF data sizes While it's understandable why kernel limits number of BTF types to 65535 and size of string section to 64KB, in libbpf as user-space library it's too restrictive. E.g., pahole converting DWARF to BTF type information for Linux kernel generates more than 3 million BTF types and more than 3MB of strings, before deduplication. So to allow btf__dedup() to do its work, we need to be able to load bigger BTF sections using btf__new(). Singed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-02-16 18:47:18 -08:00
Peter Oskolkov	9d6b3584a7	selftests: bpf: test_lwt_ip_encap: add negative tests. As requested by David Ahern: - add negative tests (no routes, explicitly unreachable destinations) to exercize error handling code paths; - do not exit on test failures, but instead print a summary of passed/failed tests at the end. Future patches will add TSO and VRF tests. Signed-off-by: Peter Oskolkov <posk@google.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-02-16 18:41:44 -08:00
Alexandre Torgue	f186a82b10	net: stmmac: use correct define to get rx timestamp on GMAC4 In dwmac4_wrback_get_rx_timestamp_status we looking for a RX timestamp. For that receive descriptors are handled and so we should use defines related to receive descriptors. It'll no change the functional behavior as RDES3_RDES1_VALID=TDES3_RS1V=BIT(26) but it makes code easier to read. Signed-off-by: Alexandre Torgue <alexandre.torgue@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-16 18:13:58 -08:00
Dan Carpenter	d0edde8d29	atm: clean up vcc_seq_next() It's confusing to call PTR_ERR(v). The PTR_ERR() function is basically a fancy cast to long so it makes you wonder, was IS_ERR() intended? But that doesn't make sense because vcc_walk() doesn't return error pointers. This patch doesn't affect runtime, it's just a cleanup. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-16 18:12:22 -08:00
Guillaume Nault	4057765f2d	sock: consistent handling of extreme SO_SNDBUF/SO_RCVBUF values SO_SNDBUF and SO_RCVBUF (and their BUFFORCE version) may overflow or underflow their input value. This patch aims at providing explicit handling of these extreme cases, to get a clear behaviour even with values bigger than INT_MAX / 2 or lower than INT_MIN / 2. For simplicity, only SO_SNDBUF and SO_SNDBUFFORCE are described here, but the same explanation and fix apply to SO_RCVBUF and SO_RCVBUFFORCE (with 'SNDBUF' replaced by 'RCVBUF' and 'wmem_max' by 'rmem_max'). Overflow of positive values =========================== When handling SO_SNDBUF or SO_SNDBUFFORCE, if 'val' exceeds INT_MAX / 2, the buffer size is set to its minimum value because 'val 2' overflows, and max_t() considers that it's smaller than SOCK_MIN_SNDBUF. For SO_SNDBUF, this can only happen with net.core.wmem_max > INT_MAX / 2. SO_SNDBUF and SO_SNDBUFFORCE are actually designed to let users probe for the maximum buffer size by setting an arbitrary large number that gets capped to the maximum allowed/possible size. Having the upper half of the positive integer space to potentially reduce the buffer size to its minimum value defeats this purpose. This patch caps the base value to INT_MAX / 2, so that bigger values don't overflow and keep setting the buffer size to its maximum. Underflow of negative values ============================ For negative numbers, SO_SNDBUF always considers them bigger than net.core.wmem_max, which is bounded by [SOCK_MIN_SNDBUF, INT_MAX]. Therefore such values are set to net.core.wmem_max and we're back to the behaviour of positive integers described above (return maximum buffer size if wmem_max <= INT_MAX / 2, return SOCK_MIN_SNDBUF otherwise). However, SO_SNDBUFFORCE behaves differently. The user value is directly multiplied by two and compared with SOCK_MIN_SNDBUF. If 'val * 2' doesn't underflow or if it underflows to a value smaller than SOCK_MIN_SNDBUF then buffer size is set to its minimum value. Otherwise the buffer size is set to the underflowed value. This patch treats negative values passed to SO_SNDBUFFORCE as null, to prevent underflows. Therefore negative values now always set the buffer size to its minimum value. Even though SO_SNDBUF behaves inconsistently by setting buffer size to the maximum value when passed a negative number, no attempt is made to modify this behaviour. There may exist some programs that rely on using negative numbers to set the maximum buffer size. Avoiding overflows because of extreme net.core.wmem_max values is the most we can do here. Summary of altered behaviours ============================= val : user-space value passed to setsockopt() val_uf : the underflowed value resulting from doubling val when val < INT_MIN / 2 wmem_max : short for net.core.wmem_max val_cap : min(val, wmem_max) min_len : minimal buffer length (that is, SOCK_MIN_SNDBUF) max_len : maximal possible buffer length, regardless of wmem_max (that is, INT_MAX - 1) ^^^^ : altered behaviour SO_SNDBUF: +-------------------------+-------------+------------+----------------+ \| CONDITION \| OLD RESULT \| NEW RESULT \| COMMENT \| +-------------------------+-------------+------------+----------------+ \| val < 0 && \| \| \| No overflow, \| \| wmem_max <= INT_MAX/2 \| wmem_max2 \| wmem_max2 \| keep original \| \| \| \| \| behaviour \| +-------------------------+-------------+------------+----------------+ \| val < 0 && \| \| \| Cap wmem_max \| \| INT_MAX/2 < wmem_max \| min_len \| max_len \| to prevent \| \| \| \| ^^^^^^^ \| overflow \| +-------------------------+-------------+------------+----------------+ \| 0 <= val <= min_len/2 \| min_len \| min_len \| Ordinary case \| +-------------------------+-------------+------------+----------------+ \| min_len/2 < val && \| val_cap2 \| val_cap2 \| Ordinary case \| \| val_cap <= INT_MAX/2 \| \| \| \| +-------------------------+-------------+------------+----------------+ \| min_len < val && \| \| \| Cap val_cap \| \| INT_MAX/2 < val_cap \| min_len \| max_len \| again to \| \| (implies that \| \| ^^^^^^^ \| prevent \| \| INT_MAX/2 < wmem_max) \| \| \| overflow \| +-------------------------+-------------+------------+----------------+ SO_SNDBUFFORCE: +------------------------------+---------+---------+------------------+ \| CONDITION \| BEFORE \| AFTER \| COMMENT \| \| \| PATCH \| PATCH \| \| +------------------------------+---------+---------+------------------+ \| val < INT_MIN/2 && \| min_len \| min_len \| Underflow with \| \| val_uf <= min_len \| \| \| no consequence \| +------------------------------+---------+---------+------------------+ \| val < INT_MIN/2 && \| val_uf \| min_len \| Set val to 0 to \| \| val_uf > min_len \| \| ^^^^^^^ \| avoid underflow \| +------------------------------+---------+---------+------------------+ \| INT_MIN/2 <= val < 0 \| min_len \| min_len \| No underflow \| +------------------------------+---------+---------+------------------+ \| 0 <= val <= min_len/2 \| min_len \| min_len \| Ordinary case \| +------------------------------+---------+---------+------------------+ \| min_len/2 < val <= INT_MAX/2 \| val2 \| val2 \| Ordinary case \| +------------------------------+---------+---------+------------------+ \| INT_MAX/2 < val \| min_len \| max_len \| Cap val to \| \| \| \| ^^^^^^^ \| prevent overflow \| +------------------------------+---------+---------+------------------+ Signed-off-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-16 18:09:54 -08:00
David S. Miller	f2281c245d	Support Mellanox BlueField SmartNIC (mlx5-updates-2019-02-15) Bodong Wang says, BlueField device is a multi-core ARM processor in a highly integrated system on chip coupled with the ConnectX interconnect controller. BlueField device can be presented in one out of two modes: - SEPARATED_HOST: ARM processors as a separated and orthogonal host like any other external host in the multi-host virtualization model. - EMBEDDED_CPU: ARM processors as Embedded CPU (EC) and part of the external hosts virtualization model. While existing driver already supports the device on separated_host mode, this patch series focus on the functionalities of embedded_cpu mode. On embedded_cpu mode, BlueField device exposes regular network controller PCI function in the BlueField host(e.g, x86). However, a separate PCI function called Embedded CPU Physical Function(ECPF) is also added to the ARM host side, where standard Linux distributions is able to run on the ARM cores. Depends on the NV configuration from firmware, ECPF can be the e-switch manager and firmware pages supplier. If ECPF is configured as e-switch manager and page supplier, it will take over the responsibilities from the PF on BlueField host includes: - Owns, controls and manages all e-switch parts, and takes e-switch traffic by default. It also should perform ENABLE_HCA for the host PF just like a PF does for its VFs. - Provides and manages the ICM host memory required for the HCA to store various contexts for itself, the PF and VFs belong the e-switch it manages. The PF on BlueField host side is still responsible for: - Control its own permanent MAC. - PCI and SRIOV configurations and perform ENABLE_HCA for its VFs. The ECPF can also retrieve information about the external host it controls, like host identifier, PCI BDF and number of virtual functions. As these parameters may be changed dynamically, an event will be triggered to the driver on ECPF side. -----BEGIN PGP SIGNATURE----- iQEcBAABAgAGBQJcZ2amAAoJEEg/ir3gV/o+lzMIALvLNUoD6pXi41MWsOwvmAHg 07mzg1N80Z66MCcFau40I8T3h9NLiRMzNtFrBNtxx+ruwKFUpAjJHjaU0sms0yQH WVjr35vk4XsZyDPSCJ4g/hCQVlgCT/1tIUvPO0YM9hjhDuVa9mT4wEpucQRDu8bO KfeXNXLDFnlWxjokhpSVj369ozh+LTv4Kzy0MBBbji97bG6MktGAT8uCimUy7wG0 7dlYimnZ1+iUD1/DZQadiLCUHUu/rTcnvF2+DcdG/nbSU8ydVLgj6vgtIfCYt4e5 kcQO5hmatnl0iJUA8GNpdHQwGjYytKneoGmIfbMAK4KjvFNF+3/8N4Bytoa5kzk= =DgTl -----END PGP SIGNATURE----- Merge tag 'mlx5-updates-2019-02-15' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== Support Mellanox BlueField SmartNIC (mlx5-updates-2019-02-15) Bodong Wang says, BlueField device is a multi-core ARM processor in a highly integrated system on chip coupled with the ConnectX interconnect controller. BlueField device can be presented in one out of two modes: - SEPARATED_HOST: ARM processors as a separated and orthogonal host like any other external host in the multi-host virtualization model. - EMBEDDED_CPU: ARM processors as Embedded CPU (EC) and part of the external hosts virtualization model. While existing driver already supports the device on separated_host mode, this patch series focus on the functionalities of embedded_cpu mode. On embedded_cpu mode, BlueField device exposes regular network controller PCI function in the BlueField host(e.g, x86). However, a separate PCI function called Embedded CPU Physical Function(ECPF) is also added to the ARM host side, where standard Linux distributions is able to run on the ARM cores. Depends on the NV configuration from firmware, ECPF can be the e-switch manager and firmware pages supplier. If ECPF is configured as e-switch manager and page supplier, it will take over the responsibilities from the PF on BlueField host includes: - Owns, controls and manages all e-switch parts, and takes e-switch traffic by default. It also should perform ENABLE_HCA for the host PF just like a PF does for its VFs. - Provides and manages the ICM host memory required for the HCA to store various contexts for itself, the PF and VFs belong the e-switch it manages. The PF on BlueField host side is still responsible for: - Control its own permanent MAC. - PCI and SRIOV configurations and perform ENABLE_HCA for its VFs. The ECPF can also retrieve information about the external host it controls, like host identifier, PCI BDF and number of virtual functions. As these parameters may be changed dynamically, an event will be triggered to the driver on ECPF side. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-16 12:11:17 -08:00
David S. Miller	bb015f2216	Merge branch 's390-next' Julian Wiedmann says: ==================== s390/qeth: updates 2019-02-15 please apply a few more qeth patches to net-next. Along with some smaller improvements, this revamps our code for the SW statistics that are exposed through ETHTOOL_GSTATS. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-15 20:35:30 -08:00
Julian Wiedmann	8024cc9e85	s390/qeth: split out OSN netdev ops Rather than special-casing OSN in a number of places, just give this device type its own netdev_ops structure. When setting up the OSN net_device, also skip the handling of the various HW offloads (eg TSO). The device shouldn't be advertising any of them, and the OSN code paths in qeth don't have support for them. In particular RX VLAN filtering is not supported, so don't hook up those callbacks in the netdev_ops. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-15 20:35:30 -08:00
Julian Wiedmann	1b4d5e1c61	s390/qeth: add support for ETHTOOL_GRINGPARAM Implement a trivial callback that exposes the queue sizes. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-15 20:35:30 -08:00
Julian Wiedmann	b0abc4f5df	s390/qeth: overhaul ethtool statistics Accumulate per-TX queue statistics, and increase their size to 64 bit. Don't bother with enabling/disabling the statistics, the overhead is negligible. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-15 20:35:29 -08:00
Julian Wiedmann	d896ac62d0	s390/qeth: move ethtool code into its own file Most of this is self-contained code. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-15 20:35:29 -08:00
Julian Wiedmann	4326b5b461	s390/qeth: reduce ethtool statistics Counting the number of function calls and the time spent in functions is best left to proper tracing facilities. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-15 20:35:29 -08:00
Julian Wiedmann	bb92d3f866	s390/qeth: use a static Output Queue array qeth dynamically allocates an array for storing pointers to its Output Queue structures. Switch this to a static array - we are currently limited to 4 Output Queues, so shrinking the qeth_qdio_info struct by just a few bytes doesn't justify the additional complexity. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-15 20:35:29 -08:00
Julian Wiedmann	0aa35a3689	s390/qeth: allow manual recovery when device is SOFTSETUP Once a qeth ccwgroup device is set online, it's also armed for internal recovery. So allow for testing that code path via sysfs, regardless of whether the interface is up or down. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-15 20:35:29 -08:00
Florian Fainelli	ff326d3cdf	selftests: forwarding: Add some missing configuration symbols For the forwarding selftests to work, we need network namespaces when using veth/vrf otherwise ping/ping6 commands like these: ip vrf exec vveth0 /bin/ping 192.0.2.2 -c 10 -i 0.1 -w 5 will fail because network namespaces may not be enabled. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-15 20:32:22 -08:00
Paolo Abeni	1490ed2abc	net/ipv6: prefer rcu_access_pointer() over rcu_dereference() rt6_cache_allowed_for_pmtu() checks for rt->from presence, but it does not access the RCU protected pointer. We can use rcu_access_pointer() and clean-up the code a bit. No functional changes intended. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-15 20:25:26 -08:00
Colin Ian King	59e6158aca	mlxsw: core: fix spelling mistake "temprature" -> "temperature" There is a spelling mistake in several dev_err messages, fix these. Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-15 20:16:52 -08:00
Bodong Wang	c96692fb8f	net/mlx5: E-Switch, Allow transition to offloads mode for ECPF Currently, the e-switch driver requires going to legacy mode before changing to the offloads mode. This makes sense for regular case as the legacy mode is done by creating VFs. However, it's problematic when ECPF is the eswitch manager. In such case, ECPF will control the vports on peer host including the peer PF and VFs. But ECPF doesn't need and shall not create VFs as the VFs are created in the peer PF host. Grant ECPF the ability to change from none to the offloads mode. Note that currently the only way to go back to none mode is by unloading the ECPF driver. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2019-02-15 17:25:58 -08:00
Bodong Wang	a3888f33db	net/mlx5: E-Switch, Load/unload VF reps according to event from host PF When host PF changes the number of VFs, the ECPF esw driver will get a FW event. It should query the number of VFs enabled by host PF and update the VF reps accordingly. Note that host PF can't change the number of VFs dynamically, it has to reset the number of VFs to 0 before changing to a new positive number. The host event is registered when driver is moving to switchdev mode, and it's the last step to do in esw_offloads_init. It's unregistered and the work queue is flushed when driver quits from switchdev mode. In this way, the host event and devlink command are serialized. When driver is enabling switchdev mode, pay attention to the following two facts: 1. Host PF must not have VF initialized as the flow table in ECPF has ENCAP enabled as default. Such flow table can't be created with existing initialized VFs. 2. ECPF doesn't know how many VFs the host PF will enable, ECPF offloads flow steering shall create the flow table/groups based on the max number of VFs possibly supported by host PF. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2019-02-15 17:25:58 -08:00
Bodong Wang	81cd229c29	net/mlx5: E-Switch, Consider ECPF vport depends on eswitch ownership ECPF connects to the eswitch through vport 0xfffe. ECPF may or may not be the eswitch manager depending on firmware configuration. 1. If ECPF is eswitch manager: ECPF will take over the eswitch manager responsibility. A rep of the host PF shall be created at the ECPF side for the eswitch manager to control. 2. If ECPF is not eswitch manager: host PF will be the eswitch manager, ECPF acts similar as a VF to the host PF. Host PF will be aware of the ECPF vport presence and control it's rep. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2019-02-15 17:25:58 -08:00
Bodong Wang	5ae5162066	net/mlx5: E-Switch, Assign a different position for uplink rep and vport In offloads mode, the current implementation puts the uplink representor at index zero of the vport reps array. It is not "natural" to place it at index 0 since we want to put the representor for vport 0 at index 0 with the introduction of SmartNIC. A separate patch will handle the case whether a rep is needed for vport 0 (PF vport). So, we want to have a different placeholder for uplink vport and representor. It was placed at the end of vport and rep array. Since vport number can no longer act as an index into the vport or representors arrays, use functions to map vport numbers to indices when accessing the vports or representors arrays, and vice versa. Signed-off-by: Bodong Wang <bodong@mellanox.com> Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2019-02-15 17:25:58 -08:00
Bodong Wang	f8e8fa0262	net/mlx5: E-Switch, Centralize repersentor reg/unreg to eswitch driver Eswitch has two users: IB and ETH. They both register repersentors when mlx5 interface is added, and unregister the repersentors when mlx5 interface is removed. Ideally, each driver should only deal with the entities which are unique to itself. However, current IB and ETH drivers have to perform the following eswitch operations: 1. When registering, specify how many vports to register. This number is the same for both drivers which is the total available vport numbers. 2. When unregistering, specify the number of registered vports to do unregister. Also, unload the repersentors which are already loaded. It's unnecessary for eswitch driver to hands out the control of above operations to individual driver users, as they're not unique to each driver. Instead, such operations should be centralized to eswitch driver. This consolidates eswitch control flow, and simplified IB and ETH driver. This patch doesn't change any functionality. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2019-02-15 17:25:58 -08:00
Bodong Wang	29d9fd7d5a	net/mlx5: E-Switch, Support load/unload reps of specific vport types Currently the driver loads and unloads all reps in an unbreakable group. However, with ECPF, the reps of special vports such as uplink and host PF should always be loaded in switchdev mode where the reps for VFs will be loaded on-demand and unloaded on no-demand. This is a pre-step for that change. This patch doesn't change any functionality. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2019-02-15 17:25:57 -08:00
Bodong Wang	f121e0ea95	net/mlx5: E-Switch, Add state to eswitch vport representors Currently the eswitch vport reps have a valid indicator, which is set on register and unset on unregister. However, a rep can be loaded or not loaded when doing unregister, current driver checks if the vport of that rep is enabled as a flag to imply the rep is loaded. However, for ECPF, this is not valid as the host PF will enable the vports for its VFs instead. Add three states: {unregistered, registered, loaded}, with the following state changes across different operations: create: (none) -> unregistered reg: unregistered -> registered load: registered -> loaded unload: loaded -> registered unreg: registered -> unregistered Note that the state shall only be updated inside eswitch driver rather than individual drivers such as ETH or IB. Signed-off-by: Bodong Wang <bodong@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Suggested-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2019-02-15 17:25:57 -08:00
Bodong Wang	879c8f84e3	net/mlx5: E-Switch, Use getter and iterator to access vport/rep With only PF and VF, it is sufficient to have the vport/rep array index as the vport number. This is because PF and VF vports numbers are consecutive serial numbers. In downstream patches with introducing of ECPF and UPLINK vports, it's not consecutive any more. Use getter to get specific vport/rep, and use iterator to traversal a list of vport/rep. This hides the translation between array index and vport number, and provides flexibility of using different translation mechanism in the future. This patch doesn't change any functionality. Signed-off-by: Bodong Wang <bodong@mellanox.com> Suggested-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2019-02-15 17:25:57 -08:00
Bodong Wang	c9b99abcf2	net/mlx5: E-Switch, Split VF and special vports for offloads mode When driver is entering offloads mode, there are two major tasks to do: initialize flow steering and create representors. Flow steering should make sure enough flow table/group spaces are reserved for all reps. Representors will be created in a group, all or none. With the introduction of ECPF, flow steering should still reserve the same spaces. But, the representors are not always loaded/unloaded in a single piece. Once ECPF is in offloads mode, it will get the number of VF changing event from host PF. In such scenario, only the VF reps should be loaded/unloaded, not the reps for special vports (such as the uplink vport). Thus, when entering offloads mode, driver should specify the total number of reps, and the number of VF reps separately. When leaving offloads mode, the cleanup should use the information self-contained in eswitch such as number of VFs. This patch doesn't change any functionality. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2019-02-15 17:25:57 -08:00
Bodong Wang	eca8cc3895	net/mlx5: E-Switch, Refactor offloads flow steering init/cleanup E-switch offloads mode initialize/cleanup multiple steering related entities (flow table/group). Refactor these operations to internal helper functions for better block design. This patch doesn't change any functionality. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2019-02-15 17:25:57 -08:00
Bodong Wang	cbc44e76bf	net/mlx5: E-Switch, Properly refer to host PF vport as other vport Commands referring to vports use the following scheme: 1. When referring to my own vport, put 0 in vport and 0 in other_vport. 2. When referring to another vport, put the vport number of the referred vport and put 1 in other_vport. It was assumed that driver is accessing other vport when vport number is greater than 0. With the above scheme, the case that ECPF eswitch manager is trying to access host PF vport will fall over with scheme 1 as the vport number is 0. This is apparently wrong as driver is trying to refer other vport. As such usage can only happen in the eswitch context, change relevant functions to provide other vport input properly. Signed-off-by: Bodong Wang <bodong@mellanox.com> Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2019-02-15 17:25:56 -08:00
Bodong Wang	a1b3839ac4	net/mlx5: E-Switch, Properly refer to the esw manager vport In SmartNIC mode, the eswitch manager is not necessarily the PF (vport 0). Use a helper function to get the correct eswitch manager vport number and cache on the eswitch instance for fast reference. Signed-off-by: Bodong Wang <bodong@mellanox.com> Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2019-02-15 17:25:56 -08:00
Bodong Wang	86b39a66b7	net/mlx5: Correctly set LAG mode for ECPF When bonding is added, driver assumes that it's RoCE LAG if no VF is enabled. This is not enough for ECPF as the VF is enabled in host PF side. LAG should only choose RoCE mode when both slave devices meet conditions below: 1. E-Switch offloads mode is NONE. 2. No VF is enabled. Signed-off-by: Bodong Wang <bodong@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2019-02-15 17:25:56 -08:00
Saeed Mahameed	259fae5a2c	Merge branch 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Merge mlx5-next shared branched into net-next, From Bodong Wang: 1) Introduction of ECPF (Embedded CPU Physical Function), and low level bits for mlx5 SmartNic capabilities support. 2) Vport enumeration refactoring that affect mlx5_ib and mlx5_core From Aya Levin, 3) Add support for 50Gbps per lane link modes in the Port Type and Speed register (PTYS) 4) Refactor low level query functions for PTYS register 5) Add support for 50Gbps per lane link modes to mlx5_ib Note: due to a change in API in mlx5/core and a later patch from net-next, a fixup was squashed with this merge commit that replaces FDB_UPLINK_VPORT with MLX5_VPORT_UPLINK which exists only in upstream net-next. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2019-02-15 16:45:31 -08:00
Peter Oskolkov	b251f9f63a	bpf: make LWTUNNEL_BPF dependent on INET Lightweight tunnels are L3 constructs that are used with IP/IP6. For example, lwtunnel_xmit is called from ip_output.c and ip6_output.c only. Make the dependency explicit at least for LWT-BPF, as now they call into IP routing. V2: added "Reported-by" below. Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Peter Oskolkov <posk@google.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-02-16 01:06:30 +01:00
David S. Miller	3313da8188	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net The netfilter conflicts were rather simple overlapping changes. However, the cls_tcindex.c stuff was a bit more complex. On the 'net' side, Cong is fixing several races and memory leaks. Whilst on the 'net-next' side we have Vlad adding the rtnl-ness support. What I've decided to do, in order to resolve this, is revert the conversion over to using a workqueue that Cong did, bringing us back to pure RCU. I did it this way because I believe that either Cong's races don't apply with have Vlad did things, or Cong will have to implement the race fix slightly differently. Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-15 12:38:38 -08:00
Linus Torvalds	24f0a48743	for-linus-20190215 -----BEGIN PGP SIGNATURE----- iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAlxm7pAQHGF4Ym9lQGtl cm5lbC5kawAKCRD301j7KXHgpl6JEACM5qHp7HEf7muuLKDUoX16G2eDOjacVxbL q1kqyHNvrYD/aGo+8vcshCef6xno9fL1akIxTyaTcMwYJUk9JSMicsVimxC1OvI6 a5ZiWItX2L8Nh/heJe+FtutWbrT+Nd+3Q8DqI+U0YkRnjnXaRVgLFtBmjLOxBrqJ Ps/VepB4GaxA0oWdPbhos/N3wa42uFy3ixdv3Kv6WmHdqraB9uagt8PwwUti3WzQ uxWL6J+JOBSDha8l3fp68Okib1bm/6Nmmc9l8Yz1eFwf+Y+gVgw7wPQxkUD/XaFW bDJGwp3NawK07EanIAIzfXUEGfLvgeRJBEP3OGwV/TAiHX5q9zQo/tbM6x8j4aT9 zGlwU/EnwFixgbRW/hOT5Ox4usBlfB1j0ZiNmgUm8QphHrELFnc35Kd+PR/KONNX sI6ZiifEAMR+4S99kTZ5YjHUqcUVm9ndd8iQGW9mvM6vt3o1L6QKeOeEKBMlhMcx V+JtViC50ojidYc82kEtQFY9OKRkc5x3k1wBsH49LGMT+fvEwETallOXHTarQKrv QAZNN1NINkMmrL5bgBXFqf0qpOy4xHnhis5AilUHNZwa4G8iAe8oqz/2eUCydiV1 Ogx20a8T1ifeSkI2NXrwnBjVzqnfiO9wOb9py98BiLR6k59x3GYtbCdGtpIXfSFv hG79KKoz3Q== =8mjO -----END PGP SIGNATURE----- Merge tag 'for-linus-20190215' of git://git.kernel.dk/linux-block Pull block fixes from Jens Axboe: - Ensure we insert into the hctx dispatch list, if a request is marked as DONTPREP (Jianchao) - NVMe pull request, single missing unlock on error fix (Keith) - MD pull request, single fix for a potentially data corrupting issue (Nate) - Floppy check_events regression fix (Yufen) * tag 'for-linus-20190215' of git://git.kernel.dk/linux-block: md/raid1: don't clear bitmap bits on interrupted recovery. floppy: check_events callback should not return a negative number nvme-pci: add missing unlock for reset error blk-mq: insert rq with DONTPREP to hctx dispatch list when requeue	2019-02-15 09:12:28 -08:00
Linus Torvalds	ae3fa8bd73	- Fix bug in DM crypt's sizing of its block integrity tag space, resulting in less memory use when DM crypt layers on DM integrity. - Fix a long-standing DM thinp crash consistency bug that was due to improper handling of FUA. This issue is specific to writes that fill an entire thinp block which needs to be allocated. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJcZjtbAAoJEMUj8QotnQNaTxMIANdjCyW0LlpNNuDX8hyVzXAc HNqyFxfNk7LD4ck5jn3HuQo5nCRvne+ltjol0vOqBokITXe1a9t+GB/fWSz0yZd9 69NvwgLoaZZ0pcxeddvUQ2TAOBxCdP8O4JokQL5QgnCt4nvUOWbGQBlSQNBf/8KO 9xa+0z36pMAC2dCnClKSQgwj+ZRZOBwOKSDVl7SiM7SvbNcirtBEgtvjr8gOrKvl SbLtoFwj8hwJFCpllwIE4ec+bHw9XsCeFEBwGiSnp6GF2sgfLbx0/EpHj09M18Vt QCXtYxcm8IMsh0w2y4YnmSWDk8yV7P/vVyoBmMjzv/gYx+6Eyxynk8pk32LNnEc= =jC/2 -----END PGP SIGNATURE----- Merge tag 'for-5.0/dm-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper fixes from Mike Snitzer: - Fix bug in DM crypt's sizing of its block integrity tag space, resulting in less memory use when DM crypt layers on DM integrity. - Fix a long-standing DM thinp crash consistency bug that was due to improper handling of FUA. This issue is specific to writes that fill an entire thinp block which needs to be allocated. * tag 'for-5.0/dm-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm thin: fix bug where bio that overwrites thin block ignores FUA dm crypt: don't overallocate the integrity tag space	2019-02-15 08:50:48 -08:00
Linus Torvalds	dfeae33798	MMC core: - Fix deadlock bug for block I/O requests MMC host: - sunxi: Disable broken HS-DDR mode for H5 by default - sunxi: Avoid unsupported speed modes declared via DT - meson-gx: Restore interrupt name -----BEGIN PGP SIGNATURE----- iQJLBAABCgA1FiEEugLDXPmKSktSkQsV/iaEJXNYjCkFAlxmiakXHHVsZi5oYW5z c29uQGxpbmFyby5vcmcACgkQ/iaEJXNYjCmsMRAAlcMKQlZ9sjnMpmrUcAGC6i4G nL6bRBkTSgcXCK/23ipY02PTkJV1ZqYuoPprNYd9+z5SM9bWmlMQM2JQvX4A6C4b wV0V/9iSeQmCZgx58K+hFMWjn0Wjpnn+FKTgho8gr++G2klG2XklKDhjT3nZsxuN cZIJwF9TkqP2Ie4r2lwNjIWGu9tcGA9ubgjqv1LLcKzf9yYeXuEGH7ToR+U1DW3y dV8x3Sqw8aBrvVyv8vWtAlaUVzUqgMunTqxRSK4bpfddEvdbQ1VfyUKXX+GWJATX C72Hv729nEEqWrbFoxaBPEMU8mL5Z0FhzIYFDWj2SJkjXOB8lxuGvrwMHaNM/zb/ haDcZCt/C/DF3c3uKWAjCP1I0MKRCiaYvfv+WDIhFZNTSfOeKM/QlzzBt/nqiGkZ spFk+S9VJzcXGaYkOZ5K/ULr4FBYv6TRUatEWe3WafZtAgkpUTeyEZC8LP4wr+0M U2d+qyqdsl6V+df82qx21uutmxDvSx/W/pCfuamQmPiRJQQ+H7c7Jq6wL3jPeich 1Vh4dDCPtlcxhHhW5QE/tujNVJjixXkY9J3DupEJLhGjJ2RF6nhn+HvNH6D/tSNB 77IHSswYWa1rDxM73AOnbCLkyHdp236gnwpiWcz0sd5o3kQvQmY4q/xy+Pt3HfRn J1ffQBvXbERCUulRtDU= =ipwm -----END PGP SIGNATURE----- Merge tag 'mmc-v5.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc Pull MMC fixes from Ulf Hansson: "A couple of MMC fixes intended for v5.0-rc7. MMC core: - Fix deadlock bug for block I/O requests MMC host: - sunxi: Disable broken HS-DDR mode for H5 by default - sunxi: Avoid unsupported speed modes declared via DT - meson-gx: Restore interrupt name" * tag 'mmc-v5.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc: mmc: meson-gx: fix interrupt name mmc: block: handle complete_work on separate workqueue mmc: sunxi: Filter out unsupported modes declared in the device tree mmc: sunxi: Disable HS-DDR mode for H5 eMMC controller by default	2019-02-15 08:45:28 -08:00
Linus Torvalds	545aabcbdc	drm core, i915, amd, imx, vkms fixes -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJcZi5LAAoJEAx081l5xIa+tSQQAKblf/Ca7QryDUbAN8JIxeJp NuJiNP4jvvZbCod/ISktU+0zY2uKnW09//ljkEdGVg+Ilww2HHm/drs1HRUMP6QP U6EKhoJQ99OfsBYy3J+PloBz9uS/ziGJB6YN0qcJkTZ1tvAenNqO88MWJittDZCu ao92sB0mwW3s+R/36OtCMce3LDGPuMst98z2+tN+C4JWZW4tktYyGo/fSsZ60Gry Hxo/X/K0F4qn5vPfPL41fH0DzXpKiuztp7WsK97YS3Wa2VeNynKaORdcWyBBQq/n t2NvLXyW58/wzHG0u1lbWEUEor2LJZ9Cd5aVl+i8e8giR2RogTEBVbqR+hXciTPe 3lUfSXKwwus1tsiX8amcIVPIpIyZ5Hk5igfql/EHFki3zdbOESVgrFtzwas2oW6b GljURKcNe40ZK0btaogB8m1lZ1sN2poDgB3QYrIVFywzhv+Bm3TUexmWxbho/lYW jho3OkP3UlxHyN4TF0CjG2lEU+kCo8EPzkOoRCzjTo49DUorqjtAMcZrbFr3z5wK oq0m9G+itUqnKmsAA9ElHsGHK4pwRpA6hwFMnOZTWb6sVZGoT4fU6pAucZoZXE8j wlADT/HRZECes9PVqrxeqA4h8EVlira6fdV1S9YzuMHrMpwo335Suf8eghQpFkST oOXYb8/6GfKGyz3JDxb8 =dYoy -----END PGP SIGNATURE----- Merge tag 'drm-fixes-2019-02-15-1' of git://anongit.freedesktop.org/drm/drm Pull drm fixes from Dave Airlie: "Usual pull request, little larger than I'd like but nothing too strange in it. Willy found an bug in the lease ioctl calculations, but it's a drm master only ioctl which makes it harder to mess with. i915: - combo phy programming fix - opregion version check fix for VBT RVDA lookup - gem mmap ioctl race fix - fbdev hpd during suspend fix - array size bounds check fix in pmu amdgpu: - Vega20 psp fix - Add vrr range to debugfs for freesync debugging sched: - Scheduler race fix vkms: - license header fixups imx: - Fix CSI register offsets for i.MX51 and i.MX53. - Fix delayed page flip completion events on i.MX6QP due to unexpected behaviour of the PRE when issuing NOP buffer updates to the same buffer address. - Stop throwing errors for plane updates on disabled CRTCs when a userspace process is killed while a plane update is pending. - Add missing of_node_put cleanup in imx_ldb_bind" * tag 'drm-fixes-2019-02-15-1' of git://anongit.freedesktop.org/drm/drm: drm: Use array_size() when creating lease drm/amdgpu/psp11: TA firmware is optional (v3) drm/i915/opregion: rvda is relative from opregion base in opregion 2.1+ drm/i915/opregion: fix version check drm/i915: Prevent a race during I915_GEM_MMAP ioctl with WC set drm/i915: Block fbdev HPD processing during suspend drm/i915/pmu: Fix enable count array size and bounds checking drm/i915/cnl: Fix CNL macros for Voltage Swing programming drm/i915/icl: combo port vswing programming changes per BSPEC drm/vkms: Fix license inconsistent drm/amd/display: Expose connector VRR range via debugfs drm/sched: Always trace the dependencies we wait on, to fix a race. gpu: ipu-v3: pre: don't trigger update if buffer address doesn't change gpu: ipu-v3: Fix CSI offsets for imx53 drm/imx: imx-ldb: add missing of_node_puts gpu: ipu-v3: Fix i.MX51 CSI control registers offset drm/imx: ignore plane updates on disabled crtcs	2019-02-15 08:20:33 -08:00
Linus Torvalds	2aba322074	Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fix from Herbert Xu: "This fixes a crash on resume in the ccree driver" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: ccree - fix resume race condition on init	2019-02-15 08:11:43 -08:00
Linus Torvalds	6e7bd3b549	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: 1) Fix MAC address setting in mac80211 pmsr code, from Johannes Berg. 2) Probe SFP modules after being attached, from Russell King. 3) Byte ordering bug in SMC rx_curs_confirmed code, from Ursula Braun. 4) Revert some r8169 changes that are causing regressions, from Heiner Kallweit. 5) Fix spurious connection timeouts in netfilter nat code, from Florian Westphal. 6) SKB leak in tipc, from Hoang Le. 7) Short packet checkum issue in mlx4, similar to a previous mlx5 change, from Saeed Mahameed. The issue is that whilst padding bytes are usually zero, it is not guarateed and the hardware doesn't take the padding bytes into consideration when generating the checksum. 8) Fix various races in cls_tcindex, from Cong Wang. 9) Need to set stream ext to NULL before freeing in SCTP code, from Xin Long. 10) Fix locking in phy_is_started, from Heiner Kallweit. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (54 commits) net: ethernet: freescale: set FEC ethtool regs version net: hns: Fix object reference leaks in hns_dsaf_roce_reset() mm: page_alloc: fix ref bias in page_frag_alloc() for 1-byte allocs net: phy: fix potential race in the phylib state machine net: phy: don't use locking in phy_is_started selftests: fix timestamping Makefile net: dsa: bcm_sf2: potential array overflow in bcm_sf2_sw_suspend() net: fix possible overflow in __sk_mem_raise_allocated() dsa: mv88e6xxx: Ensure all pending interrupts are handled prior to exit net: phy: fix interrupt handling in non-started states sctp: set stream ext to NULL after freeing it in sctp_stream_outq_migrate sctp: call gso_reset_checksum when computing checksum in sctp_gso_segment net/mlx5e: XDP, fix redirect resources availability check net/mlx5: Fix a compilation warning in events.c net/mlx5: No command allowed when command interface is not ready net/mlx5e: Fix NULL pointer derefernce in set channels error flow netfilter: nft_compat: use-after-free when deleting targets team: avoid complex list operations in team_nl_cmd_options_set() net_sched: fix two more memory leaks in cls_tcindex net_sched: fix a memory leak in cls_tcindex ...	2019-02-15 08:00:11 -08:00
Linus Torvalds	02d7504089	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull signal fix from Eric Biederman: "Just a single patch that restores PTRACE_EVENT_EXIT functionality that was accidentally broken by last weeks fixes" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: signal: Restore the stop PTRACE_EVENT_EXIT	2019-02-15 07:56:24 -08:00
Andrey Ignatov	789f6bab84	libbpf: Introduce bpf_object__btf Add new accessor for bpf_object to get opaque struct btf * from it. struct btf * is needed for all operations with BTF and it's present in bpf_object. The only thing missing is a way to get it. Example use-case is to get BTF key_type_id and value_type_id for a map in bpf_object. It can be done with btf__get_map_kv_tids() but that function requires struct btf *. Similar API can be added for struct btf_ext but no use-case for it yet. Signed-off-by: Andrey Ignatov <rdna@fb.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-02-15 15:20:54 +01:00
Andrey Ignatov	1a11a4c74f	libbpf: Introduce bpf_map__resize Add bpf_map__resize() to change max_entries for a map. Quite often necessary map size is unknown at compile time and can be calculated only at run time. Currently the following approach is used to do so: * bpf_object__open_buffer() to open Elf file from a buffer; * bpf_object__find_map_by_name() to find relevant map; * bpf_map__def() to get map attributes and create struct bpf_create_map_attr from them; * update max_entries in bpf_create_map_attr; * bpf_create_map_xattr() to create new map with updated max_entries; * bpf_map__reuse_fd() to replace the map in bpf_object with newly created one. And after all this bpf_object can finally be loaded. The map will have new size. It 1) is quite a lot of steps; 2) doesn't take BTF into account. For "2)" even more steps should be made and some of them require changes to libbpf (e.g. to get struct btf * from bpf_object). Instead the whole problem can be solved by introducing simple bpf_map__resize() API that checks the map and sets new max_entries if the map is not loaded yet. So the new steps are: * bpf_object__open_buffer() to open Elf file from a buffer; * bpf_object__find_map_by_name() to find relevant map; * bpf_map__resize() to update max_entries. That's much simpler and works with BTF. Signed-off-by: Andrey Ignatov <rdna@fb.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-02-15 15:20:42 +01:00
Jan Sokolowski	f8ebfaf668	net: bpf: remove XDP_QUERY_XSK_UMEM enumerator Commit `c9b47cc1fa` ("xsk: fix bug when trying to use both copy and zero-copy on one queue id") moved the umem query code to the AF_XDP core, and therefore removed the need to query the netdevice for a umem. This patch removes XDP_QUERY_XSK_UMEM and all code that implement that behavior, which is just dead code. Signed-off-by: Jan Sokolowski <jan.sokolowski@intel.com> Acked-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-02-15 15:14:22 +01:00

1 2 3 4 5 ...

813559 Commits