linux

Author	SHA1	Message	Date
Sasha Neftin	109f599663	igc: Remove the 'igc_read_mac_addr_base' method Remove the redundant 'igc_read_mac_addr_base' method and use the 'igc_read_mac_addr' method directly instead. Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-02-05 17:14:38 -08:00
Konstantin Khlebnikov	0f9e980bf5	e1000e: fix cyclic resets at link up with active tx I'm seeing series of e1000e resets (sometimes endless) at system boot if something generates tx traffic at this time. In my case this is netconsole who sends message "e1000e 0000:02:00.0: Some CPU C-states have been disabled in order to enable jumbo frames" from e1000e itself. As result e1000_watchdog_task sees used tx buffer while carrier is off and start this reset cycle again. [ 17.794359] e1000e: eth1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None [ 17.794714] IPv6: ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready [ 22.936455] e1000e 0000:02:00.0 eth1: changing MTU from 1500 to 9000 [ 23.033336] e1000e 0000:02:00.0: Some CPU C-states have been disabled in order to enable jumbo frames [ 26.102364] e1000e: eth1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None [ 27.174495] 8021q: 802.1Q VLAN Support v1.8 [ 27.174513] 8021q: adding VLAN 0 to HW filter on device eth1 [ 30.671724] cgroup: cgroup: disabling cgroup2 socket matching due to net_prio or net_cls activation [ 30.898564] netpoll: netconsole: local port 6666 [ 30.898566] netpoll: netconsole: local IPv6 address 2a02:6b8:0:80b:beae:c5ff:fe28:23f8 [ 30.898567] netpoll: netconsole: interface 'eth1' [ 30.898568] netpoll: netconsole: remote port 6666 [ 30.898568] netpoll: netconsole: remote IPv6 address 2a02:6b8:b000:605c:e61d:2dff:fe03:3790 [ 30.898569] netpoll: netconsole: remote ethernet address b0:a8:6e:f4:ff:c0 [ 30.917747] console [netcon0] enabled [ 30.917749] netconsole: network logging started [ 31.453353] e1000e 0000:02:00.0: Some CPU C-states have been disabled in order to enable jumbo frames [ 34.185730] e1000e 0000:02:00.0: Some CPU C-states have been disabled in order to enable jumbo frames [ 34.321840] e1000e 0000:02:00.0: Some CPU C-states have been disabled in order to enable jumbo frames [ 34.465822] e1000e 0000:02:00.0: Some CPU C-states have been disabled in order to enable jumbo frames [ 34.597423] e1000e 0000:02:00.0: Some CPU C-states have been disabled in order to enable jumbo frames [ 34.745417] e1000e 0000:02:00.0: Some CPU C-states have been disabled in order to enable jumbo frames [ 34.877356] e1000e 0000:02:00.0: Some CPU C-states have been disabled in order to enable jumbo frames [ 35.005441] e1000e 0000:02:00.0: Some CPU C-states have been disabled in order to enable jumbo frames [ 35.157376] e1000e 0000:02:00.0: Some CPU C-states have been disabled in order to enable jumbo frames [ 35.289362] e1000e 0000:02:00.0: Some CPU C-states have been disabled in order to enable jumbo frames [ 35.417441] e1000e 0000:02:00.0: Some CPU C-states have been disabled in order to enable jumbo frames [ 37.790342] e1000e: eth1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None This patch flushes tx buffers only once when carrier is off rather than at each watchdog iteration. Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-02-05 17:09:39 -08:00
Sasha Neftin	439c71f7d2	igc: Remove unneeded code Remove the 'igc_get_link_up_info_base method' from igc_base.c file. Use the 'igc_get_speed_and_duplex_copper' method directly and reduce the code redundancy. Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-02-05 17:05:35 -08:00
Sasha Neftin	55fdbeaa2d	igc: Remove unused code Remove unused igc_adv_data_desc definition from igc_base.h file. Descriptors definition will be added per demand. Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-02-05 17:00:52 -08:00
Jeff Kirsher	979eff22c9	e1000e: fix a missing check for return value The change is based on the issue found by Kangjie Lu <kjlu@umn.edu> where we not checking the return value of a register read/write which could result in a NULL pointer dereference if the read/write fails. Since we are only trying to disable the far-end loopback, if the read and write of register fails, we do not want to bail out of the function. We just want to log that it failed to disable and continue on. CC: Sasha Neftin <sasha.neftin@intel.com> CC: Kangjie Lu <kjlu@umn.edu> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-02-05 16:08:55 -08:00
Jacob Keller	ea888b03e3	fm10k: TRIVIAL cleanup of extra spacing in function comment The function comment for fm10k_iov_msg_msix_pf has an extra space in a sentence, which is unnecessary. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-02-05 16:08:55 -08:00
Jiri Kosina	2242281d69	ixgbe: remove magic constant in ixgbe_reset_hw_82599() ixgbe_reset_hw_82599() resets the value of hw->mac.num_rar_entries to pre-defined value of 128. Let's get rid of that hardcoded literal, and use IXGBE_82599_RAR_ENTRIES instead, the same way the normal initialization path does. Signed-off-by: Jiri Kosina <jkosina@suse.cz> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-02-05 16:08:54 -08:00
Sasha Neftin	a8890c38ab	igc: Fix code redundancy Remove redundant igc_check_for_link_base code and replace it with an igc_check_for_copper_link method. Fix duplication of IGC_ADVTXD_PAYLEN_SHIFT mask declaration. Remove obsolete IGC_SCVPC register definition. Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-02-05 16:08:54 -08:00
Mike Rapoport	facd86390b	docs/networking: fix formatting of Intel drivers documentation The documentation of Intel drivers is missing the heading adornment for document titles. This causes the generated html to have TOC entries from these documents to appear as top level TOC entries: * Linux* Base Driver for Intel(R) Ethernet Network Connection * Contents * Identifying Your Adapter * Command Line Parameters * AutoNeg * Duplex ... Add overline heading adornment to document titles. Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-02-05 16:08:54 -08:00
Sasha Neftin	803cc52323	igc: Remove unreachable code from igc_phy.c file Address community comment. Remove the unreachable code leads to the static checker warning. PHY functionality will be added later per demand. Reported by Dan Carpenter. Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-02-05 16:08:54 -08:00
Kai-Heng Feng	59f58708c5	e1000e: Exclude device from suspend direct complete optimization e1000e sets different WoL settings in system suspend callback and runtime suspend callback. The suspend direct complete optimization leaves e1000e in runtime suspended state with wrong WoL setting during system suspend. To fix this, we need to disable suspend direct complete optimization to let e1000e always use suspend callback to set correct WoL during system suspend. Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-02-05 16:08:54 -08:00
Russell King	bf2fa12593	net: marvell: mvpp2: fix lack of link interrupts Sven Auhagen reports that if he changes a SFP+ module for a SFP module on the Macchiatobin Single Shot, the link does not come back up. For Sven, it is as easy as: - Insert a SFP+ module connected, and use ping6 to verify link is up. - Remove SFP+ module - Insert SFP 1000base-X module use ping6 to verify link is up: Link up event did not trigger and the link is down but that doesn't show the problem for me. Locally, this has been reproduced by: - Boot with no modules. - Insert SFP+ module, confirm link is up. - Replace module with 25000base-X module. Confirm link is up. - Set remote end down, link is reported as dropped at both ends. - Set remote end up, link is reported up at remote end, but not local end due to lack of link interrupt. Fix this by setting up both GMAC and XLG interrupts for port 0, but only unmasking the appropriate interrupt according to the current mode set in the mac_config() method. However, only do the mask/unmask dance when we are really changing the link mode to avoid missing any link interrupts. Tested-by: Sven Auhagen <sven.auhagen@voleatech.de> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-05 10:40:28 -08:00
Russell King	4a4cec7257	net: marvell: mvpp2: use phy_interface_mode_is_8023z() helper Use the phy_interface_mode_is_8023z() helper for detecting interface modes that use 802.3z serial encoding. This is equivalent to testing for both 1000base-X and 2500base-X. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-05 10:40:28 -08:00
David S. Miller	7194d92b23	Merge branch 'nixge-Fixed-link-support' Moritz Fischer says: ==================== nixge: Fixed-link support This series adds fixed-link support to nixge. The first patch corrects the binding to correctly reflect hardware that does not come with MDIO cores instantiated. The second patch adds fixed link support to the driver. The third patch updates the binding document with the now optional (formerly required) phy-handle property and references the fixed-link docs. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-05 10:34:34 -08:00
Moritz Fischer	baaac2fb0d	dt-bindings: net: Add fixed-link support Update device-tree binding with fixed-link support. With fixed-link support the formerly required property 'phy-handle' is now optional if 'fixed-link' child is present. Signed-off-by: Moritz Fischer <mdf@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-05 10:34:34 -08:00
Moritz Fischer	8dc0ae90ad	net: nixge: Add support for fixed-link configurations Add support for fixed-link configurations to nixge driver. Signed-off-by: Moritz Fischer <mdf@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-05 10:34:34 -08:00
Moritz Fischer	dd648818da	net: nixge: Make mdio child node optional Make MDIO child optional and only instantiate the MDIO bus if the child is actually present. There are currently no (in-tree) users of this binding; all (out-of-tree) users use overlays that get shipped together with the FPGA images that contain the IP. This will significantly increase maintainabilty of future revisions of this IP. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Moritz Fischer <mdf@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-05 10:34:34 -08:00
Daniel Borkmann	90d304b7f7	Merge branch 'bpf-riscv-jit' Björn Töpel says: ==================== This v2 series adds an RV64G BPF JIT to the kernel. At the moment the RISC-V Linux port does not support CONFIG_HAVE_KPROBES (Patrick Stählin sent out an RFC last year), which means that CONFIG_BPF_EVENTS is not supported. Thus, no tests involving BPF_PROG_TYPE_TRACEPOINT, BPF_PROG_TYPE_PERF_EVENT, BPF_PROG_TYPE_KPROBE and BPF_PROG_TYPE_RAW_TRACEPOINT passes. The implementation does not support "far branching" (>4KiB). Test results: # modprobe test_bpf test_bpf: Summary: 378 PASSED, 0 FAILED, [366/366 JIT'ed] # echo 1 > /proc/sys/kernel/unprivileged_bpf_disabled # ./test_verifier ... Summary: 761 PASSED, 507 SKIPPED, 2 FAILED Note that "test_verifier" was run with one build with CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS=y and one without, otherwise many of the the tests that require unaligned access were skipped. CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS=y: # echo 1 > /proc/sys/kernel/unprivileged_bpf_disabled # ./test_verifier \| grep -c 'NOTE.unknown align' 0 No CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS: # echo 1 > /proc/sys/kernel/unprivileged_bpf_disabled # ./test_verifier \| grep -c 'NOTE.unknown align' 59 The two failing test_verifier tests are: "ld_abs: vlan + abs, test 1" "ld_abs: jump around ld_abs" This is due to that "far branching" involved in those tests. All tests where done on QEMU emulator version 3.1.50 (v3.1.0-688-g8ae951fbc106). I'll test it on real hardware, when I get access to it. I'm routing this patch via bpf-next/netdev mailing list (after a conversation with Palmer at FOSDEM), mainly because the other JITs went that path. Again, thanks for all the comments! Cheers, Björn v1 -> v2: * Added JMP32 support. (Daniel) * Add RISC-V to Documentation/sysctl/net.txt. (Daniel) * Fixed seen_call() asymmetry. (Daniel) * Fixed broken bpf_flush_icache() range. (Daniel) * Added alignment annotations to some selftests. RFCv1 -> v1: * Cleaned up the Kconfig and net/Makefile. (Christoph) * Removed the entry-stub and squashed the build/config changes to be part of the JIT implementation. (Christoph) * Simplified the register tracking code. (Daniel) * Removed unused macros. (Daniel) * Added myself as maintainer and updated documentation. (Daniel) * Removed HAVE_EFFICIENT_UNALIGNED_ACCESS. (Christoph, Palmer) * Added tail-calls and cleaned up the code. ==================== Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-02-05 17:03:02 +01:00
Björn Töpel	e2c6f50e48	selftests/bpf: add "any alignment" annotation for some tests RISC-V does, in-general, not have "efficient unaligned access". When testing the RISC-V BPF JIT, some selftests failed in the verification due to misaligned access. Annotate these tests with the F_NEEDS_EFFICIENT_UNALIGNED_ACCESS flag. Signed-off-by: Björn Töpel <bjorn.topel@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-02-05 16:56:10 +01:00
Björn Töpel	e8cb0167ae	bpf, doc: add RISC-V JIT to BPF documentation Update Documentation/networking/filter.txt and Documentation/sysctl/net.txt to mention RISC-V. Signed-off-by: Björn Töpel <bjorn.topel@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-02-05 16:56:10 +01:00
Björn Töpel	8a9e0aff88	MAINTAINERS: add RISC-V BPF JIT maintainer Add Björn Töpel as RISC-V BPF JIT maintainer. Signed-off-by: Björn Töpel <bjorn.topel@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-02-05 16:56:10 +01:00
Björn Töpel	2353ecc6f9	bpf, riscv: add BPF JIT for RV64G This commit adds a BPF JIT for RV64G. The JIT is a two-pass JIT, and has a dynamic prolog/epilogue (similar to the MIPS64 BPF JIT) instead of static ones (e.g. x86_64). At the moment the RISC-V Linux port does not support CONFIG_HAVE_KPROBES, which means that CONFIG_BPF_EVENTS is not supported. Thus, no tests involving BPF_PROG_TYPE_TRACEPOINT, BPF_PROG_TYPE_PERF_EVENT, BPF_PROG_TYPE_KPROBE and BPF_PROG_TYPE_RAW_TRACEPOINT passes. The implementation does not support "far branching" (>4KiB). Test results: # modprobe test_bpf test_bpf: Summary: 378 PASSED, 0 FAILED, [366/366 JIT'ed] # echo 1 > /proc/sys/kernel/unprivileged_bpf_disabled # ./test_verifier ... Summary: 761 PASSED, 507 SKIPPED, 2 FAILED Note that "test_verifier" was run with one build with CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS=y and one without, otherwise many of the the tests that require unaligned access were skipped. CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS=y: # echo 1 > /proc/sys/kernel/unprivileged_bpf_disabled # ./test_verifier \| grep -c 'NOTE.unknown align' 0 No CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS: # echo 1 > /proc/sys/kernel/unprivileged_bpf_disabled # ./test_verifier \| grep -c 'NOTE.unknown align' 59 The two failing test_verifier tests are: "ld_abs: vlan + abs, test 1" "ld_abs: jump around ld_abs" This is due to that "far branching" involved in those tests. All tests where done on QEMU (QEMU emulator version 3.1.50 (v3.1.0-688-g8ae951fbc106)). Signed-off-by: Björn Töpel <bjorn.topel@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-02-05 16:56:10 +01:00
Daniel Borkmann	31de389707	Merge branch 'bpf-btf-dedup' Andrii Nakryiko says: ==================== This patch series adds BTF deduplication algorithm to libbpf. This algorithm allows to take BTF type information containing duplicate per-compilation unit information and reduce it to equivalent set of BTF types with no duplication without loss of information. It also deduplicates strings and removes those strings that are not referenced from any BTF type (and line information in .BTF.ext section, if any). Algorithm also resolves struct/union forward declarations into concrete BTF types across multiple compilation units to facilitate better deduplication ratio. If undesired, this resolution can be disabled through specifying corresponding options. When applied to BTF data emitted by pahole's DWARF->BTF converter, it reduces the overall size of .BTF section by about 65x, from about 112MB to 1.75MB, leaving only 29247 out of initial 3073497 BTF type descriptors. Algorithm with minor differences and preliminary results before FUNC/FUNC_PROTO support is also described more verbosely at: https://facebookmicrosites.github.io/bpf/blog/2018/11/14/btf-enhancement.html v1->v2: - rebase on latest bpf-next - err_log/elog -> pr_debug - btf__dedup, btf__get_strings, btf__get_nr_types listed under 0.0.2 version ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-02-05 16:52:58 +01:00
Andrii Nakryiko	9c65112744	selftests/btf: add initial BTF dedup tests This patch sets up a new kind of tests (BTF dedup tests) and tests few aspects of BTF dedup algorithm. More complete set of tests will come in follow up patches. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-02-05 16:52:57 +01:00
Andrii Nakryiko	d5caef5b56	btf: add BTF types deduplication algorithm This patch implements BTF types deduplication algorithm. It allows to greatly compress typical output of pahole's DWARF-to-BTF conversion or LLVM's compilation output by detecting and collapsing identical types emitted in isolation per compilation unit. Algorithm also resolves struct/union forward declarations into concrete BTF types representing referenced struct/union. If undesired, this resolution can be disabled through specifying corresponding options. Algorithm itself and its application to Linux kernel's BTF types is described in details at: https://facebookmicrosites.github.io/bpf/blog/2018/11/14/btf-enhancement.html Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-02-05 16:52:57 +01:00
Andrii Nakryiko	69eaab04c6	btf: extract BTF type size calculation This pre-patch extracts calculation of amount of space taken by BTF type descriptor for later reuse by btf_dedup functionality. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-02-05 16:52:57 +01:00
Linus Walleij	5468e82f70	net: phy: fixed-phy: Drop GPIO from fixed_phy_add() All users of the fixed_phy_add() pass -1 as GPIO number to the fixed phy driver, and all users of fixed_phy_register() pass -1 as GPIO number as well, except for the device tree MDIO bus. Any new users should create a proper device and pass the GPIO as a descriptor associated with the device so delete the GPIO argument from the calls and drop the code looking requesting a GPIO in fixed_phy_add(). In fixed phy_register(), investigate the "fixed-link" node and pick the GPIO descriptor from "link-gpios" if this property exists. Move the corresponding code out of of_mdio.c as the fixed phy code anyways requires OF to be in use. Tested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-04 18:33:36 -08:00
Tonghao Zhang	fc9c5a4a5a	net/mlx5: Fix code style issue in mlx driver Add the tab before '}' and keep the code style consistent. Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-04 18:01:56 -08:00
Stanislav Fomichev	a8a1f7d09c	libbpf: fix libbpf_print With the recent print rework we now have the following problem: pr_{warning,info,debug} expand to __pr which calls libbpf_print. libbpf_print does va_start and calls __libbpf_pr with va_list argument. In __base_pr we again do va_start. Because the next argument is a va_list, we don't get correct pointer to the argument (and print noting in my case, I don't know why it doesn't crash tbh). Fix this by changing libbpf_print_fn_t signature to accept va_list and remove unneeded calls to va_start in the existing users. Alternatively, this can we solved by exporting __libbpf_pr and changing __pr macro to (and killing libbpf_print): { if (__libbpf_pr) __libbpf_pr(level, "libbpf: " fmt, ##__VA_ARGS__) } Signed-off-by: Stanislav Fomichev <sdf@google.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-02-04 17:45:31 -08:00
Santosh Shilimkar	fd261ce6a3	rds: rdma: update rdma transport for tos For RDMA transports, RDS TOS is an extension of IB QoS(Annex A13) to provide clients the ability to segregate traffic flows for different type of data. RDMA CM abstract it for ULPs using rdma_set_service_type(). Internally, each traffic flow is represented by a connection with all of its independent resources like that of a normal connection, and is differentiated by service type. In other words, there can be multiple qp connections between an IP pair and each supports a unique service type. The feature has been added from RDSv4.1 onwards and supports rolling upgrades. RDMA connection metadata also carries the tos information to set up SL on end to end context. The original code was developed by Bang Nguyen in downstream kernel back in 2.6.32 kernel days and it has evolved over period of time. Reviewed-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> [yanjun.zhu@oracle.com: Adapted original patch with ipv6 changes] Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com>	2019-02-04 14:59:13 -08:00
Santosh Shilimkar	56dc8bce9f	rds: add transport specific tos_map hook RDMA transport maps user tos to underline virtual lanes(VL) for IB or DSCP values. RDMA CM transport abstract thats for RDS. TCP transport makes use of default priority 0 and maps all user tos values to it. Reviewed-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> [yanjun.zhu@oracle.com: Adapted original patch with ipv6 changes] Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com>	2019-02-04 14:59:13 -08:00
Santosh Shilimkar	3eb450367d	rds: add type of service(tos) infrastructure RDS Service type (TOS) is user-defined and needs to be configured via RDS IOCTL interface. It must be set before initiating any traffic and once set the TOS can not be changed. All out-going traffic from the socket will be associated with its TOS. Reviewed-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> [yanjun.zhu@oracle.com: Adapted original patch with ipv6 changes] Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com>	2019-02-04 14:59:12 -08:00
Santosh Shilimkar	d021fabf52	rds: rdma: add consumer reject For legacy protocol version incompatibility with non linux RDS, consumer reject reason being used to convey it to peer. But the choice of reject reason value as '1' was really poor. Anyway for interoperability reasons with shipping products, it needs to be supported. For any future versions, properly encoded reject reason should to be used. Reviewed-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> [yanjun.zhu@oracle.com: Adapted original patch with ipv6 changes] Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com>	2019-02-04 14:59:11 -08:00
Santosh Shilimkar	cdc306a5c9	rds: make v3.1 as compat version Mark RDSv3.1 as compat version and add v4.1 version macro's. Subsequent patches enable TOS(Type of Service) feature which is tied with v4.1 for RDMA transport. Reviewed-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> [yanjun.zhu@oracle.com: Adapted original patch with ipv6 changes] Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com>	2019-02-04 14:59:11 -08:00
David S. Miller	d3ab9df53e	Merge branch 'sh_eth-implement-simple-RX-checksum-offload' Sergei Shtylyov says: ==================== sh_eth: implement simple RX checksum offload Here's a set of 7 patches against DaveM's 'net-next.git' repo. I'm implemeting the simple RX checksum offload (like was done for the 'ravb' driver by Simon Horman); it has been only tested on the R8A7740 and R8A77980 SoCs, the other SoCs should just work (according to their manuals)... ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-04 13:31:00 -08:00
Sergei Shtylyov	997feb11b8	sh_eth: offload RX checksum on SH7763 The SH7763 SoC manual describes the Ether MAC's RX checksum offload the same way as it's implemented in the EtherAVB MACs... Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-04 13:31:00 -08:00
Sergei Shtylyov	06240e1b52	sh_eth: offload RX checksum on SH7734 The SH7734 SoC manual describes the Ether MAC's RX checksum offload the same way as it's implemented in the EtherAVB MACs... Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-04 13:31:00 -08:00
Sergei Shtylyov	0da843adee	sh_eth: offload RX checksum on R8A77980 The R-Car V3H (R8A77980) SoC manual describes the Ether MAC's RX checksum offload the same way as it's implemented in the EtherAVB MAC... Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-04 13:31:00 -08:00
Sergei Shtylyov	040c16fd59	sh_eth: offload RX checksum on R8A7740 The R-Mobile A1 (R8A7740) SoC manual describes the Ether MAC's RX checksum offload the same way as it's implemented in the EtherAVB MAC... Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-04 13:31:00 -08:00
Sergei Shtylyov	48132cd0c6	sh_eth: offload RX checksum on R7S72100 The RZ/A1H (R7S721000) SoC manual describes the Ether MAC's RX checksum offload the same way as it's implemented in the EtherAVB MACs... Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-04 13:31:00 -08:00
Sergei Shtylyov	f8e022db50	sh_eth: RX checksum offload support Add support for the RX checksum offload. This is enabled by default and may be disabled and re-enabled using 'ethtool': # ethtool -K eth0 rx off # ethtool -K eth0 rx on Some Ether MACs provide a simple checksumming scheme which appears to be completely compatible with CHECKSUM_COMPLETE: sum of all packet data after the L2 header is appended to packet data; this may be trivially read by the driver and used to update the skb accordingly. The same checksumming scheme is implemented in the EtherAVB MACs and now supported by the 'ravb' driver. In terms of performance, throughput is close to gigabit line rate with the RX checksum offload both enabled and disabled. The 'perf' output, however, appears to indicate that significantly less time is spent in do_csum() -- this is as expected. Test results with RX checksum offload enabled: ~/netperf-2.2pl4# perf record -a ./netperf -t TCP_MAERTS -H 192.168.2.4 TCP MAERTS TEST to 192.168.2.4 Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec 131072 16384 16384 10.01 933.93 [ perf record: Woken up 8 times to write data ] [ perf record: Captured and wrote 1.955 MB perf.data (41940 samples) ] ~/netperf-2.2pl4# perf report Samples: 41K of event 'cycles:ppp', Event count (approx.): 9915302763 Overhead Command Shared Object Symbol 9.44% netperf [kernel.kallsyms] [k] __arch_copy_to_user 7.75% swapper [kernel.kallsyms] [k] _raw_spin_unlock_irq 6.31% swapper [kernel.kallsyms] [k] default_idle_call 5.89% swapper [kernel.kallsyms] [k] arch_cpu_idle 4.37% swapper [kernel.kallsyms] [k] tick_nohz_idle_exit 4.02% netperf [kernel.kallsyms] [k] _raw_spin_unlock_irq 2.52% netperf [kernel.kallsyms] [k] preempt_count_sub 1.81% netperf [kernel.kallsyms] [k] tcp_recvmsg 1.80% netperf [kernel.kallsyms] [k] _raw_spin_unlock_irqres 1.78% netperf [kernel.kallsyms] [k] preempt_count_add 1.36% netperf [kernel.kallsyms] [k] __tcp_transmit_skb 1.20% netperf [kernel.kallsyms] [k] __local_bh_enable_ip 1.10% netperf [kernel.kallsyms] [k] sh_eth_start_xmit Test results with RX checksum offload disabled: ~/netperf-2.2pl4# perf record -a ./netperf -t TCP_MAERTS -H 192.168.2.4 TCP MAERTS TEST to 192.168.2.4 Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec 131072 16384 16384 10.01 932.04 [ perf record: Woken up 14 times to write data ] [ perf record: Captured and wrote 3.642 MB perf.data (78817 samples) ] ~/netperf-2.2pl4# perf report Samples: 78K of event 'cycles:ppp', Event count (approx.): 18091442796 Overhead Command Shared Object Symbol 7.00% swapper [kernel.kallsyms] [k] do_csum 3.94% swapper [kernel.kallsyms] [k] sh_eth_poll 3.83% ksoftirqd/0 [kernel.kallsyms] [k] do_csum 3.23% swapper [kernel.kallsyms] [k] _raw_spin_unlock_irq 2.87% netperf [kernel.kallsyms] [k] __arch_copy_to_user 2.86% swapper [kernel.kallsyms] [k] arch_cpu_idle 2.13% swapper [kernel.kallsyms] [k] default_idle_call 2.12% ksoftirqd/0 [kernel.kallsyms] [k] sh_eth_poll 2.02% swapper [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore 1.84% swapper [kernel.kallsyms] [k] __softirqentry_text_start 1.64% swapper [kernel.kallsyms] [k] tick_nohz_idle_exit 1.53% netperf [kernel.kallsyms] [k] _raw_spin_unlock_irq 1.32% netperf [kernel.kallsyms] [k] preempt_count_sub 1.27% swapper [kernel.kallsyms] [k] __pi___inval_dcache_area 1.22% swapper [kernel.kallsyms] [k] check_preemption_disabled 1.01% ksoftirqd/0 [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore The above results collected on the R-Car V3H Starter Kit board. Based on the commit `4d86d38186` ("ravb: RX checksum offload")... Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-04 13:31:00 -08:00
Sergei Shtylyov	2c2ab5af7d	sh_eth: rename sh_eth_cpu_data::hw_checksum Commit `62e04b7e0e` ("sh_eth: rename 'sh_eth_cpu_data::hw_crc'") renamed the field to 'hw_checksum' for the Ether DMAC "intelligent checksum", however some Ether MACs implement a simpler checksumming scheme, so that name now seems misleading. Rename that field to 'csmr' as the "intelligent checksum" is always controlled by the CSMR register. Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-04 13:31:00 -08:00
Alexei Starovoitov	1728b11110	Merge branch 'libbpf-btf_ext' Yonghong Song says: ==================== This patch set exposed a few functions in libbpf. All these newly added API functions are helpful for JIT based bpf compilation where .BTF and .BTF.ext are available as in-memory data blobs. Patch #1 exposed several btf_ext__* API functions which are used to handle .BTF.ext ELF sections. Patch #2 refactored the function bpf_map_find_btf_info() and exposed API function btf__get_map_kv_tids() to retrieve the map key/value type id's generated by bpf program through BPF_ANNOTATE_KV_PAIR macro. ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-02-04 12:48:37 -08:00
Yonghong Song	96408c4344	tools/bpf: implement libbpf btf__get_map_kv_tids() API function Currently, to get map key/value type id's, the macro BPF_ANNOTATE_KV_PAIR(<map_name>, <key_type>, <value_type>) needs to be defined in the bpf program for the corresponding map. During program/map loading time, the local static function bpf_map_find_btf_info() in libbpf.c is implemented to retrieve the key/value type ids given the map name. The patch refactored function bpf_map_find_btf_info() to create an API btf__get_map_kv_tids() which includes the bulk of implementation for the original function. The API btf__get_map_kv_tids() can be used by bcc, a JIT based bpf compilation system, which uses the same BPF_ANNOTATE_KV_PAIR to record map key/value types. Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-02-04 12:48:36 -08:00
Yonghong Song	b8dcf8d149	tools/bpf: expose functions btf_ext__* as API functions The following set of functions, which manipulates .BTF.ext section, are exposed as API functions: . btf_ext__new . btf_ext__free . btf_ext__reloc_func_info . btf_ext__reloc_line_info . btf_ext__func_info_rec_size . btf_ext__line_info_rec_size These functions are useful for JIT based bpf codegen, e.g., bcc, to manipulate in-memory .BTF.ext sections. The signature of function btf_ext__reloc_func_info() is also changed to be the same as its definition in btf.c. Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-02-04 12:48:36 -08:00
Stanislav Fomichev	7e8a590377	selftests/bpf: use localhost in tcp_{server,client}.py Bind and connect to localhost. There is no reason for this test to use non-localhost interface. This lets us run this test in a network namespace. Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-02-04 21:29:27 +01:00
Heiko Carstens	ecc15f113c	s390: bpf: fix JMP32 code-gen Commit `626a5f66da` ("s390: bpf: implement jitting of JMP32") added JMP32 code-gen support for s390. However it triggers the warning below due to some unusual gotos in the original s390 bpf jit code. Add a couple of additional "is_jmp32" initializations to fix this. Also fix the wrong opcode for the "llilf" instruction that was introduced with the same commit. arch/s390/net/bpf_jit_comp.c: In function 'bpf_jit_insn': arch/s390/net/bpf_jit_comp.c:248:55: warning: 'is_jmp32' may be used uninitialized in this function [-Wmaybe-uninitialized] _EMIT6(op1 \| reg(b1, b2) << 16 \| (rel & 0xffff), op2 \| mask); \ ^ arch/s390/net/bpf_jit_comp.c:1211:8: note: 'is_jmp32' was declared here bool is_jmp32 = BPF_CLASS(insn->code) == BPF_JMP32; Fixes: `626a5f66da` ("s390: bpf: implement jitting of JMP32") Cc: Jiong Wang <jiong.wang@netronome.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Acked-by: Jiong Wang <jiong.wang@netronome.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-02-04 09:45:09 -08:00
Alexei Starovoitov	9fa3b47304	Merge branch 'change-libbpf-print-api' Yonghong Song says: ==================== These are patches responding to my comments for Magnus's patch (https://patchwork.ozlabs.org/patch/1032848/). The goal is to make pr_* macros available to other C files than libbpf.c, and to simplify API function libbpf_set_print(). Specifically, Patch #1 used global functions to facilitate pr_* macros in the header files so they are available in different C files. Patch #2 removes the global function libbpf_print_level_available() which is added in Patch 1. Patch #3 simplified libbpf_set_print() which takes only one print function with a debug level argument among others. Changelogs: v3 -> v4: . rename libbpf internal header util.h to libbpf_util.h . rename libbpf internal function libbpf_debug_print() to libbpf_print() v2 -> v3: . bailed out earlier in libbpf_debug_print() if __libbpf_pr is NULL . added missing LIBBPF_DEBUG level check in libbpf.c __base_pr(). v1 -> v2: . Renamed global function libbpf_dprint() to libbpf_debug_print() to be more expressive. . Removed libbpf_dprint_level_available() as it is used only once in btf.c and we can remove it by optimizing for common cases. ==================== Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-02-04 09:40:59 -08:00
Yonghong Song	6f1ae8b662	tools/bpf: simplify libbpf API function libbpf_set_print() Currently, the libbpf API function libbpf_set_print() takes three function pointer parameters for warning, info and debug printout respectively. This patch changes the API to have just one function pointer parameter and the function pointer has one additional parameter "debugging level". So if in the future, if the debug level is increased, the function signature won't change. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-02-04 09:40:59 -08:00
Yonghong Song	9d100a19ff	tools/bpf: print out btf log at LIBBPF_WARN level Currently, the btf log is allocated and printed out in case of error at LIBBPF_DEBUG level. Such logs from kernel are very important for debugging. For example, bpf syscall BPF_PROG_LOAD command can get verifier logs back to user space. In function load_program() of libbpf.c, the log buffer is allocated unconditionally and printed out at pr_warning() level. Let us do the similar thing here for btf. Allocate buffer unconditionally and print out error logs at pr_warning() level. This can reduce one global function and optimize for common situations where pr_warning() is activated either by default or by user supplied debug output function. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-02-04 09:40:58 -08:00

1 2 3 4 5 ...

812468 Commits