Commit Graph

1091314 Commits

Author SHA1 Message Date
Horatiu Vultur
f3d8e0a9c2 net: lan966x: Add support for PTP_PF_EXTTS
Extend the PTP programmable pins to implement also PTP_PF_EXTTS
function. The PTP pin can be configured to capture only on the rising
edge of the PPS signal. And once an event is seen then an interrupt is
generated and the local time counter is saved.
The interrupt is shared between all the pins.

Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-27 12:03:18 +01:00
Horatiu Vultur
2b7ff2588e net: lan966x: Add support for PTP_PF_PEROUT
Lan966x has 8 PTP programmable pins, where the last pins is hardcoded to
be used by PHC0, which does the frame timestamping. All the rest of the
PTP pins can be shared between the PHCs and can have different functions
like perout or extts. For now add support for PTP_FS_PEROUT.
The HW is not able to support absolute start time but can use the nsec
for phase adjustment when generating PPS.

Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-27 12:03:18 +01:00
Horatiu Vultur
3adc11e5fc net: lan966x: Add registers used to configure the PTP pin
Add registers that are used to configure the PTP pins. These registers
are used to enable the interrupts per PTP pin and to set the waveform
generated by the pin.

Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-27 12:03:18 +01:00
Horatiu Vultur
77f2accb50 net: lan966x: Change the PTP pin used to read/write the PHC.
To read/write a value to a PHC, it is required to use a PTP pin.
Currently it is used pin 5, but change to pin 7 as is the last pin.
All the other pins will have different functions.

Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-27 12:03:18 +01:00
Horatiu Vultur
c1a519919d dt-bindings: net: lan966x: Extend with the ptp external interrupt.
Extend dt-bindings for lan966x with ptp external interrupt. This is
generated when an external 1pps signal is received on the ptp pin.

Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-27 12:03:17 +01:00
David S. Miller
a1bde8c92d Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net
-queue

Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2022-04-26

This series contains updates to ice driver only.

Ivan Vecera removes races related to VF message processing by changing
mutex_trylock() call to mutex_lock() and moving additional operations
to occur under mutex.

Petr Oros increases wait time after firmware flash as current time is
not sufficient.

Jake resolves a use-after-free issue for mailbox snapshot.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-27 10:58:39 +01:00
David S. Miller
124de27101 Merge branch 'mptcp-MP_FAIL-timeout'
Mat Martineau says:

====================
mptcp: Timeout for MP_FAIL response

When one peer sends an infinite mapping to coordinate fallback from
MPTCP to regular TCP, the other peer is expected to send a packet with
the MPTCP MP_FAIL option to acknowledge the infinite mapping. Rather
than leave the connection in some half-fallback state, this series adds
a timeout after which the infinite mapping sender will reset the
connection.

Patch 1 adds a fallback self test.

Patches 2-5 make use of the MPTCP socket's retransmit timer to reset the
MPTCP connection if no MP_FAIL was received.

Patches 6 and 7 extends the self test to check MP_FAIL-related MIBs.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-27 10:45:55 +01:00
Geliang Tang
53f368bfff selftests: mptcp: print extra msg in chk_csum_nr
When the multiple checksum errors occur in chk_csum_nr(), print the
numbers of the errors as an extra message.

Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-27 10:45:54 +01:00
Geliang Tang
1f7d325f7d selftests: mptcp: check MP_FAIL response mibs
This patch extends chk_fail_nr to check the MP_FAIL response mibs.

Add a new argument invert for chk_fail_nr to allow it can check the
MP_FAIL TX and RX mibs from the opposite direction.

When the infinite map is received before the MP_FAIL response, the
response will be lost. A '-' can be added into fail_tx or fail_rx to
represent that MP_FAIL response TX or RX can be lost when doing the
checks.

Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-27 10:45:54 +01:00
Geliang Tang
49fa1919d6 mptcp: reset subflow when MP_FAIL doesn't respond
This patch adds a new msk->flags bit MPTCP_FAIL_NO_RESPONSE, then reuses
sk_timer to trigger a check if we have not received a response from the
peer after sending MP_FAIL. If the peer doesn't respond properly, reset
the subflow.

Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-27 10:45:54 +01:00
Geliang Tang
9c81be0dbc mptcp: add MP_FAIL response support
This patch adds a new struct member mp_fail_response_expect in struct
mptcp_subflow_context to support MP_FAIL response. In the single subflow
with checksum error and contiguous data special case, a MP_FAIL is sent
in response to another MP_FAIL.

Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-27 10:45:54 +01:00
Geliang Tang
4293248c67 mptcp: add data lock for sk timers
mptcp_data_lock() needs to be held when manipulating the msk
retransmit_timer or the sk sk_timer. This patch adds the data
lock for the both timers.

Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-27 10:45:54 +01:00
Geliang Tang
bcf3cf93f6 mptcp: use mptcp_stop_timer
Use the helper mptcp_stop_timer() instead of using sk_stop_timer() to
stop icsk_retransmit_timer directly.

Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-27 10:45:53 +01:00
Geliang Tang
b6e074e171 selftests: mptcp: add infinite map testcase
Add the single subflow test case for MP_FAIL, to test the infinite
mapping case. Use the test_linkfail value to make 128KB test files.

Add a new function reset_with_fail(), in it use 'iptables' and 'tc
action pedit' rules to produce the bit flips to trigger the checksum
failures. Set validate_checksum to enable checksums for the MP_FAIL
tests without passing the '-C' argument. Set check_invert flag to
enable the invert bytes check for the output data in check_transfer().
Instead of the file mismatch error, this test prints out the inverted
bytes.

Add a new function pedit_action_pkts() to get the numbers of the packets
edited by the tc pedit actions. Print this numbers to the output.

Also add the needed kernel configures in the selftests config file.

Suggested-by: Davide Caratti <dcaratti@redhat.com>
Co-developed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-27 10:45:53 +01:00
Wan Jiabing
a5f3aed588 wil6210: simplify if-if to if-else
Use if and else instead of if(A) and if (!A).

Signed-off-by: Wan Jiabing <wanjiabing@vivo.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://lore.kernel.org/r/20220424094552.105466-1-wanjiabing@vivo.com
2022-04-27 10:29:22 +03:00
Wan Jiabing
7471f7d273 ath10k: simplify if-if to if-else
Use if and else instead of if(A) and if (!A).

Signed-off-by: Wan Jiabing <wanjiabing@vivo.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://lore.kernel.org/r/20220424094522.105262-1-wanjiabing@vivo.com
2022-04-27 10:28:42 +03:00
Wen Gong
66721bb4bb ath11k: read country code from SMBIOS for WCN6855/QCA6390
This read the country code from SMBIOS and send the country code
to firmware, firmware will indicate the regulatory domain info of the
country code and then ath11k will use the info.

dmesg:
[ 1242.637173] ath11k_pci 0000:02:00.0: chip_id 0x2 chip_family 0xb board_id 0xff soc_id 0x400c0200
[ 1242.637176] ath11k_pci 0000:02:00.0: fw_version 0x110b09e5 fw_build_timestamp 2021-06-22 09:32 fw_build_id QC_IMAGE_VERSION_STRING=WLAN.HSP.1.1-02533-QCAHSPSWPL_V1_V2_SILICONZ_LITE-1
[ 1242.637253] ath11k_pci 0000:02:00.0: worldwide regdomain setting from SMBIOS
[ 1242.637259] ath11k_pci 0000:02:00.0: bdf variant name not found.
[ 1242.637261] ath11k_pci 0000:02:00.0: SMBIOS bdf variant name not set.
[ 1242.637263] ath11k_pci 0000:02:00.0: DT bdf variant name not set.
[ 1242.927543] ath11k_pci 0000:02:00.0: set current country pdev id 0 alpha2 00

Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3

Signed-off-by: Wen Gong <quic_wgong@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://lore.kernel.org/r/20220421023501.32167-1-quic_wgong@quicinc.com
2022-04-27 10:28:19 +03:00
Hari Chandrakanthan
161c64de23 ath11k: disable spectral scan during spectral deinit
When ath11k modules are removed using rmmod with spectral scan enabled,
crash is observed. Different crash trace is observed for each crash.

Send spectral scan disable WMI command to firmware before cleaning
the spectral dbring in the spectral_deinit API to avoid this crash.

call trace from one of the crash observed:
[ 1252.880802] Unable to handle kernel NULL pointer dereference at virtual address 00000008
[ 1252.882722] pgd = 0f42e886
[ 1252.890955] [00000008] *pgd=00000000
[ 1252.893478] Internal error: Oops: 5 [#1] PREEMPT SMP ARM
[ 1253.093035] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.4.89 #0
[ 1253.115261] Hardware name: Generic DT based system
[ 1253.121149] PC is at ath11k_spectral_process_data+0x434/0x574 [ath11k]
[ 1253.125940] LR is at 0x88e31017
[ 1253.132448] pc : [<7f9387b8>]    lr : [<88e31017>]    psr: a0000193
[ 1253.135488] sp : 80d01bc8  ip : 00000001  fp : 970e0000
[ 1253.141737] r10: 88e31000  r9 : 970ec000  r8 : 00000080
[ 1253.146946] r7 : 94734040  r6 : a0000113  r5 : 00000057  r4 : 00000000
[ 1253.152159] r3 : e18cb694  r2 : 00000217  r1 : 1df1f000  r0 : 00000001
[ 1253.158755] Flags: NzCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment user
[ 1253.165266] Control: 10c0383d  Table: 5e71006a  DAC: 00000055
[ 1253.172472] Process swapper/0 (pid: 0, stack limit = 0x60870141)
[ 1253.458055] [<7f9387b8>] (ath11k_spectral_process_data [ath11k]) from [<7f917fdc>] (ath11k_dbring_buffer_release_event+0x214/0x2e4 [ath11k])
[ 1253.466139] [<7f917fdc>] (ath11k_dbring_buffer_release_event [ath11k]) from [<7f8ea3c4>] (ath11k_wmi_tlv_op_rx+0x1840/0x29cc [ath11k])
[ 1253.478807] [<7f8ea3c4>] (ath11k_wmi_tlv_op_rx [ath11k]) from [<7f8fe868>] (ath11k_htc_rx_completion_handler+0x180/0x4e0 [ath11k])
[ 1253.490699] [<7f8fe868>] (ath11k_htc_rx_completion_handler [ath11k]) from [<7f91308c>] (ath11k_ce_per_engine_service+0x2c4/0x3b4 [ath11k])
[ 1253.502386] [<7f91308c>] (ath11k_ce_per_engine_service [ath11k]) from [<7f9a4198>] (ath11k_pci_ce_tasklet+0x28/0x80 [ath11k_pci])
[ 1253.514811] [<7f9a4198>] (ath11k_pci_ce_tasklet [ath11k_pci]) from [<8032227c>] (tasklet_action_common.constprop.2+0x64/0xe8)
[ 1253.526476] [<8032227c>] (tasklet_action_common.constprop.2) from [<803021e8>] (__do_softirq+0x130/0x2d0)
[ 1253.537756] [<803021e8>] (__do_softirq) from [<80322610>] (irq_exit+0xcc/0xe8)
[ 1253.547304] [<80322610>] (irq_exit) from [<8036a4a4>] (__handle_domain_irq+0x60/0xb4)
[ 1253.554428] [<8036a4a4>] (__handle_domain_irq) from [<805eb348>] (gic_handle_irq+0x4c/0x90)
[ 1253.562321] [<805eb348>] (gic_handle_irq) from [<80301a78>] (__irq_svc+0x58/0x8c)

Tested-on: QCN6122 hw1.0 AHB WLAN.HK.2.6.0.1-00851-QCAHKSWPL_SILICONZ-1

Signed-off-by: Hari Chandrakanthan <quic_haric@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://lore.kernel.org/r/1649396345-349-1-git-send-email-quic_haric@quicinc.com
2022-04-27 10:27:55 +03:00
Manikanta Pubbisetty
33b67a4b4e ath11k: Update WBM idle ring HP after FW mode on
Currently, WBM idle ring HP is updated much before the shadow
configuration is sent to the FW. Any update to the shadow
registers before FW mode on request would not be reflected
on to the actual HW registers failing to bring up the device.
Send FW mode ON QMI request before WBM idle ring HP update
to fix this problem.

Tested-on: WCN6750 hw1.0 AHB WLAN.MSL.1.0.1-00573-QCAMSLSWPLZ-1
Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-01720.1-QCAHSPSWPL_V1_V2_SILICONZ_LITE-1
Tested-on: QCN9074 hw1.0 PCI WLAN.HK.2.5.0.1-01100-QCAHKSWPL_SILICONZ-1
Tested-on: IPQ8074 hw2.0 AHB WLAN.HK.2.4.0.1-00192-QCAHKSWPL_SILICONZ-1

Signed-off-by: Manikanta Pubbisetty <quic_mpubbise@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://lore.kernel.org/r/20220406094107.17878-12-quic_mpubbise@quicinc.com
2022-04-27 10:25:59 +03:00
Manikanta Pubbisetty
95959d702e ath11k: WMI changes to support WCN6750
WCN6750 is a single PDEV non-DBS chip which supports 2G, 5G and 6G bands.
It is a single LMAC device which can be either hooked to 2G/5G/6G bands.
Add WMI changes to support WCN6750.

Tested-on: WCN6750 hw1.0 AHB WLAN.MSL.1.0.1-00573-QCAMSLSWPLZ-1
Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-01720.1-QCAHSPSWPL_V1_V2_SILICONZ_LITE-1
Tested-on: QCN9074 hw1.0 PCI WLAN.HK.2.5.0.1-01100-QCAHKSWPL_SILICONZ-1
Tested-on: IPQ8074 hw2.0 AHB WLAN.HK.2.4.0.1-00192-QCAHKSWPL_SILICONZ-1

Signed-off-by: Manikanta Pubbisetty <quic_mpubbise@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://lore.kernel.org/r/20220406094107.17878-11-quic_mpubbise@quicinc.com
2022-04-27 10:25:59 +03:00
Manikanta Pubbisetty
b6f6301041 ath11k: Do not put HW in DBS mode for WCN6750
Though WCN6750 is a single PDEV device, it is not a
DBS solution. So, do not put HW in DBS mode for WCN6750.

Tested-on: WCN6750 hw1.0 AHB WLAN.MSL.1.0.1-00573-QCAMSLSWPLZ-1
Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-01720.1-QCAHSPSWPL_V1_V2_SILICONZ_LITE-1
Tested-on: QCN9074 hw1.0 PCI WLAN.HK.2.5.0.1-01100-QCAHKSWPL_SILICONZ-1
Tested-on: IPQ8074 hw2.0 AHB WLAN.HK.2.4.0.1-00192-QCAHKSWPL_SILICONZ-1

Signed-off-by: Manikanta Pubbisetty <quic_mpubbise@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://lore.kernel.org/r/20220406094107.17878-10-quic_mpubbise@quicinc.com
2022-04-27 10:25:59 +03:00
Guo Zhengkui
8c783024d6 rtlwifi: btcoex: fix if == else warning
Fix the following coccicheck warning:

drivers/net/wireless/realtek/rtlwifi/btcoexist/halbtc8821a1ant.c:1604:2-4:
WARNING: possible condition with no effect (if == else).

Signed-off-by: Guo Zhengkui <guozhengkui@vivo.com>
Acked-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20220425031725.5808-1-guozhengkui@vivo.com
2022-04-27 08:04:00 +03:00
Hamid Zamani
21947f3a74 brcmfmac: use ISO3166 country code and 0 rev as fallback on brcmfmac43602 chips
This uses ISO3166 country code and 0 rev on brcmfmac43602 chips.
Without this patch 80 MHz width is not selected on 5 GHz channels.

Commit a21bf90e92 ("brcmfmac: use ISO3166 country code and 0 rev as
fallback on some devices") provides a way to specify chips for using the
fallback case.

Before commit 151a7c12c4 ("Revert "brcmfmac: use ISO3166 country code
and 0 rev as fallback"") brcmfmac43602 devices works correctly and for
this specific case 80 MHz width is selected.

Signed-off-by: Hamid Zamani <hzamani.cs91@gmail.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20220423111237.60892-1-hzamani.cs91@gmail.com
2022-04-27 08:03:39 +03:00
Alexander Wetzel
746285cf81 rtl818x: Prevent using not initialized queues
Using not existing queues can panic the kernel with rtl8180/rtl8185 cards.
Ignore the skb priority for those cards, they only have one tx queue. Pierre
Asselin (pa@panix.com) reported the kernel crash in the Gentoo forum:

https://forums.gentoo.org/viewtopic-t-1147832-postdays-0-postorder-asc-start-25.html

He also confirmed that this patch fixes the issue. In summary this happened:

After updating wpa_supplicant from 2.9 to 2.10 the kernel crashed with a
"divide error: 0000" when connecting to an AP. Control port tx now tries to
use IEEE80211_AC_VO for the priority, which wpa_supplicants starts to use in
2.10.

Since only the rtl8187se part of the driver supports QoS, the priority
of the skb is set to IEEE80211_AC_BE (2) by mac80211 for rtl8180/rtl8185
cards.

rtl8180 is then unconditionally reading out the priority and finally crashes on
drivers/net/wireless/realtek/rtl818x/rtl8180/dev.c line 544 without this
patch:
	idx = (ring->idx + skb_queue_len(&ring->queue)) % ring->entries

"ring->entries" is zero for rtl8180/rtl8185 cards, tx_ring[2] never got
initialized.

Cc: stable@vger.kernel.org
Reported-by: pa@panix.com
Tested-by: pa@panix.com
Signed-off-by: Alexander Wetzel <alexander@wetzel-home.de>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20220422145228.7567-1-alexander@wetzel-home.de
2022-04-27 08:02:46 +03:00
Kevin Lo
fc6234d7e2 rtw88: use the correct bit in the REG_HCI_OPT_CTRL register
Write the BIT_USB_SUS_DIS bit rather than BIT_BT_DIG_CLK_EN to the
REG_HCI_OPT_CTRL register for fixing failure to PCIe power on.

Signed-off-by: Kevin Lo <kevlo@kevlo.org>
Acked-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/YmLAzuyPr0P4Y6BP@ns.kevlo.org
2022-04-27 07:58:00 +03:00
Andrejs Cainikovs
562354ab9f mwifiex: Add SD8997 SDIO-UART firmware
With a recent change now it is possible to detect the strapping
option on SD8997, which allows to pick up a correct firmware
for either SDIO-SDIO or SDIO-UART.

This commit enables SDIO-UART firmware on SD8997.

Signed-off-by: Andrejs Cainikovs <andrejs.cainikovs@toradex.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20220422090313.125857-3-andrejs.cainikovs@toradex.com
2022-04-27 07:56:31 +03:00
Andrejs Cainikovs
255ca28a65 mwifiex: Select firmware based on strapping
Some WiFi/Bluetooth modules might have different host connection
options, allowing to either use SDIO for both WiFi and Bluetooth,
or SDIO for WiFi and UART for Bluetooth. It is possible to detect
whether a module has SDIO-SDIO or SDIO-UART connection by reading
its host strap register.

This change introduces a way to automatically select appropriate
firmware depending of the connection method, and removes a need
of symlinking or overwriting the original firmware file with a
required one.

Host strap register used in this commit comes from the NXP driver [1]
hosted at Code Aurora.

[1] https://source.codeaurora.org/external/imx/linux-imx/tree/drivers/net/wireless/nxp/mxm_wifiex/wlan_src/mlinux/moal_sdio_mmc.c?h=rel_imx_5.4.70_2.3.2&id=688b67b2c7220b01521ffe560da7eee33042c7bd#n1274

Signed-off-by: Andrejs Cainikovs <andrejs.cainikovs@toradex.com>
Reviewed-by: Alvin Šipraga <alsi@bang-olufsen.dk>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20220422090313.125857-2-andrejs.cainikovs@toradex.com
2022-04-27 07:56:31 +03:00
Jens Axboe
5a1e99b61b io_uring: check reserved fields for recv/recvmsg
We should check unused fields for non-zero and -EINVAL if they are set,
making it consistent with other opcodes.

Fixes: aa1fa28fc7 ("io_uring: add support for recvmsg()")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-04-26 20:48:37 -06:00
Jens Axboe
588faa1ea5 io_uring: check reserved fields for send/sendmsg
We should check unused fields for non-zero and -EINVAL if they are set,
making it consistent with other opcodes.

Fixes: 0fa03c624d ("io_uring: add support for sendmsg()")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-04-26 20:48:31 -06:00
Martin Blumenstingl
71cffebf63 net: dsa: lantiq_gswip: Don't set GSWIP_MII_CFG_RMII_CLK
Commit 4b5923249b ("net: dsa: lantiq_gswip: Configure all remaining
GSWIP_MII_CFG bits") added all known bits in the GSWIP_MII_CFGp
register. It helped bring this register into a well-defined state so the
driver has to rely less on the bootloader to do things right.
Unfortunately it also sets the GSWIP_MII_CFG_RMII_CLK bit without any
possibility to configure it. Upon further testing it turns out that all
boards which are supported by the GSWIP driver in OpenWrt which use an
RMII PHY have a dedicated oscillator on the board which provides the
50MHz RMII reference clock.

Don't set the GSWIP_MII_CFG_RMII_CLK bit (but keep the code which always
clears it) to fix support for the Fritz!Box 7362 SL in OpenWrt. This is
a board with two Atheros AR8030 RMII PHYs. With the "RMII clock" bit set
the MAC also generates the RMII reference clock whose signal then
conflicts with the signal from the oscillator on the board. This results
in a constant cycle of the PHY detecting link up/down (and as a result
of that: the two ports using the AR8030 PHYs are not working).

At the time of writing this patch there's no known board where the MAC
(GSWIP) has to generate the RMII reference clock. If needed this can be
implemented in future by providing a device-tree flag so the
GSWIP_MII_CFG_RMII_CLK bit can be toggled per port.

Fixes: 4b5923249b ("net: dsa: lantiq_gswip: Configure all remaining GSWIP_MII_CFG bits")
Tested-by: Jan Hoffmann <jan@3e8.eu>
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Acked-by: Hauke Mehrtens <hauke@hauke-m.de>
Link: https://lore.kernel.org/r/20220425152027.2220750-1-martin.blumenstingl@googlemail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-26 17:32:52 -07:00
Sebastian Andrzej Siewior
6510ea973d net: Use this_cpu_inc() to increment net->core_stats
The macro dev_core_stats_##FIELD##_inc() disables preemption and invokes
netdev_core_stats_alloc() to return a per-CPU pointer.
netdev_core_stats_alloc() will allocate memory on its first invocation
which breaks on PREEMPT_RT because it requires non-atomic context for
memory allocation.

This can be avoided by enabling preemption in netdev_core_stats_alloc()
assuming the caller always disables preemption.

It might be better to replace local_inc() with this_cpu_inc() now that
dev_core_stats_##FIELD##_inc() gained a preempt-disable section and does
not rely on already disabled preemption. This results in less
instructions on x86-64:
local_inc:
|          incl %gs:__preempt_count(%rip)  # __preempt_count
|          movq    488(%rdi), %rax # _1->core_stats, _22
|          testq   %rax, %rax      # _22
|          je      .L585   #,
|          add %gs:this_cpu_off(%rip), %rax        # this_cpu_off, tcp_ptr__
|  .L586:
|          testq   %rax, %rax      # _27
|          je      .L587   #,
|          incq (%rax)            # _6->a.counter
|  .L587:
|          decl %gs:__preempt_count(%rip)  # __preempt_count

this_cpu_inc(), this patch:
|         movq    488(%rdi), %rax # _1->core_stats, _5
|         testq   %rax, %rax      # _5
|         je      .L591   #,
| .L585:
|         incq %gs:(%rax) # _18->rx_dropped

Use unsigned long as type for the counter. Use this_cpu_inc() to
increment the counter. Use a plain read of the counter.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/YmbO0pxgtKpCw4SY@linutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-26 17:32:30 -07:00
Marcel Ziswiler
b1190d5175 net: stmmac: dwmac-imx: comment spelling fix
Fix spelling in comment.

Fixes: 94abdad697 ("net: ethernet: dwmac: add ethernet glue logic for NXP imx8 chip")
Signed-off-by: Marcel Ziswiler <marcel.ziswiler@toradex.com>
Link: https://lore.kernel.org/r/20220425154856.169499-1-marcel@ziswiler.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-26 17:28:31 -07:00
Bjorn Helgaas
e39f63fe0d net: remove comments that mention obsolete __SLOW_DOWN_IO
The only remaining definitions of __SLOW_DOWN_IO (for alpha and ia64) do
nothing, and the only mentions in networking are in comments.  Remove these
mentions.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-26 17:09:24 -07:00
Bjorn Helgaas
dac173db11 net: wan: atp: remove unused eeprom_delay()
atp.h is included only by atp.c, which does not use eeprom_delay().  Remove
the unused definition.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-26 17:09:23 -07:00
Jakub Kicinski
c706b2b5ed net: tls: fix async vs NIC crypto offload
When NIC takes care of crypto (or the record has already
been decrypted) we forget to update darg->async. ->async
is supposed to mean whether record is async capable on
input and whether record has been queued for async crypto
on output.

Reported-by: Gal Pressman <gal@nvidia.com>
Fixes: 3547a1f9d9 ("tls: rx: use async as an in-out argument")
Tested-by: Gal Pressman <gal@nvidia.com>
Link: https://lore.kernel.org/r/20220425233309.344858-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-26 17:08:49 -07:00
Russell King (Oracle)
fae4630840 net: dsa: mt753x: fix pcs conversion regression
Daniel Golle reports that the conversion of mt753x to phylink PCS caused
an oops as below.

The problem is with the placement of the PCS initialisation, which
occurs after mt7531_setup() has been called. However, burited in this
function is a call to setup the CPU port, which requires the PCS
structure to be already setup.

Fix this by changing the initialisation order.

Unable to handle kernel NULL pointer dereference at virtual address 0000000000000020
Mem abort info:
  ESR = 0x96000005
  EC = 0x25: DABT (current EL), IL = 32 bits
  SET = 0, FnV = 0
  EA = 0, S1PTW = 0
  FSC = 0x05: level 1 translation fault
Data abort info:
  ISV = 0, ISS = 0x00000005
  CM = 0, WnR = 0
user pgtable: 4k pages, 39-bit VAs, pgdp=0000000046057000
[0000000000000020] pgd=0000000000000000, p4d=0000000000000000, pud=0000000000000000
Internal error: Oops: 96000005 [#1] SMP
Modules linked in:
CPU: 0 PID: 32 Comm: kworker/u4:1 Tainted: G S 5.18.0-rc3-next-20220422+ #0
Hardware name: Bananapi BPI-R64 (DT)
Workqueue: events_unbound deferred_probe_work_func
pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : mt7531_cpu_port_config+0xcc/0x1b0
lr : mt7531_cpu_port_config+0xc0/0x1b0
sp : ffffffc008d5b980
x29: ffffffc008d5b990 x28: ffffff80060562c8 x27: 00000000f805633b
x26: ffffff80001a8880 x25: 00000000000009c4 x24: 0000000000000016
x23: ffffff8005eb6470 x22: 0000000000003600 x21: ffffff8006948080
x20: 0000000000000000 x19: 0000000000000006 x18: 0000000000000000
x17: 0000000000000001 x16: 0000000000000001 x15: 02963607fcee069e
x14: 0000000000000000 x13: 0000000000000030 x12: 0101010101010101
x11: ffffffc037302000 x10: 0000000000000870 x9 : ffffffc008d5b800
x8 : ffffff800028f950 x7 : 0000000000000001 x6 : 00000000662b3000
x5 : 00000000000002f0 x4 : 0000000000000000 x3 : ffffff800028f080
x2 : 0000000000000000 x1 : ffffff800028f080 x0 : 0000000000000000
Call trace:
 mt7531_cpu_port_config+0xcc/0x1b0
 mt753x_cpu_port_enable+0x24/0x1f0
 mt7531_setup+0x49c/0x5c0
 mt753x_setup+0x20/0x31c
 dsa_register_switch+0x8bc/0x1020
 mt7530_probe+0x118/0x200
 mdio_probe+0x30/0x64
 really_probe.part.0+0x98/0x280
 __driver_probe_device+0x94/0x140
 driver_probe_device+0x40/0x114
 __device_attach_driver+0xb0/0x10c
 bus_for_each_drv+0x64/0xa0
 __device_attach+0xa8/0x16c
 device_initial_probe+0x10/0x20
 bus_probe_device+0x94/0x9c
 deferred_probe_work_func+0x80/0xb4
 process_one_work+0x200/0x3a0
 worker_thread+0x260/0x4c0
 kthread+0xd4/0xe0
 ret_from_fork+0x10/0x20
Code: 9409e911 937b7e60 8b0002a0 f9405800 (f9401005)
---[ end trace 0000000000000000 ]---

Reported-by: Daniel Golle <daniel@makrotopia.org>
Tested-by: Daniel Golle <daniel@makrotopia.org>
Fixes: cbd1f243bc ("net: dsa: mt7530: partially convert to phylink_pcs")
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/E1nj6FW-007WZB-5Y@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-26 17:08:31 -07:00
Eric Dumazet
68822bdf76 net: generalize skb freeing deferral to per-cpu lists
Logic added in commit f35f821935 ("tcp: defer skb freeing after socket
lock is released") helped bulk TCP flows to move the cost of skbs
frees outside of critical section where socket lock was held.

But for RPC traffic, or hosts with RFS enabled, the solution is far from
being ideal.

For RPC traffic, recvmsg() has to return to user space right after
skb payload has been consumed, meaning that BH handler has no chance
to pick the skb before recvmsg() thread. This issue is more visible
with BIG TCP, as more RPC fit one skb.

For RFS, even if BH handler picks the skbs, they are still picked
from the cpu on which user thread is running.

Ideally, it is better to free the skbs (and associated page frags)
on the cpu that originally allocated them.

This patch removes the per socket anchor (sk->defer_list) and
instead uses a per-cpu list, which will hold more skbs per round.

This new per-cpu list is drained at the end of net_action_rx(),
after incoming packets have been processed, to lower latencies.

In normal conditions, skbs are added to the per-cpu list with
no further action. In the (unlikely) cases where the cpu does not
run net_action_rx() handler fast enough, we use an IPI to raise
NET_RX_SOFTIRQ on the remote cpu.

Also, we do not bother draining the per-cpu list from dev_cpu_dead()
This is because skbs in this list have no requirement on how fast
they should be freed.

Note that we can add in the future a small per-cpu cache
if we see any contention on sd->defer_lock.

Tested on a pair of hosts with 100Gbit NIC, RFS enabled,
and /proc/sys/net/ipv4/tcp_rmem[2] tuned to 16MB to work around
page recycling strategy used by NIC driver (its page pool capacity
being too small compared to number of skbs/pages held in sockets
receive queues)

Note that this tuning was only done to demonstrate worse
conditions for skb freeing for this particular test.
These conditions can happen in more general production workload.

10 runs of one TCP_STREAM flow

Before:
Average throughput: 49685 Mbit.

Kernel profiles on cpu running user thread recvmsg() show high cost for
skb freeing related functions (*)

    57.81%  [kernel]       [k] copy_user_enhanced_fast_string
(*) 12.87%  [kernel]       [k] skb_release_data
(*)  4.25%  [kernel]       [k] __free_one_page
(*)  3.57%  [kernel]       [k] __list_del_entry_valid
     1.85%  [kernel]       [k] __netif_receive_skb_core
     1.60%  [kernel]       [k] __skb_datagram_iter
(*)  1.59%  [kernel]       [k] free_unref_page_commit
(*)  1.16%  [kernel]       [k] __slab_free
     1.16%  [kernel]       [k] _copy_to_iter
(*)  1.01%  [kernel]       [k] kfree
(*)  0.88%  [kernel]       [k] free_unref_page
     0.57%  [kernel]       [k] ip6_rcv_core
     0.55%  [kernel]       [k] ip6t_do_table
     0.54%  [kernel]       [k] flush_smp_call_function_queue
(*)  0.54%  [kernel]       [k] free_pcppages_bulk
     0.51%  [kernel]       [k] llist_reverse_order
     0.38%  [kernel]       [k] process_backlog
(*)  0.38%  [kernel]       [k] free_pcp_prepare
     0.37%  [kernel]       [k] tcp_recvmsg_locked
(*)  0.37%  [kernel]       [k] __list_add_valid
     0.34%  [kernel]       [k] sock_rfree
     0.34%  [kernel]       [k] _raw_spin_lock_irq
(*)  0.33%  [kernel]       [k] __page_cache_release
     0.33%  [kernel]       [k] tcp_v6_rcv
(*)  0.33%  [kernel]       [k] __put_page
(*)  0.29%  [kernel]       [k] __mod_zone_page_state
     0.27%  [kernel]       [k] _raw_spin_lock

After patch:
Average throughput: 73076 Mbit.

Kernel profiles on cpu running user thread recvmsg() looks better:

    81.35%  [kernel]       [k] copy_user_enhanced_fast_string
     1.95%  [kernel]       [k] _copy_to_iter
     1.95%  [kernel]       [k] __skb_datagram_iter
     1.27%  [kernel]       [k] __netif_receive_skb_core
     1.03%  [kernel]       [k] ip6t_do_table
     0.60%  [kernel]       [k] sock_rfree
     0.50%  [kernel]       [k] tcp_v6_rcv
     0.47%  [kernel]       [k] ip6_rcv_core
     0.45%  [kernel]       [k] read_tsc
     0.44%  [kernel]       [k] _raw_spin_lock_irqsave
     0.37%  [kernel]       [k] _raw_spin_lock
     0.37%  [kernel]       [k] native_irq_return_iret
     0.33%  [kernel]       [k] __inet6_lookup_established
     0.31%  [kernel]       [k] ip6_protocol_deliver_rcu
     0.29%  [kernel]       [k] tcp_rcv_established
     0.29%  [kernel]       [k] llist_reverse_order

v2: kdoc issue (kernel bots)
    do not defer if (alloc_cpu == smp_processor_id()) (Paolo)
    replace the sk_buff_head with a single-linked list (Jakub)
    add a READ_ONCE()/WRITE_ONCE() for the lockless read of sd->defer_list

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Link: https://lore.kernel.org/r/20220422201237.416238-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-26 17:05:59 -07:00
Linus Torvalds
46cf2c613f Merge tag 'pinctrl-v5.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
Pull pin control fixes from Linus Walleij:

 - Fix some register offsets on Intel Alderlake

 - Fix the order the UFS and SDC pins on Qualcomm SM6350

 - Fix a build error in Mediatek Moore.

 - Fix a pin function table in the Sunplus SP7021.

 - Fix some Kconfig and static keywords on the Samsung Tesla FSD SoC.

 - Fix up the EOI function for edge triggered IRQs and keep the block
   clock enabled for level IRQs in the STM32 driver.

 - Fix some bits and order in the Rockchip RK3308 driver.

 - Handle the errorpath in the Pistachio driver probe() properly.

* tag 'pinctrl-v5.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
  pinctrl: pistachio: fix use of irq_of_parse_and_map()
  pinctrl: stm32: Keep pinctrl block clock enabled when LEVEL IRQ requested
  pinctrl: rockchip: sort the rk3308_mux_recalced_data entries
  pinctrl: rockchip: fix RK3308 pinmux bits
  pinctrl: stm32: Do not call stm32_gpio_get() for edge triggered IRQs in EOI
  pinctrl: Fix an error in pin-function table of SP7021
  pinctrl: samsung: fix missing GPIOLIB on ARM64 Exynos config
  pinctrl: mediatek: moore: Fix build error
  pinctrl: qcom: sm6350: fix order of UFS & SDC pins
  pinctrl: alderlake: Fix register offsets for ADL-N variant
  pinctrl: samsung: staticize fsd_pin_ctrl
2022-04-26 16:34:11 -07:00
Alexei Starovoitov
d54d06a4c4 Merge branch 'Teach libbpf to "fix up" BPF verifier log'
Andrii Nakryiko says:

====================

This patch set teaches libbpf to enhance BPF verifier log with human-readable
and relevant information about failed CO-RE relocation. Patch #9 is the main
one with the new logic. See relevant commit messages for some more details.

All the other patches are either fixing various bugs detected
while working on this feature, most prominently a bug with libbpf not handling
CO-RE relocations for SEC("?...") programs, or are refactoring libbpf
internals to allow for easier reuse of CO-RE relo lookup and formatting logic.
====================

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-04-26 15:41:47 -07:00
Andrii Nakryiko
ea4128eb43 selftests/bpf: Add libbpf's log fixup logic selftests
Add tests validating that libbpf is indeed patching up BPF verifier log
with CO-RE relocation details. Also test partial and full truncation
scenarios.

This test might be a bit fragile due to changing BPF verifier log
format. If that proves to be frequently breaking, we can simplify tests
or remove the truncation subtests. But for now it seems useful to test
it in those conditions that are otherwise rarely occuring in practice.

Also test CO-RE relo failure in a subprog as that excercises subprogram CO-RE
relocation mapping logic which doesn't work out of the box without extra
relo storage previously done only for gen_loader case.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220426004511.2691730-11-andrii@kernel.org
2022-04-26 15:41:46 -07:00
Andrii Nakryiko
9fdc4273b8 libbpf: Fix up verifier log for unguarded failed CO-RE relos
Teach libbpf to post-process BPF verifier log on BPF program load
failure and detect known error patterns to provide user with more
context.

Currently there is one such common situation: an "unguarded" failed BPF
CO-RE relocation. While failing CO-RE relocation is expected, it is
expected to be property guarded in BPF code such that BPF verifier
always eliminates BPF instructions corresponding to such failed CO-RE
relos as dead code. In cases when user failed to take such precautions,
BPF verifier provides the best log it can:

  123: (85) call unknown#195896080
  invalid func unknown#195896080

Such incomprehensible log error is due to libbpf "poisoning" BPF
instruction that corresponds to failed CO-RE relocation by replacing it
with invalid `call 0xbad2310` instruction (195896080 == 0xbad2310 reads
"bad relo" if you squint hard enough).

Luckily, libbpf has all the necessary information to look up CO-RE
relocation that failed and provide more human-readable description of
what's going on:

  5: <invalid CO-RE relocation>
  failed to resolve CO-RE relocation <byte_off> [6] struct task_struct___bad.fake_field_subprog (0:2 @ offset 8)

This hopefully makes it much easier to understand what's wrong with
user's BPF program without googling magic constants.

This BPF verifier log fixup is setup to be extensible and is going to be
used for at least one other upcoming feature of libbpf in follow up patches.
Libbpf is parsing lines of BPF verifier log starting from the very end.
Currently it processes up to 10 lines of code looking for familiar
patterns. This avoids wasting lots of CPU processing huge verifier logs
(especially for log_level=2 verbosity level). Actual verification error
should normally be found in last few lines, so this should work
reliably.

If libbpf needs to expand log beyond available log_buf_size, it
truncates the end of the verifier log. Given verifier log normally ends
with something like:

  processed 2 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0

... truncating this on program load error isn't too bad (end user can
always increase log size, if it needs to get complete log).

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220426004511.2691730-10-andrii@kernel.org
2022-04-26 15:41:46 -07:00
Andrii Nakryiko
14032f2644 libbpf: Simplify bpf_core_parse_spec() signature
Simplify bpf_core_parse_spec() signature to take struct bpf_core_relo as
an input instead of requiring callers to decompose them into type_id,
relo, spec_str, etc. This makes using and reusing this helper easier.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220426004511.2691730-9-andrii@kernel.org
2022-04-26 15:41:46 -07:00
Andrii Nakryiko
b58af63aab libbpf: Refactor CO-RE relo human description formatting routine
Refactor how CO-RE relocation is formatted. Now it dumps human-readable
representation, currently used by libbpf in either debug or error
message output during CO-RE relocation resolution process, into provided
buffer. This approach allows for better reuse of this functionality
outside of CO-RE relocation resolution, which we'll use in next patch
for providing better error message for BPF verifier rejecting BPF
program due to unguarded failed CO-RE relocation.

It also gets rid of annoying "stitching" of libbpf_print() calls, which
was the only place where we did this.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220426004511.2691730-8-andrii@kernel.org
2022-04-26 15:41:46 -07:00
Andrii Nakryiko
185cfe837f libbpf: Record subprog-resolved CO-RE relocations unconditionally
Previously, libbpf recorded CO-RE relocations with insns_idx resolved
according to finalized subprog locations (which are appended at the end
of entry BPF program) to simplify the job of light skeleton generator.

This is necessary because once subprogs' instructions are appended to
main entry BPF program all the subprog instruction indices are shifted
and that shift is different for each entry (main) BPF program, so it's
generally impossible to map final absolute insn_idx of the finalized BPF
program to their original locations inside subprograms.

This information is now going to be used not only during light skeleton
generation, but also to map absolute instruction index to subprog's
instruction and its corresponding CO-RE relocation. So start recording
these relocations always, not just when obj->gen_loader is set.

This information is going to be freed at the end of bpf_object__load()
step, as before (but this can change in the future if there will be
a need for this information post load step).

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220426004511.2691730-7-andrii@kernel.org
2022-04-26 15:41:46 -07:00
Andrii Nakryiko
b82bb1ffbb selftests/bpf: Add CO-RE relos and SEC("?...") to linked_funcs selftests
Enhance linked_funcs selftest with two tricky features that might not
obviously work correctly together. We add CO-RE relocations to entry BPF
programs and mark those programs as non-autoloadable with SEC("?...")
annotation. This makes sure that libbpf itself handles .BTF.ext CO-RE
relocation data matching correctly for SEC("?...") programs, as well as
ensures that BPF static linker handles this correctly (this was the case
before, no changes are necessary, but it wasn't explicitly tested).

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220426004511.2691730-6-andrii@kernel.org
2022-04-26 15:41:46 -07:00
Andrii Nakryiko
11d5daa892 libbpf: Avoid joining .BTF.ext data with BPF programs by section name
Instead of using ELF section names as a joining key between .BTF.ext and
corresponding BPF programs, pre-build .BTF.ext section number to ELF
section index mapping during bpf_object__open() and use it later for
matching .BTF.ext information (func/line info or CO-RE relocations) to
their respective BPF programs and subprograms.

This simplifies corresponding joining logic and let's libbpf do
manipulations with BPF program's ELF sections like dropping leading '?'
character for non-autoloaded programs. Original joining logic in
bpf_object__relocate_core() (see relevant comment that's now removed)
was never elegant, so it's a good improvement regardless. But it also
avoids unnecessary internal assumptions about preserving original ELF
section name as BPF program's section name (which was broken when
SEC("?abc") support was added).

Fixes: a3820c4811 ("libbpf: Support opting out from autoloading BPF programs declaratively")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220426004511.2691730-5-andrii@kernel.org
2022-04-26 15:41:46 -07:00
Andrii Nakryiko
966a750932 libbpf: Fix logic for finding matching program for CO-RE relocation
Fix the bug in bpf_object__relocate_core() which can lead to finding
invalid matching BPF program when processing CO-RE relocation. IF
matching program is not found, last encountered program will be assumed
to be correct program and thus error detection won't detect the problem.

Fixes: 9c82a63cf3 ("libbpf: Fix CO-RE relocs against .text section")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220426004511.2691730-4-andrii@kernel.org
2022-04-26 15:41:46 -07:00
Andrii Nakryiko
0994a54c52 libbpf: Drop unhelpful "program too large" guess
libbpf pretends it knows actual limit of BPF program instructions based
on UAPI headers it compiled with. There is neither any guarantee that
UAPI headers match host kernel, nor BPF verifier actually uses
BPF_MAXINSNS constant anymore. Just drop unhelpful "guess", BPF verifier
will emit actual reason for failure in its logs anyways.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220426004511.2691730-3-andrii@kernel.org
2022-04-26 15:41:45 -07:00
Andrii Nakryiko
afe98d46ba libbpf: Fix anonymous type check in CO-RE logic
Use type name for checking whether CO-RE relocation is referring to
anonymous type. Using spec string makes no sense.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220426004511.2691730-2-andrii@kernel.org
2022-04-26 15:41:45 -07:00
Menglong Dong
c317ab71fa bpf: Compute map_btf_id during build time
For now, the field 'map_btf_id' in 'struct bpf_map_ops' for all map
types are computed during vmlinux-btf init:

  btf_parse_vmlinux() -> btf_vmlinux_map_ids_init()

It will lookup the btf_type according to the 'map_btf_name' field in
'struct bpf_map_ops'. This process can be done during build time,
thanks to Jiri's resolve_btfids.

selftest of map_ptr has passed:

  $96 map_ptr:OK
  Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Menglong Dong <imagedong@tencent.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-04-26 11:35:21 -07:00