LoongArch architecture changes for 6.7 (BPF CPU v4 support) depend on
the bpf changes to fix conflictions in selftests and work, so merge them
to create a base.
- Untangle the initialization and updates of passive and active trip
points in the ACPI thermal driver (Rafael Wysocki).
- Reduce code duplication related to the initialization and updates
of trip points in the ACPI thermal driver (Rafael Wysocki).
- Use trip pointers for cooling device binding in the ACPI thermal
driver (Rafael Wysocki).
- Simplify critical and hot trips representation in the ACPI thermal
driver (Rafael Wysocki).
- Use trip pointers in thermal governors and in the related part of
the thermal core (Rafael Wysocki).
- Drop the trips_disabled bitmask that has become redundant from the
thermal core (Rafael Wysocki).
- Avoid updating trip points when the thermal zone temperature falls
into a trip point's hysteresis range (ícolas F. R. A. Prado).
- Add power floor notifications support to the int340x thermal control
driver (Srinivas Pandruvada).
- Rework updating trip points in the int340x thermal driver so that it
does not access thermal zone internals directly (Rafael Wysocki).
- Use param_get_byte() instead of param_get_int() as the max_idle module
parameter .get() callback in the Intel powerclamp thermal driver to
avoid possible out-of-bounds access (David Arcari).
- Add workload hints support to the int340x thermal driver (Srinivas
Pandruvada).
- Add support for Mediatek LVTS MT8192 along with suspend/resume
routines (Balsam Chihi).
- Fix probe for THERMAL_V2 in the Mediatek LVTS driver (Markus
Schneider-Pargmann).
- Remove duplicate error message from the max76620 driver when
thermal_of_zone_register() fails (Thierry Reding).
- Add i.MX7D compatible bindings to fix a warning from dtbs_check for
the imx6ul platform (Alexander Stein).
- Add sa8775p compatible to the QCom tsens driver (Priyansh Jain).
- Fix error check in lvts_debugfs_init() to be against PTR_ERR() in the
LVTS Mediatek driver (Minjie Du).
- Remove unused variable in thermal/tools (Kuan-Wei Chiu).
- Document the imx8dl thermal sensor (Fabio Estevam).
- Add variable names in callback prototypes to prevent warning from
checkpatch.pl in the imx8mm driver (Bragatheswaran Manickavel).
- Add missing unevaluatedProperties on child node schemas for tegra124
(Rob Herring)
- Add mt7988 support to the Mediatek LVTS driver (Frank Wunderlich).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmU6a+sSHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRx9s8P/A/u3d8jn5Dj7B4N3Y4uTL1IG3aleGnZ
EDxoiW6ju723s9ZMGsJg6qzXoksJDjleUTGtdyGYgbPyFz44/6/bcYqI6H0GsHlH
lah5jAgXEaJdbc9CwwhjBoVEN6wO5ATyDIJ1H+iU0x/6svZpXN+tlQp2dAFacAxD
CyZkFrBrSrEercO59NxCTyA9n9hfAfHdX7hBd5e686SXtcR+wdLyLfrPiEoE4oYq
vW9QkzTSK+nILrLTSGnrmtYvsOzbyRKqoIPrRhqSwgobvDHvYgkokdESYGzu/d3F
k922eHV8C4RyFZmVT9o+dVaK27gZ3KQI3d8BfpitMYD/Tb0N4axmwkZ44rrA8OOy
kpTFENvlICyUpfcTqwLYIZT9WK7rXKpxqjLeKMZvzrUWBoNm8tFsuJy3Hxd0IP6V
F2X62UH/sA4QAtJHD0VP5mc7FRbuFBdpyQ6Tq9ZNUQGoOeFT6VxQaqzRzqWwV6Na
lxxvX3FST4lRguNRg9NW3RrOxQOyLaQP1TIN3CaZ0rOs2dlGz8v2UN252uM/K1Y0
JO0rXS2HoXdUqWWaIMnVvNqOsaYeMGhVfCNySEn1DoChpInAZF50Z0/fL5Vpb51L
cl0VrNxReHwDHeuRtKcFgH1S8YrsehrRAZp3bDArmD/XszJQR+Rv5N6BSdURDLSU
1mkfn37l8fKc
=PSaG
-----END PGP SIGNATURE-----
Merge tag 'thermal-6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull thermal control updates from Rafael Wysocki:
"These further rework the ACPI thermal driver, after the changes made
to it in the previous cycle, to make it easier to grasp, get rid of
redundant pieces of internal data structures and eliminate its
reliance on a specific ordering of trip point objects in the thermal
core, make thermal core adjustments needed for the ACPI thermal driver
rework, modify the thermal governor interface so as to use trip
pointers for representing trip points in it, switch over multiple
thermal drivers to using void platform driver remove callbacks, add
support for 2 hardware features to the Intel int340x thermal driver,
add support for new hardware on ARM platforms, update documentation,
fix problems, clean up code and update the MAINTAINERS record for
thermal control.
Specifics:
- Untangle the initialization and updates of passive and active trip
points in the ACPI thermal driver (Rafael Wysocki)
- Reduce code duplication related to the initialization and updates
of trip points in the ACPI thermal driver (Rafael Wysocki)
- Use trip pointers for cooling device binding in the ACPI thermal
driver (Rafael Wysocki)
- Simplify critical and hot trips representation in the ACPI thermal
driver (Rafael Wysocki)
- Use trip pointers in thermal governors and in the related part of
the thermal core (Rafael Wysocki)
- Drop the trips_disabled bitmask that has become redundant from the
thermal core (Rafael Wysocki)
- Avoid updating trip points when the thermal zone temperature falls
into a trip point's hysteresis range (ícolas F. R. A. Prado)
- Add power floor notifications support to the int340x thermal
control driver (Srinivas Pandruvada)
- Rework updating trip points in the int340x thermal driver so that
it does not access thermal zone internals directly (Rafael
Wysocki)
- Use param_get_byte() instead of param_get_int() as the max_idle
module parameter .get() callback in the Intel powerclamp thermal
driver to avoid possible out-of-bounds access (David Arcari)
- Add workload hints support to the int340x thermal driver (Srinivas
Pandruvada)
- Add support for Mediatek LVTS MT8192 along with suspend/resume
routines (Balsam Chihi)
- Fix probe for THERMAL_V2 in the Mediatek LVTS driver (Markus
Schneider-Pargmann)
- Remove duplicate error message from the max76620 driver when
thermal_of_zone_register() fails (Thierry Reding)
- Add i.MX7D compatible bindings to fix a warning from dtbs_check for
the imx6ul platform (Alexander Stein)
- Add sa8775p compatible to the QCom tsens driver (Priyansh Jain)
- Fix error check in lvts_debugfs_init() to be against PTR_ERR() in
the LVTS Mediatek driver (Minjie Du)
- Remove unused variable in thermal/tools (Kuan-Wei Chiu)
- Document the imx8dl thermal sensor (Fabio Estevam)
- Add variable names in callback prototypes to prevent warning from
checkpatch.pl in the imx8mm driver (Bragatheswaran Manickavel)
- Add missing unevaluatedProperties on child node schemas for
tegra124 (Rob Herring)
- Add mt7988 support to the Mediatek LVTS driver (Frank Wunderlich)"
* tag 'thermal-6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (111 commits)
thermal: ACPI: Include the right header file
thermal: core: Don't update trip points inside the hysteresis range
thermal: core: Pass trip pointer to governor throttle callback
thermal: gov_step_wise: Fold update_passive_instance() into its caller
thermal: gov_power_allocator: Use trip pointers instead of trip indices
thermal: gov_fair_share: Rearrange get_trip_level()
thermal: trip: Define for_each_trip() macro
thermal: trip: Simplify computing trip indices
thermal/qcom/tsens: Drop ops_v0_1
thermal/drivers/mediatek/lvts_thermal: Update calibration data documentation
thermal/drivers/mediatek/lvts_thermal: Add mt8192 support
thermal/drivers/mediatek/lvts_thermal: Add suspend and resume
dt-bindings: thermal: mediatek: Add LVTS thermal controller definition for mt8192
thermal/drivers/mediatek: Fix probe for THERMAL_V2
thermal/drivers/max77620: Remove duplicate error message
dt-bindings: timer: add imx7d compatible
dt-bindings: net: microchip: Allow nvmem-cell usage
dt-bindings: imx-thermal: Add #thermal-sensor-cells property
dt-bindings: thermal: tsens: Add sa8775p compatible
thermal/drivers/mediatek/lvts_thermal: Fix error check in lvts_debugfs_init()
...
- Generalized infrastructure for 'writable' ID registers, effectively
allowing userspace to opt-out of certain vCPU features for its guest
- Optimization for vSGI injection, opportunistically compressing MPIDR
to vCPU mapping into a table
- Improvements to KVM's PMU emulation, allowing userspace to select
the number of PMCs available to a VM
- Guest support for memory operation instructions (FEAT_MOPS)
- Cleanups to handling feature flags in KVM_ARM_VCPU_INIT, squashing
bugs and getting rid of useless code
- Changes to the way the SMCCC filter is constructed, avoiding wasted
memory allocations when not in use
- Load the stage-2 MMU context at vcpu_load() for VHE systems, reducing
the overhead of errata mitigations
- Miscellaneous kernel and selftest fixes
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQSNXHjWXuzMZutrKNKivnWIJHzdFgUCZUFJRgAKCRCivnWIJHzd
FtgYAP9cMsc1Mhlw3jNQnTc6q0cbTulD/SoEDPUat1dXMqjs+gEAnskwQTrTX834
fgGQeCAyp7Gmar+KeP64H0xm8kPSpAw=
=R4M7
-----END PGP SIGNATURE-----
Merge tag 'kvmarm-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 updates for 6.7
- Generalized infrastructure for 'writable' ID registers, effectively
allowing userspace to opt-out of certain vCPU features for its guest
- Optimization for vSGI injection, opportunistically compressing MPIDR
to vCPU mapping into a table
- Improvements to KVM's PMU emulation, allowing userspace to select
the number of PMCs available to a VM
- Guest support for memory operation instructions (FEAT_MOPS)
- Cleanups to handling feature flags in KVM_ARM_VCPU_INIT, squashing
bugs and getting rid of useless code
- Changes to the way the SMCCC filter is constructed, avoiding wasted
memory allocations when not in use
- Load the stage-2 MMU context at vcpu_load() for VHE systems, reducing
the overhead of errata mitigations
- Miscellaneous kernel and selftest fixes
Restricted CXL Host (RCH) Error Handling undoes the topology munging of
CXL 1.1 to enabled some AER recovery, and lands some base infrastructure
for handling Root-Complex-Event-Collectors (RCECs) with CXL. Include
this long running series finally for v6.7.
Core & protocols
----------------
- Support usec resolution of TCP timestamps, enabled selectively by
a route attribute.
- Defer regular TCP ACK while processing socket backlog, try to send
a cumulative ACK at the end. Increase single TCP flow performance
on a 200Gbit NIC by 20% (100Gbit -> 120Gbit).
- The Fair Queuing (FQ) packet scheduler:
- add built-in 3 band prio / WRR scheduling
- support bypass if the qdisc is mostly idle (5% speed up for TCP RR)
- improve inactive flow reporting
- optimize the layout of structures for better cache locality
- Support TCP Authentication Option (RFC 5925, TCP-AO), a more modern
replacement for the old MD5 option.
- Add more retransmission timeout (RTO) related statistics to TCP_INFO.
- Support sending fragmented skbs over vsock sockets.
- Make sure we send SIGPIPE for vsock sockets if socket was shutdown().
- Add sysctl for ignoring lower limit on lifetime in Router
Advertisement PIO, based on an in-progress IETF draft.
- Add sysctl to control activation of TCP ping-pong mode.
- Add sysctl to make connection timeout in MPTCP configurable.
- Support rcvlowat and notsent_lowat on MPTCP sockets, to help apps
limit the number of wakeups.
- Support netlink GET for MDB (multicast forwarding), allowing user
space to request a single MDB entry instead of dumping the entire
table.
- Support selective FDB flushing in the VXLAN tunnel driver.
- Allow limiting learned FDB entries in bridges, prevent OOM attacks.
- Allow controlling via configfs netconsole targets which were created
via the kernel cmdline at boot, rather than via configfs at runtime.
- Support multiple PTP timestamp event queue readers with different
filters.
- MCTP over I3C.
BPF
---
- Add new veth-like netdevice where BPF program defines the logic
of the xmit routine. It can operate in L3 and L2 mode.
- Support exceptions - allow asserting conditions which should
never be true but are hard for the verifier to infer.
With some extra flexibility around handling of the exit / failure.
https://lwn.net/Articles/938435/
- Add support for local per-cpu kptr, allow allocating and storing
per-cpu objects in maps. Access to those objects operates on
the value for the current CPU. This allows to deprecate local
one-off implementations of per-CPU storage like
BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE maps.
- Extend cgroup BPF sockaddr hooks for UNIX sockets. The use case is
for systemd to re-implement the LogNamespace feature which allows
running multiple instances of systemd-journald to process the logs
of different services.
- Enable open-coded task_vma iteration, after maple tree conversion
made it hard to directly walk VMAs in tracing programs.
- Add open-coded task, css_task and css iterator support.
One of the use cases is customizable OOM victim selection via BPF.
- Allow source address selection with bpf_*_fib_lookup().
- Add ability to pin BPF timer to the current CPU.
- Prevent creation of infinite loops by combining tail calls and
fentry/fexit programs.
- Add missed stats for kprobes to retrieve the number of missed kprobe
executions and subsequent executions of BPF programs.
- Inherit system settings for CPU security mitigations.
- Add BPF v4 CPU instruction support for arm32 and s390x.
Changes to common code
----------------------
- overflow: add DEFINE_FLEX() for on-stack definition of structs
with flexible array members.
- Process doc update with more guidance for reviewers.
Driver API
----------
- Simplify locking in WiFi (cfg80211 and mac80211 layers), use wiphy
mutex in most places and remove a lot of smaller locks.
- Create a common DPLL configuration API. Allow configuring
and querying state of PLL circuits used for clock syntonization,
in network time distribution.
- Unify fragmented and full page allocation APIs in page pool code.
Let drivers be ignorant of PAGE_SIZE.
- Rework PHY state machine to avoid races with calls to phy_stop().
- Notify DSA drivers of MAC address changes on user ports, improve
correctness of offloads which depend on matching port MAC addresses.
- Allow antenna control on injected WiFi frames.
- Reduce the number of variants of napi_schedule().
- Simplify error handling when composing devlink health messages.
Misc
----
- A lot of KCSAN data race "fixes", from Eric.
- A lot of __counted_by() annotations, from Kees.
- A lot of strncpy -> strscpy and printf format fixes.
- Replace master/slave terminology with conduit/user in DSA drivers.
- Handful of KUnit tests for netdev and WiFi core.
Removed
-------
- AppleTalk COPS.
- AppleTalk ipddp.
- TI AR7 CPMAC Ethernet driver.
Drivers
-------
- Ethernet high-speed NICs:
- Intel (100G, ice, idpf):
- add a driver for the Intel E2000 IPUs
- make CRC/FCS stripping configurable
- cross-timestamping for E823 devices
- basic support for E830 devices
- use aux-bus for managing client drivers
- i40e: report firmware versions via devlink
- nVidia/Mellanox:
- support 4-port NICs
- increase max number of channels to 256
- optimize / parallelize SF creation flow
- Broadcom (bnxt):
- enhance NIC temperature reporting
- support PAM4 speeds and lane configuration
- Marvell OcteonTX2:
- PTP pulse-per-second output support
- enable hardware timestamping for VFs
- Solarflare/AMD:
- conntrack NAT offload and offload for tunnels
- Wangxun (ngbe/txgbe):
- expose HW statistics
- Pensando/AMD:
- support PCI level reset
- narrow down the condition under which skbs are linearized
- Netronome/Corigine (nfp):
- support CHACHA20-POLY1305 crypto in IPsec offload
- Ethernet NICs embedded, slower, virtual:
- Synopsys (stmmac):
- add Loongson-1 SoC support
- enable use of HW queues with no offload capabilities
- enable PPS input support on all 5 channels
- increase TX coalesce timer to 5ms
- RealTek USB (r8152): improve efficiency of Rx by using GRO frags
- xen: support SW packet timestamping
- add drivers for implementations based on TI's PRUSS (AM64x EVM)
- nVidia/Mellanox Ethernet datacenter switches:
- avoid poor HW resource use on Spectrum-4 by better block selection
for IPv6 multicast forwarding and ordering of blocks in ACL region
- Ethernet embedded switches:
- Microchip:
- support configuring the drive strength for EMI compliance
- ksz9477: partial ACL support
- ksz9477: HSR offload
- ksz9477: Wake on LAN
- Realtek:
- rtl8366rb: respect device tree config of the CPU port
- Ethernet PHYs:
- support Broadcom BCM5221 PHYs
- TI dp83867: support hardware LED blinking
- CAN:
- add support for Linux-PHY based CAN transceivers
- at91_can: clean up and use rx-offload helpers
- WiFi:
- MediaTek (mt76):
- new sub-driver for mt7925 USB/PCIe devices
- HW wireless <> Ethernet bridging in MT7988 chips
- mt7603/mt7628 stability improvements
- Qualcomm (ath12k):
- WCN7850:
- enable 320 MHz channels in 6 GHz band
- hardware rfkill support
- enable IEEE80211_HW_SINGLE_SCAN_ON_ALL_BANDS
to make scan faster
- read board data variant name from SMBIOS
- QCN9274: mesh support
- RealTek (rtw89):
- TDMA-based multi-channel concurrency (MCC)
- Silicon Labs (wfx):
- Remain-On-Channel (ROC) support
- Bluetooth:
- ISO: many improvements for broadcast support
- mark BCM4378/BCM4387 as BROKEN_LE_CODED
- add support for QCA2066
- btmtksdio: enable Bluetooth wakeup from suspend
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmU8XsYACgkQMUZtbf5S
Irv19RAAnud/24OOF5XMEJkIcYlnfqximh4XO6PujRSYkSkOUJdZTF6iJPgf3pSP
YpwoHYbYKHYfeOf8+3bTNESiQNSnoVmvmvwiS6/7lZ3behHUrGLQzW9Htc3EZyWH
2h6QkDZ5OOjfg0bwYSfp3vXkmMH2k8WE9Y0NvCkhcohqZi13Rmp14RnyPmNb2d1V
yZRYDMSM133KqE6gnBr1Ct65IEvnKeGlCUN2mTGqOJgdn6DZMsyxvtt0y4rmN7Ab
41+CgPU5SfxfbYpW+Dl2HJpgfte3WrC57KC6AM0PAPJzPmQWgeB/m9mjz/apj6Bg
bhsEIo7FdvbCnQm3yWPhK2OgCAcSwLr8jfGMU+Q+W4VnL5SRRR3Rm0zjsze+kHNP
OfqJgxzl3DpvoJqVBy1h5FGcZt0XHwhksm4cTxWqIahsF+veY0ECBXbuBBQx9XTF
Y7INfI8ulg7wISJs+CJfIClYkgOibTw2u8taBS5ikbtgxNqp5D4QqODn7UefQap1
PR/IDYODF+zRgmMJLeBqSa6fij6BkfOEDiOWak5kggBoZdtbtmeKI6tzze06CNdW
lWv1WEhRufxnwK+IuWsEkjhiMbs2WGLvkJ5JbgQV9BfqHfIfiqBCrcWtT/WbQnGt
lmU46CXh1t/FZEqbmK9h+8vsIIfrcDl6jb5npEiKPRG00vDKRTM=
=46nS
-----END PGP SIGNATURE-----
Merge tag 'net-next-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski:
"Core & protocols:
- Support usec resolution of TCP timestamps, enabled selectively by a
route attribute.
- Defer regular TCP ACK while processing socket backlog, try to send
a cumulative ACK at the end. Increase single TCP flow performance
on a 200Gbit NIC by 20% (100Gbit -> 120Gbit).
- The Fair Queuing (FQ) packet scheduler:
- add built-in 3 band prio / WRR scheduling
- support bypass if the qdisc is mostly idle (5% speed up for TCP RR)
- improve inactive flow reporting
- optimize the layout of structures for better cache locality
- Support TCP Authentication Option (RFC 5925, TCP-AO), a more modern
replacement for the old MD5 option.
- Add more retransmission timeout (RTO) related statistics to
TCP_INFO.
- Support sending fragmented skbs over vsock sockets.
- Make sure we send SIGPIPE for vsock sockets if socket was
shutdown().
- Add sysctl for ignoring lower limit on lifetime in Router
Advertisement PIO, based on an in-progress IETF draft.
- Add sysctl to control activation of TCP ping-pong mode.
- Add sysctl to make connection timeout in MPTCP configurable.
- Support rcvlowat and notsent_lowat on MPTCP sockets, to help apps
limit the number of wakeups.
- Support netlink GET for MDB (multicast forwarding), allowing user
space to request a single MDB entry instead of dumping the entire
table.
- Support selective FDB flushing in the VXLAN tunnel driver.
- Allow limiting learned FDB entries in bridges, prevent OOM attacks.
- Allow controlling via configfs netconsole targets which were
created via the kernel cmdline at boot, rather than via configfs at
runtime.
- Support multiple PTP timestamp event queue readers with different
filters.
- MCTP over I3C.
BPF:
- Add new veth-like netdevice where BPF program defines the logic of
the xmit routine. It can operate in L3 and L2 mode.
- Support exceptions - allow asserting conditions which should never
be true but are hard for the verifier to infer. With some extra
flexibility around handling of the exit / failure:
https://lwn.net/Articles/938435/
- Add support for local per-cpu kptr, allow allocating and storing
per-cpu objects in maps. Access to those objects operates on the
value for the current CPU.
This allows to deprecate local one-off implementations of per-CPU
storage like BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE maps.
- Extend cgroup BPF sockaddr hooks for UNIX sockets. The use case is
for systemd to re-implement the LogNamespace feature which allows
running multiple instances of systemd-journald to process the logs
of different services.
- Enable open-coded task_vma iteration, after maple tree conversion
made it hard to directly walk VMAs in tracing programs.
- Add open-coded task, css_task and css iterator support. One of the
use cases is customizable OOM victim selection via BPF.
- Allow source address selection with bpf_*_fib_lookup().
- Add ability to pin BPF timer to the current CPU.
- Prevent creation of infinite loops by combining tail calls and
fentry/fexit programs.
- Add missed stats for kprobes to retrieve the number of missed
kprobe executions and subsequent executions of BPF programs.
- Inherit system settings for CPU security mitigations.
- Add BPF v4 CPU instruction support for arm32 and s390x.
Changes to common code:
- overflow: add DEFINE_FLEX() for on-stack definition of structs with
flexible array members.
- Process doc update with more guidance for reviewers.
Driver API:
- Simplify locking in WiFi (cfg80211 and mac80211 layers), use wiphy
mutex in most places and remove a lot of smaller locks.
- Create a common DPLL configuration API. Allow configuring and
querying state of PLL circuits used for clock syntonization, in
network time distribution.
- Unify fragmented and full page allocation APIs in page pool code.
Let drivers be ignorant of PAGE_SIZE.
- Rework PHY state machine to avoid races with calls to phy_stop().
- Notify DSA drivers of MAC address changes on user ports, improve
correctness of offloads which depend on matching port MAC
addresses.
- Allow antenna control on injected WiFi frames.
- Reduce the number of variants of napi_schedule().
- Simplify error handling when composing devlink health messages.
Misc:
- A lot of KCSAN data race "fixes", from Eric.
- A lot of __counted_by() annotations, from Kees.
- A lot of strncpy -> strscpy and printf format fixes.
- Replace master/slave terminology with conduit/user in DSA drivers.
- Handful of KUnit tests for netdev and WiFi core.
Removed:
- AppleTalk COPS.
- AppleTalk ipddp.
- TI AR7 CPMAC Ethernet driver.
Drivers:
- Ethernet high-speed NICs:
- Intel (100G, ice, idpf):
- add a driver for the Intel E2000 IPUs
- make CRC/FCS stripping configurable
- cross-timestamping for E823 devices
- basic support for E830 devices
- use aux-bus for managing client drivers
- i40e: report firmware versions via devlink
- nVidia/Mellanox:
- support 4-port NICs
- increase max number of channels to 256
- optimize / parallelize SF creation flow
- Broadcom (bnxt):
- enhance NIC temperature reporting
- support PAM4 speeds and lane configuration
- Marvell OcteonTX2:
- PTP pulse-per-second output support
- enable hardware timestamping for VFs
- Solarflare/AMD:
- conntrack NAT offload and offload for tunnels
- Wangxun (ngbe/txgbe):
- expose HW statistics
- Pensando/AMD:
- support PCI level reset
- narrow down the condition under which skbs are linearized
- Netronome/Corigine (nfp):
- support CHACHA20-POLY1305 crypto in IPsec offload
- Ethernet NICs embedded, slower, virtual:
- Synopsys (stmmac):
- add Loongson-1 SoC support
- enable use of HW queues with no offload capabilities
- enable PPS input support on all 5 channels
- increase TX coalesce timer to 5ms
- RealTek USB (r8152): improve efficiency of Rx by using GRO frags
- xen: support SW packet timestamping
- add drivers for implementations based on TI's PRUSS (AM64x EVM)
- nVidia/Mellanox Ethernet datacenter switches:
- avoid poor HW resource use on Spectrum-4 by better block
selection for IPv6 multicast forwarding and ordering of blocks
in ACL region
- Ethernet embedded switches:
- Microchip:
- support configuring the drive strength for EMI compliance
- ksz9477: partial ACL support
- ksz9477: HSR offload
- ksz9477: Wake on LAN
- Realtek:
- rtl8366rb: respect device tree config of the CPU port
- Ethernet PHYs:
- support Broadcom BCM5221 PHYs
- TI dp83867: support hardware LED blinking
- CAN:
- add support for Linux-PHY based CAN transceivers
- at91_can: clean up and use rx-offload helpers
- WiFi:
- MediaTek (mt76):
- new sub-driver for mt7925 USB/PCIe devices
- HW wireless <> Ethernet bridging in MT7988 chips
- mt7603/mt7628 stability improvements
- Qualcomm (ath12k):
- WCN7850:
- enable 320 MHz channels in 6 GHz band
- hardware rfkill support
- enable IEEE80211_HW_SINGLE_SCAN_ON_ALL_BANDS to
make scan faster
- read board data variant name from SMBIOS
- QCN9274: mesh support
- RealTek (rtw89):
- TDMA-based multi-channel concurrency (MCC)
- Silicon Labs (wfx):
- Remain-On-Channel (ROC) support
- Bluetooth:
- ISO: many improvements for broadcast support
- mark BCM4378/BCM4387 as BROKEN_LE_CODED
- add support for QCA2066
- btmtksdio: enable Bluetooth wakeup from suspend"
* tag 'net-next-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1816 commits)
net: pcs: xpcs: Add 2500BASE-X case in get state for XPCS drivers
net: bpf: Use sockopt_lock_sock() in ip_sock_set_tos()
net: mana: Use xdp_set_features_flag instead of direct assignment
vxlan: Cleanup IFLA_VXLAN_PORT_RANGE entry in vxlan_get_size()
iavf: delete the iavf client interface
iavf: add a common function for undoing the interrupt scheme
iavf: use unregister_netdev
iavf: rely on netdev's own registered state
iavf: fix the waiting time for initial reset
iavf: in iavf_down, don't queue watchdog_task if comms failed
iavf: simplify mutex_trylock+sleep loops
iavf: fix comments about old bit locks
doc/netlink: Update schema to support cmd-cnt-name and cmd-max-name
tools: ynl: introduce option to process unknown attributes or types
ipvlan: properly track tx_errors
netdevsim: Block until all devices are released
nfp: using napi_build_skb() to replace build_skb()
net: dsa: microchip: ksz9477: Fix spelling mistake "Enery" -> "Energy"
net: dsa: microchip: Ensure Stable PME Pin State for Wake-on-LAN
net: dsa: microchip: Refactor switch shutdown routine for WoL preparation
...
- Add CONFIG_KVM_MAX_NR_VCPUS to allow supporting up to 4096 vCPUs without
forcing more common use cases to eat the extra memory overhead.
- Add IBPB and SBPB virtualization support.
- Fix a bug where restoring a vCPU snapshot that was taken within 1 second of
creating the original vCPU would cause KVM to try to synchronize the vCPU's
TSC and thus clobber the correct TSC being set by userspace.
- Compute guest wall clock using a single TSC read to avoid generating an
inaccurate time, e.g. if the vCPU is preempted between multiple TSC reads.
- "Virtualize" HWCR.TscFreqSel to make Linux guests happy, which complain
about a "Firmware Bug" if the bit isn't set for select F/M/S combos.
- Don't apply side effects to Hyper-V's synthetic timer on writes from
userspace to fix an issue where the auto-enable behavior can trigger
spurious interrupts, i.e. do auto-enabling only for guest writes.
- Remove an unnecessary kick of all vCPUs when synchronizing the dirty log
without PML enabled.
- Advertise "support" for non-serializing FS/GS base MSR writes as appropriate.
- Use octal notation for file permissions through KVM x86.
- Fix a handful of typo fixes and warts.
-----BEGIN PGP SIGNATURE-----
iQJGBAABCgAwFiEEMHr+pfEFOIzK+KY1YJEiAU0MEvkFAmU8EugSHHNlYW5qY0Bn
b29nbGUuY29tAAoJEGCRIgFNDBL5xS0P+gPTDO81CUZO70LrO2W4E7toRBf/F9x1
/v5D/76p9hG32Z6+BJs/xxDxJFagw75MtoR5oKivtXiip3TxbfOyDOlaQkIRo85E
/d95il/LRidL3Mv3TXRj1lykXnxSSz9tigAGEZti1Y9Fn9fXEIwurJH7dU5cBI1E
fin5bsDaTNRjG4jjTiEUbnKPRTlD/S7CQJn4CaYvZhMv/eJkYDLyBBVy4VLoLzvD
ctL6VJQLGPVxbxr9mEmulaqMrSuDIQQLkRVQJAViKyerBInTEc5d/GPCHuE8O3zi
0r/QSJbMS9titWLz07NhJ1UH4VJNyaEhRlyJPSFhBW4h6dzUb3EXdUe0Hwa+JH/S
H2cVqsANItTCIhvDtuEGIRDahu0eD+63h90InJ0gEVL1kSJS+UWZHB71PkUEQgAV
2OsuT1D26fuxrv+0b9ioBZURycqKw++zGsrwyVhe77eBgqBJ12tbL4TAD+QNjaQ5
HZTCe6YV83gZoOMeVkoTGSf96s9lGORgxsaAIXmFuLB9RVCVXhVh0ph2HZsnV8Hw
ZXEXpBEFo7GUhb0NIvsk2W73QL87A3fLv15yITWc8KuC7/dXP9z6KpSKjFySS69X
uWD1MVx6shhvbg97UzoJlXc3/z0aVzmdZJudE5d0gcFvAjIItqp6ICPOoKxfj8pT
tqRZu3kVHd61
=sfp8
-----END PGP SIGNATURE-----
Merge tag 'kvm-x86-misc-6.7' of https://github.com/kvm-x86/linux into HEAD
KVM x86 misc changes for 6.7:
- Add CONFIG_KVM_MAX_NR_VCPUS to allow supporting up to 4096 vCPUs without
forcing more common use cases to eat the extra memory overhead.
- Add IBPB and SBPB virtualization support.
- Fix a bug where restoring a vCPU snapshot that was taken within 1 second of
creating the original vCPU would cause KVM to try to synchronize the vCPU's
TSC and thus clobber the correct TSC being set by userspace.
- Compute guest wall clock using a single TSC read to avoid generating an
inaccurate time, e.g. if the vCPU is preempted between multiple TSC reads.
- "Virtualize" HWCR.TscFreqSel to make Linux guests happy, which complain
about a "Firmware Bug" if the bit isn't set for select F/M/S combos.
- Don't apply side effects to Hyper-V's synthetic timer on writes from
userspace to fix an issue where the auto-enable behavior can trigger
spurious interrupts, i.e. do auto-enabling only for guest writes.
- Remove an unnecessary kick of all vCPUs when synchronizing the dirty log
without PML enabled.
- Advertise "support" for non-serializing FS/GS base MSR writes as appropriate.
- Use octal notation for file permissions through KVM x86.
- Fix a handful of typo fixes and warts.
- Smstateen and Zicond support for Guest/VM
- Virtualized senvcfg CSR for Guest/VM
- Added Smstateen registers to the get-reg-list selftests
- Added Zicond to the get-reg-list selftests
- Virtualized SBI debug console (DBCN) for Guest/VM
- Added SBI debug console (DBCN) to the get-reg-list selftests
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEZdn75s5e6LHDQ+f/rUjsVaLHLAcFAmU3dKgACgkQrUjsVaLH
LAeh6g//RVlm3rLLwnWes3PwnBZDtkXYVVLy8oCEPjjjZGMJ+piwV8kcq4QpNear
e1nrOJTnndkjPG9TdJIxFW0qGnPLe4D9Ag6NoFpJXwxZRy/Glr8iWbt3pOcrv6vu
gk/9RP5gQpJpcP7D2scR+go6ryEzncCNY5GLtXoyFM9drQk+IROSrMWrWI5CXBJi
c0qwQWsSWb3KTBTZ7Gsm0Y7E9WIi5u6bBxcLJvblUKszS+PpXO9IA+8qOGZS4Cts
ocy5qqu+NNPECZjSa+kZti8ZSbdjNw9ORyB4OHlBUt55Uidp72pPFdTjGvSvs/jq
6wF70i1qjZOtjVUNdlkF93EN1xu4f3Hus/6/Q1pcyq+k9vo7eeJQQRWR4bJZ9SJR
ZkhYfOgqMUzMnrf2M20en9DygRDFstL2XW5mGDPmk9XLSNv1KwpKf96Xl2E20uPb
21EVQSiybmpl80WkWs0txwk62qQYzYUx+Io8XKXl8MrqNN2HCnzUUy/fjy/kd7iM
Oe1a8kNcV/eawpURBNVCcpld9qM1OPCmseqrr9tpvdR4SmAo1mAi/d7/I61umwpE
a36fqmwIZ0ppMs86BOcMRcTaee6EjPeldXnRCTjCLqo6YvsdHRrwpNBHRMbUOKpy
0688Xf+vjmu81IXhfaRTcs5dvKe1NI4puRtG5DUt7n1csSUyNWI=
=TFKQ
-----END PGP SIGNATURE-----
Merge tag 'kvm-riscv-6.7-1' of https://github.com/kvm-riscv/linux into HEAD
KVM/riscv changes for 6.7
- Smstateen and Zicond support for Guest/VM
- Virtualized senvcfg CSR for Guest/VM
- Added Smstateen registers to the get-reg-list selftests
- Added Zicond to the get-reg-list selftests
- Virtualized SBI debug console (DBCN) for Guest/VM
- Added SBI debug console (DBCN) to the get-reg-list selftests
More updates for v6,7 following the early merge request:
- Fixes for handling of component name prefixing when name prefixes
are used by the machine driver.
- Fixes for noise when stopping some Sounwire CODECs.
- Support for AMD ACP 6.3 and 7.0, Awinc AW88399, more Intel
platforms and more Qualcomm SC7180 platforms.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmU/rAQACgkQJNaLcl1U
h9AXVwf/SwWrxTus3+O2hS5rwusjqQBn1t2mzlnxsYNVCYGBcjOpMGL4HrJzn++e
DJXoMyWis5FNKFWyPtKMGE1kZYdUUE/g7LpZOBew4P47nBv6SQWRvUxPfoq8mdOg
Xb4+kjBzTq1dhwZnZPvNdsknvM7cLfx/lyo2vUJR4peDL0rzgnx72fhRZAjzh2OH
CSz69aTjpliqikp+V7JVFYf2yma2LjTOCL2saiIF/PcxsqUUa73XTggg610EPY+R
pOFb2MBRDbZuJrETbaiytLwtcPuvrpiHRHhxuClsjHGVTDbGfE7GzePS+yUu66y9
LO8oAl7kJebw+WWffIOoL2IjXcG9tA==
=nn11
-----END PGP SIGNATURE-----
Merge tag 'asoc-v6.7-2' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus
ASoC: Updates for v6.7
More updates for v6,7 following the early merge request:
- Fixes for handling of component name prefixing when name prefixes
are used by the machine driver.
- Fixes for noise when stopping some Sounwire CODECs.
- Support for AMD ACP 6.3 and 7.0, Awinc AW88399, more Intel
platforms and more Qualcomm SC7180 platforms.
* cpuset now supports remote partitions where CPUs can be reserved for
exclusive use down the tree without requiring all the intermediate nodes
to be partitions. This makes it easier to use partitions without modifying
existing cgroup hierarchy.
* cpuset partition configuration behavior improvement.
* cgroup_favordynmods= boot param added to allow setting the flag on boot on
cgroup1.
* Misc code and doc updates.
-----BEGIN PGP SIGNATURE-----
iIQEABYIACwWIQTfIjM1kS57o3GsC/uxYfJx3gVYGQUCZUBUKA4cdGpAa2VybmVs
Lm9yZwAKCRCxYfJx3gVYGWfMAP9WP+Z21qzzL2bY5I5kOxu+rD2fF9ORk7azILrI
c3gXSQD/bdxNWcdtrCQMvs+ToNEJvqDFxrmgG5uRUF8Ea+FCtQ8=
=gZj2
-----END PGP SIGNATURE-----
Merge tag 'cgroup-for-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup updates from Tejun Heo:
- cpuset now supports remote partitions where CPUs can be reserved for
exclusive use down the tree without requiring all the intermediate
nodes to be partitions. This makes it easier to use partitions
without modifying existing cgroup hierarchy.
- cpuset partition configuration behavior improvement
- cgroup_favordynmods= boot param added to allow setting the flag on
boot on cgroup1
- Misc code and doc updates
* tag 'cgroup-for-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
docs/cgroup: Add the list of threaded controllers to cgroup-v2.rst
cgroup: use legacy_name for cgroup v1 disable info
cgroup/cpuset: Cleanup signedness issue in cpu_exclusive_check()
cgroup/cpuset: Enable invalid to valid local partition transition
cgroup: add cgroup_favordynmods= command-line option
cgroup/cpuset: Extend test_cpuset_prs.sh to test remote partition
cgroup/cpuset: Documentation update for partition
cgroup/cpuset: Check partition conflict with housekeeping setup
cgroup/cpuset: Introduce remote partition
cgroup/cpuset: Add cpuset.cpus.exclusive for v2
cgroup/cpuset: Add cpuset.cpus.exclusive.effective for v2
cgroup/cpuset: Fix load balance state in update_partition_sd_lb()
cgroup: Avoid extra dereference in css_populate_dir()
cgroup: Check for ret during cgroup1_base_files cft addition
- Add LKDTM test for stuck CPUs (Mark Rutland)
- Improve LKDTM selftest behavior under UBSan (Ricardo Cañuelo)
- Refactor more 1-element arrays into flexible arrays (Gustavo A. R. Silva)
- Analyze and replace strlcpy and strncpy uses (Justin Stitt, Azeem Shaikh)
- Convert group_info.usage to refcount_t (Elena Reshetova)
- Add __counted_by annotations (Kees Cook, Gustavo A. R. Silva)
- Add Kconfig fragment for basic hardening options (Kees Cook, Lukas Bulwahn)
- Fix randstruct GCC plugin performance mode to stay in groups (Kees Cook)
- Fix strtomem() compile-time check for small sources (Kees Cook)
-----BEGIN PGP SIGNATURE-----
iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmU/3cUWHGtlZXNjb29r
QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJsEoEACBGPSiOmfSWdH3TOnIG270PD24
jGjg8KFv7RC/JTOdYmpLl0okdlGT9LvjN/ToSSDEw3PIayxoXUdhkbYy0MYtiV3m
yz2ozDTzJuplQX/W2fPE+nXSzIwHao2zjPPFjHnT7lt8IIjhgjiOtLfZ2gGUkW99
Mdu2aWh3u0r4tC8OS23++yN5ibRc5l72efsjDWjZ0aPXnxE1bjmLMiIPiizpndIf
beasPuDBs98sJVYouemCwnsPXuXOPz3Q1Cpo/fTd+TMTJCLSemCQZCTuOBU0acI/
ZjLCgCaJU1yIYKBMtrIN4G9kITZniXX3/Nm4o6NQMVlcCqMeNaHuflomqWoqWfhE
UPbRo2eghZOaMNiCKLLvZDIqPrh1IcsiEl6Ef3W4hICc42GTK96IuGisIvDXwQ4N
/SzTOupJuN42noh3z1M3XuZy5RoXJ99IYDNY5CTKf9IdqvA0bbGkU3nb1gZH/xw9
BjTqKzR/7K1kTXuSgagDZ1Wceej9pZxhX7E3IHYsP8ZOvKug3EeL4yybVwQ3HRfq
Qnzcp/qPB9cOkLSQXveRTFTsj2mX28Gixct/iDuc1jIYwGQlY1gI6dcUcqby6ptM
BrQti7eR2NH2+T3aE2UVCIWsZVhx7NaSF+z8JxfAuu56jicc4xJVsi8zrNveWX5M
m2VXyBl3121BVtKi4w==
=0iVF
-----END PGP SIGNATURE-----
Merge tag 'hardening-v6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull hardening updates from Kees Cook:
"One of the more voluminous set of changes is for adding the new
__counted_by annotation[1] to gain run-time bounds checking of
dynamically sized arrays with UBSan.
- Add LKDTM test for stuck CPUs (Mark Rutland)
- Improve LKDTM selftest behavior under UBSan (Ricardo Cañuelo)
- Refactor more 1-element arrays into flexible arrays (Gustavo A. R.
Silva)
- Analyze and replace strlcpy and strncpy uses (Justin Stitt, Azeem
Shaikh)
- Convert group_info.usage to refcount_t (Elena Reshetova)
- Add __counted_by annotations (Kees Cook, Gustavo A. R. Silva)
- Add Kconfig fragment for basic hardening options (Kees Cook, Lukas
Bulwahn)
- Fix randstruct GCC plugin performance mode to stay in groups (Kees
Cook)
- Fix strtomem() compile-time check for small sources (Kees Cook)"
* tag 'hardening-v6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (56 commits)
hwmon: (acpi_power_meter) replace open-coded kmemdup_nul
reset: Annotate struct reset_control_array with __counted_by
kexec: Annotate struct crash_mem with __counted_by
virtio_console: Annotate struct port_buffer with __counted_by
ima: Add __counted_by for struct modsig and use struct_size()
MAINTAINERS: Include stackleak paths in hardening entry
string: Adjust strtomem() logic to allow for smaller sources
hardening: x86: drop reference to removed config AMD_IOMMU_V2
randstruct: Fix gcc-plugin performance mode to stay in group
mailbox: zynqmp: Annotate struct zynqmp_ipi_pdata with __counted_by
drivers: thermal: tsens: Annotate struct tsens_priv with __counted_by
irqchip/imx-intmux: Annotate struct intmux_data with __counted_by
KVM: Annotate struct kvm_irq_routing_table with __counted_by
virt: acrn: Annotate struct vm_memory_region_batch with __counted_by
hwmon: Annotate struct gsc_hwmon_platform_data with __counted_by
sparc: Annotate struct cpuinfo_tree with __counted_by
isdn: kcapi: replace deprecated strncpy with strscpy_pad
isdn: replace deprecated strncpy with strscpy
NFS/flexfiles: Annotate struct nfs4_ff_layout_segment with __counted_by
nfs41: Annotate struct nfs4_file_layout_dsaddr with __counted_by
...
This pull request contains the following branches:
rcu/torture: RCU torture, locktorture and generic torture infrastructure
updates that include various fixes, cleanups and consolidations.
Among the user visible things, ftrace dumps can now be found into
their own file, and module parameters get better documented and
reported on dumps.
rcu/fixes: Generic and misc fixes all over the place. Some highlights:
* Hotplug handling has seen some light cleanups and comments.
* An RCU barrier can now be triggered through sysfs to serialize
memory stress testing and avoid OOM.
* Object information is now dumped in case of invalid callback
invocation.
* Also various SRCU issues, too hard to trigger to deserve urgent
pull requests, have been fixed.
rcu/docs: RCU documentation updates
rcu/refscale: RCU reference scalability test minor fixes and doc
improvements.
rcu/tasks: RCU tasks minor fixes
rcu/stall: Stall detection updates. Introduce RCU CPU Stall notifiers
that allows a subsystem to provide informations to help debugging.
Also cure some false positive stalls.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEd76+gtGM8MbftQlOhSRUR1COjHcFAmU21h0ACgkQhSRUR1CO
jHdUgA/+Myy5K5OxNrqlF/gIK+flOSg635RyZ0DBx8OMXZ/fAg9qRI+PKt5I4Lha
eXAg6EtmwSgHmIbjcg8WzsvwniEsqqjOF+n1qil447fHUI2Qqw6c7fIm/MXQkeHJ
qA7CODDRtsAnwnjmTteasmMeGV0bmXDENxhNrAZBFnVkRgTqfyDbFcn+nxOaPK6b
fmbKvnB07WUg1KOV8/MbEtAZPb8QgHo58bXSZRKjKkiqRQWB/D3On+tShFK7SYJi
wIqQ96MLyUXLaIWQ47v6xEO4PZO+3o1wAryvP1DRdb5UrPjO6yKFfQaoo5Mza92G
zhBJhnXkVvCoNoCU7GKJIDV54SgDHaB6Sf1GN5cjwfujOkLuGCyg0CpKktCGm7uH
n3X66PVep608Uj2Y/pAo/hv3Hbv7lCu4nfrERvVLG9YoxUvTJDsKmBv+SF/g2mxF
rHqFa39HUPr1yHA5WjqOQS3lLdqCXEGKvNi6zXCvOceiDbHbiJFkBo6p8TVrbSMX
FCOWZ3LoE+6uiLu/lLOEroTjeBd8GhDh1LgWgyVK7o0LhP1018DSBolrpcSwnmOo
Q/E4G2x+aPWs+5NTOmMGOIPY70khKQIM3c8YZelSRffJBo6O3yV68h6X45NQxYvx
keLvrDaza8h4hKwaof/QaX4ZJgTOZ0xjpawr1vR0hbK8LNtPrUw=
=cVD7
-----END PGP SIGNATURE-----
Merge tag 'rcu-next-v6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks
Pull RCU updates from Frederic Weisbecker:
- RCU torture, locktorture and generic torture infrastructure updates
that include various fixes, cleanups and consolidations.
Among the user visible things, ftrace dumps can now be found into
their own file, and module parameters get better documented and
reported on dumps.
- Generic and misc fixes all over the place. Some highlights:
* Hotplug handling has seen some light cleanups and comments
* An RCU barrier can now be triggered through sysfs to serialize
memory stress testing and avoid OOM
* Object information is now dumped in case of invalid callback
invocation
* Also various SRCU issues, too hard to trigger to deserve urgent
pull requests, have been fixed
- RCU documentation updates
- RCU reference scalability test minor fixes and doc improvements.
- RCU tasks minor fixes
- Stall detection updates. Introduce RCU CPU Stall notifiers that
allows a subsystem to provide informations to help debugging. Also
cure some false positive stalls.
* tag 'rcu-next-v6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks: (56 commits)
srcu: Only accelerate on enqueue time
locktorture: Check the correct variable for allocation failure
srcu: Fix callbacks acceleration mishandling
rcu: Comment why callbacks migration can't wait for CPUHP_RCUTREE_PREP
rcu: Standardize explicit CPU-hotplug calls
rcu: Conditionally build CPU-hotplug teardown callbacks
rcu: Remove references to rcu_migrate_callbacks() from diagrams
rcu: Assume rcu_report_dead() is always called locally
rcu: Assume IRQS disabled from rcu_report_dead()
rcu: Use rcu_segcblist_segempty() instead of open coding it
rcu: kmemleak: Ignore kmemleak false positives when RCU-freeing objects
srcu: Fix srcu_struct node grpmask overflow on 64-bit systems
torture: Convert parse-console.sh to mktemp
rcutorture: Traverse possible cpu to set maxcpu in rcu_nocb_toggle()
rcutorture: Replace schedule_timeout*() 1-jiffy waits with HZ/20
torture: Add kvm.sh --debug-info argument
locktorture: Rename readers_bind/writers_bind to bind_readers/bind_writers
doc: Catch-up update for locktorture module parameters
locktorture: Add call_rcu_chains module parameter
locktorture: Add new module parameters to lock_torture_print_module_parms()
...
o Add stdarg.h header and a few additional system-call upgrades.
o Add support for constructors and destructors.
o Add tests to verify the ability to link multiple .o files
against nolibc.
o Numerous string-function optimizations and improvements.
o Prevent redundant kernel relinks by avoiding embedding of
initramfs into the kernel image.
o Allow building i386 with multiarch compiler and make ppc64le
use qemu-system-ppc64.
o Miscellaneous fixups, including addition of -nostdinc for
nolibc-test, avoiding -Wstringop-overflow warnings, and avoiding
unused parameter warnings for ENOSYS fallbacks.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEbK7UrM+RBIrCoViJnr8S83LZ+4wFAmU3A1ETHHBhdWxtY2tA
a2VybmVsLm9yZwAKCRCevxLzctn7jGyAEACRJX0s3JBvxRz5zKOA+l41sAN3DZ9z
ygKIghnlNeDXQEFSqkZXb9Pa8BUPaVnFet3X5gUy8/0jcbXPPjIHU68U6EGQWk3f
y5uTaxhTQkC+5gLyRhvq7FtjWxwYlg24D2e6ctrEw4pCt18PfkEhxof5PBhg/71K
UVrZ55cRvXG7CTLSm5p1+jNkAOJuNfV+zD32QuV9V+7CwNLU088TZS9jGALFjKC0
UyE8E5uvmTQ6QQOl64Z6GNhpQual/2BslIGDVtb/+/Ii5Ch2nA8gV2YiC8cVPzpz
r8yxqSEwfmiTNDPFH6PRIAx/optfgV/uScyyCNEiLwh/gFcag04BjC9GpWy5jKzA
akchr0n+7yfJTpzzNmM38OAoaqMgzcPedxW2RDP5Eeb4cw0AKoy7bD3WeBRfmpgl
tAgd8Gl7vpvSjecQSZfCY1hJ4F/qS2CfnObL4/EbHxIOfyLo0A6eEKLIf9PP02bT
w2YJkZVSprKi8CXvIaV5KAhxXUGp07FJ5PHYLFFjinez/e6ksb7AH/ltrnAULMoV
Ig3aQCYJ4oOFFVjH+h9+fFrqbI87Xo13UfO6PtkJU3gV749prqRIAg6FCTU0HkIz
TSvy/PEFAxaXvNSGsSXikxQxvx3Fph9laoxQ1dSgEZSXKY43v1sfnXp1h5vGDdKH
GPWaEtEBfNVUdg==
=UKBQ
-----END PGP SIGNATURE-----
Merge tag 'nolibc.2023.10.23a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu
Pull nolibc updates from Paul McKenney:
- Add stdarg.h header and a few additional system-call upgrades
- Add support for constructors and destructors
- Add tests to verify the ability to link multiple .o files against
nolibc
- Numerous string-function optimizations and improvements
- Prevent redundant kernel relinks by avoiding embedding of initramfs
into the kernel image
- Allow building i386 with multiarch compiler and make ppc64le use
qemu-system-ppc64
- Miscellaneous fixups, including addition of -nostdinc for
nolibc-test, avoiding -Wstringop-overflow warnings, and avoiding
unused parameter warnings for ENOSYS fallbacks
* tag 'nolibc.2023.10.23a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu:
selftests/nolibc: add tests for multi-object linkage
selftests/nolibc: use qemu-system-ppc64 for ppc64le
tools/nolibc: add support for constructors and destructors
tools/nolibc: drop test for getauxval(AT_PAGESZ)
tools/nolibc: automatically detect necessity to use pselect6
tools/nolibc: don't define new syscall number
tools/nolibc: avoid unused parameter warnings for ENOSYS fallbacks
selftests/nolibc: allow building i386 with multiarch compiler
selftests/nolibc: don't embed initramfs into kernel image
selftests/nolibc: libc-test: avoid -Wstringop-overflow warnings
tools/nolibc: string: Remove the `_nolibc_memcpy_up()` function
tools/nolibc: string: Remove the `_nolibc_memcpy_down()` function
tools/nolibc: x86-64: Use `rep stosb` for `memset()`
tools/nolibc: x86-64: Use `rep movsb` for `memcpy()` and `memmove()`
selftests/nolibc: use -nostdinc for nolibc-test
tools/nolibc: add stdarg.h header
- Add new NX-stack self-test
- Improve NUMA partial-CFMWS handling
- Fix #VC handler bugs resulting in SEV-SNP boot failures
- Drop the 4MB memory size restriction on minimal NUMA nodes
- Reorganize headers a bit, in preparation to header dependency reduction efforts
- Misc cleanups & fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmU9Ek4RHG1pbmdvQGtl
cm5lbC5vcmcACgkQEnMQ0APhK1gIJQ/+Mg6mzMaThyNXqhJszeZJBmDaBv2sqjAB
5tcferg1nJBdNBzX8bJ95UFt9fIqeYAcgH00qlQCYSmyzbC1TQTk9U2Pre1zbOw4
042ONK8sygKSje1zdYleHoBeqwnxD2VNM0NwBElhGjumwHRng/tbLiI9wx6qiz+C
VsFXavkBszHGA1pjy9wZLGixYIH5jCygMpH134Wp+CIhpS+C4nftcGdIL1D5Oil1
6Tm2XeI6uyfiQhm9IOwDjfoYeC7gUjx1rp8rHseGUMJxyO/BX9q5j1ixbsVriqfW
97ucYuRL9mza7ic516C9v7OlAA3AGH2xWV+SYOGK88i9Co4kYzP4WnamxXqOsD8+
popxG55oa6QelhaouTBZvgERpZ4fWupSDs/UccsDaE9leMCerNEbGHEzt/Mm/2sw
xopjMQ0y5Kn6/fS0dLv8U+XHu4ANkvXJkFd6Ny0h/WfgGefuQOOTG9ruYgfeqqB8
dViQ4R7CO8ySjD45KawAZl/EqL86x1M/CI1nlt0YY4vNwUuOJbebL7Jn8w3Fjxm5
FVfUlDmcPdhZfL9Vnrsi6MIou1cU1yJPw4D6sXJ4sg4s7A4ebBcRRrjayVQ4msjv
Q7cvBOMnWEHhOV11pvP50FmQuj74XW3bUqiuWrnK1SypvnhHavF6kc1XYpBLs1xZ
y8nueJW2qPw=
=tT5F
-----END PGP SIGNATURE-----
Merge tag 'x86-mm-2023-10-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 mm handling updates from Ingo Molnar:
- Add new NX-stack self-test
- Improve NUMA partial-CFMWS handling
- Fix #VC handler bugs resulting in SEV-SNP boot failures
- Drop the 4MB memory size restriction on minimal NUMA nodes
- Reorganize headers a bit, in preparation to header dependency
reduction efforts
- Misc cleanups & fixes
* tag 'x86-mm-2023-10-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mm: Drop the 4 MB restriction on minimal NUMA node memory size
selftests/x86/lam: Zero out buffer for readlink()
x86/sev: Drop unneeded #include
x86/sev: Move sev_setup_arch() to mem_encrypt.c
x86/tdx: Replace deprecated strncpy() with strtomem_pad()
selftests/x86/mm: Add new test that userspace stack is in fact NX
x86/sev: Make boot_ghcb_page[] static
x86/boot: Move x86_cache_alignment initialization to correct spot
x86/sev-es: Set x86_virt_bits to the correct value straight away, instead of a two-phase approach
x86/sev-es: Allow copy_from_kernel_nofault() in earlier boot
x86_64: Show CR4.PSE on auxiliaries like on BSP
x86/iommu/docs: Update AMD IOMMU specification document URL
x86/sev/docs: Update document URL in amd-memory-encryption.rst
x86/mm: Move arch_memory_failure() and arch_is_platform_page() definitions from <asm/processor.h> to <asm/pgtable.h>
ACPI/NUMA: Apply SRAT proximity domain to entire CFMWS window
x86/numa: Introduce numa_fill_memblks()
* kvm-arm64/pmu_pmcr_n:
: User-defined PMC limit, courtesy Raghavendra Rao Ananta
:
: Certain VMMs may want to reserve some PMCs for host use while running a
: KVM guest. This was a bit difficult before, as KVM advertised all
: supported counters to the guest. Userspace can now limit the number of
: advertised PMCs by writing to PMCR_EL0.N, as KVM's sysreg and PMU
: emulation enforce the specified limit for handling guest accesses.
KVM: selftests: aarch64: vPMU test for validating user accesses
KVM: selftests: aarch64: vPMU register test for unimplemented counters
KVM: selftests: aarch64: vPMU register test for implemented counters
KVM: selftests: aarch64: Introduce vpmu_counter_access test
tools: Import arm_pmuv3.h
KVM: arm64: PMU: Allow userspace to limit PMCR_EL0.N for the guest
KVM: arm64: Sanitize PM{C,I}NTEN{SET,CLR}, PMOVS{SET,CLR} before first run
KVM: arm64: Add {get,set}_user for PM{C,I}NTEN{SET,CLR}, PMOVS{SET,CLR}
KVM: arm64: PMU: Set PMCR_EL0.N for vCPU based on the associated PMU
KVM: arm64: PMU: Add a helper to read a vCPU's PMCR_EL0
KVM: arm64: Select default PMU in KVM_ARM_VCPU_INIT handler
KVM: arm64: PMU: Introduce helpers to set the guest's PMU
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
* kvm-arm64/writable-id-regs:
: Writable ID registers, courtesy of Jing Zhang
:
: This series significantly expands the architectural feature set that
: userspace can manipulate via the ID registers. A new ioctl is defined
: that makes the mutable fields in the ID registers discoverable to
: userspace.
KVM: selftests: Avoid using forced target for generating arm64 headers
tools headers arm64: Fix references to top srcdir in Makefile
KVM: arm64: selftests: Test for setting ID register from usersapce
tools headers arm64: Update sysreg.h with kernel sources
KVM: selftests: Generate sysreg-defs.h and add to include path
perf build: Generate arm64's sysreg-defs.h and add to include path
tools: arm64: Add a Makefile for generating sysreg-defs.h
KVM: arm64: Document vCPU feature selection UAPIs
KVM: arm64: Allow userspace to change ID_AA64ZFR0_EL1
KVM: arm64: Allow userspace to change ID_AA64PFR0_EL1
KVM: arm64: Allow userspace to change ID_AA64MMFR{0-2}_EL1
KVM: arm64: Allow userspace to change ID_AA64ISAR{0-2}_EL1
KVM: arm64: Bump up the default KVM sanitised debug version to v8p8
KVM: arm64: Reject attempts to set invalid debug arch version
KVM: arm64: Advertise selected DebugVer in DBGDIDR.Version
KVM: arm64: Use guest ID register values for the sake of emulation
KVM: arm64: Document KVM_ARM_GET_REG_WRITABLE_MASKS
KVM: arm64: Allow userspace to get the writable masks for feature ID registers
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
The 'prepare' target that generates the arm64 sysreg headers had no
prerequisites, so it wound up forcing a rebuild of all KVM selftests
each invocation. Add a rule for the generated headers and just have
dependents use that for a prerequisite.
Reported-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com>
Fixes: 9697d84cc3 ("KVM: selftests: Generate sysreg-defs.h and add to include path")
Tested-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com>
Link: https://lore.kernel.org/r/20231027005439.3142015-3-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
* kvm-arm64/misc:
: Miscellaneous updates
:
: - Put an upper bound on the number of I-cache invalidations by
: cacheline to avoid soft lockups
:
: - Get rid of bogus refererence count transfer for THP mappings
:
: - Do a local TLB invalidation on permission fault race
:
: - Fixes for page_fault_test KVM selftest
:
: - Add a tracepoint for detecting MMIO instructions unsupported by KVM
KVM: arm64: Add tracepoint for MMIO accesses where ISV==0
KVM: arm64: selftest: Perform ISB before reading PAR_EL1
KVM: arm64: selftest: Add the missing .guest_prepare()
KVM: arm64: Always invalidate TLB for stage-2 permission faults
KVM: arm64: Do not transfer page refcount for THP adjustment
KVM: arm64: Avoid soft lockups due to I-cache maintenance
arm64: tlbflush: Rename MAX_TLBI_OPS
KVM: arm64: Don't use kerneldoc comment for arm64_check_features()
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
It looks like a mistake to issue ISB *after* reading PAR_EL1, we should
instead perform it between the AT instruction and the reads of PAR_EL1.
As according to DDI0487J.a IJTYVP,
"When an address translation instruction is executed, explicit
synchronization is required to guarantee the result is visible to
subsequent direct reads of PAR_EL1."
Otherwise all guest_at testcases fail on my box with
==== Test Assertion Failure ====
aarch64/page_fault_test.c:142: par & 1 == 0
pid=1355864 tid=1355864 errno=4 - Interrupted system call
1 0x0000000000402853: vcpu_run_loop at page_fault_test.c:681
2 0x0000000000402cdb: run_test at page_fault_test.c:730
3 0x0000000000403897: for_each_guest_mode at guest_modes.c:100
4 0x00000000004019f3: for_each_test_and_guest_mode at page_fault_test.c:1105
5 (inlined by) main at page_fault_test.c:1131
6 0x0000ffffb153c03b: ?? ??:0
7 0x0000ffffb153c113: ?? ??:0
8 0x0000000000401aaf: _start at ??:?
0x1 != 0x0 (par & 1 != 0)
Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231007124043.626-2-yuzenghui@huawei.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Running page_fault_test on a Cortex A72 fails with
Test: ro_memslot_no_syndrome_guest_cas
Testing guest mode: PA-bits:40, VA-bits:48, 4K pages
Testing memory backing src type: anonymous
==== Test Assertion Failure ====
aarch64/page_fault_test.c:117: guest_check_lse()
pid=1944087 tid=1944087 errno=4 - Interrupted system call
1 0x00000000004028b3: vcpu_run_loop at page_fault_test.c:682
2 0x0000000000402d93: run_test at page_fault_test.c:731
3 0x0000000000403957: for_each_guest_mode at guest_modes.c:100
4 0x00000000004019f3: for_each_test_and_guest_mode at page_fault_test.c:1108
5 (inlined by) main at page_fault_test.c:1134
6 0x0000ffff868e503b: ?? ??:0
7 0x0000ffff868e5113: ?? ??:0
8 0x0000000000401aaf: _start at ??:?
guest_check_lse()
because we don't have a guest_prepare stage to check the presence of
FEAT_LSE and skip the related guest_cas testing, and we end-up failing in
GUEST_ASSERT(guest_check_lse()).
Add the missing .guest_prepare() where it's indeed required.
Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231007124043.626-1-yuzenghui@huawei.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
The Component Register base address @component_reg_phys is no longer
used after the rework of the Component Register setup which now uses
struct member @reg_map instead. Remove the base address.
Signed-off-by: Terry Bowman <terry.bowman@amd.com>
Signed-off-by: Robert Richter <rrichter@amd.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/20231018171713.1883517-9-rrichter@amd.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The cxl-cli unit test for firmware update does operations like starting
an asynchronous firmware update, making sure it is in progress, and
attempting to cancel it. In some cases, such as with no or minimal
dynamic debugging turned on, the firmware update completes too quickly,
not allowing the test to have a chance to verify it was in progress.
This caused a failure of the signature:
expected fw_update_in_progress:true
test/cxl-update-firmware.sh: failed at line 88
Fix this by adding a delay (~1.5 - 2 ms) to each firmware transfer
request handled by the mocked interface.
Reported-by: Dan Williams <dan.j.williams@intel.com>
Tested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
Link: https://lore.kernel.org/r/20231026-vv-fw_upd_test_fix-v2-1-5282fd193883@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Root decoder granularity must match value from CFWMS, which may not
be the region's granularity for non-interleaved root decoders.
So when calculating granularities for host bridge decoders, use the
region's granularity instead of the root decoder's granularity to ensure
the correct granularities are set for the host bridge decoders and any
downstream switch decoders.
Test configuration is 1 host bridge * 2 switches * 2 endpoints per switch.
Region created with 2048 granularity using following command line:
cxl create-region -m -d decoder0.0 -w 4 mem0 mem2 mem1 mem3 \
-g 2048 -s 2048M
Use "cxl list -PDE | grep granularity" to get a view of the granularity
set at each level of the topology.
Before this patch:
"interleave_granularity":2048,
"interleave_granularity":2048,
"interleave_granularity":512,
"interleave_granularity":2048,
"interleave_granularity":2048,
"interleave_granularity":512,
"interleave_granularity":256,
After:
"interleave_granularity":2048,
"interleave_granularity":2048,
"interleave_granularity":4096,
"interleave_granularity":2048,
"interleave_granularity":2048,
"interleave_granularity":4096,
"interleave_granularity":2048,
Fixes: 27b3f8d138 ("cxl/region: Program target lists")
Cc: <stable@vger.kernel.org>
Signed-off-by: Jim Harris <jim.harris@samsung.com>
Link: https://lore.kernel.org/r/169824893473.1403938.16110924262989774582.stgit@bgt-140510-bm03.eng.stellus.in
[djbw: fixup the prebuilt cxl_test region]
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Add 2 tests to the layout1 fixture:
* topology_changes_with_net_only: Checks that FS topology
changes are not denied by network-only restrictions.
* topology_changes_with_net_and_fs: Make sure that FS topology
changes are still denied with FS and network restrictions.
This specifically test commit d722036403 ("landlock: Allow FS topology
changes for domains without such rule type").
Cc: Konstantin Meskhidze <konstantin.meskhidze@huawei.com>
Link: https://lore.kernel.org/r/20231027154615.815134-1-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
Just like displaying "invert" after "Info: ", "simult" should be
displayed too when rm_subflow_nr doesn't match the expect value in
chk_rm_nr():
syn [ ok ]
synack [ ok ]
ack [ ok ]
add [ ok ]
echo [ ok ]
rm [ ok ]
rmsf [ ok ] 3 in [2:4]
Info: invert simult
syn [ ok ]
synack [ ok ]
ack [ ok ]
add [ ok ]
echo [ ok ]
rm [ ok ]
rmsf [ ok ]
Info: invert
Reviewed-by: Matthieu Baerts <matttbe@kernel.org>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20231025-send-net-next-20231025-v1-10-db8f25f798eb@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Global var mptcp_connect defined at the front of mptcp_sockopt.sh is
duplicate with local var mptcp_connect defined in do_transfer(), drop
this useless global one.
Reviewed-by: Matthieu Baerts <matttbe@kernel.org>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20231025-send-net-next-20231025-v1-9-db8f25f798eb@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The second input parameter of 'wait_rm_addr/sf $1 1' is misused. If it's
1, wait_rm_addr/sf will never break, and will loop ten times, then
'wait_rm_addr/sf' equals to 'sleep 1'. This delay time is too long,
which can sometimes make the tests fail.
A better way to use wait_rm_addr/sf is to use rm_addr/sf_count to obtain
the current value, and then pass into wait_rm_addr/sf.
Fixes: 4369c198e5 ("selftests: mptcp: test userspace pm out of transfer")
Cc: stable@vger.kernel.org
Suggested-by: Matthieu Baerts <matttbe@kernel.org>
Reviewed-by: Matthieu Baerts <matttbe@kernel.org>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20231025-send-net-next-20231025-v1-2-db8f25f798eb@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Some userspace pm tests failed are reported by CI:
112 userspace pm add & remove address
syn [ ok ]
synack [ ok ]
ack [ ok ]
add [ ok ]
echo [ ok ]
mptcp_info subflows=1:1 [ ok ]
subflows_total 2:2 [ ok ]
mptcp_info add_addr_signal=1:1 [ ok ]
rm [ ok ]
rmsf [ ok ]
Info: invert
mptcp_info subflows=0:0 [ ok ]
subflows_total 1:1 [fail]
got subflows 0:0 expected 1:1
Server ns stats
TcpPassiveOpens 2 0.0
TcpInSegs 118 0.0
This patch fixes them by changing 'speed' to 5 to run the tests much more
slowly.
Fixes: 4369c198e5 ("selftests: mptcp: test userspace pm out of transfer")
Cc: stable@vger.kernel.org
Reviewed-by: Matthieu Baerts <matttbe@kernel.org>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20231025-send-net-next-20231025-v1-1-db8f25f798eb@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Test the new MDB get functionality by converting dump and grep to MDB
get.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Test the new MDB get functionality by converting dump and grep to MDB
get.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQTFp0I1jqZrAX+hPRXbK58LschIgwUCZTp12QAKCRDbK58LschI
g8BrAQDifqp5liEEdXV8jdReBwJtqInjrL5tzy5LcyHUMQbTaAEA6Ph3Ct3B+3oA
mFnIW/y6UJiJrby0Xz4+vV5BXI/5WQg=
=pLCV
-----END PGP SIGNATURE-----
Merge tag 'for-netdev' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:
====================
pull-request: bpf-next 2023-10-26
We've added 51 non-merge commits during the last 10 day(s) which contain
a total of 75 files changed, 5037 insertions(+), 200 deletions(-).
The main changes are:
1) Add open-coded task, css_task and css iterator support.
One of the use cases is customizable OOM victim selection via BPF,
from Chuyi Zhou.
2) Fix BPF verifier's iterator convergence logic to use exact states
comparison for convergence checks, from Eduard Zingerman,
Andrii Nakryiko and Alexei Starovoitov.
3) Add BPF programmable net device where bpf_mprog defines the logic
of its xmit routine. It can operate in L3 and L2 mode,
from Daniel Borkmann and Nikolay Aleksandrov.
4) Batch of fixes for BPF per-CPU kptr and re-enable unit_size checking
for global per-CPU allocator, from Hou Tao.
5) Fix libbpf which eagerly assumed that SHT_GNU_verdef ELF section
was going to be present whenever a binary has SHT_GNU_versym section,
from Andrii Nakryiko.
6) Fix BPF ringbuf correctness to fold smp_mb__before_atomic() into
atomic_set_release(), from Paul E. McKenney.
7) Add a warning if NAPI callback missed xdp_do_flush() under
CONFIG_DEBUG_NET which helps checking if drivers were missing
the former, from Sebastian Andrzej Siewior.
8) Fix missed RCU read-lock in bpf_task_under_cgroup() which was throwing
a warning under sleepable programs, from Yafang Shao.
9) Avoid unnecessary -EBUSY from htab_lock_bucket by disabling IRQ before
checking map_locked, from Song Liu.
10) Make BPF CI linked_list failure test more robust,
from Kumar Kartikeya Dwivedi.
11) Enable samples/bpf to be built as PIE in Fedora, from Viktor Malik.
12) Fix xsk starving when multiple xsk sockets were associated with
a single xsk_buff_pool, from Albert Huang.
13) Clarify the signed modulo implementation for the BPF ISA standardization
document that it uses truncated division, from Dave Thaler.
14) Improve BPF verifier's JEQ/JNE branch taken logic to also consider
signed bounds knowledge, from Andrii Nakryiko.
15) Add an option to XDP selftests to use multi-buffer AF_XDP
xdp_hw_metadata and mark used XDP programs as capable to use frags,
from Larysa Zaremba.
16) Fix bpftool's BTF dumper wrt printing a pointer value and another
one to fix struct_ops dump in an array, from Manu Bretelle.
* tag 'for-netdev' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (51 commits)
netkit: Remove explicit active/peer ptr initialization
selftests/bpf: Fix selftests broken by mitigations=off
samples/bpf: Allow building with custom bpftool
samples/bpf: Fix passing LDFLAGS to libbpf
samples/bpf: Allow building with custom CFLAGS/LDFLAGS
bpf: Add more WARN_ON_ONCE checks for mismatched alloc and free
selftests/bpf: Add selftests for netkit
selftests/bpf: Add netlink helper library
bpftool: Extend net dump with netkit progs
bpftool: Implement link show support for netkit
libbpf: Add link-based API for netkit
tools: Sync if_link uapi header
netkit, bpf: Add bpf programmable net device
bpf: Improve JEQ/JNE branch taken logic
bpf: Fold smp_mb__before_atomic() into atomic_set_release()
bpf: Fix unnecessary -EBUSY from htab_lock_bucket
xsk: Avoid starving the xsk further down the list
bpf: print full verifier states on infinite loop detection
selftests/bpf: test if state loops are detected in a tricky case
bpf: correct loop detection for iterators convergence
...
====================
Link: https://lore.kernel.org/r/20231026150509.2824-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add 82 test suites to check edge cases related to bind() and connect()
actions. They are defined with 6 fixtures and their variants:
The "protocol" fixture is extended with 12 variants defined as a matrix
of: sandboxed/not-sandboxed, IPv4/IPv6/unix network domain, and
stream/datagram socket. 4 related tests suites are defined:
* bind: Tests bind action.
* connect: Tests connect action.
* bind_unspec: Tests bind action with the AF_UNSPEC socket family.
* connect_unspec: Tests connect action with the AF_UNSPEC socket family.
The "ipv4" fixture is extended with 4 variants defined as a matrix
of: sandboxed/not-sandboxed, and stream/datagram socket. 1 related test
suite is defined:
* from_unix_to_inet: Tests to make sure unix sockets' actions are not
restricted by Landlock rules applied to TCP ones.
The "tcp_layers" fixture is extended with 8 variants defined as a matrix
of: IPv4/IPv6 network domain, and different number of landlock rule
layers. 2 related tests suites are defined:
* ruleset_overlap: Tests nested layers with less constraints.
* ruleset_expand: Tests nested layers with more constraints.
In the "mini" fixture 4 tests suites are defined:
* network_access_rights: Tests handling of known access rights.
* unknown_access_rights: Tests handling of unknown access rights.
* inval: Tests unhandled allowed access and zero access value.
* tcp_port_overflow: Tests with port values greater than 65535.
The "ipv4_tcp" fixture supports IPv4 network domain with stream socket.
2 tests suites are defined:
* port_endianness: Tests with big/little endian port formats.
* with_fs: Tests a ruleset with both filesystem and network
restrictions.
The "port_specific" fixture is extended with 4 variants defined
as a matrix of: sandboxed/not-sandboxed, IPv4/IPv6 network domain,
and stream socket. 2 related tests suites are defined:
* bind_connect_zero: Tests with port 0.
* bind_connect_1023: Tests with port 1023.
Test coverage for security/landlock is 92.4% of 710 lines according to
gcc/gcov-13.
Signed-off-by: Konstantin Meskhidze <konstantin.meskhidze@huawei.com>
Link: https://lore.kernel.org/r/20231026014751.414649-11-konstantin.meskhidze@huawei.com
[mic: Extend commit message, update test coverage, clean up capability
use, fix useless TEST_F_FORK, and improve ipv4_tcp.with_fs]
Co-developed-by: Mickaël Salaün <mic@digikod.net>
Signed-off-by: Mickaël Salaün <mic@digikod.net>
Add network rules support in the ruleset management helpers and the
landlock_create_ruleset() syscall. Extend user space API to support
network actions:
* Add new network access rights: LANDLOCK_ACCESS_NET_BIND_TCP and
LANDLOCK_ACCESS_NET_CONNECT_TCP.
* Add a new network rule type: LANDLOCK_RULE_NET_PORT tied to struct
landlock_net_port_attr. The allowed_access field contains the network
access rights, and the port field contains the port value according to
the controlled protocol. This field can take up to a 64-bit value
but the maximum value depends on the related protocol (e.g. 16-bit
value for TCP). Network port is in host endianness [1].
* Add a new handled_access_net field to struct landlock_ruleset_attr
that contains network access rights.
* Increment the Landlock ABI version to 4.
Implement socket_bind() and socket_connect() LSM hooks, which enable
to control TCP socket binding and connection to specific ports.
Expand access_masks_t from u16 to u32 to be able to store network access
rights alongside filesystem access rights for rulesets' handled access
rights.
Access rights are not tied to socket file descriptors but checked at
bind() or connect() call time against the caller's Landlock domain. For
the filesystem, a file descriptor is a direct access to a file/data.
However, for network sockets, we cannot identify for which data or peer
a newly created socket will give access to. Indeed, we need to wait for
a connect or bind request to identify the use case for this socket.
Likewise a directory file descriptor may enable to open another file
(i.e. a new data item), but this opening is also restricted by the
caller's domain, not the file descriptor's access rights [2].
[1] https://lore.kernel.org/r/278ab07f-7583-a4e0-3d37-1bacd091531d@digikod.net
[2] https://lore.kernel.org/r/263c1eb3-602f-57fe-8450-3f138581bee7@digikod.net
Signed-off-by: Konstantin Meskhidze <konstantin.meskhidze@huawei.com>
Link: https://lore.kernel.org/r/20231026014751.414649-9-konstantin.meskhidze@huawei.com
[mic: Extend commit message, fix typo in comments, and specify
endianness in the documentation]
Co-developed-by: Mickaël Salaün <mic@digikod.net>
Signed-off-by: Mickaël Salaün <mic@digikod.net>
The IOMMU_HWPT_ALLOC ioctl now supports passing user_data to allocate a
user-managed domain for nested HWPTs. Add its coverage for that. Also,
update _test_cmd_hwpt_alloc() and add test_cmd/err_hwpt_alloc_nested().
Link: https://lore.kernel.org/r/20231026043938.63898-11-yi.l.liu@intel.com
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
When we configure the kernel command line with 'mitigations=off' and set
the sysctl knob 'kernel.unprivileged_bpf_disabled' to 0, the commit
bc5bc309db ("bpf: Inherit system settings for CPU security mitigations")
causes issues in the execution of `test_progs -t verifier`. This is
because 'mitigations=off' bypasses Spectre v1 and Spectre v4 protections.
Currently, when a program requests to run in unprivileged mode
(kernel.unprivileged_bpf_disabled = 0), the BPF verifier may prevent
it from running due to the following conditions not being enabled:
- bypass_spec_v1
- bypass_spec_v4
- allow_ptr_leaks
- allow_uninit_stack
While 'mitigations=off' enables the first two conditions, it does not
enable the latter two. As a result, some test cases in
'test_progs -t verifier' that were expected to fail to run may run
successfully, while others still fail but with different error messages.
This makes it challenging to address them comprehensively.
Moreover, in the future, we may introduce more fine-grained control over
CPU mitigations, such as enabling only bypass_spec_v1 or bypass_spec_v4.
Given the complexity of the situation, rather than fixing each broken test
case individually, it's preferable to skip them when 'mitigations=off' is
in effect and introduce specific test cases for the new 'mitigations=off'
scenario. For instance, we can introduce new BTF declaration tags like
'__failure__nospec', '__failure_nospecv1' and '__failure_nospecv4'.
In this patch, the approach is to simply skip the broken test cases when
'mitigations=off' is enabled. The result of `test_progs -t verifier` as
follows after this commit,
Before this commit
==================
- without 'mitigations=off'
- kernel.unprivileged_bpf_disabled = 2
Summary: 74/948 PASSED, 388 SKIPPED, 0 FAILED
- kernel.unprivileged_bpf_disabled = 0
Summary: 74/1336 PASSED, 0 SKIPPED, 0 FAILED <<<<
- with 'mitigations=off'
- kernel.unprivileged_bpf_disabled = 2
Summary: 74/948 PASSED, 388 SKIPPED, 0 FAILED
- kernel.unprivileged_bpf_disabled = 0
Summary: 63/1276 PASSED, 0 SKIPPED, 11 FAILED <<<< 11 FAILED
After this commit
=================
- without 'mitigations=off'
- kernel.unprivileged_bpf_disabled = 2
Summary: 74/948 PASSED, 388 SKIPPED, 0 FAILED
- kernel.unprivileged_bpf_disabled = 0
Summary: 74/1336 PASSED, 0 SKIPPED, 0 FAILED <<<<
- with this patch, with 'mitigations=off'
- kernel.unprivileged_bpf_disabled = 2
Summary: 74/948 PASSED, 388 SKIPPED, 0 FAILED
- kernel.unprivileged_bpf_disabled = 0
Summary: 74/948 PASSED, 388 SKIPPED, 0 FAILED <<<< SKIPPED
Fixes: bc5bc309db ("bpf: Inherit system settings for CPU security mitigations")
Reported-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Closes: https://lore.kernel.org/bpf/CAADnVQKUBJqg+hHtbLeeC2jhoJAWqnmRAzXW3hmUCNSV9kx4sQ@mail.gmail.com
Link: https://lore.kernel.org/bpf/20231025031144.5508-1-laoar.shao@gmail.com
Add a minimal netlink helper library for the BPF selftests. This has been
taken and cut down and cleaned up from iproute2. This covers basics such
as netdevice creation which we need for BPF selftests / BPF CI given
iproute2 package cannot cover it yet.
Stanislav Fomichev suggested that this could be replaced in future by ynl
tool generated C code once it has RTNL support to create devices. Once we
get to this point the BPF CI would also need to add libmnl. If no further
extensions are needed, a second option could be that we remove this code
again once iproute2 package has support.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20231024214904.29825-7-daniel@iogearbox.net
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Add a vPMU test scenario to validate the userspace accesses for
the registers PM{C,I}NTEN{SET,CLR} and PMOVS{SET,CLR} to ensure
that KVM honors the architectural definitions of these registers
for a given PMCR.N.
Signed-off-by: Raghavendra Rao Ananta <rananta@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231020214053.2144305-13-rananta@google.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Add a new test case to the vpmu_counter_access test to check
if PMU registers or their bits for unimplemented counters are not
accessible or are RAZ, as expected.
Signed-off-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Raghavendra Rao Ananta <rananta@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231020214053.2144305-12-rananta@google.com
[Oliver: fix issues relating to exception return address]
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Add a new test case to the vpmu_counter_access test to check if PMU
registers or their bits for implemented counters on the vCPU are
readable/writable as expected, and can be programmed to count events.
Signed-off-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Raghavendra Rao Ananta <rananta@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231020214053.2144305-11-rananta@google.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Introduce vpmu_counter_access test for arm64 platforms.
The test configures PMUv3 for a vCPU, sets PMCR_EL0.N for the vCPU,
and check if the guest can consistently see the same number of the
PMU event counters (PMCR_EL0.N) that userspace sets.
This test case is done with each of the PMCR_EL0.N values from
0 to 31 (With the PMCR_EL0.N values greater than the host value,
the test expects KVM_SET_ONE_REG for the PMCR_EL0 to fail).
Signed-off-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Raghavendra Rao Ananta <rananta@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231020214053.2144305-10-rananta@google.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Change ifconfig with ip command, on a system where ifconfig is
not used this script will not work correcly.
Test result with this patchset:
sudo make TARGETS="net" kselftest
....
TAP version 13
1..1
timeout set to 1500
selftests: net: route_localnet.sh
run arp_announce test
net.ipv4.conf.veth0.route_localnet = 1
net.ipv4.conf.veth1.route_localnet = 1
net.ipv4.conf.veth0.arp_announce = 2
net.ipv4.conf.veth1.arp_announce = 2
PING 127.25.3.14 (127.25.3.14) from 127.25.3.4 veth0: 56(84)
bytes of data.
64 bytes from 127.25.3.14: icmp_seq=1 ttl=64 time=0.038 ms
64 bytes from 127.25.3.14: icmp_seq=2 ttl=64 time=0.068 ms
64 bytes from 127.25.3.14: icmp_seq=3 ttl=64 time=0.068 ms
64 bytes from 127.25.3.14: icmp_seq=4 ttl=64 time=0.068 ms
64 bytes from 127.25.3.14: icmp_seq=5 ttl=64 time=0.068 ms
--- 127.25.3.14 ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 4073ms
rtt min/avg/max/mdev = 0.038/0.062/0.068/0.012 ms
ok
run arp_ignore test
net.ipv4.conf.veth0.route_localnet = 1
net.ipv4.conf.veth1.route_localnet = 1
net.ipv4.conf.veth0.arp_ignore = 3
net.ipv4.conf.veth1.arp_ignore = 3
PING 127.25.3.14 (127.25.3.14) from 127.25.3.4 veth0: 56(84)
bytes of data.
64 bytes from 127.25.3.14: icmp_seq=1 ttl=64 time=0.032 ms
64 bytes from 127.25.3.14: icmp_seq=2 ttl=64 time=0.065 ms
64 bytes from 127.25.3.14: icmp_seq=3 ttl=64 time=0.066 ms
64 bytes from 127.25.3.14: icmp_seq=4 ttl=64 time=0.065 ms
64 bytes from 127.25.3.14: icmp_seq=5 ttl=64 time=0.065 ms
--- 127.25.3.14 ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 4092ms
rtt min/avg/max/mdev = 0.032/0.058/0.066/0.013 ms
ok
ok 1 selftests: net: route_localnet.sh
...
Signed-off-by: Swarup Laxman Kotiaklapudi <swarupkotikalapudi@gmail.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20231023123422.2895-1-swarupkotikalapudi@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
or aren't considered necessary for earlier kernel versions.
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZTfz/QAKCRDdBJ7gKXxA
joMyAP99hLaLYeJbjlf+4tLJzhlpbVoFra1ieun2D+ZgFE78xQD/T4T3PYrZhYqD
WdrxGT9fiKOykXM5pmQRH9Zr4EvJBA0=
=Obbk
-----END PGP SIGNATURE-----
Merge tag 'mm-hotfixes-stable-2023-10-24-09-40' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
"20 hotfixes. 12 are cc:stable and the remainder address post-6.5
issues or aren't considered necessary for earlier kernel versions"
* tag 'mm-hotfixes-stable-2023-10-24-09-40' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
maple_tree: add GFP_KERNEL to allocations in mas_expected_entries()
selftests/mm: include mman header to access MREMAP_DONTUNMAP identifier
mailmap: correct email aliasing for Oleksij Rempel
mailmap: map Bartosz's old address to the current one
mm/damon/sysfs: check DAMOS regions update progress from before_terminate()
MAINTAINERS: Ondrej has moved
kasan: disable kasan_non_canonical_hook() for HW tags
kasan: print the original fault addr when access invalid shadow
hugetlbfs: close race between MADV_DONTNEED and page fault
hugetlbfs: extend hugetlb_vma_lock to private VMAs
hugetlbfs: clear resv_map pointer if mmap fails
mm: zswap: fix pool refcount bug around shrink_worker()
mm/migrate: fix do_pages_move for compat pointers
riscv: fix set_huge_pte_at() for NAPOT mappings when a swap entry is set
riscv: handle VM_FAULT_[HWPOISON|HWPOISON_LARGE] faults instead of panicking
mmap: fix error paths with dup_anon_vma()
mmap: fix vma_iterator in error path of vma_merge()
mm: fix vm_brk_flags() to not bail out while holding lock
mm/mempolicy: fix set_mempolicy_home_node() previous VMA pointer
mm/page_alloc: correct start page when guard page debug is enabled
Change test_mock_dirty_bitmaps() to pass a flag where it specifies the flag
under test. The test does the same thing as the GET_DIRTY_BITMAP regular
test. Except that it tests whether the dirtied bits are fetched all the
same a second time, as opposed to observing them cleared.
Link: https://lore.kernel.org/r/20231024135109.73787-19-joao.m.martins@oracle.com
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Enumerate the capabilities from the mock device and test whether it
advertises as expected. Include it as part of the iommufd_dirty_tracking
fixture.
Link: https://lore.kernel.org/r/20231024135109.73787-18-joao.m.martins@oracle.com
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Add a new test ioctl for simulating the dirty IOVAs in the mock domain, and
implement the mock iommu domain ops that get the dirty tracking supported.
The selftest exercises the usual main workflow of:
1) Setting dirty tracking from the iommu domain
2) Read and clear dirty IOPTEs
Different fixtures will test different IOVA range sizes, that exercise
corner cases of the bitmaps.
Link: https://lore.kernel.org/r/20231024135109.73787-17-joao.m.martins@oracle.com
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Change mock_domain to supporting dirty tracking and add tests to exercise
the new SET_DIRTY_TRACKING API in the iommufd_dirty_tracking selftest
fixture.
Link: https://lore.kernel.org/r/20231024135109.73787-16-joao.m.martins@oracle.com
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
In order to selftest the iommu domain dirty enforcing implement the
mock_domain necessary support and add a new dev_flags to test that the
hwpt_alloc/attach_device fails as expected.
Expand the existing mock_domain fixture with a enforce_dirty test that
exercises the hwpt_alloc and device attachment.
Link: https://lore.kernel.org/r/20231024135109.73787-15-joao.m.martins@oracle.com
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Expand mock_domain test to be able to manipulate the device capabilities.
This allows testing with mockdev without dirty tracking support advertised
and thus make sure enforce_dirty test does the expected.
To avoid breaking IOMMUFD_TEST UABI replicate the mock_domain struct and
thus add an input dev_flags at the end.
Link: https://lore.kernel.org/r/20231024135109.73787-14-joao.m.martins@oracle.com
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
A convoluted test case for iterators convergence logic that
demonstrates that states with branch count equal to 0 might still be
a part of not completely explored loop.
E.g. consider the following state diagram:
initial Here state 'succ' was processed first,
| it was eventually tracked to produce a
V state identical to 'hdr'.
.---------> hdr All branches from 'succ' had been explored
| | and thus 'succ' has its .branches == 0.
| V
| .------... Suppose states 'cur' and 'succ' correspond
| | | to the same instruction + callsites.
| V V In such case it is necessary to check
| ... ... whether 'succ' and 'cur' are identical.
| | | If 'succ' and 'cur' are a part of the same loop
| V V they have to be compared exactly.
| succ <- cur
| |
| V
| ...
| |
'----'
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20231024000917.12153-7-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
These test cases try to hide read and precision marks from loop
convergence logic: marks would only be assigned on subsequent loop
iterations or after exploring states pushed to env->head stack first.
Without verifier fix to use exact states comparison logic for
iterators convergence these tests (except 'triple_continue') would be
errorneously marked as safe.
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20231024000917.12153-5-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Convergence for open coded iterators is computed in is_state_visited()
by examining states with branches count > 1 and using states_equal().
states_equal() computes sub-state relation using read and precision marks.
Read and precision marks are propagated from children states,
thus are not guaranteed to be complete inside a loop when branches
count > 1. This could be demonstrated using the following unsafe program:
1. r7 = -16
2. r6 = bpf_get_prandom_u32()
3. while (bpf_iter_num_next(&fp[-8])) {
4. if (r6 != 42) {
5. r7 = -32
6. r6 = bpf_get_prandom_u32()
7. continue
8. }
9. r0 = r10
10. r0 += r7
11. r8 = *(u64 *)(r0 + 0)
12. r6 = bpf_get_prandom_u32()
13. }
Here verifier would first visit path 1-3, create a checkpoint at 3
with r7=-16, continue to 4-7,3 with r7=-32.
Because instructions at 9-12 had not been visitied yet existing
checkpoint at 3 does not have read or precision mark for r7.
Thus states_equal() would return true and verifier would discard
current state, thus unsafe memory access at 11 would not be caught.
This commit fixes this loophole by introducing exact state comparisons
for iterator convergence logic:
- registers are compared using regs_exact() regardless of read or
precision marks;
- stack slots have to have identical type.
Unfortunately, this is too strict even for simple programs like below:
i = 0;
while(iter_next(&it))
i++;
At each iteration step i++ would produce a new distinct state and
eventually instruction processing limit would be reached.
To avoid such behavior speculatively forget (widen) range for
imprecise scalar registers, if those registers were not precise at the
end of the previous iteration and do not match exactly.
This a conservative heuristic that allows to verify wide range of
programs, however it precludes verification of programs that conjure
an imprecise value on the first loop iteration and use it as precise
on the second.
Test case iter_task_vma_for_each() presents one of such cases:
unsigned int seen = 0;
...
bpf_for_each(task_vma, vma, task, 0) {
if (seen >= 1000)
break;
...
seen++;
}
Here clang generates the following code:
<LBB0_4>:
24: r8 = r6 ; stash current value of
... body ... 'seen'
29: r1 = r10
30: r1 += -0x8
31: call bpf_iter_task_vma_next
32: r6 += 0x1 ; seen++;
33: if r0 == 0x0 goto +0x2 <LBB0_6> ; exit on next() == NULL
34: r7 += 0x10
35: if r8 < 0x3e7 goto -0xc <LBB0_4> ; loop on seen < 1000
<LBB0_6>:
... exit ...
Note that counter in r6 is copied to r8 and then incremented,
conditional jump is done using r8. Because of this precision mark for
r6 lags one state behind of precision mark on r8 and widening logic
kicks in.
Adding barrier_var(seen) after conditional is sufficient to force
clang use the same register for both counting and conditional jump.
This issue was discussed in the thread [1] which was started by
Andrew Werner <awerner32@gmail.com> demonstrating a similar bug
in callback functions handling. The callbacks would be addressed
in a followup patch.
[1] https://lore.kernel.org/bpf/97a90da09404c65c8e810cf83c94ac703705dc0e.camel@gmail.com/
Co-developed-by: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Co-developed-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20231024000917.12153-4-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
It delivers current TCP time stamp in ms unit, and is used
in place of confusing tcp_time_stamp_raw()
It is the same family than tcp_clock_ns() and tcp_clock_ms().
tcp_time_stamp_raw() will be replaced later for TSval
contexts with a more descriptive name.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
- kprobe-events: Fix kprobe events to reject if the attached symbol
is not unique name because it may not the function which the user
want to attach to. (User can attach a probe to such symbol using
the nearest unique symbol + offset.)
- selftest: Add a testcase to ensure the kprobe event rejects non
unique symbol correctly.
-----BEGIN PGP SIGNATURE-----
iQFPBAABCgA5FiEEh7BulGwFlgAOi5DV2/sHvwUrPxsFAmUzdQobHG1hc2FtaS5o
aXJhbWF0c3VAZ21haWwuY29tAAoJENv7B78FKz8bMNAH/inFWv8e+rMm8F5Po6ZI
CmBxuZbxy2l+KfYDjXqSHu7TLKngVd6Bhdb5H2K7fgdwiZxrS0i6qvdppo+Cxgop
Yod06peDTM80IKavioCcOJOwLPGXXpZkMlK5fdC48HN6vrf9km4vws5ZAagfc1ng
YhnYm1HHeXcIYwtLkE2dCr6HkwkaOebWTLdZ8c70d1OPw0L9rzxH+edjhKCq8uIw
6WUg9ERxJYPUuCkQxOxVJrTdzNMRXsgf28FHc0LyYRm8kDpECT2BP6e/Y+TBbsX5
2pN5cUY5qfI6t3Pc1HDs2KX8ui2QCmj0mCvT0VixhdjThdHpRf0VjIFFAANf3LNO
XVA=
=O1Aa
-----END PGP SIGNATURE-----
Merge tag 'probes-fixes-v6.6-rc6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull probes fixes from Masami Hiramatsu:
- kprobe-events: Fix kprobe events to reject if the attached symbol is
not unique name because it may not the function which the user want
to attach to. (User can attach a probe to such symbol using the
nearest unique symbol + offset.)
- selftest: Add a testcase to ensure the kprobe event rejects non
unique symbol correctly.
* tag 'probes-fixes-v6.6-rc6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
selftests/ftrace: Add new test case which checks non unique symbol
tracing/kprobes: Return EADDRNOTAVAIL when func matches several symbols
Add a test to check if inner rt curves are upgraded to sc curves.
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This Kselftest update for Linux 6.6-rc7 consists of one single fix
to assert check in user_events abi_test to properly check bit value
on Big Endian architectures. The current code treats the bit values
as Little Endian and the check fails on Big Endian.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmUytR4ACgkQCwJExA0N
QxzjoQ/+MWgtyggLZo/dP+qJ7AU+tG08YXWWuh99lkSxVs3xHvwl2EGaYf1PXN9c
ZUXb/KGfls8G4tv20KU2+/uSRSirkf+CFLN6HaBBG+2cun8o0KpHlVKGfmvRjb13
ZUX8UQJ5u9kTSTqV7gCxVbemV5iOhTazuwVQ1Aq7wFL6e/0oL4eolbgNP0ub76vy
UsZl9j/8pFhtVfdqRJqorKQ9H5RgvW7CCmobkUruGJoXg7jFuFdeqXtS4U1ziyJT
g42LtXuC053KrEnEmhj1EadZC4E1eXadffssanfyWolAXjRD3N+2vLYasJOiGDAO
mT111kQLaRNvZLJBULlrnWITkbVOhfE9Cxu14idxvdLShQiUO8kRCjoN2TbLZ2ET
n7u/4anHBu99ljO6uZVNe7JY5ZrwaM1o7Myvk9eOVRe4WKkgOXJKUM3lfwvfMK8Q
sEWvfBY04gL5y697DDiQvJ4g0fRwwnoadpHFvJTJX977H5C8/c750YZw/Emuip9D
zxyHGEVtncM6awbHP8bGBgg2f6XISWKs5qvzrwLmdXx42oi0VjU4z+cEVOUSx/rY
K+B+LVzcdYc/6UhjQzmylNBUs5CO7K6D05/tes07QaZ1MCCyw3b+DAR+LJn2U2+C
Y3+x/kQT16abMIIpTM8qVWqxswvaFSyZ+5hCkVrAsYwkM1fpB4c=
=PJhV
-----END PGP SIGNATURE-----
Merge tag 'linux_kselftest_active-fixes-6.6-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull Kselftest fix from Shuah Khan:
"One single fix to assert check in user_events abi_test to properly
check bit value on Big Endian architectures. The code treated the bit
values as Little Endian and the check failed on Big Endian"
* tag 'linux_kselftest_active-fixes-6.6-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
selftests/user_events: Fix abi_test for BE archs
Add the following 3 test cases for bpf memory allocator:
1) Do allocation in bpf program and free through map free
2) Do batch per-cpu allocation and per-cpu free in bpf program
3) Do per-cpu allocation in bpf program and free through map free
For per-cpu allocation, because per-cpu allocation can not refill timely
sometimes, so test 2) and test 3) consider it is OK for
bpf_percpu_obj_new_impl() to return NULL.
Signed-off-by: Hou Tao <houtao1@huawei.com>
Link: https://lore.kernel.org/r/20231020133202.4043247-8-houtao@huaweicloud.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
The linked list failure test 'pop_front_off' and 'pop_back_off'
currently rely on matching exact instruction and register values. The
purpose of the test is to ensure the offset is correctly incremented for
the returned pointers from list pop helpers, which can then be used with
container_of to obtain the real object. Hence, somehow obtaining the
information that the offset is 48 will work for us. Make the test more
robust by relying on verifier error string of bpf_spin_lock and remove
dependence on fragile instruction index or register number, which can be
affected by different clang versions used to build the selftests.
Fixes: 300f19dcdb ("selftests/bpf: Add BPF linked list API tests")
Reported-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20231020144839.2734006-1-memxor@gmail.com
If name_show() is non unique, this test will try to install a kprobe on this
function which should fail returning EADDRNOTAVAIL.
On kernel where name_show() is not unique, this test is skipped.
Link: https://lore.kernel.org/all/20231020104250.9537-3-flaniel@linux.microsoft.com/
Cc: stable@vger.kernel.org
Signed-off-by: Francis Laniel <flaniel@linux.microsoft.com>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
We have a new SBI debug console (DBCN) extension supported by in-kernel
KVM so let us add this extension to get-reg-list test.
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
This patch adds 4 subtests to demonstrate these patterns and validating
correctness.
subtest1:
1) We use task_iter to iterate all process in the system and search for the
current process with a given pid.
2) We create some threads in current process context, and use
BPF_TASK_ITER_PROC_THREADS to iterate all threads of current process. As
expected, we would find all the threads of current process.
3) We create some threads and use BPF_TASK_ITER_ALL_THREADS to iterate all
threads in the system. As expected, we would find all the threads which was
created.
subtest2:
We create a cgroup and add the current task to the cgroup. In the
BPF program, we would use bpf_for_each(css_task, task, css) to iterate all
tasks under the cgroup. As expected, we would find the current process.
subtest3:
1) We create a cgroup tree. In the BPF program, we use
bpf_for_each(css, pos, root, XXX) to iterate all descendant under the root
with pre and post order. As expected, we would find all descendant and the
last iterating cgroup in post-order is root cgroup, the first iterating
cgroup in pre-order is root cgroup.
2) We wse BPF_CGROUP_ITER_ANCESTORS_UP to traverse the cgroup tree starting
from leaf and root separately, and record the height. The diff of the
hights would be the total tree-high - 1.
subtest4:
Add some failure testcase when using css_task, task and css iters, e.g,
unlock when using task-iters to iterate tasks.
Signed-off-by: Chuyi Zhou <zhouchuyi@bytedance.com>
Link: https://lore.kernel.org/r/20231018061746.111364-9-zhouchuyi@bytedance.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
The newly-added struct bpf_iter_task has a name collision with a selftest
for the seq_file task iter's bpf skel, so the selftests/bpf/progs file is
renamed in order to avoid the collision.
Signed-off-by: Chuyi Zhou <zhouchuyi@bytedance.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20231018061746.111364-8-zhouchuyi@bytedance.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
This Patch adds kfuncs bpf_iter_css_{new,next,destroy} which allow
creation and manipulation of struct bpf_iter_css in open-coded iterator
style. These kfuncs actually wrapps css_next_descendant_{pre, post}.
css_iter can be used to:
1) iterating a sepcific cgroup tree with pre/post/up order
2) iterating cgroup_subsystem in BPF Prog, like
for_each_mem_cgroup_tree/cpuset_for_each_descendant_pre in kernel.
The API design is consistent with cgroup_iter. bpf_iter_css_new accepts
parameters defining iteration order and starting css. Here we also reuse
BPF_CGROUP_ITER_DESCENDANTS_PRE, BPF_CGROUP_ITER_DESCENDANTS_POST,
BPF_CGROUP_ITER_ANCESTORS_UP enums.
Signed-off-by: Chuyi Zhou <zhouchuyi@bytedance.com>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/20231018061746.111364-5-zhouchuyi@bytedance.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
This patch adds kfuncs bpf_iter_task_{new,next,destroy} which allow
creation and manipulation of struct bpf_iter_task in open-coded iterator
style. BPF programs can use these kfuncs or through bpf_for_each macro to
iterate all processes in the system.
The API design keep consistent with SEC("iter/task"). bpf_iter_task_new()
accepts a specific task and iterating type which allows:
1. iterating all process in the system (BPF_TASK_ITER_ALL_PROCS)
2. iterating all threads in the system (BPF_TASK_ITER_ALL_THREADS)
3. iterating all threads of a specific task (BPF_TASK_ITER_PROC_THREADS)
Signed-off-by: Chuyi Zhou <zhouchuyi@bytedance.com>
Link: https://lore.kernel.org/r/20231018061746.111364-4-zhouchuyi@bytedance.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
This patch adds kfuncs bpf_iter_css_task_{new,next,destroy} which allow
creation and manipulation of struct bpf_iter_css_task in open-coded
iterator style. These kfuncs actually wrapps css_task_iter_{start,next,
end}. BPF programs can use these kfuncs through bpf_for_each macro for
iteration of all tasks under a css.
css_task_iter_*() would try to get the global spin-lock *css_set_lock*, so
the bpf side has to be careful in where it allows to use this iter.
Currently we only allow it in bpf_lsm and bpf iter-s.
Signed-off-by: Chuyi Zhou <zhouchuyi@bytedance.com>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/20231018061746.111364-3-zhouchuyi@bytedance.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Expand the sockopt test to use also check for io_uring {g,s}etsockopt
commands operations.
This patch starts by marking each test if they support io_uring support
or not.
Right now, io_uring cmd getsockopt() has a limitation of only
accepting level == SOL_SOCKET, otherwise it returns -EOPNOTSUPP. Since
there aren't any test exercising getsockopt(level == SOL_SOCKET), this
patch changes two tests to use level == SOL_SOCKET, they are
"getsockopt: support smaller ctx->optlen" and "getsockopt: read
ctx->optlen".
There is no limitation for the setsockopt() part.
Later, each test runs using regular {g,s}etsockopt systemcalls, and, if
liburing is supported, execute the same test (again), but calling
liburing {g,s}setsockopt commands.
This patch also changes the level of two tests to use SOL_SOCKET for the
following two tests. This is going to help to exercise the io_uring
subsystem:
* getsockopt: read ctx->optlen
* getsockopt: support smaller ctx->optlen
Signed-off-by: Breno Leitao <leitao@debian.org>
Link: https://lore.kernel.org/r/20231016134750.1381153-12-leitao@debian.org
Acked-by: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Instead of defining basic io_uring functions in the test case, move them
to a common directory, so, other tests can use them.
This simplify the test code and reuse the common liburing
infrastructure. This is basically a copy of what we have in
io_uring_zerocopy_tx with some minor improvements to make checkpatch
happy.
A follow-up test will use the same helpers in a BPF sockopt test.
Signed-off-by: Breno Leitao <leitao@debian.org>
Link: https://lore.kernel.org/r/20231016134750.1381153-8-leitao@debian.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Feels like an up-tick in regression fixes, mostly for older releases.
The hfsc fix, tcp_disconnect() and Intel WWAN fixes stand out as fairly
clear-cut user reported regressions. The mlx5 DMA bug was causing strife
for 390x folks. The fixes themselves are not particularly scary, tho.
No open investigations / outstanding reports at the time of writing.
Current release - regressions:
- eth: mlx5: perform DMA operations in the right locations,
make devices usable on s390x, again
- sched: sch_hfsc: upgrade 'rt' to 'sc' when it becomes a inner curve,
previous fix of rejecting invalid config broke some scripts
- rfkill: reduce data->mtx scope in rfkill_fop_open, avoid deadlock
- revert "ethtool: Fix mod state of verbose no_mask bitset",
needs more work
Current release - new code bugs:
- tcp: fix listen() warning with v4-mapped-v6 address
Previous releases - regressions:
- tcp: allow tcp_disconnect() again when threads are waiting,
it was denied to plug a constant source of bugs but turns out
.NET depends on it
- eth: mlx5: fix double-free if buffer refill fails under OOM
- revert "net: wwan: iosm: enable runtime pm support for 7560",
it's causing regressions and the WWAN team at Intel disappeared
- tcp: tsq: relax tcp_small_queue_check() when rtx queue contains
a single skb, fix single-stream perf regression on some devices
Previous releases - always broken:
- Bluetooth:
- fix issues in legacy BR/EDR PIN code pairing
- correctly bounds check and pad HCI_MON_NEW_INDEX name
- netfilter:
- more fixes / follow ups for the large "commit protocol" rework,
which went in as a fix to 6.5
- fix null-derefs on netlink attrs which user may not pass in
- tcp: fix excessive TLP and RACK timeouts from HZ rounding
(bless Debian for keeping HZ=250 alive)
- net: more strict VIRTIO_NET_HDR_GSO_UDP_L4 validation, prevent
letting frankenstein UDP super-frames from getting into the stack
- net: fix interface altnames when ifc moves to a new namespace
- eth: qed: fix the size of the RX buffers
- mptcp: avoid sending RST when closing the initial subflow
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmUxawUACgkQMUZtbf5S
Irt75w/+MC2ilnFQhfoQqpCxL2uHWOKzhenUz2PoNKR60OKgLLmTq8YBYcpyCVAQ
wBQaNzNu1/yjF5BG7aNS/j0suGeJsOfhfoahQvXlLaat9NuDqxTpaeoT5FZ7eNQw
RZ8+CJug3BRDV0TkaJH9UDVC/nfJTsnsGWNIhNXYGPsuveqAUun+xrnN8ZbvZIrn
6D9rMF+u9SdVO+ANCquXBC7+CWEWiJS1ljUrU7BRNiv/9FSlnPQtjOdpuKleeBO8
4usMS7TezHNgRdiAKC8GjSGUiIkIIMJT4y4wuczBEQAD4Pkki9UpBrui97ozOj7h
W4N7UOuPlUBIardvKNoYz9rZyiFXBcPPm0GruHiuCqpxyqmzoFgv2XJsb/6KfzNn
Dyro+lvh8smtbFHvFqiwaNu5y8ucfClaowvR4gjSe2KcB7hIpwNkh6vWC6OMGJK3
hiKHnDrnXBQMbnP1YfiJ4feLmm3UYCG8eFdv/ULZT0a9TzZ7fKfzAfywUwD+/O8Y
+S28Hr9srdDCHO7ih/gF3Wq9wtnnLy8QEkpt7cpXjRDj0uWH8JkHU3YEIGF2814Y
LNVGmX9y6RcgrHNp03K1PmUcgAzhTTuV9QRoQKEucuBT5AK9ALDQ8YomnWDWDgrp
UOdJPi1RUTsmqslADF15wZ9W5Ki/cDnUJsE4HU/MtnM2w95C49Q=
=z1F6
-----END PGP SIGNATURE-----
Merge tag 'net-6.6-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from bluetooth, netfilter, WiFi.
Feels like an up-tick in regression fixes, mostly for older releases.
The hfsc fix, tcp_disconnect() and Intel WWAN fixes stand out as
fairly clear-cut user reported regressions. The mlx5 DMA bug was
causing strife for 390x folks. The fixes themselves are not
particularly scary, tho. No open investigations / outstanding reports
at the time of writing.
Current release - regressions:
- eth: mlx5: perform DMA operations in the right locations, make
devices usable on s390x, again
- sched: sch_hfsc: upgrade 'rt' to 'sc' when it becomes a inner
curve, previous fix of rejecting invalid config broke some scripts
- rfkill: reduce data->mtx scope in rfkill_fop_open, avoid deadlock
- revert "ethtool: Fix mod state of verbose no_mask bitset", needs
more work
Current release - new code bugs:
- tcp: fix listen() warning with v4-mapped-v6 address
Previous releases - regressions:
- tcp: allow tcp_disconnect() again when threads are waiting, it was
denied to plug a constant source of bugs but turns out .NET depends
on it
- eth: mlx5: fix double-free if buffer refill fails under OOM
- revert "net: wwan: iosm: enable runtime pm support for 7560", it's
causing regressions and the WWAN team at Intel disappeared
- tcp: tsq: relax tcp_small_queue_check() when rtx queue contains a
single skb, fix single-stream perf regression on some devices
Previous releases - always broken:
- Bluetooth:
- fix issues in legacy BR/EDR PIN code pairing
- correctly bounds check and pad HCI_MON_NEW_INDEX name
- netfilter:
- more fixes / follow ups for the large "commit protocol" rework,
which went in as a fix to 6.5
- fix null-derefs on netlink attrs which user may not pass in
- tcp: fix excessive TLP and RACK timeouts from HZ rounding (bless
Debian for keeping HZ=250 alive)
- net: more strict VIRTIO_NET_HDR_GSO_UDP_L4 validation, prevent
letting frankenstein UDP super-frames from getting into the stack
- net: fix interface altnames when ifc moves to a new namespace
- eth: qed: fix the size of the RX buffers
- mptcp: avoid sending RST when closing the initial subflow"
* tag 'net-6.6-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (94 commits)
Revert "ethtool: Fix mod state of verbose no_mask bitset"
selftests: mptcp: join: no RST when rm subflow/addr
mptcp: avoid sending RST when closing the initial subflow
mptcp: more conservative check for zero probes
tcp: check mptcp-level constraints for backlog coalescing
selftests: mptcp: join: correctly check for no RST
net: ti: icssg-prueth: Fix r30 CMDs bitmasks
selftests: net: add very basic test for netdev names and namespaces
net: move altnames together with the netdevice
net: avoid UAF on deleted altname
net: check for altname conflicts when changing netdev's netns
net: fix ifname in netlink ntf during netns move
net: ethernet: ti: Fix mixed module-builtin object
net: phy: bcm7xxx: Add missing 16nm EPHY statistics
ipv4: fib: annotate races around nh->nh_saddr_genid and nh->nh_saddr
tcp_bpf: properly release resources on error paths
net/sched: sch_hfsc: upgrade 'rt' to 'sc' when it becomes a inner curve
net: mdio-mux: fix C45 access returning -EIO after API change
tcp: tsq: relax tcp_small_queue_check() when rtx queue contains a single skb
octeon_ep: update BQL sent bytes before ringing doorbell
...
Recently, we noticed that some RST were wrongly generated when removing
the initial subflow.
This patch makes sure RST are not sent when removing any subflows or any
addresses.
Fixes: c2b2ae3925 ("mptcp: handle correctly disconnect() failures")
Cc: stable@vger.kernel.org
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Matthieu Baerts <matttbe@kernel.org>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20231018-send-net-20231018-v1-5-17ecb002e41d@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The commit mentioned below was more tolerant with the number of RST seen
during a test because in some uncontrollable situations, multiple RST
can be generated.
But it was not taking into account the case where no RST are expected:
this validation was then no longer reporting issues for the 0 RST case
because it is not possible to have less than 0 RST in the counter. This
patch fixes the issue by adding a specific condition.
Fixes: 6bf41020b7 ("selftests: mptcp: update and extend fastclose test-cases")
Cc: stable@vger.kernel.org
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts <matttbe@kernel.org>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20231018-send-net-20231018-v1-1-17ecb002e41d@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add selftest for fixes around naming netdevs and namespaces.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Some taprio tests need auxiliary scripts to wait for workqueue events to
process. Move them to a dedicated folder in order to package them for
the kselftests tarball.
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Link: https://lore.kernel.org/r/20231017152309.3196320-3-pctammela@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Make sure CI builds using just tc-testing/config can run all tdc tests.
Some tests were broken because of missing knobs.
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Link: https://lore.kernel.org/r/20231017152309.3196320-2-pctammela@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add tests to verify setting ID registers from userspace is handled
correctly by KVM. Also add a test case to use ioctl
KVM_ARM_GET_REG_WRITABLE_MASKS to get writable masks.
Signed-off-by: Jing Zhang <jingzhangos@google.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231011195740.3349631-6-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
The users of sysreg.h (perf, KVM selftests) are now generating the
necessary sysreg-defs.h; sync sysreg.h with the kernel sources and
fix the KVM selftests that use macros which suffered a rename.
Signed-off-by: Jing Zhang <jingzhangos@google.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231011195740.3349631-5-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Start generating sysreg-defs.h for arm64 builds in anticipation of
updating sysreg.h to a version that depends on it.
Reviewed-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231011195740.3349631-4-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
My original comment lied, output can be "0 A A B 0 0 0\n"
(see comment in the code).
I don't quite understand why
get_mm_counter(mm, MM_FILEPAGES) + get_mm_counter(mm, MM_SHMEMPAGES)
can stay positive but get_mm_counter(mm, MM_ANONPAGES) is always 0 after
everything is unmapped but that's just me.
[adobriyan@gmail.com: more or less rewritten]
Link: https://lkml.kernel.org/r/0721ca69-7bb4-40aa-8d01-0c5f91e5f363@p183
Signed-off-by: Swarup Laxman Kotiaklapudi <swarupkotikalapudi@gmail.com>
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
This patch add a new kselftest to demonstrate and verify the new hugetlb
memcg accounting behavior.
Link: https://lkml.kernel.org/r/20231006184629.155543-5-nphamcs@gmail.com
Signed-off-by: Nhat Pham <nphamcs@gmail.com>
Cc: Frank van der Linden <fvdl@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Rik van Riel <riel@surriel.com>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Tejun heo <tj@kernel.org>
Cc: Yosry Ahmed <yosryahmed@google.com>
Cc: Zefan Li <lizefan.x@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Create a selftest that exercises the race between page faults and
madvise(MADV_DONTNEED) in the same huge page. Do it by running two
threads that touches the huge page and madvise(MADV_DONTNEED) at the same
time.
In case of a SIGBUS coming at pagefault, the test should fail, since we
hit the bug.
The test doesn't have a signal handler, and if it fails, it fails like
the following
----------------------------------
running ./hugetlb_fault_after_madv
----------------------------------
./run_vmtests.sh: line 186: 595563 Bus error (core dumped) "$@"
[FAIL]
This selftest goes together with the fix of the bug[1] itself.
[1] https://lore.kernel.org/all/20231001005659.2185316-1-riel@surriel.com/#r
Link: https://lkml.kernel.org/r/20231005163922.87568-3-leitao@debian.org
Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Rik van Riel <riel@surriel.com>
Tested-by: Rik van Riel <riel@surriel.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Muchun Song <muchun.song@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Patch series "New selftest for mm", v2.
This is a simple test case that reproduces an mm problem[1], where a page
fault races with madvise(), and it is not trivial to reproduce and debug.
This test-case aims to avoid such race problems from happening again,
impacting workloads that leverages external allocators, such as tcmalloc,
jemalloc, etc.
[1] https://lore.kernel.org/all/20231001005659.2185316-1-riel@surriel.com/#r
This patch (of 2):
get_free_hugepages() is helpful for other hugepage tests. Export it to
the common file (vm_util.c) to be reused.
Link: https://lkml.kernel.org/r/20231005163922.87568-1-leitao@debian.org
Link: https://lkml.kernel.org/r/20231005163922.87568-2-leitao@debian.org
Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Rik van Riel <riel@surriel.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Muchun Song <muchun.song@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The bulk allocation is iterating through an array and storing enough
memory for the entire bulk allocation instead of a single array entry.
Only allocate an array element of the size set in the kmem_cache.
Link: https://lkml.kernel.org/r/20230929201359.2857583-1-Liam.Howlett@oracle.com
Fixes: cc86e0c2f3 ("radix tree test suite: add support for slab bulk APIs")
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reported-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Add pagemap ioctl tests. Add several different types of tests to judge
the correction of the interface.
Link: https://lkml.kernel.org/r/20230821141518.870589-7-usama.anjum@collabora.com
Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Cc: Alex Sierra <alex.sierra@amd.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrei Vagin <avagin@gmail.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michal Miroslaw <emmir@google.com>
Cc: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Cc: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Nadav Amit <namit@vmware.com>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Paul Gofman <pgofman@codeweavers.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Yun Zhou <yun.zhou@windriver.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Commit 20d96b25cc ("selftests/resctrl: Fix schemata write error
check") exposed a problem in feature detection logic in MBM selftest.
If schemata does not support MB:x=x entries, the schemata write to
initialize 100% memory bandwidth allocation in mbm_setup() will now
fail with -EINVAL due to the error handling corrected by the commit
20d96b25cc ("selftests/resctrl: Fix schemata write error check").
That commit just uncovers the failed write, it is not wrong itself.
If MB:x=x is not supported by schemata, it is safe to assume 100%
memory bandwidth is always set. Therefore, the previously ignored error
does not make the MBM test itself wrong.
Restore the previous behavior of MBM test by checking MB support before
attempting to write it into schemata which results in behavior
equivalent to ignoring the write error.
Fixes: 20d96b25cc ("selftests/resctrl: Fix schemata write error check")
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
The clone3() selftests currently report test results in a format that does
not mesh entirely well with automation. They log output for each test such
as:
# [1382411] Trying clone3() with flags 0 (size 0)
# I am the parent (1382411). My child's pid is 1382412
# I am the child, my PID is 1382412
# [1382411] clone3() with flags says: 0 expected 0
ok 1 [1382411] Result (0) matches expectation (0)
This is not ideal for automated parsers since the text after the "ok 1" is
treated as the test name when comparing runs by a lot of automation (tests
routinely get renumbered due to things like new tests being added based on
logical groupings). The PID means that the test names will frequently vary
and the rest of the name being a description of results means several tests
have identical text there.
Address this by refactoring things so that we have a static descriptive
name for each test which we use when logging passes, failures and skips
and since we now have a stable name for the test to hand log that before
starting the test to address the common issue reading logs where the test
name is only printed after any diagnostics. The result is:
# Running test 'simple clone3()'
# [1562777] Trying clone3() with flags 0 (size 0)
# I am the parent (1562777). My child's pid is 1562778
# I am the child, my PID is 1562778
# [1562777] clone3() with flags says: 0 expected 0
ok 1 simple clone3()
In order to handle skips a bit more neatly this is done in a moderately
invasive fashion where we move from a sequence of function calls to having
an array of test parameters. This hopefully also makes it a little easier
to see what the tests are doing when looking at both the source and the
logs.
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
when the argument type is 'unsigned int',printf '%u'
in format string. Problem found during code reading.
Update commit log with information on how the problem
was found:
Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: zhujun2 <zhujun2@cmss.chinamobile.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
The opened file should be closed in main(), otherwise resource
leak will occur that this problem was discovered by code reading
Signed-off-by: zhujun2 <zhujun2@cmss.chinamobile.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
This is the riscv variant of commit 9855c4626c ("selftests/ftrace:
Add ppc support for kprobe args tests").
Signed-off-by: Yu Liao <liaoyu15@huawei.com>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
When execute the following command to test clone3 under !CONFIG_TIME_NS:
# make headers && cd tools/testing/selftests/clone3 && make && ./clone3
we can see the following error info:
# [7538] Trying clone3() with flags 0x80 (size 0)
# Invalid argument - Failed to create new process
# [7538] clone3() with flags says: -22 expected 0
not ok 18 [7538] Result (-22) is different than expected (0)
...
# Totals: pass:18 fail:1 xfail:0 xpass:0 skip:0 error:0
This is because if CONFIG_TIME_NS is not set, but the flag
CLONE_NEWTIME (0x80) is used to clone a time namespace, it
will return -EINVAL in copy_time_ns().
If kernel does not support CONFIG_TIME_NS, /proc/self/ns/time
will be not exist, and then we should skip clone3() test with
CLONE_NEWTIME.
With this patch under !CONFIG_TIME_NS:
# make headers && cd tools/testing/selftests/clone3 && make && ./clone3
...
# Time namespaces are not supported
ok 18 # SKIP Skipping clone3() with CLONE_NEWTIME
...
# Totals: pass:18 fail:0 xfail:0 xpass:0 skip:1 error:0
Link: https://lkml.kernel.org/r/1689066814-13295-1-git-send-email-yangtiezhu@loongson.cn
Fixes: 515bddf0ec ("selftests/clone3: test clone3 with CLONE_NEWTIME")
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Definition for MREMAP_DONTUNMAP is not present in glibc older than 2.32
thus throwing an undeclared error when running make on mm. Including
linux/mman.h solves the build error for people having older glibc.
Link: https://lkml.kernel.org/r/20231012155257.891776-1-samasth.norway.ananda@oracle.com
Fixes: 0183d777c2 ("selftests: mm: remove duplicate unneeded defines")
Signed-off-by: Samasth Norway Ananda <samasth.norway.ananda@oracle.com>
Reported-by: Linux Kernel Functional Testing <lkft@linaro.org>
Closes: https://lore.kernel.org/linux-mm/CA+G9fYvV-71XqpCr_jhdDfEtN701fBdG3q+=bafaZiGwUXy_aA@mail.gmail.com/
Tested-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Don't mess with the host's firewall ruleset. Since audit logging is not
per-netns, add an initial delay of a second so other selftests' netns
cleanups have a chance to finish.
Fixes: e8dbde59ca ("selftests: netfilter: Test nf_tables audit logging")
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Florian Westphal <fw@strlen.de>
When resetting multiple objects at once (via dump request), emit a log
message per table (or filled skb) and resurrect the 'entries' parameter
to contain the number of objects being logged for.
To test the skb exhaustion path, perform some bulk counter and quota
adds in the kselftest.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Reviewed-by: Richard Guy Briggs <rgb@redhat.com>
Acked-by: Paul Moore <paul@paul-moore.com> (Audit)
Signed-off-by: Florian Westphal <fw@strlen.de>
Assume that caller's 'to' offset really represents an upper boundary for
the pattern search, so patterns extending past this offset are to be
rejected.
The old behaviour also was kind of inconsistent when it comes to
fragmentation (or otherwise non-linear skbs): If the pattern started in
between 'to' and 'from' offsets but extended to the next fragment, it
was not found if 'to' offset was still within the current fragment.
Test the new behaviour in a kselftest using iptables' string match.
Suggested-by: Pablo Neira Ayuso <pablo@netfilter.org>
Fixes: f72b948dcb ("[NET]: skb_find_text ignores to argument")
Signed-off-by: Phil Sutter <phil@nwl.cc>
Reviewed-by: Florian Westphal <fw@strlen.de>
Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>