Commit Graph

28121 Commits

Author SHA1 Message Date
Linus Torvalds
64ad946152 Merge tag 'x86_core_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 core updates from Borislav Petkov:

 - Get rid of all the .fixup sections because this generates
   misleading/wrong stacktraces and confuse RELIABLE_STACKTRACE and
   LIVEPATCH as the backtrace misses the function which is being fixed
   up.

 - Add Straight Line Speculation mitigation support which uses a new
   compiler switch -mharden-sls= which sticks an INT3 after a RET or an
   indirect branch in order to block speculation after them. Reportedly,
   CPUs do speculate behind such insns.

 - The usual set of cleanups and improvements

* tag 'x86_core_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (32 commits)
  x86/entry_32: Fix segment exceptions
  objtool: Remove .fixup handling
  x86: Remove .fixup section
  x86/word-at-a-time: Remove .fixup usage
  x86/usercopy: Remove .fixup usage
  x86/usercopy_32: Simplify __copy_user_intel_nocache()
  x86/sgx: Remove .fixup usage
  x86/checksum_32: Remove .fixup usage
  x86/vmx: Remove .fixup usage
  x86/kvm: Remove .fixup usage
  x86/segment: Remove .fixup usage
  x86/fpu: Remove .fixup usage
  x86/xen: Remove .fixup usage
  x86/uaccess: Remove .fixup usage
  x86/futex: Remove .fixup usage
  x86/msr: Remove .fixup usage
  x86/extable: Extend extable functionality
  x86/entry_32: Remove .fixup usage
  x86/entry_64: Remove .fixup usage
  x86/copy_mc_64: Remove .fixup usage
  ...
2022-01-12 16:31:19 -08:00
Linus Torvalds
362f533a2a Merge tag 'cxl-for-5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl
Pull CXL (Compute Express Link) updates from Dan Williams:
 "The highlight is initial support for CXL memory hotplug. The static
  NUMA node (ACPI SRAT Physical Address to Proximity Domain) information
  known to platform firmware is extended to support the potential
  performance-class / memory-target nodes dynamically created from
  available CXL memory device capacity.

  New unit test infrastructure is added for validating health
  information payloads.

  Fixes to module reload stress and stack usage from exposure in -next
  are included. A symbol rename and some other miscellaneous fixups are
  included as well.

  Summary:

   - Rework ACPI sub-table infrastructure to optionally be used outside
     of __init scenarios and use it for CEDT.CFMWS sub-table parsing.

   - Add support for extending num_possible_nodes by the potential
     hotplug CXL memory ranges

   - Extend tools/testing/cxl with mock memory device health information

   - Fix a module-reload workqueue race

   - Fix excessive stack-frame usage

   - Rename the driver context data structure from "cxl_mem" since that
     name collides with a proposed driver name

   - Use EXPORT_SYMBOL_NS_GPL instead of -DDEFAULT_SYMBOL_NAMESPACE at
     build time"

* tag 'cxl-for-5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl:
  cxl/core: Remove cxld_const_init in cxl_decoder_alloc()
  cxl/pmem: Fix module reload vs workqueue state
  ACPI: NUMA: Add a node and memblk for each CFMWS not in SRAT
  cxl/test: Mock acpi_table_parse_cedt()
  cxl/acpi: Convert CFMWS parsing to ACPI sub-table helpers
  ACPI: Add a context argument for table parsing handlers
  ACPI: Teach ACPI table parsing about the CEDT header format
  ACPI: Keep sub-table parsing infrastructure available for modules
  tools/testing/cxl: add mock output for the GET_HEALTH_INFO command
  cxl/memdev: Remove unused cxlmd field
  cxl/core: Convert to EXPORT_SYMBOL_NS_GPL
  cxl/memdev: Change cxl_mem to a more descriptive name
  cxl/mbox: Remove bad comment
  cxl/pmem: Fix reference counting for delayed work
2022-01-12 15:57:59 -08:00
Linus Torvalds
3acbdbf42e Merge tag 'libnvdimm-for-5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
Pull dax and libnvdimm updates from Dan Williams:
 "The bulk of this is a rework of the dax_operations API after
  discovering the obstacles it posed to the work-in-progress DAX+reflink
  support for XFS and other copy-on-write filesystem mechanics.

  Primarily the need to plumb a block_device through the API to handle
  partition offsets was a sticking point and Christoph untangled that
  dependency in addition to other cleanups to make landing the
  DAX+reflink support easier.

  The DAX_PMEM_COMPAT option has been around for 4 years and not only
  are distributions shipping userspace that understand the current
  configuration API, but some are not even bothering to turn this option
  on anymore, so it seems a good time to remove it per the deprecation
  schedule. Recall that this was added after the device-dax subsystem
  moved from /sys/class/dax to /sys/bus/dax for its sysfs organization.
  All recent functionality depends on /sys/bus/dax.

  Some other miscellaneous cleanups and reflink prep patches are
  included as well.

  Summary:

   - Simplify the dax_operations API:

      - Eliminate bdev_dax_pgoff() in favor of the filesystem
        maintaining and applying a partition offset to all its DAX iomap
        operations.

      - Remove wrappers and device-mapper stacked callbacks for
        ->copy_from_iter() and ->copy_to_iter() in favor of moving
        block_device relative offset responsibility to the
        dax_direct_access() caller.

      - Remove the need for an @bdev in filesystem-DAX infrastructure

      - Remove unused uio helpers copy_from_iter_flushcache() and
        copy_mc_to_iter() as only the non-check_copy_size() versions are
        used for DAX.

   - Prepare XFS for the pending (next merge window) DAX+reflink support

   - Remove deprecated DEV_DAX_PMEM_COMPAT support

   - Cleanup a straggling misuse of the GUID api"

* tag 'libnvdimm-for-5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: (38 commits)
  iomap: Fix error handling in iomap_zero_iter()
  ACPI: NFIT: Import GUID before use
  dax: remove the copy_from_iter and copy_to_iter methods
  dax: remove the DAXDEV_F_SYNC flag
  dax: simplify dax_synchronous and set_dax_synchronous
  uio: remove copy_from_iter_flushcache() and copy_mc_to_iter()
  iomap: turn the byte variable in iomap_zero_iter into a ssize_t
  memremap: remove support for external pgmap refcounts
  fsdax: don't require CONFIG_BLOCK
  iomap: build the block based code conditionally
  dax: fix up some of the block device related ifdefs
  fsdax: shift partition offset handling into the file systems
  dax: return the partition offset from fs_dax_get_by_bdev
  iomap: add a IOMAP_DAX flag
  xfs: pass the mapping flags to xfs_bmbt_to_iomap
  xfs: use xfs_direct_write_iomap_ops for DAX zeroing
  xfs: move dax device handling into xfs_{alloc,free}_buftarg
  ext4: cleanup the dax handling in ext4_fill_super
  ext2: cleanup the dax handling in ext2_fill_super
  fsdax: decouple zeroing from the iomap buffered I/O code
  ...
2022-01-12 15:46:11 -08:00
Linus Torvalds
84bfcc0b69 Merge tag 'integrity-v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity
Pull integrity subsystem updates from Mimi Zohar:
 "The few changes are all kexec related:

   - The MOK keys are loaded onto the .platform keyring in order to
     verify the kexec kernel image signature.

     However, the MOK keys should only be trusted when secure boot is
     enabled. Before loading the MOK keys onto the .platform keyring,
     make sure the system is booted in secure boot mode.

   - When carrying the IMA measurement list across kexec, limit dumping
     the measurement list to when dynamic debug or CONFIG_DEBUG is
     enabled.

   - kselftest: add kexec_file_load selftest support for PowerNV and
     other cleanup"

* tag 'integrity-v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity:
  selftests/kexec: Enable secureboot tests for PowerPC
  ima: silence measurement list hexdump during kexec
  selftests/kexec: update searching for the Kconfig
  selftest/kexec: fix "ignored null byte in input" warning
  integrity: Do not load MOK and MOKx when secure boot be disabled
  ima: Fix undefined arch_ima_get_secureboot() and co
2022-01-11 13:11:10 -08:00
Linus Torvalds
c288ea6798 Merge tag 'gpio-updates-for-v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
Pull gpio updates from Bartosz Golaszewski:
 "The gpio-sim module is back, this time without any changes to
  configfs. This results in a less elegant user-space interface but I
  never got any follow-up on the committable items and didn't want to
  delay this module for several more months.

  Other than that we have support for several new models and some
  support going away. We started working on converting GPIO drivers to
  using fwnode exclusively in order to limit references to OF symbols to
  gpiolib-of.c exclusively. We also have regular tweaks and improvements
  all over the place.

  Summary:

   - new testing module: gpio-sim that is scheduled to replace
     gpio-mockup

   - initial changes aiming at converting all GPIO drivers to using the
     fwnode interface and limiting any references to OF symbols to
     gpiolib-of.c

   - add support for Tegra234 and Tegra241 to gpio-tegra186

   - add support for new models (SSD201 and SSD202D) to gpio-msc313

   - add basic support for interrupts to gpio-aggregator

   - add support for AMDIF031 HID device to gpio-amdpt

   - drop support for unused platforms in gpio-xlp

   - cleanup leftovers from the removal of the legacy Samsung Exynos
     GPIO driver

   - use raw spinlocks in gpio-aspeed and gpio-aspeed-sgpio to make
     PREEMPT_RT happy

   - generalize the common 'ngpios' device property by reading it in the
     core gpiolib code so that we can remove duplicate reads from
     drivers

   - allow line names from device properties to override names set by
     drivers

   - code shrink in gpiod_add_lookup_table()

   - add new model to the DT bindings for gpio-vf610

   - convert DT bindings for tegra devices to YAML

   - improvements to interrupt handling in gpio-rcar and gpio-rockchip

   - updates to intel drivers from Andy (details in the merge commit)

   - some minor tweaks, improvements and coding-style fixes all around
     the subsystem"

* tag 'gpio-updates-for-v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux: (59 commits)
  gpio: rcar: Propagate errors from devm_request_irq()
  gpio: rcar: Use platform_get_irq() to get the interrupt
  gpio: ts5500: Use platform_get_irq() to get the interrupt
  gpio: dwapb: Switch to use fwnode instead of of_node
  gpiolib: acpi: make fwnode take precedence in struct gpio_chip
  dt-bindings: gpio: samsung: drop unused bindings
  gpio: max3191x: Use bitmap_free() to free bitmap
  gpio: regmap: Switch to use fwnode instead of of_node
  gpio: tegra186: Add support for Tegra241
  dt-bindings: gpio: Add Tegra241 support
  gpio: brcmstb: Use local variable to access OF node
  gpio: Remove unused local OF node pointers
  gpio: sim: add missing fwnode_handle_put() in gpio_sim_probe()
  gpio: msc313: Add support for SSD201 and SSD202D
  gpio: msc313: Code clean ups
  dt-bindings: gpio: msc313: Add offsets for ssd20xd
  dt-bindings: gpio: msc313: Add compatible for ssd20xd
  gpio: sim: fix uninitialized ret variable
  gpio: Propagate firmware node from a parent device
  gpio: Setup parent device and get rid of unnecessary of_node assignment
  ...
2022-01-11 12:31:35 -08:00
Linus Torvalds
347708875a Merge tag 'platform-drivers-x86-v5.17-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86
Pull x86 platform driver updates from Hans de Goede:
 "Highlights:

  New drivers:
   - asus-tf103c-dock
   - intel_crystal_cove_charger
   - lenovo-yogabook-wmi
   - simatic-ipc platform-code + led driver + watchdog driver
   - x86-android-tablets (kernel module to workaround DSDT bugs on
     these)

  amd-pmc:
   - bug-fixes
   - smar trace buffer support

  asus-wmi:
   - support for custom fan curves

  int3472 (camera info ACPI object for Intel IPU3/SkyCam cameras):
   - ACPI core + int3472 changes to delay enumeration of camera sensor
     I2C clients until the PMIC for the sensor has been fully probed
   - Add support for board data (DSDT info is incomplete) for setting up
     the tps68470 PMIC used on some boards with these cameras
   - Add board data for the Microsoft Surface Go (original, v2 and v3)

  thinkpad_acpi:
   - various cleanups
   - support for forced battery discharging (for battery calibration)
   - support to inhibit battery charging
   - this includes power_supply core changes to add new APIs for this

  think_lmi:
   - enhanced BIOS password support

  various other small fixes and hardware-id additions"

* tag 'platform-drivers-x86-v5.17-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: (78 commits)
  power: supply: Provide stubs for charge_behaviour helpers
  platform/x86: x86-android-tablets: Fix GPIO lookup leak on error-exit
  platform/x86: int3472: Add board data for Surface Go 3
  platform/x86: Add Asus TF103C dock driver
  platform/x86: x86-android-tablets: Add TM800A550L data
  platform/x86: x86-android-tablets: Add Asus MeMO Pad 7 ME176C data
  platform/x86: x86-android-tablets: Add Asus TF103C data
  platform/x86: x86-android-tablets: Add support for preloading modules
  platform/x86: x86-android-tablets: Add support for registering GPIO lookup tables
  platform/x86: x86-android-tablets: Add support for instantiating serdevs
  platform/x86: x86-android-tablets: Add support for instantiating platform-devs
  platform/x86: x86-android-tablets: Add support for PMIC interrupts
  platform/x86: x86-android-tablets: Don't return -EPROBE_DEFER from a non probe() function
  platform/x86: touchscreen_dmi: Remove the Glavey TM800A550L entry
  platform/x86: touchscreen_dmi: Enable pen support on the Chuwi Hi10 Plus and Pro
  platform/x86: touchscreen_dmi: Correct min/max values for Chuwi Hi10 Pro (CWI529) tablet
  platform/x86: Add intel_crystal_cove_charger driver
  power: supply: fix charge_behaviour attribute initialization
  platform/x86: intel-uncore-frequency: use default_groups in kobj_type
  x86/platform/uv: use default_groups in kobj_type
  ...
2022-01-11 11:26:57 -08:00
Linus Torvalds
1be5bdf8cd Merge tag 'kcsan.2022.01.09a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu
Pull KCSAN updates from Paul McKenney:
 "This provides KCSAN fixes and also the ability to take memory barriers
  into account for weakly-ordered systems. This last can increase the
  probability of detecting certain types of data races"

* tag 'kcsan.2022.01.09a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: (29 commits)
  kcsan: Only test clear_bit_unlock_is_negative_byte if arch defines it
  kcsan: Avoid nested contexts reading inconsistent reorder_access
  kcsan: Turn barrier instrumentation into macros
  kcsan: Make barrier tests compatible with lockdep
  kcsan: Support WEAK_MEMORY with Clang where no objtool support exists
  compiler_attributes.h: Add __disable_sanitizer_instrumentation
  objtool, kcsan: Remove memory barrier instrumentation from noinstr
  objtool, kcsan: Add memory barrier instrumentation to whitelist
  sched, kcsan: Enable memory barrier instrumentation
  mm, kcsan: Enable barrier instrumentation
  x86/qspinlock, kcsan: Instrument barrier of pv_queued_spin_unlock()
  x86/barriers, kcsan: Use generic instrumentation for non-smp barriers
  asm-generic/bitops, kcsan: Add instrumentation for barriers
  locking/atomics, kcsan: Add instrumentation for barriers
  locking/barriers, kcsan: Support generic instrumentation
  locking/barriers, kcsan: Add instrumentation for barriers
  kcsan: selftest: Add test case to check memory barrier instrumentation
  kcsan: Ignore GCC 11+ warnings about TSan runtime support
  kcsan: test: Add test cases for memory barrier instrumentation
  kcsan: test: Match reordered or normal accesses
  ...
2022-01-11 09:51:26 -08:00
Linus Torvalds
1c824bf768 Merge tag 'lkmm.2022.01.09a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu
Pull memory model documentation updates from Paul McKenney:
 "This series contains documentation and litmus tests for locking,
  courtesy of Boqun Feng"

* tag 'lkmm.2022.01.09a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu:
  tools/memory-model: litmus: Add two tests for unlock(A)+lock(B) ordering
  tools/memory-model: doc: Describe the requirement of the litmus-tests directory
  tools/memory-model: Provide extra ordering for unlock+lock pair on the same CPU
2022-01-11 09:38:03 -08:00
Linus Torvalds
e7d38f16c2 Merge tag 'rcu.2022.01.09a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu
Pull RCU updates from Paul McKenney:

 - Documentation updates, perhaps most notably Neil Brown's writeup of
   the reference-counting analogy to RCU.

 - Expedited grace-period cleanups.

 - Remove CONFIG_RCU_FAST_NO_HZ due to lack of valid users. I have asked
   around, posted a blog entry, and sent this series to LKML without
   result.

 - Miscellaneous fixes.

 - RCU callback offloading updates, perhaps most notably Frederic
   Weisbecker's updates allowing CPUs booted in the de-offloaded state
   to be offloaded at runtime.

 - nolibc fixes from Willy Tarreau and Anmar Faizi, but also including
   Mark Brown's addition of gettid().

 - RCU Tasks Trace fixes, including changes that increase the
   scalability of call_rcu_tasks_trace() for the BPF folks (Martin Lau
   and KP Singh).

 - Various fixes including those from Wander Lairson Costa and Li
   Zhijian.

 - Fixes plus addition of tests for the increased call_rcu_tasks_trace()
   scalability.

* tag 'rcu.2022.01.09a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: (87 commits)
  rcu/nocb: Merge rcu_spawn_cpu_nocb_kthread() and rcu_spawn_one_nocb_kthread()
  rcu/nocb: Allow empty "rcu_nocbs" kernel parameter
  rcu/nocb: Create kthreads on all CPUs if "rcu_nocbs=" or "nohz_full=" are passed
  rcu/nocb: Optimize kthreads and rdp initialization
  rcu/nocb: Prepare nocb_cb_wait() to start with a non-offloaded rdp
  rcu/nocb: Remove rcu_node structure from nocb list when de-offloaded
  rcu-tasks: Use fewer callbacks queues if callback flood ends
  rcu-tasks: Use separate ->percpu_dequeue_lim for callback dequeueing
  rcu-tasks: Use more callback queues if contention encountered
  rcu-tasks: Avoid raw-spinlocked wakeups from call_rcu_tasks_generic()
  rcu-tasks: Count trylocks to estimate call_rcu_tasks() contention
  rcu-tasks: Add rcupdate.rcu_task_enqueue_lim to set initial queueing
  rcu-tasks: Make rcu_barrier_tasks*() handle multiple callback queues
  rcu-tasks: Use workqueues for multiple rcu_tasks_invoke_cbs() invocations
  rcu-tasks: Abstract invocations of callbacks
  rcu-tasks: Abstract checking of callback lists
  rcu-tasks: Add a ->percpu_enqueue_lim to the rcu_tasks structure
  rcu-tasks: Inspect stalled task's trc state in locked state
  rcu-tasks: Use spin_lock_rcu_node() and friends
  rcutorture: Combine n_max_cbs from all kthreads in a callback flood
  ...
2022-01-11 09:29:44 -08:00
Linus Torvalds
fe2437ccbd Merge tag 'thermal-5.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull thermal control updates from Rafael Wysocki:
 "These add a new driver for Renesas RZ/G2L TSU, update a few existing
  thermal control drivers and clean up the tmon utility.

  Specifics:

   - Add new TSU driver and DT bindings for the Renesas RZ/G2L platform
     (Biju Das).

   - Fix missing check when calling reset_control_deassert() in the
     rz2gl thermal driver (Biju Das).

   - In preparation for FORTIFY_SOURCE performing compile-time and
     run-time field bounds checking for memcpy(), avoid intentionally
     writing across neighboring fields in the int340x thermal control
     driver (Kees Cook).

   - Fix RFIM mailbox write commands handling in the int340x thermal
     control driver (Sumeet Pawnikar).

   - Fix PM issue occurring in the iMX thermal control driver during
     suspend/resume by implementing PM runtime support in it (Oleksij
     Rempel).

   - Add 'const' annotation to thermal_cooling_ops in the Intel
     powerclamp driver (Rikard Falkeborn).

   - Fix missing ADC bit set in the iMX8MP thermal driver to enable the
     sensor (Paul Gerber).

   - Drop unused local variable definition from tmon (ran jianping)"

* tag 'thermal-5.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  thermal/drivers/int340x: Fix RFIM mailbox write commands
  thermal/drivers/rz2gl: Add error check for reset_control_deassert()
  thermal/drivers/imx8mm: Enable ADC when enabling monitor
  thermal/drivers: Add TSU driver for RZ/G2L
  dt-bindings: thermal: Document Renesas RZ/G2L TSU
  thermal/drivers/intel_powerclamp: Constify static thermal_cooling_device_ops
  thermal/drivers/imx: Implement runtime PM support
  thermal: tools: tmon: remove unneeded local variable
  thermal: int340x: Use struct_group() for memcpy() region
2022-01-10 20:43:54 -08:00
Linus Torvalds
8efd0d9c31 Merge tag '5.17-net-next' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski:
 "Core
  ----

   - Defer freeing TCP skbs to the BH handler, whenever possible, or at
     least perform the freeing outside of the socket lock section to
     decrease cross-CPU allocator work and improve latency.

   - Add netdevice refcount tracking to locate sources of netdevice and
     net namespace refcount leaks.

   - Make Tx watchdog less intrusive - avoid pausing Tx and restarting
     all queues from a single CPU removing latency spikes.

   - Various small optimizations throughout the stack from Eric Dumazet.

   - Make netdev->dev_addr[] constant, force modifications to go via
     appropriate helpers to allow us to keep addresses in ordered data
     structures.

   - Replace unix_table_lock with per-hash locks, improving performance
     of bind() calls.

   - Extend skb drop tracepoint with a drop reason.

   - Allow SO_MARK and SO_PRIORITY setsockopt under CAP_NET_RAW.

  BPF
  ---

   - New helpers:
      - bpf_find_vma(), find and inspect VMAs for profiling use cases
      - bpf_loop(), runtime-bounded loop helper trading some execution
        time for much faster (if at all converging) verification
      - bpf_strncmp(), improve performance, avoid compiler flakiness
      - bpf_get_func_arg(), bpf_get_func_ret(), bpf_get_func_arg_cnt()
        for tracing programs, all inlined by the verifier

   - Support BPF relocations (CO-RE) in the kernel loader.

   - Further the support for BTF_TYPE_TAG annotations.

   - Allow access to local storage in sleepable helpers.

   - Convert verifier argument types to a composable form with different
     attributes which can be shared across types (ro, maybe-null).

   - Prepare libbpf for upcoming v1.0 release by cleaning up APIs,
     creating new, extensible ones where missing and deprecating those
     to be removed.

  Protocols
  ---------

   - WiFi (mac80211/cfg80211):
      - notify user space about long "come back in N" AP responses,
        allow it to react to such temporary rejections
      - allow non-standard VHT MCS 10/11 rates
      - use coarse time in airtime fairness code to save CPU cycles

   - Bluetooth:
      - rework of HCI command execution serialization to use a common
        queue and work struct, and improve handling errors reported in
        the middle of a batch of commands
      - rework HCI event handling to use skb_pull_data, avoiding packet
        parsing pitfalls
      - support AOSP Bluetooth Quality Report

   - SMC:
      - support net namespaces, following the RDMA model
      - improve connection establishment latency by pre-clearing buffers
      - introduce TCP ULP for automatic redirection to SMC

   - Multi-Path TCP:
      - support ioctls: SIOCINQ, OUTQ, and OUTQNSD
      - support socket options: IP_TOS, IP_FREEBIND, IP_TRANSPARENT,
        IPV6_FREEBIND, and IPV6_TRANSPARENT, TCP_CORK and TCP_NODELAY
      - support cmsgs: TCP_INQ
      - improvements in the data scheduler (assigning data to subflows)
      - support fastclose option (quick shutdown of the full MPTCP
        connection, similar to TCP RST in regular TCP)

   - MCTP (Management Component Transport) over serial, as defined by
     DMTF spec DSP0253 - "MCTP Serial Transport Binding".

  Driver API
  ----------

   - Support timestamping on bond interfaces in active/passive mode.

   - Introduce generic phylink link mode validation for drivers which
     don't have any quirks and where MAC capability bits fully express
     what's supported. Allow PCS layer to participate in the validation.
     Convert a number of drivers.

   - Add support to set/get size of buffers on the Rx rings and size of
     the tx copybreak buffer via ethtool.

   - Support offloading TC actions as first-class citizens rather than
     only as attributes of filters, improve sharing and device resource
     utilization.

   - WiFi (mac80211/cfg80211):
      - support forwarding offload (ndo_fill_forward_path)
      - support for background radar detection hardware
      - SA Query Procedures offload on the AP side

  New hardware / drivers
  ----------------------

   - tsnep - FPGA based TSN endpoint Ethernet MAC used in PLCs with
     real-time requirements for isochronous communication with protocols
     like OPC UA Pub/Sub.

   - Qualcomm BAM-DMUX WWAN - driver for data channels of modems
     integrated into many older Qualcomm SoCs, e.g. MSM8916 or MSM8974
     (qcom_bam_dmux).

   - Microchip LAN966x multi-port Gigabit AVB/TSN Ethernet Switch driver
     with support for bridging, VLANs and multicast forwarding
     (lan966x).

   - iwlmei driver for co-operating between Intel's WiFi driver and
     Intel's Active Management Technology (AMT) devices.

   - mse102x - Vertexcom MSE102x Homeplug GreenPHY chips

   - Bluetooth:
      - MediaTek MT7921 SDIO devices
      - Foxconn MT7922A
      - Realtek RTL8852AE

  Drivers
  -------

   - Significantly improve performance in the datapaths of: lan78xx,
     ax88179_178a, lantiq_xrx200, bnxt.

   - Intel Ethernet NICs:
      - igb: support PTP/time PEROUT and EXTTS SDP functions on
        82580/i354/i350 adapters
      - ixgbevf: new PF -> VF mailbox API which avoids the risk of
        mailbox corruption with ESXi
      - iavf: support configuration of VLAN features of finer
        granularity, stacked tags and filtering
      - ice: PTP support for new E822 devices with sub-ns precision
      - ice: support firmware activation without reboot

   - Mellanox Ethernet NICs (mlx5):
      - expose control over IRQ coalescing mode (CQE vs EQE) via ethtool
      - support TC forwarding when tunnel encap and decap happen between
        two ports of the same NIC
      - dynamically size and allow disabling various features to save
        resources for running in embedded / SmartNIC scenarios

   - Broadcom Ethernet NICs (bnxt):
      - use page frag allocator to improve Rx performance
      - expose control over IRQ coalescing mode (CQE vs EQE) via ethtool

   - Other Ethernet NICs:
      - amd-xgbe: add Ryzen 6000 (Yellow Carp) Ethernet support

   - Microsoft cloud/virtual NIC (mana):
      - add XDP support (PASS, DROP, TX)

   - Mellanox Ethernet switches (mlxsw):
      - initial support for Spectrum-4 ASICs
      - VxLAN with IPv6 underlay

   - Marvell Ethernet switches (prestera):
      - support flower flow templates
      - add basic IP forwarding support

   - NXP embedded Ethernet switches (ocelot & felix):
      - support Per-Stream Filtering and Policing (PSFP)
      - enable cut-through forwarding between ports by default
      - support FDMA to improve packet Rx/Tx to CPU

   - Other embedded switches:
      - hellcreek: improve trapping management (STP and PTP) packets
      - qca8k: support link aggregation and port mirroring

   - Qualcomm 802.11ax WiFi (ath11k):
      - qca6390, wcn6855: enable 802.11 power save mode in station mode
      - BSS color change support
      - WCN6855 hw2.1 support
      - 11d scan offload support
      - scan MAC address randomization support
      - full monitor mode, only supported on QCN9074
      - qca6390/wcn6855: report signal and tx bitrate
      - qca6390: rfkill support
      - qca6390/wcn6855: regdb.bin support

   - Intel WiFi (iwlwifi):
      - support SAR GEO Offset Mapping (SGOM) and Time-Aware-SAR (TAS)
        in cooperation with the BIOS
      - support for Optimized Connectivity Experience (OCE) scan
      - support firmware API version 68
      - lots of preparatory work for the upcoming Bz device family

   - MediaTek WiFi (mt76):
      - Specific Absorption Rate (SAR) support
      - mt7921: 160 MHz channel support

   - RealTek WiFi (rtw88):
      - Specific Absorption Rate (SAR) support
      - scan offload

   - Other WiFi NICs
      - ath10k: support fetching (pre-)calibration data from nvmem
      - brcmfmac: configure keep-alive packet on suspend
      - wcn36xx: beacon filter support"

* tag '5.17-net-next' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2048 commits)
  tcp: tcp_send_challenge_ack delete useless param `skb`
  net/qla3xxx: Remove useless DMA-32 fallback configuration
  rocker: Remove useless DMA-32 fallback configuration
  hinic: Remove useless DMA-32 fallback configuration
  lan743x: Remove useless DMA-32 fallback configuration
  net: enetc: Remove useless DMA-32 fallback configuration
  cxgb4vf: Remove useless DMA-32 fallback configuration
  cxgb4: Remove useless DMA-32 fallback configuration
  cxgb3: Remove useless DMA-32 fallback configuration
  bnx2x: Remove useless DMA-32 fallback configuration
  et131x: Remove useless DMA-32 fallback configuration
  be2net: Remove useless DMA-32 fallback configuration
  vmxnet3: Remove useless DMA-32 fallback configuration
  bna: Simplify DMA setting
  net: alteon: Simplify DMA setting
  myri10ge: Simplify DMA setting
  qlcnic: Simplify DMA setting
  net: allwinner: Fix print format
  page_pool: remove spinlock in page_pool_refill_alloc_cache()
  amt: fix wrong return type of amt_send_membership_update()
  ...
2022-01-10 19:06:09 -08:00
Linus Torvalds
bf4eebf8cf Merge tag 'linux-kselftest-kunit-5.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull KUnit updates from Shuah Khan:
 "This consists of several fixes and enhancements. A few highlights:

   - Option --kconfig_add option allows easily tweaking kunitconfigs

   - make build subcommand can reconfigure if needed

   - doesn't error on tests without test plans

   - doesn't crash if no parameters are generated

   - defaults --jobs to # of cups

   - reports test parameter results as (K)TAP subtests"

* tag 'linux-kselftest-kunit-5.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  kunit: tool: Default --jobs to number of CPUs
  kunit: tool: fix newly introduced typechecker errors
  kunit: tool: make `build` subcommand also reconfigure if needed
  kunit: tool: delete kunit_parser.TestResult type
  kunit: tool: use dataclass instead of collections.namedtuple
  kunit: tool: suggest using decode_stacktrace.sh on kernel crash
  kunit: tool: reconfigure when the used kunitconfig changes
  kunit: tool: revamp message for invalid kunitconfig
  kunit: tool: add --kconfig_add to allow easily tweaking kunitconfigs
  kunit: tool: move Kconfig read_from_file/parse_from_string to package-level
  kunit: tool: print parsed test results fully incrementally
  kunit: Report test parameter results as (K)TAP subtests
  kunit: Don't crash if no parameters are generated
  kunit: tool: Report an error if any test has no subtests
  kunit: tool: Do not error on tests without test plans
  kunit: add run_checks.py script to validate kunit changes
  Documentation: kunit: remove claims that kunit is a mocking framework
  kunit: tool: fix --json output for skipped tests
2022-01-10 12:16:48 -08:00
Linus Torvalds
4369b3cec2 Merge tag 'linux-kselftest-next-5.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull Kselftest update from Shuah Khan:
 "Fixes to build errors, false negatives, and several code cleanups,
  including the ARRAY_SIZE cleanup that removes 25+ duplicates
  ARRAY_SIZE defines from individual tests"

* tag 'linux-kselftest-next-5.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  selftests/vm: remove ARRAY_SIZE define from individual tests
  selftests/timens: remove ARRAY_SIZE define from individual tests
  selftests/sparc64: remove ARRAY_SIZE define from adi-test
  selftests/seccomp: remove ARRAY_SIZE define from seccomp_benchmark
  selftests/rseq: remove ARRAY_SIZE define from individual tests
  selftests/net: remove ARRAY_SIZE define from individual tests
  selftests/landlock: remove ARRAY_SIZE define from common.h
  selftests/ir: remove ARRAY_SIZE define from ir_loopback.c
  selftests/core: remove ARRAY_SIZE define from close_range_test.c
  selftests/cgroup: remove ARRAY_SIZE define from cgroup_util.h
  selftests/arm64: remove ARRAY_SIZE define from vec-syscfg.c
  tools: fix ARRAY_SIZE defines in tools and selftests hdrs
  selftests: cgroup: build error multiple outpt files
  selftests/move_mount_set_group remove unneeded conversion to bool
  selftests/mount: remove unneeded conversion to bool
  selftests: harness: avoid false negatives if test has no ASSERTs
  selftests/ftrace: make kprobe profile testcase description unique
  selftests: clone3: clone3: add case CLONE3_ARGS_NO_TEST
  selftests: timers: Remove unneeded semicolon
  kselftests: timers:Remove unneeded semicolon
2022-01-10 12:08:12 -08:00
Linus Torvalds
9d3a1e0a88 Merge tag 'seccomp-v5.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull seccomp updates from Kees Cook:
 "The core seccomp code hasn't changed for this cycle, but the selftests
  were improved while helping to debug the recent signal handling
  refactoring work Eric did.

  Summary:

   - Improve seccomp selftests in support of signal handler refactoring
     (Kees Cook)"

* tag 'seccomp-v5.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  selftests/seccomp: Report event mismatches more clearly
  selftests/seccomp: Stop USER_NOTIF test if kcmp() fails
2022-01-10 11:50:57 -08:00
Linus Torvalds
bfed6efb8e Merge tag 'x86_sgx_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 SGX updates from Borislav Petkov:

 - Add support for handling hw errors in SGX pages: poisoning,
   recovering from poison memory and error injection into SGX pages

 - A bunch of changes to the SGX selftests to simplify and allow of SGX
   features testing without the need of a whole SGX software stack

 - Add a sysfs attribute which is supposed to show the amount of SGX
   memory in a NUMA node, similar to what /proc/meminfo is to normal
   memory

 - The usual bunch of fixes and cleanups too

* tag 'x86_sgx_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
  x86/sgx: Fix NULL pointer dereference on non-SGX systems
  selftests/sgx: Fix corrupted cpuid macro invocation
  x86/sgx: Add an attribute for the amount of SGX memory in a NUMA node
  x86/sgx: Fix minor documentation issues
  selftests/sgx: Add test for multiple TCS entry
  selftests/sgx: Enable multiple thread support
  selftests/sgx: Add page permission and exception test
  selftests/sgx: Rename test properties in preparation for more enclave tests
  selftests/sgx: Provide per-op parameter structs for the test enclave
  selftests/sgx: Add a new kselftest: Unclobbered_vdso_oversubscribed
  selftests/sgx: Move setup_test_encl() to each TEST_F()
  selftests/sgx: Encpsulate the test enclave creation
  selftests/sgx: Dump segments and /proc/self/maps only on failure
  selftests/sgx: Create a heap for the test enclave
  selftests/sgx: Make data measurement for an enclave segment optional
  selftests/sgx: Assign source for each segment
  selftests/sgx: Fix a benign linker warning
  x86/sgx: Add check for SGX pages to ghes_do_memory_failure()
  x86/sgx: Add hook to error injection address validation
  x86/sgx: Hook arch_memory_failure() into mainline code
  ...
2022-01-10 09:44:09 -08:00
Linus Torvalds
9b9e211360 Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Catalin Marinas:

 - KCSAN enabled for arm64.

 - Additional kselftests to exercise the syscall ABI w.r.t. SVE/FPSIMD.

 - Some more SVE clean-ups and refactoring in preparation for SME
   support (scalable matrix extensions).

 - BTI clean-ups (SYM_FUNC macros etc.)

 - arm64 atomics clean-up and codegen improvements.

 - HWCAPs for FEAT_AFP (alternate floating point behaviour) and
   FEAT_RPRESS (increased precision of reciprocal estimate and
   reciprocal square root estimate).

 - Use SHA3 instructions to speed-up XOR.

 - arm64 unwind code refactoring/unification.

 - Avoid DC (data cache maintenance) instructions when DCZID_EL0.DZP ==
   1 (potentially set by a hypervisor; user-space already does this).

 - Perf updates for arm64: support for CI-700, HiSilicon PCIe PMU,
   Marvell CN10K LLC-TAD PMU, miscellaneous clean-ups.

 - Other fixes and clean-ups; highlights: fix the handling of erratum
   1418040, correct the calculation of the nomap region boundaries,
   introduce io_stop_wc() mapped to the new DGH instruction (data
   gathering hint).

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (81 commits)
  arm64: Use correct method to calculate nomap region boundaries
  arm64: Drop outdated links in comments
  arm64: perf: Don't register user access sysctl handler multiple times
  drivers: perf: marvell_cn10k: fix an IS_ERR() vs NULL check
  perf/smmuv3: Fix unused variable warning when CONFIG_OF=n
  arm64: errata: Fix exec handling in erratum 1418040 workaround
  arm64: Unhash early pointer print plus improve comment
  asm-generic: introduce io_stop_wc() and add implementation for ARM64
  arm64: Ensure that the 'bti' macro is defined where linkage.h is included
  arm64: remove __dma_*_area() aliases
  docs/arm64: delete a space from tagged-address-abi
  arm64: Enable KCSAN
  kselftest/arm64: Add pidbench for floating point syscall cases
  arm64/fp: Add comments documenting the usage of state restore functions
  kselftest/arm64: Add a test program to exercise the syscall ABI
  kselftest/arm64: Allow signal tests to trigger from a function
  kselftest/arm64: Parameterise ptrace vector length information
  arm64/sve: Minor clarification of ABI documentation
  arm64/sve: Generalise vector length configuration prctl() for SME
  arm64/sve: Make sysctl interface for SVE reusable by SME
  ...
2022-01-10 08:49:37 -08:00
Jakub Kicinski
8aaaf2f3af Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Merge in fixes directly in prep for the 5.17 merge window.
No conflicts.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-09 17:00:17 -08:00
Jakub Kicinski
77bbcb60f7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
Pablo Neira Ayuso says:

====================
Netfilter updates for net-next

The following patchset contains Netfilter updates for net-next. This
includes one patch to update ovs and act_ct to use nf_ct_put() instead
of nf_conntrack_put().

1) Add netns_tracker to nfnetlink_log and masquerade, from Eric Dumazet.

2) Remove redundant rcu read-size lock in nf_tables packet path.

3) Replace BUG() by WARN_ON_ONCE() in nft_payload.

4) Consolidate rule verdict tracing.

5) Replace WARN_ON() by WARN_ON_ONCE() in nf_tables core.

6) Make counter support built-in in nf_tables.

7) Add new field to conntrack object to identify locally generated
   traffic, from Florian Westphal.

8) Prevent NAT from shadowing well-known ports, from Florian Westphal.

9) Merge nf_flow_table_{ipv4,ipv6} into nf_flow_table_inet, also from
   Florian.

10) Remove redundant pointer in nft_pipapo AVX2 support, from Colin Ian King.

11) Replace opencoded max() in conntrack, from Jiapeng Chong.

12) Update conntrack to use refcount_t API, from Florian Westphal.

13) Move ip_ct_attach indirection into the nf_ct_hook structure.

14) Constify several pointer object in the netfilter codebase,
    from Florian Westphal.

15) Tree-wide replacement of nf_conntrack_put() by nf_ct_put(), also
    from Florian.

16) Fix egress splat due to incorrect rcu notation, from Florian.

17) Move stateful fields of connlimit, last, quota, numgen and limit
    out of the expression data area.

18) Build a blob to represent the ruleset in nf_tables, this is a
    requirement of the new register tracking infrastructure.

19) Add NFT_REG32_NUM to define the maximum number of 32-bit registers.

20) Add register tracking infrastructure to skip redundant
    store-to-register operations, this includes support for payload,
    meta and bitwise expresssions.

* git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next: (32 commits)
  netfilter: nft_meta: cancel register tracking after meta update
  netfilter: nft_payload: cancel register tracking after payload update
  netfilter: nft_bitwise: track register operations
  netfilter: nft_meta: track register operations
  netfilter: nft_payload: track register operations
  netfilter: nf_tables: add register tracking infrastructure
  netfilter: nf_tables: add NFT_REG32_NUM
  netfilter: nf_tables: add rule blob layout
  netfilter: nft_limit: move stateful fields out of expression data
  netfilter: nft_limit: rename stateful structure
  netfilter: nft_numgen: move stateful fields out of expression data
  netfilter: nft_quota: move stateful fields out of expression data
  netfilter: nft_last: move stateful fields out of expression data
  netfilter: nft_connlimit: move stateful fields out of expression data
  netfilter: egress: avoid a lockdep splat
  net: prefer nf_ct_put instead of nf_conntrack_put
  netfilter: conntrack: avoid useless indirection during conntrack destruction
  netfilter: make function op structures const
  netfilter: core: move ip_ct_attach indirection to struct nf_ct_hook
  netfilter: conntrack: convert to refcount_t api
  ...
====================

Link: https://lore.kernel.org/r/20220109231640.104123-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-09 15:59:23 -08:00
Linus Torvalds
9a12a5aa17 Merge tag 'perf-tools-fixes-for-v5.16-2022-01-09' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
Pull perf tools fixes from Arnaldo Carvalho de Melo:

 - Revert "libtraceevent: Increase libtraceevent logging when verbose",
   breaks the build with libtraceevent-1.3.0, i.e. when building with
   'LIBTRACEEVENT_DYNAMIC=1'.

 - Avoid early exit in 'perf trace' due to running SIGCHLD handler
   before it makes sense to. It can happen when using a BPF source code
   event that have to be first built into an object file.

* tag 'perf-tools-fixes-for-v5.16-2022-01-09' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
  Revert "libtraceevent: Increase libtraceevent logging when verbose"
  perf trace: Avoid early exit due to running SIGCHLD handler before it makes sense to
2022-01-09 10:37:07 -08:00
Paolo Abeni
327b9a94e2 selftests: mptcp: more stable join tests-cases
MPTCP join self-tests are a bit fragile as they reply on
delays instead of events to catch-up with the expected
sockets states.

Replace the delay with state checking where possible and
reduce the number of sleeps in the most complex scenarios.

This will both reduce the tests run-time and will improve
stability.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-07 19:00:43 -08:00
Arnaldo Carvalho de Melo
dc9f2dd5de Revert "libtraceevent: Increase libtraceevent logging when verbose"
This reverts commit 08efcb4a63.

This breaks the build as it will prefer using libbpf-devel header files,
even when not using LIBBPF_DYNAMIC=1, breaking the build.

This was detected on OpenSuSE Tumbleweed with libtraceevent-devel 1.3.0,
as described by Jiri Slaby:

=======================================================================
It breaks build with LIBTRACEEVENT_DYNAMIC and version 1.3.0:
> util/debug.c: In function ‘perf_debug_option’:
> util/debug.c:243:17: error: implicit declaration of function
‘tep_set_loglevel’ [-Werror=implicit-function-declaration]
>   243 |                 tep_set_loglevel(TEP_LOG_INFO);
>       |                 ^~~~~~~~~~~~~~~~
> util/debug.c:243:34: error: ‘TEP_LOG_INFO’ undeclared (first use in this
function); did you mean ‘TEP_PRINT_INFO’?
>   243 |                 tep_set_loglevel(TEP_LOG_INFO);
>       |                                  ^~~~~~~~~~~~
>       |                                  TEP_PRINT_INFO
> util/debug.c:243:34: note: each undeclared identifier is reported only once
for each function it appears in
> util/debug.c:245:34: error: ‘TEP_LOG_DEBUG’ undeclared (first use in this
function)
>   245 |                 tep_set_loglevel(TEP_LOG_DEBUG);
>       |                                  ^~~~~~~~~~~~~
> util/debug.c:247:34: error: ‘TEP_LOG_ALL’ undeclared (first use in this
function)
>   247 |                 tep_set_loglevel(TEP_LOG_ALL);
>       |                                  ^~~~~~~~~~~

It is because the gcc's command line looks like:
gcc
...
-I/home/abuild/rpmbuild/BUILD/tools/lib/
...
-DLIBTRACEEVENT_VERSION=65790
...
=======================================================================

The proper way to fix this is more involved and so not suitable for this
late in the 5.16-rc stage.

Reported-by: Jiri Slaby <jirislaby@kernel.org>
Link: https://lore.kernel.org/lkml/bc2b0786-8965-1bcd-2316-9d9bb37b9c31@kernel.org
Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: https://lore.kernel.org/lkml/YddGjjmlMZzxUZbN@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-01-07 16:08:50 -03:00
Jiri Olsa
f06a82f9d3 perf trace: Avoid early exit due to running SIGCHLD handler before it makes sense to
When running 'perf trace' with an BPF object like:

  # perf trace -e openat,tools/perf/examples/bpf/hello.c

the event parsing eventually calls llvm__get_kbuild_opts() that runs a
script and that ends up with SIGCHLD delivered to the 'perf trace'
handler, which assumes the workload process is done and quits 'perf
trace'.

Move the SIGCHLD handler setup directly to trace__run(), where the event
is parsed and the object is already compiled.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Christy Lee <christyc.y.lee@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20220106222030.227499-1-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-01-07 15:45:19 -03:00
Paolo Abeni
46e967d187 selftests: mptcp: add tests for subflow creation failure
Verify that, when multiple endpoints are available, subflows
creation proceed even when the first additional subflow creation
fails - due to packet drop on the relevant link

Co-developed-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-07 11:27:07 +00:00
Paolo Abeni
86e39e0448 mptcp: keep track of local endpoint still available for each msk
Include into the path manager status a bitmap tracking the list
of local endpoints still available - not yet used - for the
relevant mptcp socket.

Keep such map updated at endpoint creation/deletion time, so
that we can easily skip already used endpoint at local address
selection time.

The endpoint used by the initial subflow is lazyly accounted at
subflow creation time: the usage bitmap is be up2date before
endpoint selection and we avoid such unneeded task in some relevant
scenarios - e.g. busy servers accepting incoming subflows but
not creating any additional ones nor annuncing additional addresses.

Overall this allows for fair local endpoints usage in case of
subflow failure.

As a side effect, this patch also enforces that each endpoint
is used at most once for each mptcp connection.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-07 11:27:07 +00:00
Paolo Abeni
05be5e273c selftests: mptcp: add disconnect tests
Performs several disconnect/reconnect on the same socket,
ensuring the overall transfer is succesful.

The new test leverages ioctl(SIOCOUTQ) to ensure all the
pending data is acked before disconnecting.

Additionally order alphabetically the test program arguments list
for better maintainability.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-07 11:27:07 +00:00
Jakub Kicinski
29507144c9 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The following patchset contains Netfilter fixes for net:

1) Refcount leak in ipt_CLUSTERIP rule loading path, from Xin Xiong.

2) Use socat in netfilter selftests, from Hangbin Liu.

3) Skip layer checksum 4 update for IP fragments.

4) Missing allocation of pcpu scratch maps on clone in
   nft_set_pipapo, from Florian Westphal.

* git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf:
  netfilter: nft_set_pipapo: allocate pcpu scratch maps on clone
  netfilter: nft_payload: do not update layer 4 checksum when mangling fragments
  selftests: netfilter: switch to socat for tests using -q option
  netfilter: ipt_CLUSTERIP: fix refcount leak in clusterip_tg_check()
====================

Link: https://lore.kernel.org/r/20220106215139.170824-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-06 18:37:45 -08:00
Jakub Kicinski
257367c0c9 Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Alexei Starovoitov says:

====================
pull-request: bpf-next 2022-01-06

We've added 41 non-merge commits during the last 2 day(s) which contain
a total of 36 files changed, 1214 insertions(+), 368 deletions(-).

The main changes are:

1) Various fixes in the verifier, from Kris and Daniel.

2) Fixes in sockmap, from John.

3) bpf_getsockopt fix, from Kuniyuki.

4) INET_POST_BIND fix, from Menglong.

5) arm64 JIT fix for bpf pseudo funcs, from Hou.

6) BPF ISA doc improvements, from Christoph.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (41 commits)
  bpf: selftests: Add bind retry for post_bind{4, 6}
  bpf: selftests: Use C99 initializers in test_sock.c
  net: bpf: Handle return value of BPF_CGROUP_RUN_PROG_INET{4,6}_POST_BIND()
  bpf/selftests: Test bpf_d_path on rdonly_mem.
  libbpf: Add documentation for bpf_map batch operations
  selftests/bpf: Don't rely on preserving volatile in PT_REGS macros in loop3
  xdp: Add xdp_do_redirect_frame() for pre-computed xdp_frames
  xdp: Move conversion to xdp_frame out of map functions
  page_pool: Store the XDP mem id
  page_pool: Add callback to init pages when they are allocated
  xdp: Allow registering memory model without rxq reference
  samples/bpf: xdpsock: Add timestamp for Tx-only operation
  samples/bpf: xdpsock: Add time-out for cleaning Tx
  samples/bpf: xdpsock: Add sched policy and priority support
  samples/bpf: xdpsock: Add cyclic TX operation capability
  samples/bpf: xdpsock: Add clockid selection support
  samples/bpf: xdpsock: Add Dest and Src MAC setting for Tx-only operation
  samples/bpf: xdpsock: Add VLAN support for Tx-only operation
  libbpf 1.0: Deprecate bpf_object__find_map_by_offset() API
  libbpf 1.0: Deprecate bpf_map__is_offload_neutral()
  ...
====================

Link: https://lore.kernel.org/r/20220107013626.53943-1-alexei.starovoitov@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-06 18:07:26 -08:00
Menglong Dong
f734248174 bpf: selftests: Add bind retry for post_bind{4, 6}
With previous patch, kernel is able to 'put_port' after sys_bind()
fails. Add the test for that case: rebind another port after
sys_bind() fails. If the bind success, it means previous bind
operation is already undoed.

Signed-off-by: Menglong Dong <imagedong@tencent.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220106132022.3470772-4-imagedong@tencent.com
2022-01-06 17:08:35 -08:00
Menglong Dong
6fd92c7f0c bpf: selftests: Use C99 initializers in test_sock.c
Use C99 initializers for the initialization of 'tests' in test_sock.c.

Signed-off-by: Menglong Dong <imagedong@tencent.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220106132022.3470772-3-imagedong@tencent.com
2022-01-06 17:08:35 -08:00
Hao Luo
44bab87d8c bpf/selftests: Test bpf_d_path on rdonly_mem.
The second parameter of bpf_d_path() can only accept writable
memories. Rdonly_mem obtained from bpf_per_cpu_ptr() can not
be passed into bpf_d_path for modification. This patch adds
a selftest to verify this behavior.

Signed-off-by: Hao Luo <haoluo@google.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20220106205525.2116218-1-haoluo@google.com
2022-01-06 15:20:49 -08:00
Grant Seltzer
e59618f0f4 libbpf: Add documentation for bpf_map batch operations
This adds documention for:

- bpf_map_delete_batch()
- bpf_map_lookup_batch()
- bpf_map_lookup_and_delete_batch()
- bpf_map_update_batch()

This also updates the public API for the `keys` parameter
of `bpf_map_delete_batch()`, and both the
`keys` and `values` parameters of `bpf_map_update_batch()`
to be constants.

Signed-off-by: Grant Seltzer <grantseltzer@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20220106201304.112675-1-grantseltzer@gmail.com
2022-01-06 15:12:42 -08:00
Andrii Nakryiko
70bc793382 selftests/bpf: Don't rely on preserving volatile in PT_REGS macros in loop3
PT_REGS*() macro on some architectures force-cast struct pt_regs to
other types (user_pt_regs, etc) and might drop volatile modifiers, if any.
Volatile isn't really required as pt_regs value isn't supposed to change
during the BPF program run, so this is correct behavior.

But progs/loop3.c relies on that volatile modifier to ensure that loop
is preserved. Fix loop3.c by declaring i and sum variables as volatile
instead. It preserves the loop and makes the test pass on all
architectures (including s390x which is currently broken).

Fixes: 3cc31d7940 ("libbpf: Normalize PT_REGS_xxx() macro definitions")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220106205156.955373-1-andrii@kernel.org
2022-01-06 22:25:53 +01:00
Tejun Heo
bf35a7879f selftests: cgroup: Test open-time cgroup namespace usage for migration checks
When a task is writing to an fd opened by a different task, the perm check
should use the cgroup namespace of the latter task. Add a test for it.

Tested-by: Michal Koutný <mkoutny@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2022-01-06 11:02:29 -10:00
Tejun Heo
613e040e4d selftests: cgroup: Test open-time credential usage for migration checks
When a task is writing to an fd opened by a different task, the perm check
should use the credentials of the latter task. Add a test for it.

Tested-by: Michal Koutný <mkoutny@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2022-01-06 11:02:29 -10:00
Tejun Heo
b09c2baa56 selftests: cgroup: Make cg_create() use 0755 for permission instead of 0644
0644 is an odd perm to create a cgroup which is a directory. Use the regular
0755 instead. This is necessary for euid switching test case.

Reviewed-by: Michal Koutný <mkoutny@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2022-01-06 11:02:29 -10:00
Coco Li
eac1b93c14 gro: add ability to control gro max packet size
Eric Dumazet suggested to allow users to modify max GRO packet size.

We have seen GRO being disabled by users of appliances (such as
wifi access points) because of claimed bufferbloat issues,
or some work arounds in sch_cake, to split GRO/GSO packets.

Instead of disabling GRO completely, one can chose to limit
the maximum packet size of GRO packets, depending on their
latency constraints.

This patch adds a per device gro_max_size attribute
that can be changed with ip link command.

ip link set dev eth0 gro_max_size 16000

Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Coco Li <lixiaoyan@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-06 12:27:05 +00:00
Hangbin Liu
1585f590a2 selftests: netfilter: switch to socat for tests using -q option
The nc cmd(nmap-ncat) that distributed with Fedora/Red Hat does not have
option -q. This make some tests failed with:

	nc: invalid option -- 'q'

Let's switch to socat which is far more dependable.

Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-01-06 10:43:18 +01:00
Miroslav Lichvar
87eee9c558 testptp: set pin function before other requests
When the -L option of the testptp utility is specified with other
options (e.g. -p to enable PPS output), the user probably wants to
apply it to the pin configured by the -L option.

Reorder the code to set the pin function before other function requests
to avoid confusing users.

Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Link: https://lore.kernel.org/r/20220105152506.3256026-1-mlichvar@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-05 17:08:58 -08:00
Christy Lee
5f60826428 libbpf 1.0: Deprecate bpf_object__find_map_by_offset() API
API created with simplistic assumptions about BPF map definitions.
It hasn’t worked for a while, deprecate it in preparation for
libbpf 1.0.

  [0] Closes: https://github.com/libbpf/libbpf/issues/302

Signed-off-by: Christy Lee <christylee@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220105003120.2222673-1-christylee@fb.com
2022-01-05 16:11:32 -08:00
Christy Lee
9855c131b9 libbpf 1.0: Deprecate bpf_map__is_offload_neutral()
Deprecate bpf_map__is_offload_neutral(). It’s most probably broken
already. PERF_EVENT_ARRAY isn’t the only map that’s not suitable
for hardware offloading. Applications can directly check map type
instead.

  [0] Closes: https://github.com/libbpf/libbpf/issues/306

Signed-off-by: Christy Lee <christylee@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220105000601.2090044-1-christylee@fb.com
2022-01-05 16:09:06 -08:00
Qiang Wang
51a33c60f1 libbpf: Support repeated legacy kprobes on same function
If repeated legacy kprobes on same function in one process,
libbpf will register using the same probe name and got -EBUSY
error. So append index to the probe name format to fix this
problem.

Co-developed-by: Chengming Zhou <zhouchengming@bytedance.com>
Signed-off-by: Qiang Wang <wangqiang.wq.frank@bytedance.com>
Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211227130713.66933-2-wangqiang.wq.frank@bytedance.com
2022-01-05 15:38:21 -08:00
Qiang Wang
71cff670ba libbpf: Use probe_name for legacy kprobe
Fix a bug in commit 46ed5fc33d, which wrongly used the
func_name instead of probe_name to register legacy kprobe.

Fixes: 46ed5fc33d ("libbpf: Refactor and simplify legacy kprobe code")
Co-developed-by: Chengming Zhou <zhouchengming@bytedance.com>
Signed-off-by: Qiang Wang <wangqiang.wq.frank@bytedance.com>
Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Tested-by: Hengqi Chen <hengqi.chen@gmail.com>
Reviewed-by: Hengqi Chen <hengqi.chen@gmail.com>
Link: https://lore.kernel.org/bpf/20211227130713.66933-1-wangqiang.wq.frank@bytedance.com
2022-01-05 15:38:21 -08:00
Christy Lee
7218c28c87 libbpf: Deprecate bpf_perf_event_read_simple() API
With perf_buffer__poll() and perf_buffer__consume() APIs available,
there is no reason to expose bpf_perf_event_read_simple() API to
users. If users need custom perf buffer, they could re-implement
the function.

Mark bpf_perf_event_read_simple() and move the logic to a new
static function so it can still be called by other functions in the
same file.

  [0] Closes: https://github.com/libbpf/libbpf/issues/310

Signed-off-by: Christy Lee <christylee@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211229204156.13569-1-christylee@fb.com
2022-01-05 15:27:43 -08:00
Jakub Kicinski
b9adba350a Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
No conflicts.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-05 14:36:10 -08:00
Linus Torvalds
75acfdb6fd Merge tag 'net-5.16-final' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski"
 "Networking fixes, including fixes from bpf, and WiFi. One last pull
  request, turns out some of the recent fixes did more harm than good.

  Current release - regressions:

   - Revert "xsk: Do not sleep in poll() when need_wakeup set", made the
     problem worse

   - Revert "net: phy: fixed_phy: Fix NULL vs IS_ERR() checking in
     __fixed_phy_register", broke EPROBE_DEFER handling

   - Revert "net: usb: r8152: Add MAC pass-through support for more
     Lenovo Docks", broke setups without a Lenovo dock

  Current release - new code bugs:

   - selftests: set amt.sh executable

  Previous releases - regressions:

   - batman-adv: mcast: don't send link-local multicast to mcast routers

  Previous releases - always broken:

   - ipv4/ipv6: check attribute length for RTA_FLOW / RTA_GATEWAY

   - sctp: hold endpoint before calling cb in
     sctp_transport_lookup_process

   - mac80211: mesh: embed mesh_paths and mpp_paths into
     ieee80211_if_mesh to avoid complicated handling of sub-object
     allocation failures

   - seg6: fix traceroute in the presence of SRv6

   - tipc: fix a kernel-infoleak in __tipc_sendmsg()"

* tag 'net-5.16-final' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (36 commits)
  selftests: set amt.sh executable
  Revert "net: usb: r8152: Add MAC passthrough support for more Lenovo Docks"
  sfc: The RX page_ring is optional
  iavf: Fix limit of total number of queues to active queues of VF
  i40e: Fix incorrect netdev's real number of RX/TX queues
  i40e: Fix for displaying message regarding NVM version
  i40e: fix use-after-free in i40e_sync_filters_subtask()
  i40e: Fix to not show opcode msg on unsuccessful VF MAC change
  ieee802154: atusb: fix uninit value in atusb_set_extended_addr
  mac80211: mesh: embedd mesh_paths and mpp_paths into ieee80211_if_mesh
  mac80211: initialize variable have_higher_than_11mbit
  sch_qfq: prevent shift-out-of-bounds in qfq_init_qdisc
  netrom: fix copying in user data in nr_setsockopt
  udp6: Use Segment Routing Header for dest address if present
  icmp: ICMPV6: Examine invoking packet for Segment Route Headers.
  seg6: export get_srh() for ICMP handling
  Revert "net: phy: fixed_phy: Fix NULL vs IS_ERR() checking in __fixed_phy_register"
  ipv6: Do cleanup if attribute validation fails in multipath route
  ipv6: Continue processing multipath route even if gateway attribute is invalid
  net/fsl: Remove leftover definition in xgmac_mdio
  ...
2022-01-05 14:08:56 -08:00
Daniel Borkmann
ca796fe66f bpf, selftests: Add verifier test for mem_or_null register with offset.
Add a new test case with mem_or_null typed register with off > 0 to ensure
it gets rejected by the verifier:

  # ./test_verifier 1011
  #1009/u check with invalid reg offset 0 OK
  #1009/p check with invalid reg offset 0 OK
  Summary: 2 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-01-05 12:00:19 -08:00
Taehee Yoo
db54c12a3d selftests: set amt.sh executable
amt.sh test script will not work because it doesn't have execution
permission. So, it adds execution permission.

Reported-by: Hangbin Liu <liuhangbin@gmail.com>
Fixes: c08e8baea7 ("selftests: add amt interface selftest script")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Link: https://lore.kernel.org/r/20220105144436.13415-1-ap420073@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-05 10:27:19 -08:00
Nageswara R Sastry
65e38e32a9 selftests/kexec: Enable secureboot tests for PowerPC
Existing test cases determine secureboot state using efi variable, which
is available only on x86 architecture.  Add support for determining
secureboot state using device tree property on PowerNV architecture.

Signed-off-by: Nageswara R Sastry <rnsastry@linux.ibm.com>
Reviewed-by: Nayna Jain <nayna@linux.ibm.com>
Tested-by: Nayna Jain <nayna@linux.ibm.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2022-01-05 11:44:57 -05:00
Jiri Olsa
5e22dd1862 bpf/selftests: Fix namespace mount setup in tc_redirect
The tc_redirect umounts /sys in the new namespace, which can be
mounted as shared and cause global umount. The lazy umount also
takes down mounted trees under /sys like debugfs, which won't be
available after sysfs mounts again and could cause fails in other
tests.

  # cat /proc/self/mountinfo | grep debugfs
  34 23 0:7 / /sys/kernel/debug rw,nosuid,nodev,noexec,relatime shared:14 - debugfs debugfs rw
  # cat /proc/self/mountinfo | grep sysfs
  23 86 0:22 / /sys rw,nosuid,nodev,noexec,relatime shared:2 - sysfs sysfs rw
  # mount | grep debugfs
  debugfs on /sys/kernel/debug type debugfs (rw,nosuid,nodev,noexec,relatime)

  # ./test_progs -t tc_redirect
  #164 tc_redirect:OK
  Summary: 1/4 PASSED, 0 SKIPPED, 0 FAILED

  # mount | grep debugfs
  # cat /proc/self/mountinfo | grep debugfs
  # cat /proc/self/mountinfo | grep sysfs
  25 86 0:22 / /sys rw,relatime shared:2 - sysfs sysfs rw

Making the sysfs private under the new namespace so the umount won't
trigger the global sysfs umount.

Reported-by: Hangbin Liu <haliu@redhat.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jussi Maki <joamaki@gmail.com>
Link: https://lore.kernel.org/bpf/20220104121030.138216-1-jolsa@kernel.org
2022-01-05 13:35:18 +01:00
Paul Chaignon
0fd800b245 bpftool: Probe for instruction set extensions
This patch introduces new probes to check whether the kernel supports
instruction set extensions v2 and v3. The first introduced eBPF
instructions BPF_J{LT,LE,SLT,SLE} in commit 92b31a9af7 ("bpf: add
BPF_J{LT,LE,SLT,SLE} instructions"). The second introduces 32-bit
variants of all jump instructions in commit 092ed0968b ("bpf:
verifier support JMP32").

These probes are useful for userspace BPF projects that want to use newer
instruction set extensions on newer kernels, to reduce the programs'
sizes or their complexity. LLVM already provides an mcpu=probe option to
automatically probe the kernel and select the newest-supported
instruction set extension. That is however not flexible enough for all
use cases. For example, in Cilium, we only want to use the v3
instruction set extension on v5.10+, even though it is supported on all
kernels v5.1+.

Signed-off-by: Paul Chaignon <paul@isovalent.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/3bfedcd9898c1f41ac67ca61f144fec84c6c3a92.1641314075.git.paul@isovalent.com
2022-01-05 13:31:40 +01:00