Commit Graph

19748 Commits

Author SHA1 Message Date
Joel Granados
78eb4ea25c sysctl: treewide: constify the ctl_table argument of proc_handlers
const qualify the struct ctl_table argument in the proc_handler function
signatures. This is a prerequisite to moving the static ctl_table
structs into .rodata data which will ensure that proc_handler function
pointers cannot be modified.

This patch has been generated by the following coccinelle script:

```
  virtual patch

  @r1@
  identifier ctl, write, buffer, lenp, ppos;
  identifier func !~ "appldata_(timer|interval)_handler|sched_(rt|rr)_handler|rds_tcp_skbuf_handler|proc_sctp_do_(hmac_alg|rto_min|rto_max|udp_port|alpha_beta|auth|probe_interval)";
  @@

  int func(
  - struct ctl_table *ctl
  + const struct ctl_table *ctl
    ,int write, void *buffer, size_t *lenp, loff_t *ppos);

  @r2@
  identifier func, ctl, write, buffer, lenp, ppos;
  @@

  int func(
  - struct ctl_table *ctl
  + const struct ctl_table *ctl
    ,int write, void *buffer, size_t *lenp, loff_t *ppos)
  { ... }

  @r3@
  identifier func;
  @@

  int func(
  - struct ctl_table *
  + const struct ctl_table *
    ,int , void *, size_t *, loff_t *);

  @r4@
  identifier func, ctl;
  @@

  int func(
  - struct ctl_table *ctl
  + const struct ctl_table *ctl
    ,int , void *, size_t *, loff_t *);

  @r5@
  identifier func, write, buffer, lenp, ppos;
  @@

  int func(
  - struct ctl_table *
  + const struct ctl_table *
    ,int write, void *buffer, size_t *lenp, loff_t *ppos);

```

* Code formatting was adjusted in xfs_sysctl.c to comply with code
  conventions. The xfs_stats_clear_proc_handler,
  xfs_panic_mask_proc_handler and xfs_deprecated_dointvec_minmax where
  adjusted.

* The ctl_table argument in proc_watchdog_common was const qualified.
  This is called from a proc_handler itself and is calling back into
  another proc_handler, making it necessary to change it as part of the
  proc_handler migration.

Co-developed-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Co-developed-by: Joel Granados <j.granados@samsung.com>
Signed-off-by: Joel Granados <j.granados@samsung.com>
2024-07-24 20:59:29 +02:00
Linus Torvalds
91bd008d4e Probes updates for v6.11:
Uprobes:
 - x86/shstk: Make return uprobe work with shadow stack.
 - Add uretprobe syscall which speeds up the uretprobe 10-30% faster. This
   syscall is automatically used from user-space trampolines which are
   generated by the uretprobe. If this syscall is used by normal
   user program, it will cause SIGILL. Note that this is currently only
   implemented on x86_64.
   (This also has 2 fixes for adjusting the syscall number to avoid conflict
    with new *attrat syscalls.)
 - uprobes/perf: fix user stack traces in the presence of pending uretprobe.
   This corrects the uretprobe's trampoline address in the stacktrace with
   correct return address.
 - selftests/x86: Add a return uprobe with shadow stack test.
 - selftests/bpf: Add uretprobe syscall related tests.
   . test case for register integrity check.
   . test case with register changing case.
   . test case for uretprobe syscall without uprobes (expected to be failed).
   . test case for uretprobe with shadow stack.
 - selftests/bpf: add test validating uprobe/uretprobe stack traces
 - MAINTAINERS: Add uprobes entry. This does not specify the tree but to
   clarify who maintains and reviews the uprobes.
 
 Kprobes:
 - tracing/kprobes: Test case cleanups. Replace redundant WARN_ON_ONCE() +
   pr_warn() with WARN_ONCE() and remove unnecessary code from selftest.
 - tracing/kprobes: Add symbol counting check when module loads. This
   checks the uniqueness of the probed symbol on modules. The same check
   has already done for kernel symbols.
   (This also has a fix for build error with CONFIG_MODULES=n)
 
 Cleanup:
 - Add MODULE_DESCRIPTION() macros for fprobe and kprobe examples.
 -----BEGIN PGP SIGNATURE-----
 
 iQFPBAABCgA5FiEEh7BulGwFlgAOi5DV2/sHvwUrPxsFAmaWYxwbHG1hc2FtaS5o
 aXJhbWF0c3VAZ21haWwuY29tAAoJENv7B78FKz8bsUgH/3JcSzDZujQWCZ1f4fJn
 QecvTFSYcCl6ck8+/3wm4EsgeCXIFOyPnoPc7k2Gm+l6Dlk1DKGV6wV4tuKFUq9X
 9mplcwoVA0Ln+EX9zv9v4s99yUGxcU9xjgC9XT7J52SvqYncPIi6dR0Z9wlJBmyd
 Bx3cZk+wSzCYaoqYngI2fKlzsEcYgDIP999fQPRi0HGzNZujc4xeJyjCTC/48yWO
 9kreRQq6wFdgRQTwMcR/fKPDKIGZQCU8jkXv5crVV5K3rNaBcwBmCJJMP8PzPU0V
 UQ0+8RZK+Qk8SBwXcMNVRqm/efTderob4IYxP8OBe5wjAIE7+vu8r6sqwxRIS54M
 Cyg=
 =DRSr
 -----END PGP SIGNATURE-----

Merge tag 'probes-v6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull probes updates from Masami Hiramatsu:
 "Uprobes:

   - x86/shstk: Make return uprobe work with shadow stack

   - Add uretprobe syscall which speeds up the uretprobe 10-30% faster.
     This syscall is automatically used from user-space trampolines
     which are generated by the uretprobe. If this syscall is used by
     normal user program, it will cause SIGILL. Note that this is
     currently only implemented on x86_64.

     (This also has two fixes for adjusting the syscall number to avoid
     conflict with new *attrat syscalls.)

   - uprobes/perf: fix user stack traces in the presence of pending
     uretprobe. This corrects the uretprobe's trampoline address in the
     stacktrace with correct return address

   - selftests/x86: Add a return uprobe with shadow stack test

   - selftests/bpf: Add uretprobe syscall related tests.
      - test case for register integrity check
      - test case with register changing case
      - test case for uretprobe syscall without uprobes (expected to fail)
      - test case for uretprobe with shadow stack

   - selftests/bpf: add test validating uprobe/uretprobe stack traces

   - MAINTAINERS: Add uprobes entry. This does not specify the tree but
     to clarify who maintains and reviews the uprobes

  Kprobes:

   - tracing/kprobes: Test case cleanups.

     Replace redundant WARN_ON_ONCE() + pr_warn() with WARN_ONCE() and
     remove unnecessary code from selftest

   - tracing/kprobes: Add symbol counting check when module loads.

     This checks the uniqueness of the probed symbol on modules. The
     same check has already done for kernel symbols

     (This also has a fix for build error with CONFIG_MODULES=n)

  Cleanup:

   - Add MODULE_DESCRIPTION() macros for fprobe and kprobe examples"

* tag 'probes-v6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  MAINTAINERS: Add uprobes entry
  selftests/bpf: Change uretprobe syscall number in uprobe_syscall test
  uprobe: Change uretprobe syscall scope and number
  tracing/kprobes: Fix build error when find_module() is not available
  tracing/kprobes: Add symbol counting check when module loads
  selftests/bpf: add test validating uprobe/uretprobe stack traces
  perf,uprobes: fix user stack traces in the presence of pending uretprobes
  tracing/kprobe: Remove cleanup code unrelated to selftest
  tracing/kprobe: Integrate test warnings into WARN_ONCE
  selftests/bpf: Add uretprobe shadow stack test
  selftests/bpf: Add uretprobe syscall call from user space test
  selftests/bpf: Add uretprobe syscall test for regs changes
  selftests/bpf: Add uretprobe syscall test for regs integrity
  selftests/x86: Add return uprobe shadow stack test
  uprobe: Add uretprobe syscall to speed up return probe
  uprobe: Wire up uretprobe system call
  x86/shstk: Make return uprobe work with shadow stack
  samples: kprobes: add missing MODULE_DESCRIPTION() macros
  fprobe: add missing MODULE_DESCRIPTION() macro
2024-07-18 12:19:20 -07:00
Linus Torvalds
b3ce7a3084 drm next for 6.11-rc1:
core:
 - deprecate DRM data and return 0 date
 - connector: Create a set of helpers to help with HDMI support
 - Remove driver owner assignments
 - Allow more drivers to compile with COMPILE_TEST
 - Conversions to drm_edid
 - Sprinkle MODULE_DESCRIPTIONS everywhere they are missing
 - Remove drm_mm_replace_node
 - print: Add a drm prefix to warn level messages too, remove
          ___drm_dbg, consolidate prefix handling
 - New monochrome TV mode variant
 
 ttm:
 - improve number of page faults on some platforms
 - fix test builds under PREEMPT_RT
 - more test coverage
 
 ci:
 - Require a more recent version of mesa,
 - improve farm setup and test generation
 
 dma-buf:
 - warn if reserving 0 fence slots
 - internal API heap enhancements
 
 fbdev:
 - Create memory manager optimized fbdev emulation
 
 panic:
 - Allow to select fonts,
 - improve drm_fb_dma_get_scanout_buffer
 - Allow to dump kmsg to the screen
 
 bridge:
 - Remove redundant checks on bridge->encoder
 - Remove drm_bridge_chain_mode_fixup
 - bridge-connector: Plumb in the new HDMI helper
 - analogix_dp: Various improvements, handle AUX transfers timeout
 - samsung-dsim: Fix timings calculation
 - tc358767: Plenty of small fixes, fix no connector attach, fix clocks
 - sii902x: state validation improvements
 
 panels:
 - Switch panels from register table initialization to proper code
 - Now that the panel code tracks the panel state, remove every
   ad-hoc implementation in the panel drivers
 - More cleanup of prepare / enable state tracking in drivers
 - edp: Drop legacy panel compatibles
 - simple-bridge: Switch to devm_drm_bridge_add
 - New panels: Lincoln Tech Sol LCD185-101CT, Microtips Technology
   13-101HIEBCAF0-C, Microtips Technology MF-103HIEB0GA0, BOE
   nv110wum-l60, IVO t109nw41, WL-355608-A8, PrimeView PM070WL4,
   Lincoln Technologies LCD197, Ortustech COM35H3P70ULC,
   AUO G104STN01, K&d kd101ne3-40ti
 
 amdgpu:
 - DCN 4.0.x support
 - GC 12.0 support
 - GMC 12.0 support
 - SDMA 7.0 support
 - MES12 support
 - MMHUB 4.1 support
 - GFX12 modifier and DCC support
 - lots of IP fixes/updates
 
 amdkfd:
 - Contiguous VRAM allocations
 - GC 12.0 support
 - SDMA 7.0 support
 - SR-IOV fixes
 - KFD GFX ALU exceptions
 
 i915:
 - Battlemage Xe2 HPD display enablement
 - Panel Replay enabling
 - DP AUX-less ALPM/LOBF
 - Enable link training failure fallback for DP MST links
 - CMRR (Content Match Refresh Rate) enabling
 - Increase ADL-S/ADL-P/DG2+ max TMDS bitrate to 6 Gbps
 - Enable eDP AUX based HDR backlight
 - Support replaying GPU hangs with captured context image
 - Automate CCS Mode setting during engine resets
 - lots of refactoring
 - Support replaying GPU hangs with captured context image
 - Increase FLR timeout from 3s to 9s
 - Enable w/a 16021333562 for DG2, MTL and ARL [guc]
 
 xe:
 - update MAINATINERS
 - New uapi adding OA functionality to Xe
 - expose l3 bank mask
 - fix display detect on ADL-N
 - runtime PM Fixes
 - Fix silent backmerge issues
 - More prep for SR-IOV
 - HWmon additions
 - per client usage info
 - Rework GPU page fault handling
 - Drop EXEC_QUEUE_FLAG_BANNED
 - Add BMG PCI IDs
 - Scheduler fixes and improvements
 - Rename xe_exec_queue::compute to xe_exec_queue::lr
 - Use ttm_uncached for BO with NEEDS_UC flag
 - Rename xe perf layer as xe observation layer
 - lots of refactoring
 
 radeon:
 - Backlight workaround for iMac
 - Silence UBSAN flex array warnings
 
 msm:
 - Validate registers XML description against schema in CI
 - core/dpu: SM7150 support
 - mdp5: Add support for MSM8937
 - gpu: Add param for userspace to know if raytracing is supported
 - gpu: X185 support (aka gpu in X1 laptop chips)
 - gpu: a505 support
 
 ivpu:
 - hardware scheduler support
 - profiling support
 - improvements to the platform support layer
 - firmware handling improvements
 - clocks/power mgmt improvements
 - scheduler/logging improvements
 
 habanalabs:
 - Gradual sleep in polling memory macro.
 - Reduce Gaudi2 MSI-X interrupt count to 128.
 - Add Gaudi2-D revision support.
 - Add timestamp to CPLD info.
 - Gaudi2: Assume hard-reset by firmware upon MC SEI severe error.
 - Align Gaudi2 interrupt names.
 - Check for errors after preboot is ready.
 - Change habanalabs maintainer and git repo path.
 
 mgag200:
 - refactoring and improvements
 - Add BMC output
 - enable polling
 
 nouveau:
 - add registry command line
 
 v3d:
 - perf counters improvements
 
 zynqmp:
 - irq and debugfs improvements
 
 atmel-hlcdc:
 - Support XLCDC in sam9x7
 
 mipi-dbi:
 - Remove mipi_dbi_machine_little_endian
 - make SPI bits per word configurable
 - support RGB888
 - allow pixel formats to be specified in the DT
 
 sun4i:
 - Rework the blender setup for DE2
 
 panfrost:
 - Enable MT8188 support
 
 vc4:
 - Monochrome TV support
 
 exynos:
 - fix fallback mode regression
 - fix memory leak
 - Use drm_edid_duplicate() instead of kmemdup()
 
 etnaviv:
 - fix i.MX8MP NPU clock gating
 - workaround FE register cdc issues on some cores
 - fix DMA sync handling for cached buffers
 - fix job timeout handling
 - keep TS enabled on MMUv2 cores for improved performance
 
 mediatek:
 - Convert to platform remove callback returning void-
 - Drop chain_mode_fixup call in mode_valid()
 - Fixes the errors of MediaTek display driver found by IGT.
 - Add display support for the MT8365-EVK board
 - Fix bit depth overwritten for mtk_ovl_set bit_depth()
 - Fix possible_crtcs calculation
 - Fix spurious kfree()
 
 ast:
 - refactor mode setting code
 
 stm:
 - Add LVDS support
 - DSI PHY updates
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmaYqVEACgkQDHTzWXnE
 hr5p3Q/+OOxTHKJ/8WMwfV1Tuep5otkCZdBgNdcuu9zqzpEMEDUDwmV1iboIvT9x
 qJsDwSAJomwbZAnVjDKsbZuycSHUBV6HQdf+5+rtq6be1EfFRwJVzOq0u5+D3KGt
 7f2vy6sM9tw4tR6EikiuP7vCvnSz4iGrWERvEJDEtXECbALhju8sulht8ZMnr6GW
 /MfUetULLSDjq0L1x3TWAq2MPGnJ5UxIkIeOBUP6n4etAUX1BPTNA6N76eN/xMvn
 a40JhtM+pCjjkHxvloIZ+KTYN3S+hskIRksczPHh9HtNX7y/A437wyhOHJZ1NvZb
 yc5ke9GjXxGcxyZH+PY5aCS7O/XElzSSkR1jFZ2s3/MX7PVKgCahGK7+yWjPsiK2
 R5oXebdObshUa8LHDE/3WgBUmTchkvKRTXV9cvGqzxEPhC2zrxArvwP5v6B4mhCn
 Vqo3Pv0Cyr+n65Z5Dzqz/9+m999LJjFTsTrug0p5b/qBJQKu2rQONe4lpZ0NFwwY
 ExyjdxILj7mqrQpKcA6V5Bel5ZCnlVsGfTshFL6Iux54VFlJyRMzKWZ+Gdv4av5k
 dbjz+re+CojKabn3ML/7pAQujK6Rqe58vPuHV78zkvAGJnQgJOOTrmYNYtn3oBqe
 ogdCN+/PREb/9U7i6mQv5hhdHs4tT9ROXaT9jyb8XSHXW+t9lBM=
 =g+Ad
 -----END PGP SIGNATURE-----

Merge tag 'drm-next-2024-07-18' of https://gitlab.freedesktop.org/drm/kernel

Pull drm updates from Dave Airlie:
 "There's a lot of stuff in here, amd, i915 and xe have new platform
  work, lots of core rework around EDID handling, some new COMPILE_TEST
  options, maintainer changes and a lots of other stuff. Summary:

  core:
   - deprecate DRM data and return 0 date
   - connector: Create a set of helpers to help with HDMI support
   - Remove driver owner assignments
   - Allow more drivers to compile with COMPILE_TEST
   - Conversions to drm_edid
   - Sprinkle MODULE_DESCRIPTIONS everywhere they are missing
   - Remove drm_mm_replace_node
   - print: Add a drm prefix to warn level messages too, remove
            ___drm_dbg, consolidate prefix handling
   - New monochrome TV mode variant

  ttm:
   - improve number of page faults on some platforms
   - fix test builds under PREEMPT_RT
   - more test coverage

  ci:
   - Require a more recent version of mesa
   - improve farm setup and test generation

  dma-buf:
   - warn if reserving 0 fence slots
   - internal API heap enhancements

  fbdev:
   - Create memory manager optimized fbdev emulation

  panic:
   - Allow to select fonts
   - improve drm_fb_dma_get_scanout_buffer
   - Allow to dump kmsg to the screen

  bridge:
   - Remove redundant checks on bridge->encoder
   - Remove drm_bridge_chain_mode_fixup
   - bridge-connector: Plumb in the new HDMI helper
   - analogix_dp: Various improvements, handle AUX transfers timeout
   - samsung-dsim: Fix timings calculation
   - tc358767: Plenty of small fixes, fix no connector attach, fix
               clocks
   - sii902x: state validation improvements

  panels:
   - Switch panels from register table initialization to proper code
   - Now that the panel code tracks the panel state, remove every ad-hoc
     implementation in the panel drivers
   - More cleanup of prepare / enable state tracking in drivers
   - edp: Drop legacy panel compatibles
   - simple-bridge: Switch to devm_drm_bridge_add
   - New panels: Lincoln Tech Sol LCD185-101CT, Microtips Technology
                 13-101HIEBCAF0-C, Microtips Technology MF-103HIEB0GA0,
                 BOE nv110wum-l60, IVO t109nw41, WL-355608-A8, PrimeView
                 PM070WL4, Lincoln Technologies LCD197, Ortustech
                 COM35H3P70ULC, AUO G104STN01, K&d kd101ne3-40ti

  amdgpu:
   - DCN 4.0.x support
   - GC 12.0 support
   - GMC 12.0 support
   - SDMA 7.0 support
   - MES12 support
   - MMHUB 4.1 support
   - GFX12 modifier and DCC support
   - lots of IP fixes/updates

  amdkfd:
   - Contiguous VRAM allocations
   - GC 12.0 support
   - SDMA 7.0 support
   - SR-IOV fixes
   - KFD GFX ALU exceptions

  i915:
   - Battlemage Xe2 HPD display enablement
   - Panel Replay enabling
   - DP AUX-less ALPM/LOBF
   - Enable link training failure fallback for DP MST links
   - CMRR (Content Match Refresh Rate) enabling
   - Increase ADL-S/ADL-P/DG2+ max TMDS bitrate to 6 Gbps
   - Enable eDP AUX based HDR backlight
   - Support replaying GPU hangs with captured context image
   - Automate CCS Mode setting during engine resets
   - lots of refactoring
   - Support replaying GPU hangs with captured context image
   - Increase FLR timeout from 3s to 9s
   - Enable w/a 16021333562 for DG2, MTL and ARL [guc]

  xe:
   - update MAINATINERS
   - New uapi adding OA functionality to Xe
   - expose l3 bank mask
   - fix display detect on ADL-N
   - runtime PM Fixes
   - Fix silent backmerge issues
   - More prep for SR-IOV
   - HWmon additions
   - per client usage info
   - Rework GPU page fault handling
   - Drop EXEC_QUEUE_FLAG_BANNED
   - Add BMG PCI IDs
   - Scheduler fixes and improvements
   - Rename xe_exec_queue::compute to xe_exec_queue::lr
   - Use ttm_uncached for BO with NEEDS_UC flag
   - Rename xe perf layer as xe observation layer
   - lots of refactoring

  radeon:
   - Backlight workaround for iMac
   - Silence UBSAN flex array warnings

  msm:
   - Validate registers XML description against schema in CI
   - core/dpu: SM7150 support
   - mdp5: Add support for MSM8937
   - gpu: Add param for userspace to know if raytracing is supported
   - gpu: X185 support (aka gpu in X1 laptop chips)
   - gpu: a505 support

  ivpu:
   - hardware scheduler support
   - profiling support
   - improvements to the platform support layer
   - firmware handling improvements
   - clocks/power mgmt improvements
   - scheduler/logging improvements

  habanalabs:
   - Gradual sleep in polling memory macro
   - Reduce Gaudi2 MSI-X interrupt count to 128
   - Add Gaudi2-D revision support
   - Add timestamp to CPLD info
   - Gaudi2: Assume hard-reset by firmware upon MC SEI severe error
   - Align Gaudi2 interrupt names
   - Check for errors after preboot is ready
   - Change habanalabs maintainer and git repo path

  mgag200:
   - refactoring and improvements
   - Add BMC output
   - enable polling

  nouveau:
   - add registry command line

  v3d:
   - perf counters improvements

  zynqmp:
   - irq and debugfs improvements

  atmel-hlcdc:
   - Support XLCDC in sam9x7

  mipi-dbi:
   - Remove mipi_dbi_machine_little_endian
   - make SPI bits per word configurable
   - support RGB888
   - allow pixel formats to be specified in the DT

  sun4i:
   - Rework the blender setup for DE2

  panfrost:
   - Enable MT8188 support

  vc4:
   - Monochrome TV support

  exynos:
   - fix fallback mode regression
   - fix memory leak
   - Use drm_edid_duplicate() instead of kmemdup()

  etnaviv:
   - fix i.MX8MP NPU clock gating
   - workaround FE register cdc issues on some cores
   - fix DMA sync handling for cached buffers
   - fix job timeout handling
   - keep TS enabled on MMUv2 cores for improved performance

  mediatek:
   - Convert to platform remove callback returning void-
   - Drop chain_mode_fixup call in mode_valid()
   - Fixes the errors of MediaTek display driver found by IGT
   - Add display support for the MT8365-EVK board
   - Fix bit depth overwritten for mtk_ovl_set bit_depth()
   - Fix possible_crtcs calculation
   - Fix spurious kfree()

  ast:
   - refactor mode setting code

  stm:
   - Add LVDS support
   - DSI PHY updates"

* tag 'drm-next-2024-07-18' of https://gitlab.freedesktop.org/drm/kernel: (2501 commits)
  drm/amdgpu/mes12: add missing opcode string
  drm/amdgpu/mes11: update opcode strings
  Revert "drm/amd/display: Reset freesync config before update new state"
  drm/omap: Restrict compile testing to PAGE_SIZE less than 64KB
  drm/xe: Drop trace_xe_hw_fence_free
  drm/xe/uapi: Rename xe perf layer as xe observation layer
  drm/amdgpu: remove exp hw support check for gfx12
  drm/amdgpu: timely save bad pages to eeprom after gpu ras reset is completed
  drm/amdgpu: flush all cached ras bad pages to eeprom
  drm/amdgpu: select compute ME engines dynamically
  drm/amd/display: Allow display DCC for DCN401
  drm/amdgpu: select compute ME engines dynamically
  drm/amdgpu/job: Replace DRM_INFO/ERROR logging
  drm/amdgpu: select compute ME engines dynamically
  drm/amd/pm: Ignore initial value in smu response register
  drm/amdgpu: Initialize VF partition mode
  drm/amd/amdgpu: fix SDMA IRQ client ID <-> req mapping
  MAINTAINERS: fix Xinhui's name
  MAINTAINERS: update powerplay and swsmu
  drm/qxl: Pin buffer objects for internal mappings
  ...
2024-07-18 09:34:02 -07:00
Linus Torvalds
41906248d0 Power management updates for 6.11-rc1
- Add Loongson-3 CPUFreq driver support (Huacai Chen).
 
  - Add support for the Arrow Lake and Lunar Lake platforms and
    the out-of-band (OOB) mode on Emerald Rapids to the intel_pstate
    cpufreq driver, make it support the highest performance change
    interrupt and clean it up (Srinivas Pandruvada).
 
  - Switch cpufreq to new Intel CPU model defines (Tony Luck).
 
  - Simplify the cpufreq driver interface by switching the .exit() driver
    callback to the void return data type (Lizhe, Viresh Kumar).
 
  - Make cpufreq_boost_enabled() return bool (Dhruva Gole).
 
  - Add fast CPPC support to the amd-pstate cpufreq driver, address
    multiple assorted issues in it and clean it up (Perry Yuan, Mario
    Limonciello, Dhananjay Ugwekar, Meng Li, Xiaojian Du).
 
  - Add Allwinner H700 speed bin to the sun50i cpufreq driver (Ryan
    Walklin).
 
  - Fix memory leaks and of_node_put() usage in the sun50i and qcom-nvmem
    cpufreq drivers (Javier Carrasco).
 
  - Clean up the sti and dt-platdev cpufreq drivers (Jeff Johnson,
    Raphael Gallais-Pou).
 
  - Fix deferred probe handling in the TI cpufreq driver and wrong return
    values of ti_opp_supply_probe(), and add OPP tables for the AM62Ax and
    AM62Px SoCs to it (Bryan Brattlof, Primoz Fiser).
 
  - Avoid overflow of target_freq in .fast_switch() in the SCMI cpufreq
    driver (Jagadeesh Kona).
 
  - Use dev_err_probe() in every error path in probe in the Mediatek
    cpufreq driver (Nícolas Prado).
 
  - Fix kernel-doc param for longhaul_setstate in the longhaul cpufreq
    driver (Yang Li).
 
  - Fix system resume handling in the CPPC cpufreq driver (Riwen Lu).
 
  - Improve the teo cpuidle governor and clean up leftover comments from
    the menu cpuidle governor (Christian Loehle).
 
  - Clean up a comment typo in the teo cpuidle governor (Atul Kumar
    Pant).
 
  - Add missing MODULE_DESCRIPTION() macro to cpuidle haltpoll (Jeff
    Johnson).
 
  - Switch the intel_idle driver to new Intel CPU model defines (Tony
    Luck).
 
  - Switch the Intel RAPL driver new Intel CPU model defines (Tony Luck).
 
  - Simplify if condition in the idle_inject driver (Thorsten Blum).
 
  - Fix missing cleanup on error in _opp_attach_genpd() (Viresh Kumar).
 
  - Introduce an OF helper function to inform if required-opps is used
    and drop a redundant in-parameter to _set_opp_level() (Ulf Hansson).
 
  - Update pm-graph to v5.12 which includes fixes and major code revamp
    for python3.12 (Todd Brandt).
 
  - Address several assorted issues in the cpupower utility (Roman
    Storozhenko).
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmaVb+8SHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRxXIUQALFhNTO+wo8uPWUmsp0SV81Sbf17zM0f
 9IDpzJTUZLK0stTdLtxY4khcClPE4MrwS/LjSJlvkEVZChHpUw6vFezHmx0O42Ti
 Tmv3ezABSAmx6QVRSpyVhE3Hb0BmXW9V+3dtoefofV0JWenN7mqk4Hbb2Jx1Cvbh
 zyerUeWWl97yqVMM2l5owKHSvk7SYO6cfML73XcdXQ6pBfQePfekG87i1+r40l+d
 qEzdyh6JjqGbdkvZKtI4zO1Hdai9FdlLWSqYmVZGS5XRN8RVvDaHDIDlSijNXAei
 DFPFoBVAvl8CymBXXnzDyJJhCCkEb2aX3xD6WzthoCygZt5W+tqfGxyZfViBfb55
 kvpyiWZUVaDyX4Hfz1PLnJ7Xg9kPUKUcDDrsV5vKA7W0Sq2T0RbORsVkaP2nIhlY
 4Xspp9nEv+78DG0UjT7jT0Py2Oq9I6BTG+pmMTxcgA7G/U5H2uAvvIM/kwQ+30vi
 yUxO3W5o9TQmvJF1klHgp3YsCNWZG3IYacHZzUIoPbPusEbevYrCuUNriT+zlANc
 Pv/FMfBfHDmU2lHWyLzuoKhlzQosNi9NajMANBJgd55zACWKzgNzFV4P5gIMd1KR
 moJYfosbT2RWetEH8Zrh7xA5dewUphe6tibshElbKJHilnP0iFjYhhdb6aQRcuPd
 q/RECFYT7z0r
 =imBx
 -----END PGP SIGNATURE-----

Merge tag 'pm-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management updates from Rafael Wysocki:
 "These add a new cpufreq driver for Loongson-3, add support for new
  features in the intel_pstate (Lunar Lake and Arrow Lake platforms, OOB
  mode for Emerald Rapids, highest performance change interrupt),
  amd-pstate (fast CPPC) and sun50i (Allwinner H700 speed bin) cpufreq
  drivers, simplify the cpufreq driver interface, simplify the teo
  cpuidle governor, adjust the pm-graph utility for a new version of
  Python, address issues and clean up code.

  Specifics:

   - Add Loongson-3 CPUFreq driver support (Huacai Chen)

   - Add support for the Arrow Lake and Lunar Lake platforms and the
     out-of-band (OOB) mode on Emerald Rapids to the intel_pstate
     cpufreq driver, make it support the highest performance change
     interrupt and clean it up (Srinivas Pandruvada)

   - Switch cpufreq to new Intel CPU model defines (Tony Luck)

   - Simplify the cpufreq driver interface by switching the .exit()
     driver callback to the void return data type (Lizhe, Viresh Kumar)

   - Make cpufreq_boost_enabled() return bool (Dhruva Gole)

   - Add fast CPPC support to the amd-pstate cpufreq driver, address
     multiple assorted issues in it and clean it up (Perry Yuan, Mario
     Limonciello, Dhananjay Ugwekar, Meng Li, Xiaojian Du)

   - Add Allwinner H700 speed bin to the sun50i cpufreq driver (Ryan
     Walklin)

   - Fix memory leaks and of_node_put() usage in the sun50i and
     qcom-nvmem cpufreq drivers (Javier Carrasco)

   - Clean up the sti and dt-platdev cpufreq drivers (Jeff Johnson,
     Raphael Gallais-Pou)

   - Fix deferred probe handling in the TI cpufreq driver and wrong
     return values of ti_opp_supply_probe(), and add OPP tables for the
     AM62Ax and AM62Px SoCs to it (Bryan Brattlof, Primoz Fiser)

   - Avoid overflow of target_freq in .fast_switch() in the SCMI cpufreq
     driver (Jagadeesh Kona)

   - Use dev_err_probe() in every error path in probe in the Mediatek
     cpufreq driver (Nícolas Prado)

   - Fix kernel-doc param for longhaul_setstate in the longhaul cpufreq
     driver (Yang Li)

   - Fix system resume handling in the CPPC cpufreq driver (Riwen Lu)

   - Improve the teo cpuidle governor and clean up leftover comments
     from the menu cpuidle governor (Christian Loehle)

   - Clean up a comment typo in the teo cpuidle governor (Atul Kumar
     Pant)

   - Add missing MODULE_DESCRIPTION() macro to cpuidle haltpoll (Jeff
     Johnson)

   - Switch the intel_idle driver to new Intel CPU model defines (Tony
     Luck)

   - Switch the Intel RAPL driver new Intel CPU model defines (Tony
     Luck)

   - Simplify if condition in the idle_inject driver (Thorsten Blum)

   - Fix missing cleanup on error in _opp_attach_genpd() (Viresh Kumar)

   - Introduce an OF helper function to inform if required-opps is used
     and drop a redundant in-parameter to _set_opp_level() (Ulf Hansson)

   - Update pm-graph to v5.12 which includes fixes and major code revamp
     for python3.12 (Todd Brandt)

   - Address several assorted issues in the cpupower utility (Roman
     Storozhenko)"

* tag 'pm-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (77 commits)
  cpufreq: sti: fix build warning
  cpufreq: mediatek: Use dev_err_probe in every error path in probe
  cpufreq: Add Loongson-3 CPUFreq driver support
  cpufreq: Make cpufreq_driver->exit() return void
  cpufreq/amd-pstate: Fix the scaling_max_freq setting on shared memory CPPC systems
  cpufreq/amd-pstate-ut: Convert nominal_freq to khz during comparisons
  cpufreq: pcc: Remove empty exit() callback
  cpufreq: loongson2: Remove empty exit() callback
  cpufreq: nforce2: Remove empty exit() callback
  cpupower: fix lib default installation path
  cpufreq: docs: Add missing scaling_available_frequencies description
  cpuidle: teo: Don't count non-existent intercepts
  cpupower: Disable direct build of the 'bench' subproject
  cpuidle: teo: Remove recent intercepts metric
  Revert: "cpuidle: teo: Introduce util-awareness"
  cpufreq: make cpufreq_boost_enabled() return bool
  cpufreq: intel_pstate: Support highest performance change interrupt
  x86/cpufeatures: Add HWP highest perf change feature flag
  Documentation: cpufreq: amd-pstate: update doc for Per CPU boost control method
  cpufreq: amd-pstate: Cap the CPPC.max_perf to nominal_perf if CPB is off
  ...
2024-07-16 15:54:03 -07:00
Linus Torvalds
ce5a51bfac hardening updates for v6.11-rc1
- lkdtm/bugs: add test for hung smp_call_function_single() (Mark Rutland)
 
 - gcc-plugins: Remove duplicate included header file stringpool.h
   (Thorsten Blum)
 
 - ARM: Remove address checking for MMUless devices (Yanjun Yang)
 
 - randomize_kstack: Clean up per-arch entropy and codegen
 
 - KCFI: Make FineIBT mode Kconfig selectable
 
 - fortify: Do not special-case 0-sized destinations
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmaVT2IACgkQiXL039xt
 wCbq8A//RhxTdr+l/h2gyMy/Lcy/NMR9KEWklnxdftuM1V1Kzr53yeH/g6Ehw69g
 e8Ag3Sp7Fn4rNBVa+tY6RqzKwfrUHIbeewGI4LkRe19NDWFWc/Od+4tamfRSPf9c
 GL9ZnJZviRm3zByetwr4CbS69HocXFFSSgcpIv/7xOd+haSWWdvEc3KcSnavY/aq
 8wQPkZxzy8ESkOajZj2k0E2l9JP42Ex20qy0KcjweSSYVafKmbTxhKZgriwAKMCD
 Yj2m55fbD6D08vd0Y6S7H4TPilYtRbulXR9FNMtw59UpKeoUceEmyn4B43psDvau
 9XuJF/oFKrXBEJG+OUZogNu5L6uYUaNdYdtb43upu9lCsjrAjmMYfmXDHO2E40V8
 76MikxHtyFAPEzUwg/BH2CGUu9hil+FADd28s8zLuUBpRDitgYudQD+Cqrc34b6s
 QlAX19bX7KFgXqlsdwy6zJNSd3dpoMBVsP58/EhQQfiqv/ZU2TOryZenz0URlH+k
 ZCAbpXYRAzTyGz23qkutRO+6MiKXoheE7gmd9jESiaqyXe2Q6mIMPyoFU50458TH
 xXhXbZc7War8vbJLyWF7fvK/GlooTHu4xOxfNTsxKWiYShI01iiwG1hH+j4ZDVOG
 NBBK2AfX9GM8AOHJolp5EaGmon0AoVsxbRANSs1K4qZ93WTNGLk=
 =LoG2
 -----END PGP SIGNATURE-----

Merge tag 'hardening-v6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull hardening updates from Kees Cook:

 - lkdtm/bugs: add test for hung smp_call_function_single() (Mark
   Rutland)

 - gcc-plugins: Remove duplicate included header file stringpool.h
   (Thorsten Blum)

 - ARM: Remove address checking for MMUless devices (Yanjun Yang)

 - randomize_kstack: Clean up per-arch entropy and codegen

 - KCFI: Make FineIBT mode Kconfig selectable

 - fortify: Do not special-case 0-sized destinations

* tag 'hardening-v6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  randomize_kstack: Improve stack alignment codegen
  ARM: Remove address checking for MMUless devices
  gcc-plugins: Remove duplicate included header file stringpool.h
  randomize_kstack: Remove non-functional per-arch entropy filtering
  fortify: Do not special-case 0-sized destinations
  x86/alternatives: Make FineIBT mode Kconfig selectable
  lkdtm/bugs: add test for hung smp_call_function_single()
2024-07-16 13:45:43 -07:00
Linus Torvalds
e55037c879 EFI updates for v6.11
- Drop support for the 'fake' EFI memory map on x86
 
 - Add an SMBIOS based tweak to the EFI stub instructing the firmware on
   x86 Macbook Pros to keep both GPUs enabled
 
 - Replace 0-sized array with flexible array in EFI memory attributes
   table handling
 
 - Drop redundant BSS clearing when booting via the native PE entrypoint
   on x86
 
 - Avoid returning EFI_SUCCESS when aborting on an out-of-memory
   condition
 
 - Cosmetic tweak for arm64 KASLR loading logic
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQQQm/3uucuRGn1Dmh0wbglWLn0tXAUCZpTg5gAKCRAwbglWLn0t
 XOrOAQCpZjtjkPRPCBY+t3wUl84rOKiPr1SMHyL50Zl8udJKegD/bnwWSgX3FzLQ
 TN+xjnK7IAxEoKAEWt8lnt04cH5r3As=
 =7VWO
 -----END PGP SIGNATURE-----

Merge tag 'efi-next-for-v6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi

Pull EFI updates from Ard Biesheuvel:
 "Note the removal of the EFI fake memory map support - this is believed
  to be unused and no longer worth supporting. However, we could easily
  bring it back if needed.

  With recent developments regarding confidential VMs and unaccepted
  memory, combined with kexec, creating a known inaccurate view of the
  firmware's memory map and handing it to the OS is a feature we can
  live without, hence the removal. Alternatively, I could imagine making
  this feature mutually exclusive with those confidential VM related
  features, but let's try simply removing it first.

  Summary:

   - Drop support for the 'fake' EFI memory map on x86

   - Add an SMBIOS based tweak to the EFI stub instructing the firmware
     on x86 Macbook Pros to keep both GPUs enabled

   - Replace 0-sized array with flexible array in EFI memory attributes
     table handling

   - Drop redundant BSS clearing when booting via the native PE
     entrypoint on x86

   - Avoid returning EFI_SUCCESS when aborting on an out-of-memory
     condition

   - Cosmetic tweak for arm64 KASLR loading logic"

* tag 'efi-next-for-v6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
  efi: Replace efi_memory_attributes_table_t 0-sized array with flexible array
  efi: Rename efi_early_memdesc_ptr() to efi_memdesc_ptr()
  arm64/efistub: Clean up KASLR logic
  x86/efistub: Drop redundant clearing of BSS
  x86/efistub: Avoid returning EFI_SUCCESS on error
  x86/efistub: Call Apple set_os protocol on dual GPU Intel Macs
  x86/efistub: Enable SMBIOS protocol handling for x86
  efistub/smbios: Simplify SMBIOS enumeration API
  x86/efi: Drop support for fake EFI memory maps
2024-07-16 12:22:07 -07:00
Linus Torvalds
408323581b - Add support for running the kernel in a SEV-SNP guest, over a Secure
VM Service Module (SVSM).
 
    When running over a SVSM, different services can run at different
    protection levels, apart from the guest OS but still within the
    secure SNP environment.  They can provide services to the guest, like
    a vTPM, for example.
 
    This series adds the required facilities to interface with such a SVSM
    module.
 
  - The usual fixlets, refactoring and cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmaWQuoACgkQEsHwGGHe
 VUrmEw/+KqM5DK5cfpue3gn0RfH6OYUoFxOdYhGkG53qUMc3c3ka5zPVqLoHPkzp
 WPXha0Z5pVdrcD9mKtVUW9RIuLjInCM/mnoNc3tIUL+09xxemAjyG1+O+4kodiU7
 sZ5+HuKUM2ihoC4Rrm+ApRrZfH4+WcgQNvFky77iObWVBo4yIscS7Pet/MYFvuuz
 zNaGp2SGGExDeoX/pMQNI3S9FKYD26HR17AUI3DHpS0teUl2npVi4xDjFVYZh0dQ
 yAhTKbSX3Q6ekDDkvAQUbxvWTJw9qoIsvLO9dvZdx6SSWmzF9IbuECpQKGQwYcp+
 pVtcHb+3MwfB+nh5/fHyssRTOZp1UuI5GcmLHIQhmhQwCqPgzDH6te4Ud1ovkxOu
 3GoBre7KydnQIyv12I+56/ZxyPbjHWmn8Fg106nAwGTdGbBJhfcVYfPmPvwpI4ib
 nXpjypvM8FkLzLAzDK6GE9QiXqJJlxOn7t66JiH/FkXR4gnY3eI8JLMfnm5blAb+
 97LC7oyeqtstWth9/4tpCILgPR2tirrMQGjUXttgt+2VMzqnEamnFozsKvR95xok
 4j6ulKglZjdpn0ixHb2vAzAcOJvD7NP147jtCmXH7M6/f9H1Lih3MKdxX98MVhWB
 wSp16udXHzu5lF45J0BJG8uejSgBI2y51jc92HLX7kRULOGyaEo=
 =u15r
 -----END PGP SIGNATURE-----

Merge tag 'x86_sev_for_v6.11_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 SEV updates from Borislav Petkov:

 - Add support for running the kernel in a SEV-SNP guest, over a Secure
   VM Service Module (SVSM).

   When running over a SVSM, different services can run at different
   protection levels, apart from the guest OS but still within the
   secure SNP environment. They can provide services to the guest, like
   a vTPM, for example.

   This series adds the required facilities to interface with such a
   SVSM module.

 - The usual fixlets, refactoring and cleanups

[ And as always: "SEV" is AMD's "Secure Encrypted Virtualization".

  I can't be the only one who gets all the newer x86 TLA's confused,
  can I?
              - Linus ]

* tag 'x86_sev_for_v6.11_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  Documentation/ABI/configfs-tsm: Fix an unexpected indentation silly
  x86/sev: Do RMP memory coverage check after max_pfn has been set
  x86/sev: Move SEV compilation units
  virt: sev-guest: Mark driver struct with __refdata to prevent section mismatch
  x86/sev: Allow non-VMPL0 execution when an SVSM is present
  x86/sev: Extend the config-fs attestation support for an SVSM
  x86/sev: Take advantage of configfs visibility support in TSM
  fs/configfs: Add a callback to determine attribute visibility
  sev-guest: configfs-tsm: Allow the privlevel_floor attribute to be updated
  virt: sev-guest: Choose the VMPCK key based on executing VMPL
  x86/sev: Provide guest VMPL level to userspace
  x86/sev: Provide SVSM discovery support
  x86/sev: Use the SVSM to create a vCPU when not in VMPL0
  x86/sev: Perform PVALIDATE using the SVSM when not at VMPL0
  x86/sev: Use kernel provided SVSM Calling Areas
  x86/sev: Check for the presence of an SVSM in the SNP secrets page
  x86/irqflags: Provide native versions of the local_irq_save()/restore()
2024-07-16 11:12:25 -07:00
Linus Torvalds
b84b338190 - Enable Sub-NUMA clustering to work with resource control on Intel by
teaching resctrl to handle scopes due to the clustering which
    partitions the L3 cache into sets. Modify and extend the subsystem to
    handle such scopes properly
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmaWGPcACgkQEsHwGGHe
 VUrLtQ/9GnY6EZDXQf6gF50FuasOrjaJw3bzSN6N0Hy28BEgG0fFrZzAKYRUvJXl
 s16JkgQrQB3JaoT4bwcaSvMvBTtc+1cDuxMYI3C7jtBkjGFRwOgsCp/Hr2xujaKK
 IfOJNmDLx2YRuxFyfi1FK4b1YqZ1gtg5FcmmaelBCu/rkQcBC9S7VtqGqCjwmhxy
 l5WVDzMdXB++cxEJz1fBCyjdPgAwhEmNm0fnxGc0je1EvJUczd2o8Us3ND8Sw5x1
 +5JL4PjwSMlFa71yw+rTzUs9u01SAI3IxvU6sPhmxhr3O4is4rGusyUldiz1598r
 U+bYWivGn1ksVPifo0c6UUtbpaO9KLAnxsiRct7FKZdBfaqXi13twi1918aVyECJ
 8pW0R8c/W3kQYMPOlhwBIzJp31rPzAxu70k9DT0cShAzKk/EbIWZAuZGqMz9bhfS
 pcfCdD+36C/jN57KIhzo3GamzgHee40MQMLBKjFe1etZFit2EjyUK/jZhdYZWckj
 +mOyWLngLVzF2mIkFrmw4VDRHsSqZlBGSHwHyiC+J+lL+nO9N9xQrtxm4z8TimLY
 QquDSTYdqi2dGYVpN4vIOktn40A43UxirKC1X3fVqQRz71LcYGe28tMlQ99kUUJR
 H8PGajlxfSB1CWNZpgaHGTMzU09ojHvJYmXy2p1HJf4fcBiXOV4=
 =LITm
 -----END PGP SIGNATURE-----

Merge tag 'x86_cache_for_v6.11_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 resource control updates from Borislav Petkov:

 - Enable Sub-NUMA clustering to work with resource control on Intel by
   teaching resctrl to handle scopes due to the clustering which
   partitions the L3 cache into sets. Modify and extend the subsystem to
   handle such scopes properly

* tag 'x86_cache_for_v6.11_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/resctrl: Update documentation with Sub-NUMA cluster changes
  x86/resctrl: Detect Sub-NUMA Cluster (SNC) mode
  x86/resctrl: Enable shared RMID mode on Sub-NUMA Cluster (SNC) systems
  x86/resctrl: Make __mon_event_count() handle sum domains
  x86/resctrl: Fill out rmid_read structure for smp_call*() to read a counter
  x86/resctrl: Handle removing directories in Sub-NUMA Cluster (SNC) mode
  x86/resctrl: Create Sub-NUMA Cluster (SNC) monitor files
  x86/resctrl: Allocate a new field in union mon_data_bits
  x86/resctrl: Refactor mkdir_mondata_subdir() with a helper function
  x86/resctrl: Initialize on-stack struct rmid_read instances
  x86/resctrl: Add a new field to struct rmid_read for summation of domains
  x86/resctrl: Prepare for new Sub-NUMA Cluster (SNC) monitor files
  x86/resctrl: Block use of mba_MBps mount option on Sub-NUMA Cluster (SNC) systems
  x86/resctrl: Introduce snc_nodes_per_l3_cache
  x86/resctrl: Add node-scope to the options for feature scope
  x86/resctrl: Split the rdt_domain and rdt_hw_domain structures
  x86/resctrl: Prepare for different scope for control/monitor operations
  x86/resctrl: Prepare to split rdt_domain structure
  x86/resctrl: Prepare for new domain scope
2024-07-16 10:53:54 -07:00
Linus Torvalds
d679783188 - Flip the logic to add feature names to /proc/cpuinfo to having to
explicitly specify the flag if there's a valid reason to show it in
   /proc/cpuinfo
 
 - Switch a bunch of Intel x86 model checking code to the new CPU model
   defines
 
 - Fixes and cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmaVZ+EACgkQEsHwGGHe
 VUqTgA//aJez6C5SmuqIofqgimr+8JGNThf4vFB3O9tN0ony3IR8IRieF+sOZFXE
 WVyN7KOhPs2XvNzVAaJpzWUcg/E2bXzVrOKfx3uFiyNiBttKLVot7Hl640wqWGoG
 eTViTpQ6IALY7lEI6vFNXz+4Ja5PWmHxWdBkvP9ehSvqNxHivTWL4HQ11pcCWQEA
 i+V37PbOHsnH7ZprJtaV0ihtjFblk9/R4qoZuT3SObhG0QDJK4Q7yYUelxXMUUgD
 Yo3nXluQl6Vc5dD2ULYkTlhzMxoZUMURty897vYSsZz49ZXsS6fsvd+BheSQVOv1
 hzaqqFYijdIpPI1zwgAPM+e6S/EAafpNVcEkjhHGZIJehwXm3teoSlX5tK2NPGoe
 PLYrwPWAzagdS3dWvrvBYT3Bu7pygieDSyPFfVP2XQsElHsWhYvBtxeH/uUwm+v4
 xjtXaJUj9eznChPaDZhCl8ioh9szUKHsh2NJ5ND7qpxPCFpz1Xj9ZmbIYTjHEgjG
 IT8dFfykKdyh5htJWw/P8LbexpEMTmu/LDrDXt+tFsDLBKIkeLiP3h8+yDR+vJ7K
 OGBjY2ciSi9Wy9ynunCOCNHNBdia1qc3AJWSg/2YP4NW+RzRLe6cIs+Ih4s1N5lx
 ADvw+TA9CAKo1KASyOVYAxq7h4xlsyH6jbCC3ZW3P/a+Bs8smqM=
 =SEED
 -----END PGP SIGNATURE-----

Merge tag 'x86_cpu_for_v6.11_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 cpu model updates from Borislav Petkov:

 - Flip the logic to add feature names to /proc/cpuinfo to having to
   explicitly specify the flag if there's a valid reason to show it in
   /proc/cpuinfo

 - Switch a bunch of Intel x86 model checking code to the new CPU model
   defines

 - Fixes and cleanups

* tag 'x86_cpu_for_v6.11_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/cpu/intel: Drop stray FAM6 check with new Intel CPU model defines
  x86/cpufeatures: Flip the /proc/cpuinfo appearance logic
  x86/CPU/AMD: Always inline amd_clear_divider()
  x86/mce/inject: Add missing MODULE_DESCRIPTION() line
  perf/x86/rapl: Switch to new Intel CPU model defines
  x86/boot: Switch to new Intel CPU model defines
  x86/cpu: Switch to new Intel CPU model defines
  perf/x86/intel: Switch to new Intel CPU model defines
  x86/virt/tdx: Switch to new Intel CPU model defines
  x86/PCI: Switch to new Intel CPU model defines
  x86/cpu/intel: Switch to new Intel CPU model defines
  x86/platform/intel-mid: Switch to new Intel CPU model defines
  x86/pconfig: Remove unused MKTME pconfig code
  x86/cpu: Remove useless work in detect_tme_early()
2024-07-15 20:25:16 -07:00
Linus Torvalds
2439a5eaa7 - Add a spectre_bhi=vmexit mitigation option aimed at cloud
environments
 
  - Remove duplicated Spectre cmdline option documentation
 
  - Add separate macro definitions for syscall handlers which do not
    return in order to address objtool warnings
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmaVXXMACgkQEsHwGGHe
 VUrd3A/9FFJZcpxdpWJikyEskb3CO1xthfM/6QvV5U3/Nldpz4aROEteqsMYc+xB
 OcA/RkCc8mBBFuydZjNxlNwyMXkoab/rQJC/Dz7q1O61sho4RWk8yCh6xM1JRofF
 WeKGCClz1KnsCc8FlVaHAEhp6gBMJiiqawjXBklfHhUqmbY7UZgcAyeM3uMIwAEG
 qCS7opOSZVijJadoyvROf5na23hggUVO++qS4HYT66G3bI3MdEEWp06dUxXBD/Er
 2zRAY6III4wuGTxe8L49ftsyW9RS7AKY2rUmhpffkeA8tLYBfXogYVSQYyR3S9Ou
 gZg9Yeu64rjqZZUYpzRR+kATUpuSKO6nQBHxd+ICRIUbzSmXUNzvPTi5SWSWh2vC
 HTLgFbGXxg8fLlpqCJ21oaU982w3eteOJ+wgf/AH3hBykFljck9EcaGsaQ5OfeDE
 MA0XaDy2V4jypyxmLpRfRIWJWtNVTgza2Jl0Dg3X+UipAXtvCvJzW1ZJ0ksA+2P0
 K1GeWy4tC51uFndeYpNC1eQ0cJjv1mfAugHcqgVdAhwMYUZdXchaPJHr/fcF7AEG
 xjV7fnoGK6WKKUni+Tnmom3FzBVDztKAtZ4iYgwIWReRj9bKLhP2k779rMXkCftt
 WtiencSCtVn+K/4acYBx0vbRKlDv769Lq64FZ8xNgGw6uRXjhhM=
 =AP9P
 -----END PGP SIGNATURE-----

Merge tag 'x86_bugs_for_v6.11_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 cpu mitigation updates from Borislav Petkov:

 - Add a spectre_bhi=vmexit mitigation option aimed at cloud
   environments

 - Remove duplicated Spectre cmdline option documentation

 - Add separate macro definitions for syscall handlers which do not
   return in order to address objtool warnings

* tag 'x86_bugs_for_v6.11_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/bugs: Add 'spectre_bhi=vmexit' cmdline option
  x86/bugs: Remove duplicate Spectre cmdline option descriptions
  x86/syscall: Mark exit[_group] syscall handlers __noreturn
2024-07-15 20:07:27 -07:00
Linus Torvalds
f998678baf - Add a unified VMware hypercall API layer which should be used by all
callers instead of them doing homegrown solutions. This will provide for
    adding API support for confidential computing solutions like TDX
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmaVO4kACgkQEsHwGGHe
 VUp4tw//en2ywe8nqoO8a5WIxIcc6wtMTYEboqu5q7RzWJzHVRsAz72USeMlQgBB
 ywNnn2H0SgVqfcLOMkAzsEarvPUJR0ZThvYxyIStcFzqIWMbtuhazMx/tVsR+9jD
 LqIFWrSeXPE+w005srnXZb7qxvC4cDyGdRL9xHa6UoN/Io2oTEidNWs825KoLWhN
 OPqWfLrvm+Bb+JMaLYQC6UQsJk1ds91WlI3k7CdYk1sNgkTfwGHlDulwrhzM0oG0
 EcVBKW8xsOxg4ylYS5j42ykE1z+FUMpSQ+tq7fo/SUbrgTr55xhDpxi8rsS2P5xX
 fErsYBOEY228YT8V1fpaJMY1f7HLhZqy5jrODvDHCI6E3wasQuzl9Dc+OpwmN5NA
 gR9BQIoAQgpZSpTsCG6qJagx5FYmS3bY1yXmzEsTmrzmchXQ0QQqInJw61qdHO4F
 +LZYj7pOQzKlVEkrpBeWMnWMh+RmumaW0SsHVahvutzH3OA3yLjZl117S3dDiY7K
 A4cqaX4A0KeCSUkXha7NuSRDtDIevAYhIEvcoUr5Xv2FgRO2c7N1rzzCdH3ML0fZ
 Pzmjh24s91YqxY/s0YnJ57glKJfGcx0VKzPaw80/rxJ9sVb4HK2GkBOODuJhP8Iw
 rF8qIfEmRHsyJdvRkF6pSl7hIEJth/khW0qNRF8PivzCtnpDBO8=
 =4VPt
 -----END PGP SIGNATURE-----

Merge tag 'x86_vmware_for_v6.11_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 vmware updates from Borislav Petkov:

 - Add a unified VMware hypercall API layer which should be used by all
   callers instead of them doing homegrown solutions. This will provide
   for adding API support for confidential computing solutions like TDX

* tag 'x86_vmware_for_v6.11_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/vmware: Add TDX hypercall support
  x86/vmware: Remove legacy VMWARE_HYPERCALL* macros
  x86/vmware: Correct macro names
  x86/vmware: Use VMware hypercall API
  drm/vmwgfx: Use VMware hypercall API
  input/vmmouse: Use VMware hypercall API
  ptp/vmware: Use VMware hypercall API
  x86/vmware: Introduce VMware hypercall API
2024-07-15 20:05:40 -07:00
Linus Torvalds
222dfb8326 - Make error checking of AMD SMN accesses more robust in the callers as
they're the only ones who can interpret the results properly
 
  - The usual cleanups and fixes, left and right
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmaVOU0ACgkQEsHwGGHe
 VUqeFBAAl9X4bj08GwSAXfqBangXaGpKO4Nx0VZiFCYDkQ/TDnchMEBbpRWSuVzS
 SEnVSrcAXCxKqhv295UyFMmv2a+q3UUidkxTzRfznekMZMMylHYcfCFrg16w9ZNJ
 N/cBquTu96hSJHd2/usNUvNPLllTrMoIg3gofBav+NTaHQQDmzvM5htfewREY9OF
 SRS/86o3u5oIsRKKiJRyzfLzzX9lEGUvU+lvxv/yu1x2Q6SG0guhfM3HeaSxCIOs
 yeB23bwe/N/pO5KlqOtEJJL49Ypu2k/jfiS2rhH6AxSqNfXVpBlDbnahu9sA973n
 irzWwycJhVU4OQ3pqmPXdcKDqn7GmUWDsjrkEIOqJeBCSukmlM7APi8Ss8yGZ3X4
 HgDw10c900ldrxSo0H5PdpeULvowpeptpzBY8gzcdum4s0vNUvZLy/n1AKo7ydea
 oJ+ZBdXvywnR66uGQLkTxLvpGTNgyFrKDORHuyOAwJTN5CbLuco2SV/82mkcQCZt
 sAgyiWFvIcLoHZPfY8BNztYWVX01lWDIxFHJE8ca/B97mBeZCC3w1DnHJla8Kxsg
 zCMV0yn61BdMvjVS9AGaKqEuN0gYYrs/QOjtOp5ggAv7QC1ke/wqgZoFGvLbmcP9
 pIf8GzCt34u3tACGAl76toP0rtnMjGvKD8xXdHGHf7AAj1jKo28=
 =rd6Q
 -----END PGP SIGNATURE-----

Merge tag 'x86_misc_for_v6.11_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull misc x86 updates from Borislav Petkov:

 - Make error checking of AMD SMN accesses more robust in the callers as
   they're the only ones who can interpret the results properly

 - The usual cleanups and fixes, left and right

* tag 'x86_misc_for_v6.11_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/kmsan: Fix hook for unaligned accesses
  x86/platform/iosf_mbi: Convert PCIBIOS_* return codes to errnos
  x86/pci/xen: Fix PCIBIOS_* return code handling
  x86/pci/intel_mid_pci: Fix PCIBIOS_* return code handling
  x86/of: Return consistent error type from x86_of_pci_irq_enable()
  hwmon: (k10temp) Rename _data variable
  hwmon: (k10temp) Remove unused HAVE_TDIE() macro
  hwmon: (k10temp) Reduce k10temp_get_ccd_support() parameters
  hwmon: (k10temp) Define a helper function to read CCD temperature
  x86/amd_nb: Enhance SMN access error checking
  hwmon: (k10temp) Check return value of amd_smn_read()
  EDAC/amd64: Check return value of amd_smn_read()
  EDAC/amd64: Remove unused register accesses
  tools/x86/kcpuid: Add missing dir via Makefile
  x86, arm: Add missing license tag to syscall tables files
2024-07-15 19:53:07 -07:00
Linus Torvalds
98896d8795 - Unrelated x86/cc changes queued here to avoid ugly cross-merges and
conflicts:
 
    - Carve out CPU hotplug function declarations into a separate header
      with the goal to be able to use the lockdep assertions in a more
      flexible manner
 
    - As a result, refactor cacheinfo code after carving out a function
      to return the cache ID associated with a given cache level
 
    -  Cleanups
 
 - Add support to be able to kexec TDX guests. For that
 
    - Expand ACPI MADT CPU offlining support
 
    - Add machinery to prepare CoCo guests memory before kexec-ing into a new
      kernel
 
    - Cleanup, readjust and massage related code
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmaVCYoACgkQEsHwGGHe
 VUoi6g//Up/4vMzcjqzrndXfl0aP+NpK4zNud+ZPP4Qza2yPhKydniMvkWVQ8DTx
 jQaGk/tJDeFG6ofOzGkmBGyuZzuO4D7E0XFyXZZeVgSvdk2Af5vaWu1D3e4i4MiM
 Ox4H8NtWnC4MozP0hos4qB0vtYaBWVJkNvIXDVF6162zLwEmbuyrpFe3glscwIxv
 hMZR/C47RHcEeOb7yA4m/gJ+AqMe9OKradoNJkkfDpnYr6CYsbmpY09or2WYuvoI
 0gevkIe6Q9HMcq3CQl6/pR8IgbA5VmGi7iCiE1ihgTPwR3AaU8llzBqYdSgezFrk
 68A7oGeUZQeifQgjwkreZclMtsGEeGWVOB0Bh3Jgr6uaWGFXtpydi/hc73wbTz+F
 IazKQcKQYjaPW/9UG+0+cFTQlCgQ+WxwqAsN1uqzL6gMgmC9B+TM//xzk5nVxpOd
 ouf8T85tyceIPCKepGE/bWEHYYCjfbqBMyQT6RHmxUKbb1/PIsbzN26cenkZmPXT
 cpwurWVG7mRQJRqTrsS+D+opP1h/jOdkpwGlBfl1s0sX6RZuMFBk+7TlMMs61Cyo
 PWtrLV7Dr369cuXE72wIgfBAao2AS8kFshc7Atokq7/XfL9cCWHeqIcu7yvParP5
 WY43YQv8XPGI7ZnPqULByTY0Wxg8TFk8whamx97kEp8uy2HmbQU=
 =k+T+
 -----END PGP SIGNATURE-----

Merge tag 'x86_cc_for_v6.11_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 confidential computing updates from Borislav Petkov:
 "Unrelated x86/cc changes queued here to avoid ugly cross-merges and
  conflicts:

   - Carve out CPU hotplug function declarations into a separate header
     with the goal to be able to use the lockdep assertions in a more
     flexible manner

   - As a result, refactor cacheinfo code after carving out a function
     to return the cache ID associated with a given cache level

   - Cleanups

  Add support to be able to kexec TDX guests:

   - Expand ACPI MADT CPU offlining support

   - Add machinery to prepare CoCo guests memory before kexec-ing into a
     new kernel

   - Cleanup, readjust and massage related code"

* tag 'x86_cc_for_v6.11_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
  ACPI: tables: Print MULTIPROC_WAKEUP when MADT is parsed
  x86/acpi: Add support for CPU offlining for ACPI MADT wakeup method
  x86/mm: Introduce kernel_ident_mapping_free()
  x86/smp: Add smp_ops.stop_this_cpu() callback
  x86/acpi: Do not attempt to bring up secondary CPUs in the kexec case
  x86/acpi: Rename fields in the acpi_madt_multiproc_wakeup structure
  x86/mm: Do not zap page table entries mapping unaccepted memory table during kdump
  x86/mm: Make e820__end_ram_pfn() cover E820_TYPE_ACPI ranges
  x86/tdx: Convert shared memory back to private on kexec
  x86/mm: Add callbacks to prepare encrypted memory for kexec
  x86/tdx: Account shared memory
  x86/mm: Return correct level from lookup_address() if pte is none
  x86/mm: Make x86_platform.guest.enc_status_change_*() return an error
  x86/kexec: Keep CR4.MCE set during kexec for TDX guest
  x86/relocate_kernel: Use named labels for less confusion
  cpu/hotplug, x86/acpi: Disable CPU offlining for ACPI MADT wakeup
  cpu/hotplug: Add support for declaring CPU offlining not supported
  x86/apic: Mark acpi_mp_wake_* variables as __ro_after_init
  x86/acpi: Extract ACPI MADT wakeup code into a separate file
  x86/kexec: Remove spurious unconditional JMP from from identity_mapped()
  ...
2024-07-15 19:36:01 -07:00
Linus Torvalds
4578d072fa - Add a check to warn when cmdline parsing happens before the final cmdline
string has been built and thus arguments can get lost
 
  - Code cleanups and simplifications
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmaVAgQACgkQEsHwGGHe
 VUrXsxAAsnJiihOXaU/VPfuRx5d/URufo1HxLPjR5D0YXuzCEbFUS3/9UleAsg0Z
 h/hKBPtC4o9OJWqo1EIbpmCaIqMxuYZgLEQ1n2tx60FGFVfY/9H8PmqPSgMdeoPC
 HBseXzLzNy6BWeIbRIc3FCk1MF1HR83hs1aiaCJVBm19kmz4n4aZ4zRr4CNIug+0
 6kNtLWiNYW2kw6J/2zoIStVkScIzxIFcMVz7KgA4S6RIOPLaints9Nf4jNl2mp5n
 UEZy9OQEgf8h+3KI5dB5uUhckuteQSSeL6K0YJ869pRN63hOtU7MCc8PSgMpPAbX
 4s/wKYRp2l4EfEOVCJimFs/yJKeIDjOW0ivuKJ/5DvqtyXG5PMBdt8HCBlpUb/cr
 Qi4dd4/u1pUk/vJpykZq/5H6zDWym2Q2WDjOCE8K2DOi3YBY+Ia7HrBXSyQyYAJ6
 Rq8Xu6Lq+Lqgg9/7HZizoc8y6wRyzhuYpkqJWvLN57rJ5dNNKKuJyuwCyAupw4o1
 b4gfQ5KgUyG8VAs7dSqhEBzL8zrXZlbOhkeDXUUHtKw6AxS9p4LDIzKVwc6QHdAe
 0V2soGoAYv24RoAEUeVEeaIHMkKdq600W/9yNFzogNvRvFyXp+jXCR3kCtNz6TJ2
 VvioFlJw4y99UPguKi/nzyTA1EdAVVhYYgl39wTnMDOQHxSv2o0=
 =GWkn
 -----END PGP SIGNATURE-----

Merge tag 'x86_boot_for_v6.11_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 boot updates from Borislav Petkov:

 - Add a check to warn when cmdline parsing happens before the final
   cmdline string has been built and thus arguments can get lost

 - Code cleanups and simplifications

* tag 'x86_boot_for_v6.11_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/setup: Warn when option parsing is done too early
  x86/boot: Clean up the arch/x86/boot/main.c code a bit
  x86/boot: Use current_stack_pointer to avoid asm() in init_heap()
2024-07-15 19:31:59 -07:00
Linus Torvalds
208c6772d3 - This is basically PeterZ's idea to nest the alternative macros to avoid the
need to "spell out" the number of alternates in an ALTERNATIVE_n() macro and
   thus have an ever-increasing complexity in those definitions.
 
   For ease of bisection, the old macros are converted to the new, nested
   variants in a step-by-step manner so that in case an issue is encountered
   during testing, one can pinpoint the place where it fails easier. Because
   debugging alternatives is a serious pain.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmaU/+MACgkQEsHwGGHe
 VUpMGhAAqVbB3DZohv0Oa4BRRvaKFuQ3L7H0NTjK/pbT3EG+phol0zrHby2MnGjD
 HWXskps86n91QBB/06vAyZRimV/dvAPvlSKllsRx6ie3VCE4FJPzA4nTWQn/41dC
 HWamj78mQuSMgioLzIYdTY79KObtJcUw/X/xz+TTMemfkzkQxukKY7+Y71nZbuKi
 rUuCSrfAWNHQaIaoGs2JowGw7te7yNOtKQMCW5TdNLwvJfOAECuoLIFeiEcWHvoO
 uGl6FTABNLp26wmaeceUxdjBbTJcM3iV3joZQYED7B+mbJcU/a7tZw7I+mavPrbh
 Y6+EOn7rzR0wbcmj0iJ74TKr+uKDme/Qzm3YEKgGvJPj9tRjTDwxWRBnyTeCMbav
 NkKVwWTep8K+1qJtGVBwACY6iz89u3P8V5owD8O++KIPQa8rA0m8pN5gaU3PVYYQ
 D2UUdqXWIPIFoD4Sveb/WFU8OJKY+Nx7IK8KD03h5tiXW8MmGSa2e5b57gIfCLP7
 DbSHyCkTiqEdBrSM4/RaVVckD6NZ39M87H+iV51vYUkCYmODa/riMj0M7SVMi5Jo
 S/30jvdHEzWnmDBbOsn9d1XbvB5I+zz3BrcZQ2VSyBx+Y9m+SZ9qsyMEkOWw4uKM
 kfFp+XlYVWSnQFo7jY3UUIgVrU9dmdX1WxNX7/2HABDjg3MtWog=
 =tGap
 -----END PGP SIGNATURE-----

Merge tag 'x86_alternatives_for_v6.11_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 alternatives updates from Borislav Petkov:
 "This is basically PeterZ's idea to nest the alternative macros to
  avoid the need to "spell out" the number of alternates in an
  ALTERNATIVE_n() macro and thus have an ever-increasing complexity in
  those definitions.

  For ease of bisection, the old macros are converted to the new, nested
  variants in a step-by-step manner so that in case an issue is
  encountered during testing, one can pinpoint the place where it fails
  easier.

  Because debugging alternatives is a serious pain"

* tag 'x86_alternatives_for_v6.11_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/alternatives, kvm: Fix a couple of CALLs without a frame pointer
  x86/alternative: Replace the old macros
  x86/alternative: Convert the asm ALTERNATIVE_3() macro
  x86/alternative: Convert the asm ALTERNATIVE_2() macro
  x86/alternative: Convert the asm ALTERNATIVE() macro
  x86/alternative: Convert ALTERNATIVE_3()
  x86/alternative: Convert ALTERNATIVE_TERNARY()
  x86/alternative: Convert alternative_call_2()
  x86/alternative: Convert alternative_call()
  x86/alternative: Convert alternative_io()
  x86/alternative: Convert alternative_input()
  x86/alternative: Convert alternative_2()
  x86/alternative: Convert alternative()
  x86/alternatives: Add nested alternatives macros
  x86/alternative: Zap alternative_ternary()
2024-07-15 19:11:28 -07:00
Linus Torvalds
1467b49869 - A cleanup and a correction to the error injection driver to inject
a MCA_MISC value only when one has actually been supplied by the user
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmaU+QYACgkQEsHwGGHe
 VUp5aBAAh3zED0mymD4Eiutfaue1Ccwa9b8OyvkFLSX+qEuvoHAbh5vZgwGyJum6
 h1cVfw4v43XQE16DVcGXacITjrQfstb67cyj5AKMINZmCvWb+XVVngdavRJs7fsC
 gPWSEtKPWzm+Vu8jk3IYKeAwqLQnlqDlwFp1rGkMyS8KPjBXrnEnh4+Q1QaxXvdF
 XbtTRyKy1PVAA3cgAbW/FFRnZMj8KJQOSyXnYQEmPHTrPAvUufNjXmjCfV27H5/n
 kNKDu+/fY9+Lj8nPrll0j4fXymVaacZvbnUrB2Um3twouyEyBRMHDQL+xRUxhRGO
 iJs+lPBo/hW/+4Vx5BZt6jk4w95ZVldf/sIwxdMk9EDsheGg95WMd2gD1NU0gG+8
 8wGKVYh/eowkuJcNj6EiSvXDeYMeEKPeB9JTButzRM39lAOJYHcJxDaLKjxldR7e
 1HPETsc+tUP0ESuFWT4f1WSZjsc3J4fg89bx3I8gBeOgl5T8ykcpYsTuTdAzdfHM
 xPiC2WTP9WYqnfCUXFSz/yUvaAa1O6TfW2hTqGf+rJMfDBlohk3QBR59JJjO4IkY
 nkYKfc6BcIQtJZ9VQSS26f78P+NKALxReJNS83gPtq4sduDytpLCvLf4ts6dy3d7
 rfr9IL+adutOOy/DjDZ6DVhITGKYjwJEKcAQ3fNBxn0PsSOZUjI=
 =yg+A
 -----END PGP SIGNATURE-----

Merge tag 'ras_core_for_v6.11_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull RAS updates from Borislav Petkov:

 - A cleanup and a correction to the error injection driver to inject a
   MCA_MISC value only when one has actually been supplied by the user

* tag 'ras_core_for_v6.11_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mce: Remove unused variable and return value in machine_check_poll()
  x86/mce/inject: Only write MCA_MISC when a value has been supplied
2024-07-15 18:22:48 -07:00
Linus Torvalds
4fd9435641 Updates for timers, timekeeping and related functionality:
- Core:
 
     - Make the takeover of a hrtimer based broadcast timer reliable during
       CPU hot-unplug. The current implementation suffers from a race which
       can lead to broadcast timer starvation in the worst case.
 
     - VDSO related cleanups and simplifications
 
     - Small cleanups and enhancements all over the place
 
   - PTP:
 
     - Replace the architecture specific base clock to clocksource, e.g. ART
       to TSC, conversion function with generic functionality to avoid
       exposing such internals to drivers and convert all existing drivers
       over. This also allows to provide functionality which converts the
       other way round in the core code based on the same parameter set.
 
     - Provide a function to convert CLOCK_REALTIME to the base clock to
       support the upcoming PPS output driver on Intel platforms.
 
   - Drivers:
 
     - A set of Device Tree bindings for new hardware
 
     - Cleanups and enhancements all over the place
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmaUOM0THHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYofolD/9kK+aYdDj1gCFuZXZ2wTgMMxFmf/91
 0UcsGRuBJiIXs3H3iizQ0Mb0cdTW6qZJoBp0jPlvUSm0BEKdEgE1uRX2RuAPZ/Gq
 4/54ZJVopKSgAqeJFmqQubRVSv2XdMRAAJT0o1oUG3jZ0c6u8vqArIh5ZCnu13l/
 tsNOeYLYzQFyA30eHSJ/KjQ2zHwAhJnl5a/b7pdAvxmlN37bGgKEpglv+9zwFiDB
 K/kWbpb/oED9WOmoQy5QYi8iSvLQHEhFGrqzXV3fegu/B/mBBf/bpsisVx7Z1m2R
 nzxNqg86RdMjNR6giwBETZjm7YxM+gKb9nCBNILjbjWZFC4tyrBkLGJ+KniTRNyZ
 M5R4X1oP/14h00qXmCgIEFWysXaJRewYI+TIm8R2rLXrR6Tf3c4oL6fHQJxy3X52
 7A+4Z/vOk/KX6PxYmLC+xQDukhFh2nirVYsP1oNM9yC9zR/wkBBXTTmUSAI+8m8l
 KphniSPS2HMSBI6TtgOT8SKY7lRUZTnafBZq7wRXCv0Zz8AXoofgQDmBkXC99BkB
 MjLvRotJVJvY9a8LtA7htjDg/jiEMa0wHRNAGNSbflKoAKrJzoE5WbFxFZKbq3vZ
 o8cEYRMAIP+X+qn+oymT45XXXQlifZiccJdAi9FqDTvplEib2jmTmH6Ae5Khkr4l
 Lbzh/nSKVN7lOg==
 =8GjP
 -----END PGP SIGNATURE-----

Merge tag 'timers-core-2024-07-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull timer updates from Thomas Gleixner:
 "Updates for timers, timekeeping and related functionality:

  Core:

   - Make the takeover of a hrtimer based broadcast timer reliable
     during CPU hot-unplug. The current implementation suffers from a
     race which can lead to broadcast timer starvation in the worst
     case.

   - VDSO related cleanups and simplifications

   - Small cleanups and enhancements all over the place

  PTP:

   - Replace the architecture specific base clock to clocksource, e.g.
     ART to TSC, conversion function with generic functionality to avoid
     exposing such internals to drivers and convert all existing drivers
     over. This also allows to provide functionality which converts the
     other way round in the core code based on the same parameter set.

   - Provide a function to convert CLOCK_REALTIME to the base clock to
     support the upcoming PPS output driver on Intel platforms.

  Drivers:

   - A set of Device Tree bindings for new hardware

   - Cleanups and enhancements all over the place"

* tag 'timers-core-2024-07-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (30 commits)
  clocksource/drivers/realtek: Add timer driver for rtl-otto platforms
  dt-bindings: timer: Add schema for realtek,otto-timer
  dt-bindings: timer: Add SOPHGO SG2002 clint
  dt-bindings: timer: renesas,tmu: Add R-Car Gen2 support
  dt-bindings: timer: renesas,tmu: Add RZ/G1 support
  dt-bindings: timer: renesas,tmu: Add R-Mobile APE6 support
  clocksource/drivers/mips-gic-timer: Correct sched_clock width
  clocksource/drivers/mips-gic-timer: Refine rating computation
  clocksource/drivers/sh_cmt: Address race condition for clock events
  clocksource/driver/arm_global_timer: Remove unnecessary ‘0’ values from err
  clocksource/drivers/arm_arch_timer: Remove unnecessary ‘0’ values from irq
  tick/broadcast: Make takeover of broadcast hrtimer reliable
  tick/sched: Combine WARN_ON_ONCE and print_once
  x86/vdso: Remove unused include
  x86/vgtod: Remove unused typedef gtod_long_t
  x86/vdso: Fix function reference in comment
  vdso: Add comment about reason for vdso struct ordering
  vdso/gettimeofday: Clarify comment about open coded function
  timekeeping: Add missing kernel-doc function comments
  tick: Remove unnused tick_nohz_get_idle_calls()
  ...
2024-07-15 15:03:09 -07:00
Linus Torvalds
a5819099f6 Merge branch 'runtime-constants'
Merge runtime constants infrastructure with implementations for x86 and
arm64.

This is one of four branches that came out of me looking at profiles of
my kernel build filesystem load on my 128-core Altra arm64 system, where
pathname walking and the user copies (particularly strncpy_from_user()
for fetching the pathname from user space) is very hot.

This is a very specialized "instruction alternatives" model where the
dentry hash pointer and hash count will be constants for the lifetime of
the kernel, but the allocation are not static but done early during the
kernel boot.  In order to avoid the pointer load and dynamic shift, we
just rewrite the constants in the instructions in place.

We can't use the "generic" alternative instructions infrastructure,
because different architectures do it very differently, and it's
actually simpler to just have very specific helpers, with a fallback to
the generic ("old") model of just using variables for architectures that
do not implement the runtime constant patching infrastructure.

Link: https://lore.kernel.org/all/CAHk-=widPe38fUNjUOmX11ByDckaeEo9tN4Eiyke9u1SAtu9sA@mail.gmail.com/

* runtime-constants:
  arm64: add 'runtime constant' support
  runtime constants: add x86 architecture support
  runtime constants: add default dummy infrastructure
  vfs: dcache: move hashlen_hash() from callers into d_hash()
2024-07-15 08:36:13 -07:00
Thomas Gleixner
b7625d67eb - Remove unnecessary local variables initialization as they will be
initialized in the code path anyway right after on the ARM arch
   timer and the ARM global timer (Li kunyu)
 
 - Fix a race condition in the interrupt leading to a deadlock on the
   SH CMT driver. Note that this fix was not tested on the platform
   using this timer but the fix seems reasonable enough to be picked
   confidently (Niklas Söderlund)
 
 - Increase the rating of the gic-timer and use the configured width
   clocksource register on the MIPS architecture (Jiaxun Yang)
 
 - Add the DT bindings for the TMU on the Renesas platforms (Geert
   Uytterhoeven)
 
 - Add the DT bindings for the SOPHGO SG2002 clint on RiscV (Thomas
   Bonnefille)
 
 - Add the rtl-otto timer driver along with the DT bindings for the
   Realtek platform (Chris Packham)
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGn3N4YVz0WNVyHskqDIjiipP6E8FAmaRQh0ACgkQqDIjiipP
 6E+rfQgAqkAWZ9BjswxV8Fg+Hj+a1cSohKjDczqitQF5rJm25X5VvMwlXVa3XQGm
 yemh4tKPpll02LOiYCTyqOWzNrkVS9VsoBd5rrYjRX5aSv7UD35EXklLj4P/INwX
 O9CRGD6aK4Xbw66xxheYHSSh+2iRs2x2mq61+/VdcIBlAwpQo+vx7McRoJZZI+2t
 NFIXw8RF5dDlmmAaqiB0WnPAtcOK3SDo9fu1LEAX1ZAzvbZriLo7XLnL7ibySWVe
 BW1n7Ore6PN5Dvz7jMfTsOQsgAlVv6MPfp/s4EDqMfBLVqXNirzXrdhiee/ahnYP
 vyzQyU5HPCMiIYS45mhJF0OyDd3wyw==
 =wuYA
 -----END PGP SIGNATURE-----

Merge tag 'timers-v6.11-rc1' of https://git.linaro.org/people/daniel.lezcano/linux into timers/core

Pull clocksource/event driver updates from Daniel Lezcano:

  - Remove unnecessary local variables initialization as they will be
    initialized in the code path anyway right after on the ARM arch
    timer and the ARM global timer (Li kunyu)

  - Fix a race condition in the interrupt leading to a deadlock on the
    SH CMT driver. Note that this fix was not tested on the platform
    using this timer but the fix seems reasonable enough to be picked
    confidently (Niklas Söderlund)

  - Increase the rating of the gic-timer and use the configured width
    clocksource register on the MIPS architecture (Jiaxun Yang)

  - Add the DT bindings for the TMU on the Renesas platforms (Geert
    Uytterhoeven)

  - Add the DT bindings for the SOPHGO SG2002 clint on RiscV (Thomas
    Bonnefille)

  - Add the rtl-otto timer driver along with the DT bindings for the
    Realtek platform (Chris Packham)

Link: https://lore.kernel.org/all/91cd05de-4c5d-4242-a381-3b8a4fe6a2a2@linaro.org
2024-07-13 12:07:10 +02:00
Borislav Petkov (AMD)
38918e0bb2 x86/sev: Move SEV compilation units
A long time ago it was agreed upon that the coco stuff needs to go where
it belongs:

  https://lore.kernel.org/all/Yg5nh1RknPRwIrb8@zn.tnic

and not keep it in arch/x86/kernel. TDX did that and SEV can't find time
to do so. So lemme do it. If people have trouble converting their
ongoing featuritis patches, ask me for a sed script.

No functional changes.

Move the instrumentation exclusion bits too, as helpfully caught and
reported by the 0day folks.

Closes: https://lore.kernel.org/oe-kbuild-all/202406220748.hG3qlmDx-lkp@intel.com
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202407091342.46d7dbb-oliver.sang@intel.com
Reported-by: kernel test robot <oliver.sang@intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Nikunj A Dadhania <nikunj@amd.com>
Reviewed-by: Ashish Kalra <ashish.kalra@amd.com>
Tested-by: kernel test robot <oliver.sang@intel.com>
Link: https://lore.kernel.org/r/20240619093014.17962-1-bp@kernel.org
2024-07-11 11:55:58 +02:00
Rafael J. Wysocki
9dabb5b48f Merge back cpufreq material for 6.11. 2024-07-10 13:03:11 +02:00
Daniel Vetter
86634fa4e6 Linux 6.10-rc6
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmaB0NweHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGkvwH/36UJRk/o6wvXnyH
 E6QjCSWo2226APyWks22NjtC3I/8Iqdvkneuh6wG0qL2sXAB078EMjUq5R81bF8H
 wWFBJwetjYTp8GEyLioMEb2wCH/J3R29dLFC4UYTplafXRGP6//xcpJaKmTxcgdR
 31IzvTPXbApZ7L3k1U6rA2bK9PNKcFCOvZlrNMUCuwMrabymHsDfOUt1DqXyg2xp
 zjqiWYBwlklozmgawSWt/mdEgkWuTcAbg+KyqDVQF59s9aj/OOwZ0j+HACq5V8CM
 quTPIAYL6CC9p7uxa69lGr/sgC0Is/BZLPX7RTZAwCgarGvnX+1HUsjDcaFCtrVg
 O6fPUV8=
 =pgUx
 -----END PGP SIGNATURE-----

Merge v6.10-rc6 into drm-next

The exynos-next pull is based on a newer -rc than drm-next. hence
backmerge first to make sure the unrelated conflicts we accumulated
don't end up randomly in the exynos merge pull, but are separated out.

Conflicts are all benign: Adjacent changes in amdgpu and fbdev-dma
code, and cherry-pick conflict in xe.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2024-07-05 10:47:28 +02:00
Tony Luck
13488150f5 x86/resctrl: Detect Sub-NUMA Cluster (SNC) mode
There isn't a simple hardware bit that indicates whether a CPU is running in
Sub-NUMA Cluster (SNC) mode. Infer the state by comparing the number of CPUs
sharing the L3 cache with CPU0 to the number of CPUs in the same NUMA node as
CPU0.

Add the missing definition of pr_fmt() to monitor.c. This wasn't noticed
before as there are only "can't happen" console messages from this file.

  [ bp: Massage commit message. ]

Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Link: https://lore.kernel.org/r/20240628215619.76401-19-tony.luck@intel.com
2024-07-02 20:02:11 +02:00
Tony Luck
21b362cc76 x86/resctrl: Enable shared RMID mode on Sub-NUMA Cluster (SNC) systems
Hardware has two RMID configuration options for SNC systems. The default
mode divides RMID counters between SNC nodes. E.g. with 200 RMIDs and
two SNC nodes per L3 cache RMIDs 0..99 are used on node 0, and 100..199
on node 1. This isn't compatible with Linux resctrl usage. On this
example system a process using RMID 5 would only update monitor counters
while running on SNC node 0.

The other mode is "RMID Sharing Mode". This is enabled by clearing bit
0 of the RMID_SNC_CONFIG (0xCA0) model specific register. In this mode
the number of logical RMIDs is the number of physical RMIDs (from CPUID
leaf 0xF) divided by the number of SNC nodes per L3 cache instance. A
process can use the same RMID across different SNC nodes.

See the "Intel Resource Director Technology Architecture Specification"
for additional details.

When SNC is enabled, update the MSR when a monitor domain is marked
online. Technically this is overkill. It only needs to be done once
per L3 cache instance rather than per SNC domain. But there is no harm
in doing it more than once, and this is not in a critical path.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Link: https://lore.kernel.org/r/20240702173820.90368-3-tony.luck@intel.com
2024-07-02 19:57:51 +02:00
Tony Luck
9fbb303ec9 x86/resctrl: Make __mon_event_count() handle sum domains
Legacy resctrl monitor files must provide the sum of event values across
all Sub-NUMA Cluster (SNC) domains that share an L3 cache instance.

There are now two cases:
1) A specific domain is provided in struct rmid_read
   This is either a non-SNC system, or the request is to read data
   from just one SNC node.
2) Domain pointer is NULL. In this case the cacheinfo field in struct
   rmid_read indicates that all SNC nodes that share that L3 cache
   instance should have the event read and return the sum of all
   values.

Update the CPU sanity check. The existing check that an event is read
from a CPU in the requested domain still applies when reading a single
domain. But when summing across domains a more relaxed check that the
current CPU is in the scope of the L3 cache instance is appropriate
since the MSRs to read events are scoped at L3 cache level.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Link: https://lore.kernel.org/r/20240628215619.76401-17-tony.luck@intel.com
2024-07-02 19:57:22 +02:00
Tony Luck
c8c7d3d904 x86/resctrl: Fill out rmid_read structure for smp_call*() to read a counter
mon_event_read() fills out most fields of the struct rmid_read that is passed
via an smp_call*() function to a CPU that is part of the correct domain to
read the monitor counters.

With Sub-NUMA Cluster (SNC) mode there are now two cases to handle:

1) Reading a file that returns a value for a single domain.
   + Choose the CPU to execute from the domain cpu_mask

2) Reading a file that must sum across domains sharing an L3 cache
   instance.
   + Indicate to called code that a sum is needed by passing a NULL
     rdt_mon_domain pointer.
   + Choose the CPU from the L3 shared_cpu_map.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Link: https://lore.kernel.org/r/20240628215619.76401-16-tony.luck@intel.com
2024-07-02 19:57:19 +02:00
Tony Luck
6b48b80b08 x86/resctrl: Handle removing directories in Sub-NUMA Cluster (SNC) mode
In SNC mode, there are multiple subdirectories in each L3 level monitor
directory (one for each SNC node). If all the CPUs in an SNC node are taken
offline, just remove the SNC directory for that node. In non-SNC mode, or when
the last SNC node directory is removed, remove the L3 monitor directory.

Add a helper function to avoid duplicated code.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Link: https://lore.kernel.org/r/20240702173820.90368-2-tony.luck@intel.com
2024-07-02 19:51:06 +02:00
Tony Luck
0158ed6a13 x86/resctrl: Create Sub-NUMA Cluster (SNC) monitor files
When SNC mode is enabled, create subdirectories and files to monitor at the SNC
node granularity. Legacy behavior is preserved by tagging the monitor files at
the L3 granularity with the "sum" attribute.  When the user reads these files
the kernel will read monitor data from all SNC nodes that share the same L3
cache instance and return the aggregated value to the user.

Note that the "domid" field for files that must sum across SNC domains has the
L3 cache instance id, while non-summing files use the domain id.

The "sum" files do not need to make a call to mon_event_read() to initialize
the MBM counters. This will be handled by initializing the individual SNC nodes
that share the L3.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Link: https://lore.kernel.org/r/20240628215619.76401-14-tony.luck@intel.com
2024-07-02 19:49:54 +02:00
Tony Luck
92b5d0b118 x86/resctrl: Allocate a new field in union mon_data_bits
When Sub-NUMA Cluster (SNC) mode is enabled, the legacy monitor reporting files
must report the sum of the data from all of the SNC nodes that share the L3
cache that is referenced by the monitor file.

Resctrl squeezes all the attributes of these files into 32 bits so they can be
stored in the "priv" field of struct kernfs_node.

Currently, only three monitor events are defined by enum resctrl_event_id so
reducing it from 8 bits to 7 bits still provides more than enough space to
represent all the known event types.

But note that this choice was arbitrary.  The "rid" field is also far wider
than needed for the current number of resource id types.  This structure is
purely internal to resctrl, no ABI issues with modifying it. Subsequent changes
may rearrange the allocation of bits between each of the fields as needed.

Give the bit to a new "sum" field that indicates that reading this file must
sum across SNC nodes. This bit also indicates that the domid field is the id of
an L3 cache (instead of a domain id) to find which domains must be summed.

Fix up other issues in the kerneldoc description for mon_data_bits.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Link: https://lore.kernel.org/r/20240628215619.76401-13-tony.luck@intel.com
2024-07-02 19:49:54 +02:00
Tony Luck
603cf1e288 x86/resctrl: Refactor mkdir_mondata_subdir() with a helper function
In Sub-NUMA Cluster (SNC) mode Linux must create the monitor
files in the original "mon_L3_XX" directories and also in each
of the "mon_sub_L3_YY" directories.

Refactor mkdir_mondata_subdir() to move the creation of monitoring files
into a helper function to avoid the need to duplicate code later.

No functional change.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Link: https://lore.kernel.org/r/20240628215619.76401-12-tony.luck@intel.com
2024-07-02 19:49:54 +02:00
Tony Luck
587edd7069 x86/resctrl: Initialize on-stack struct rmid_read instances
New semantics rely on some struct rmid_read members having NULL values to
distinguish between the SNC and non-SNC scenarios.  resctrl can thus no longer
rely on this struct not being initialized properly.

Initialize all on-stack declarations of struct rmid_read:

  rdtgroup_mondata_show()
  mbm_update()
  mkdir_mondata_subdir()

to ensure that garbage values from the stack are not passed down to other
functions.

  [ bp: Massage commit message. ]

Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Link: https://lore.kernel.org/r/20240628215619.76401-11-tony.luck@intel.com
2024-07-02 19:49:54 +02:00
Tony Luck
fb1f51f677 x86/resctrl: Add a new field to struct rmid_read for summation of domains
When a user reads a monitor file rdtgroup_mondata_show() calls mon_event_read()
to package up all the required details into an rmid_read structure which is
passed across the smp_call*() infrastructure to code that will read data from
hardware and return the value (or error status) in the rmid_read structure.

Sub-NUMA Cluster (SNC) mode adds files with new semantics. These require the
smp_call-ed code to sum event data from all domains that share an L3 cache.

Add a pointer to the L3 "cacheinfo" structure to struct rmid_read for the data
collection routines to use to pick the domains to be summed.

  [ Reinette: the rmid_read structure has become complex enough so document each
    of its fields and provide the kerneldoc documentation for struct rmid_read. ]

Co-developed-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Tested-by: Babu Moger <babu.moger@amd.com>
Link: https://lore.kernel.org/r/20240628215619.76401-10-tony.luck@intel.com
2024-07-02 19:49:54 +02:00
Tony Luck
328ea68874 x86/resctrl: Prepare for new Sub-NUMA Cluster (SNC) monitor files
When SNC is enabled, monitoring data is collected at the SNC node granularity,
but must be reported at L3-cache granularity for backwards compatibility in
addition to reporting at the node level.

Add a "ci" field to the rdt_mon_domain structure to save the cache information
about the enclosing L3 cache for the domain.  This provides:

1) The cache id which is needed to compose the name of the legacy monitoring
   directory, and to determine which domains should be summed to provide
   L3-scoped data.

2) The shared_cpu_map which is needed to determine which CPUs can be used to
   read the RMID counters with the MSR interface.

This is the first step to an eventual goal of monitor reporting files like this
(for a system with two SNC nodes per L3):

  $ cd /sys/fs/resctrl/mon_data
  $ tree mon_L3_00
  mon_L3_00			<- 00 here is L3 cache id
  ├── llc_occupancy		\  These files provide legacy support
  ├── mbm_local_bytes		 > for non-SNC aware monitor apps
  ├── mbm_total_bytes		/  that expect data at L3 cache level
  ├── mon_sub_L3_00		<- 00 here is SNC node id
  │   ├── llc_occupancy		\  These files are finer grained
  │   ├── mbm_local_bytes		 > data from each SNC node
  │   └── mbm_total_bytes		/
  └── mon_sub_L3_01
      ├── llc_occupancy		\
      ├── mbm_local_bytes		 > As above, but for node 1.
      └── mbm_total_bytes		/

  [ bp: Massage commit message. ]

Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Link: https://lore.kernel.org/r/20240628215619.76401-9-tony.luck@intel.com
2024-07-02 19:49:54 +02:00
Tony Luck
ac20aa4230 x86/resctrl: Block use of mba_MBps mount option on Sub-NUMA Cluster (SNC) systems
When SNC is enabled there is a mismatch between the MBA control function
which operates at L3 cache scope and the MBM monitor functions which
measure memory bandwidth on each SNC node.

Block use of the mba_MBps when scopes for MBA/MBM do not match.

Improve user diagnostics by adding invalfc() message when mba_MBps
is not supported.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Link: https://lore.kernel.org/r/20240628215619.76401-8-tony.luck@intel.com
2024-07-02 19:49:54 +02:00
Tony Luck
e13db55b5a x86/resctrl: Introduce snc_nodes_per_l3_cache
Intel Sub-NUMA Cluster (SNC) is a feature that subdivides the CPU cores
and memory controllers on a socket into two or more groups. These are
presented to the operating system as NUMA nodes.

This may enable some workloads to have slightly lower latency to memory
as the memory controller(s) in an SNC node are electrically closer to the
CPU cores on that SNC node. This cost may be offset by lower bandwidth
since the memory accesses for each core can only be interleaved between
the memory controllers on the same SNC node.

Resctrl monitoring on an Intel system depends upon attaching RMIDs to tasks
to track L3 cache occupancy and memory bandwidth. There is an MSR that
controls how the RMIDs are shared between SNC nodes.

The default mode divides them numerically. E.g. when there are two SNC
nodes on a socket the lower number half of the RMIDs are given to the
first node, the remainder to the second node. This would be difficult
to use with the Linux resctrl interface as specific RMID values assigned
to resctrl groups are not visible to users.

RMID sharing mode divides the physical RMIDs evenly between SNC nodes
but uses a logical RMID in the IA32_PQR_ASSOC MSR. For example a system
with 200 physical RMIDs (as enumerated by CPUID leaf 0xF) that has two
SNC nodes per L3 cache instance would have 100 logical RMIDs available
for Linux to use. A task running on SNC node 0 with RMID 5 would
accumulate LLC occupancy and MBM bandwidth data in physical RMID 5.
Another task using RMID 5, but running on SNC node 1 would accumulate
data in physical RMID 105.

Even with this renumbering SNC mode requires several changes in resctrl
behavior for correct operation.

Add a static global to arch/x86/kernel/cpu/resctrl/monitor.c to indicate
how many SNC domains share an L3 cache instance.  Initialize this to
"1". Runtime detection of SNC mode will adjust this value.

Update all places to take appropriate action when SNC mode is enabled:
1) The number of logical RMIDs per L3 cache available for use is the
   number of physical RMIDs divided by the number of SNC nodes.
2) Likewise the "mon_scale" value must be divided by the number of SNC
   nodes.
3) Add a function to convert from logical RMID values (assigned to
   tasks and loaded into the IA32_PQR_ASSOC MSR on context switch)
   to physical RMID values to load into IA32_QM_EVTSEL MSR when
   reading counters on each SNC node.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Link: https://lore.kernel.org/r/20240628215619.76401-7-tony.luck@intel.com
2024-07-02 19:49:54 +02:00
Tony Luck
1a171608ee x86/resctrl: Add node-scope to the options for feature scope
Currently supported resctrl features are all domain scoped the same as the
scope of the L2 or L3 caches.

Add RESCTRL_L3_NODE as a new option for features that are scoped at the
same granularity as NUMA nodes. This is needed for Intel's Sub-NUMA
Cluster (SNC) feature where monitoring features are divided between
nodes that share an L3 cache.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Link: https://lore.kernel.org/r/20240628215619.76401-6-tony.luck@intel.com
2024-07-02 19:49:54 +02:00
Tony Luck
cae2bcb6a2 x86/resctrl: Split the rdt_domain and rdt_hw_domain structures
The same rdt_domain structure is used for both control and monitor
functions. But this results in wasted memory as some of the fields are
only used by control functions, while most are only used for monitor
functions.

Split into separate rdt_ctrl_domain and rdt_mon_domain structures with
just the fields required for control and monitoring respectively.

Similar split of the rdt_hw_domain structure into rdt_hw_ctrl_domain
and rdt_hw_mon_domain.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Link: https://lore.kernel.org/r/20240628215619.76401-5-tony.luck@intel.com
2024-07-02 19:49:54 +02:00
Tony Luck
cd84f72b6a x86/resctrl: Prepare for different scope for control/monitor operations
Resctrl assumes that control and monitor operations on a resource are
performed at the same scope.

Prepare for systems that use different scope (specifically Intel needs
to split the RDT_RESOURCE_L3 resource to use L3 scope for cache control
and NODE scope for cache occupancy and memory bandwidth monitoring).

Create separate domain lists for control and monitor operations.

Note that errors during initialization of either control or monitor
functions on a domain would previously result in that domain being
excluded from both control and monitor operations. Now the domains are
allocated independently it is no longer required to disable both control
and monitor operations if either fail.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Link: https://lore.kernel.org/r/20240628215619.76401-4-tony.luck@intel.com
2024-07-02 19:49:53 +02:00
Tony Luck
c103d4d48e x86/resctrl: Prepare to split rdt_domain structure
The rdt_domain structure is used for both control and monitor features.
It is about to be split into separate structures for these two usages
because the scope for control and monitoring features for a resource
will be different for future resources.

To allow for common code that scans a list of domains looking for a
specific domain id, move all the common fields ("list", "id", "cpu_mask")
into their own structure within the rdt_domain structure.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Link: https://lore.kernel.org/r/20240628215619.76401-3-tony.luck@intel.com
2024-07-02 19:49:53 +02:00
Tony Luck
f436cb6913 x86/resctrl: Prepare for new domain scope
Resctrl resources operate on subsets of CPUs in the system with the
defining attribute of each subset being an instance of a particular
level of cache. E.g. all CPUs sharing an L3 cache would be part of the
same domain.

In preparation for features that are scoped at the NUMA node level,
change the code from explicit references to "cache_level" to a more
generic scope. At this point the only options for this scope are groups
of CPUs that share an L2 cache or L3 cache.

Clean up the error handling when looking up domains. Report invalid ids
before calling rdt_find_domain() in preparation for better messages when
scope can be other than cache scope. This means that rdt_find_domain()
will never return an error. So remove checks for error from the call sites.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Link: https://lore.kernel.org/r/20240628215619.76401-2-tony.luck@intel.com
2024-07-02 19:49:53 +02:00
Ard Biesheuvel
37aee82c21 x86/efi: Drop support for fake EFI memory maps
Between kexec and confidential VM support, handling the EFI memory maps
correctly on x86 is already proving to be rather difficult (as opposed
to other EFI architectures which manage to never modify the EFI memory
map to begin with)

EFI fake memory map support is essentially a development hack (for
testing new support for the 'special purpose' and 'more reliable' EFI
memory attributes) that leaked into production code. The regions marked
in this manner are not actually recognized as such by the firmware
itself or the EFI stub (and never have), and marking memory as 'more
reliable' seems rather futile if the underlying memory is just ordinary
RAM.

Marking memory as 'special purpose' in this way is also dubious, but may
be in use in production code nonetheless. However, the same should be
achievable by using the memmap= command line option with the ! operator.

EFI fake memmap support is not enabled by any of the major distros
(Debian, Fedora, SUSE, Ubuntu) and does not exist on other
architectures, so let's drop support for it.

Acked-by: Borislav Petkov (AMD) <bp@alien8.de>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2024-07-02 00:26:24 +02:00
Borislav Petkov (AMD)
0d3db1f14a x86/alternatives, kvm: Fix a couple of CALLs without a frame pointer
objtool complains:

  arch/x86/kvm/kvm.o: warning: objtool: .altinstr_replacement+0xc5: call without frame pointer save/setup
  vmlinux.o: warning: objtool: .altinstr_replacement+0x2eb: call without frame pointer save/setup

Make sure %rSP is an output operand to the respective asm() statements.

The test_cc() hunk and ALT_OUTPUT_SP() courtesy of peterz. Also from him
add some helpful debugging info to the documentation.

Now on to the explanations:

tl;dr: The alternatives macros are pretty fragile.

If I do ALT_OUTPUT_SP(output) in order to be able to package in a %rsp
reference for objtool so that a stack frame gets properly generated, the
inline asm input operand with positional argument 0 in clear_page():

	"0" (page)

gets "renumbered" due to the added

	: "+r" (current_stack_pointer), "=D" (page)

and then gcc says:

  ./arch/x86/include/asm/page_64.h:53:9: error: inconsistent operand constraints in an ‘asm’

The fix is to use an explicit "D" constraint which points to a singleton
register class (gcc terminology) which ends up doing what is expected
here: the page pointer - input and output - should be in the same %rdi
register.

Other register classes have more than one register in them - example:
"r" and "=r" or "A":

  ‘A’
	The ‘a’ and ‘d’ registers.  This class is used for
	instructions that return double word results in the ‘ax:dx’
	register pair.  Single word values will be allocated either in
	‘ax’ or ‘dx’.

so using "D" and "=D" just works in this particular case.

And yes, one would say, sure, why don't you do "+D" but then:

  : "+r" (current_stack_pointer), "+D" (page)
  : [old] "i" (clear_page_orig), [new1] "i" (clear_page_rep), [new2] "i" (clear_page_erms),
  : "cc", "memory", "rax", "rcx")

now find the Waldo^Wcomma which throws a wrench into all this.

Because that silly macro has an "input..." consume-all last macro arg
and in it, one is supposed to supply input *and* clobbers, leading to
silly syntax snafus.

Yap, they need to be cleaned up, one fine day...

Closes: https://lore.kernel.org/oe-kbuild-all/202406141648.jO9qNGLa-lkp@intel.com/
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Acked-by: Sean Christopherson <seanjc@google.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240625112056.GDZnqoGDXgYuWBDUwu@fat_crate.local
2024-07-01 12:41:11 +02:00
Andrew Cooper
34b3fc558b x86/cpu/intel: Drop stray FAM6 check with new Intel CPU model defines
The outer if () should have been dropped when switching to c->x86_vfm.

Fixes: 6568fc18c2 ("x86/cpu/intel: Switch to new Intel CPU model defines")
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Acked-by: Tony Luck <tony.luck@intel.com>
Link: https://lore.kernel.org/r/20240529183605.17520-1-andrew.cooper3@citrix.com
2024-06-29 16:10:37 +02:00
Linus Torvalds
093d9603b6 x86: stop playing stack games in profile_pc()
The 'profile_pc()' function is used for timer-based profiling, which
isn't really all that relevant any more to begin with, but it also ends
up making assumptions based on the stack layout that aren't necessarily
valid.

Basically, the code tries to account the time spent in spinlocks to the
caller rather than the spinlock, and while I support that as a concept,
it's not worth the code complexity or the KASAN warnings when no serious
profiling is done using timers anyway these days.

And the code really does depend on stack layout that is only true in the
simplest of cases.  We've lost the comment at some point (I think when
the 32-bit and 64-bit code was unified), but it used to say:

	Assume the lock function has either no stack frame or a copy
	of eflags from PUSHF.

which explains why it just blindly loads a word or two straight off the
stack pointer and then takes a minimal look at the values to just check
if they might be eflags or the return pc:

	Eflags always has bits 22 and up cleared unlike kernel addresses

but that basic stack layout assumption assumes that there isn't any lock
debugging etc going on that would complicate the code and cause a stack
frame.

It causes KASAN unhappiness reported for years by syzkaller [1] and
others [2].

With no real practical reason for this any more, just remove the code.

Just for historical interest, here's some background commits relating to
this code from 2006:

  0cb91a2293 ("i386: Account spinlocks to the caller during profiling for !FP kernels")
  31679f38d8 ("Simplify profile_pc on x86-64")

and a code unification from 2009:

  ef4512882d ("x86: time_32/64.c unify profile_pc")

but the basics of this thing actually goes back to before the git tree.

Link: https://syzkaller.appspot.com/bug?extid=84fe685c02cd112a2ac3 [1]
Link: https://lore.kernel.org/all/CAK55_s7Xyq=nh97=K=G1sxueOFrJDAvPOJAL4TPTCAYvmxO9_A@mail.gmail.com/ [2]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-06-28 14:27:22 -07:00
Josh Poimboeuf
42c141fbb6 x86/bugs: Add 'spectre_bhi=vmexit' cmdline option
In cloud environments it can be useful to *only* enable the vmexit
mitigation and leave syscalls vulnerable.  Add that as an option.

This is similar to the old spectre_bhi=auto option which was removed
with the following commit:

  36d4fe147c ("x86/bugs: Remove CONFIG_BHI_MITIGATION_AUTO and spectre_bhi=auto")

with the main difference being that this has a more descriptive name and
is disabled by default.

Mitigation switch requested by Maksim Davydov <davydov-max@yandex-team.ru>.

  [ bp: Massage. ]

Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Daniel Sneddon <daniel.sneddon@linux.intel.com>
Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
Link: https://lore.kernel.org/r/2cbad706a6d5e1da2829e5e123d8d5c80330148c.1719381528.git.jpoimboe@kernel.org
2024-06-28 15:35:54 +02:00
Rafael J. Wysocki
b11ec63abe Merge back cpufreq material for v6.11. 2024-06-27 21:20:10 +02:00
Alexey Makhalov
57b7b6acb4 x86/vmware: Add TDX hypercall support
VMware hypercalls use I/O port, VMCALL or VMMCALL instructions.  Add a call to
__tdx_hypercall() in order to support TDX guests.

No change in high bandwidth hypercalls, as only low bandwidth ones are supported
for TDX guests.

  [ bp: Massage, clear on-stack struct tdx_module_args variable. ]

Co-developed-by: Tim Merrifield <tim.merrifield@broadcom.com>
Signed-off-by: Tim Merrifield <tim.merrifield@broadcom.com>
Signed-off-by: Alexey Makhalov <alexey.makhalov@broadcom.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20240613191650.9913-9-alexey.makhalov@broadcom.com
2024-06-25 17:15:48 +02:00
Alexey Makhalov
86cb65448d x86/vmware: Correct macro names
VCPU_RESERVED and LEGACY_X2APIC are not VMware hypercall commands.  These are
bits in the return value of the VMWARE_CMD_GETVCPU_INFO command.  Change
VMWARE_CMD_ prefix to GETVCPU_INFO_ one. And move the bit-shift
operation into the macro body.

Fixes: 4cca6ea04d ("x86/apic: Allow x2apic without IR on VMware platform")
Signed-off-by: Alexey Makhalov <alexey.makhalov@broadcom.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20240613191650.9913-7-alexey.makhalov@broadcom.com
2024-06-25 17:15:48 +02:00
Alexey Makhalov
b2c13c23ea x86/vmware: Use VMware hypercall API
Remove VMWARE_CMD macro and move to vmware_hypercall API.
No functional changes intended.

Use u32/u64 instead of uint32_t/uint64_t across the file.

Signed-off-by: Alexey Makhalov <alexey.makhalov@broadcom.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20240613191650.9913-6-alexey.makhalov@broadcom.com
2024-06-25 17:15:47 +02:00
Alexey Makhalov
34bf25e820 x86/vmware: Introduce VMware hypercall API
Introduce a vmware_hypercall family of functions. It is a common implementation
to be used by the VMware guest code and virtual device drivers in architecture
independent manner.

The API consists of vmware_hypercallX and vmware_hypercall_hb_{out,in}
set of functions analogous to KVM's hypercall API. Architecture-specific
implementation is hidden inside.

It will simplify future enhancements in VMware hypercalls such as SEV-ES and
TDX related changes without needs to modify a caller in device drivers code.

Current implementation extends an idea from

  bac7b4e843 ("x86/vmware: Update platform detection code for VMCALL/VMMCALL hypercalls")

to have a slow, but safe path vmware_hypercall_slow() earlier during the boot
when alternatives are not yet applied.  The code inherits VMWARE_CMD logic from
the commit mentioned above.

Move common macros from vmware.c to vmware.h.

  [ bp: Fold in a fix:
    https://lore.kernel.org/r/20240625083348.2299-1-alexey.makhalov@broadcom.com ]

Signed-off-by: Alexey Makhalov <alexey.makhalov@broadcom.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20240613191650.9913-2-alexey.makhalov@broadcom.com
2024-06-25 17:01:33 +02:00