Commit Graph

1220732 Commits

Author SHA1 Message Date
Kent Overstreet
8480905765 bcachefs: Improve io option handling in data move path
The data move path now correctly picks IO options when inodes in
different snapshots have different options applied.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-31 12:18:37 -04:00
Kent Overstreet
bbe682c767 bcachefs: Ensure devices are always correctly initialized
We can't mark device superblocks or allocate journal on a device that
isn't online.

That means we may need to do this on every mount, because we may have
formatted a new filesystem and then done the first mount
(bch2_fs_initialize()) in degraded mode.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-31 12:18:37 -04:00
Kent Overstreet
d0261559c4 bcachefs: Delete duplicate time stats initialization
This code duplicated initialization already done in
bch2_fs_btree_iter_init().

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-31 12:18:37 -04:00
Kent Overstreet
e38356d65e bcachefs: Kill dead code extent_save()
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-31 12:18:37 -04:00
Kent Overstreet
253ba178c8 bcachefs: Fix ca->oldest_gen allocation
The ca->oldest_gen array needs to be the same size as the bucket_gens
array; ca->mi.nbuckets is updated with only state_lock held, not
gc_lock, so bch2_gc_gens() could race with device resize and allocate
too small of an oldest_gens array.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-31 12:18:37 -04:00
Kent Overstreet
a1d97d8417 bcachefs: Fix shrinker names
Shrinkers are now exported to debugfs, so the names can't have slashes
in them.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-31 12:18:37 -04:00
Kent Overstreet
50a38ca1ba bcachefs: Fix btree_node_type enum
More forwards compatibility fixups: having BKEY_TYPE_btree at the end of
the enum conflicts with unnkown btree IDs, this shifts BKEY_TYPE_btree
to slot 0 and fixes things up accordingly.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-31 12:18:37 -04:00
Kent Overstreet
88dfe193bd bcachefs: bch2_btree_id_str()
Since we can run with unknown btree IDs, we can't directly index btree
IDs into fixed size arrays.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-31 12:18:37 -04:00
Kent Overstreet
b0b5bbf99f bcachefs: Don't run bch2_delete_dead_snapshots() unnecessarily
Be a bit more careful about when bch2_delete_dead_snapshots needs to
run: it only needs to run synchronously if we're running fsck, and it
only needs to run at all if we have snapshot nodes to delete or if fsck
has noticed that it needs to run.

Also:
  Rename BCH_FS_HAVE_DELETED_SNAPSHOTS -> BCH_FS_NEED_DELETE_DEAD_SNAPSHOTS

  Kill bch2_delete_dead_snapshots_hook(), move functionality to
  bch2_mark_snapshot()

  Factor out bch2_check_snapshot_needs_deletion(), to explicitly check
  if we need to be running snapshot deletion.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-31 12:18:37 -04:00
Kent Overstreet
0dd092bf10 bcachefs: Fix lock ordering with snapshot_create_lock
We must not hold btree locks while taking snapshot_create_lock - this
fixes a lockdep splat.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-31 12:18:11 -04:00
Kent Overstreet
20e425d301 six locks: Lock contended tracepoints
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-30 22:26:45 -04:00
Kent Overstreet
ee526b88ca closures: Fix race in closure_sync()
As pointed out by Linus, closure_sync() was racy; we could skip blocking
immediately after a get() and a put(), but then that would skip any
barrier corresponding to the other thread's put() barrier.

To fix this, always do the full __closure_sync() sequence whenever any
get() has happened and the closure might have been used by other
threads.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-30 21:48:22 -04:00
Kent Overstreet
2bce6368c4 closures: Better memory barriers
atomic_(dec|sub)_return_release() are a thing now - use them.

Also, delete the useless barrier in set_closure_fn(): it's redundant
with the memory barrier in closure_put(0.

Since closure_put() would now otherwise just have a release barrier, we
also need a new barrier when the ref hits 0 -
smp_acquire__after_ctrl_dep().

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-30 21:48:22 -04:00
Linus Torvalds
cd063c8b9e Misc fixes and cleanups:
- Fix potential MAX_NAME_LEN limit related build failures
  - Fix scripts/faddr2line symbol filtering bug
  - Fix scripts/faddr2line on LLVM=1
  - Fix scripts/faddr2line to accept readelf output with mapping symbols
  - Minor cleanups
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmU88VYRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1g2rQ//dvzezrAs+ZEhKLbRLSabbAlCeJ+J9zuP
 c0xBmaLwUh47sSDKfBLLEFN3IMDfgMdKjfb3E32vT/WQ+ASdfEMs6FfwRtaErypG
 XfZFpfC2WE1+Gq0MAgrXYuQgDv1Lygdimoy0aCwMlrgb7ZgWL1xorG0VSEemyKhd
 CoRFURKjeJIKJN1oOvTXKhp/SZyk39KHXeF4qSAjIGkrzsfDtEUSNR6NjBmeGUS4
 zNVWus/CucHK/6MMpHtdWw1/Ygemc1CBzYC3ZSMGimqy4Rqe2RsiGa0Y3XhlMCyn
 ekNFuUm9bxStaTknM3ZXga0xHPdKnTPkihxykLDzo0Nh9eysuFlmFrFJ2xL/B87k
 IxlpXvwxjxTSmGDhGQFVnXma6M2le3YFWGClS8UyhSPG08qg09ClwZ8OtVDi8ITI
 rj0VoFbFLuc8aeHF/tyF2t323JmcMHq0aHi+kMUElszm6+B+fPnD54gHU+REXVxO
 YIRkK9RY52mfU4KFf8xlO/UhFF6nP8pgE8pVnNF4lC034M0t4z+i/TLjOsspjVt3
 yMoZakD7sfUkAaCBq4mVfdWwo5UzTVse0BarbEcKxoME6wLEfN+efE850zGdy7n1
 iRC9AddddEyo4BnSHbWdWu/PDYJKPiH7dAtHBcfnEMJjLQewnRHlsHHbCA55jtrX
 363jNE3x6K4=
 =9U5x
 -----END PGP SIGNATURE-----

Merge tag 'objtool-core-2023-10-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull objtool updates from Ingo Molnar:
 "Misc fixes and cleanups:

   - Fix potential MAX_NAME_LEN limit related build failures

   - Fix scripts/faddr2line symbol filtering bug

   - Fix scripts/faddr2line on LLVM=1

   - Fix scripts/faddr2line to accept readelf output with mapping
     symbols

   - Minor cleanups"

* tag 'objtool-core-2023-10-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  scripts/faddr2line: Skip over mapping symbols in output from readelf
  scripts/faddr2line: Use LLVM addr2line and readelf if LLVM=1
  scripts/faddr2line: Don't filter out non-function symbols from readelf
  objtool: Remove max symbol name length limitation
  objtool: Propagate early errors
  objtool: Use 'the fallthrough' pseudo-keyword
  x86/speculation, objtool: Use absolute relocations for annotations
  x86/unwind/orc: Remove redundant initialization of 'mid' pointer in __orc_find()
2023-10-30 13:20:02 -10:00
Linus Torvalds
63ce50fff9 Scheduler changes for v6.7 are:
- Fair scheduler (SCHED_OTHER) improvements:
 
     - Remove the old and now unused SIS_PROP code & option
     - Scan cluster before LLC in the wake-up path
     - Use candidate prev/recent_used CPU if scanning failed for cluster wakeup
 
  - NUMA scheduling improvements:
 
     - Improve the VMA access-PID code to better skip/scan VMAs
     - Extend tracing to cover VMA-skipping decisions
     - Improve/fix the recently introduced sched_numa_find_nth_cpu() code
     - Generalize numa_map_to_online_node()
 
  - Energy scheduling improvements:
 
     - Remove the EM_MAX_COMPLEXITY limit
     - Add tracepoints to track energy computation
     - Make the behavior of the 'sched_energy_aware' sysctl more consistent
     - Consolidate and clean up access to a CPU's max compute capacity
     - Fix uclamp code corner cases
 
  - RT scheduling improvements:
 
     - Drive dl_rq->overloaded with dl_rq->pushable_dl_tasks updates
     - Drive the ->rto_mask with rt_rq->pushable_tasks updates
 
  - Scheduler scalability improvements:
 
     - Rate-limit updates to tg->load_avg
     - On x86 disable IBRS when CPU is offline to improve single-threaded performance
     - Micro-optimize in_task() and in_interrupt()
     - Micro-optimize the PSI code
     - Avoid updating PSI triggers and ->rtpoll_total when there are no state changes
 
  - Core scheduler infrastructure improvements:
 
     - Use saved_state to reduce some spurious freezer wakeups
     - Bring in a handful of fast-headers improvements to scheduler headers
     - Make the scheduler UAPI headers more widely usable by user-space
     - Simplify the control flow of scheduler syscalls by using lock guards
     - Fix sched_setaffinity() vs. CPU hotplug race
 
  - Scheduler debuggability improvements:
     - Disallow writing invalid values to sched_rt_period_us
     - Fix a race in the rq-clock debugging code triggering warnings
     - Fix a warning in the bandwidth distribution code
     - Micro-optimize in_atomic_preempt_off() checks
     - Enforce that the tasklist_lock is held in for_each_thread()
     - Print the TGID in sched_show_task()
     - Remove the /proc/sys/kernel/sched_child_runs_first sysctl
 
  - Misc cleanups & fixes
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmU8/NoRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1gN+xAAvKGYNZBCBG4jowxccgqAbCx81KOhhsy/
 KUaOmdLPg9WaXuqjZ5sggXQCMT0wUqBYAmqV7ts53VhWcma2I1ap4dCM6Jj+RLrc
 vNwkeNetsikiZtarMoCJs5NahL8ULh3liBaoAkkToPjQ5r43aZ/eKwDovEdIKc+g
 +Vgn7jUY8ssIrAOKT1midSwY1y8kAU2AzWOSFDTgedkJP4PgOu9/lBl9jSJ2sYaX
 N4XqONYPXTwOHUtvmzkYILxLz0k0GgJ7hmt78E8Xy2rC4taGCRwCfCMBYxREuwiP
 huo3O1P/iIe5svm4/EBUvcpvf44eAWTV+CD0dnJPwOc9IvFhpSzqSZZAsyy/JQKt
 Lnzmc/xmyc1PnXCYJfHuXrw2/m+MyUHaegPzh5iLJFrlqa79GavOElj0jNTAMzbZ
 39fybzPtuFP+64faRfu0BBlQZfORPBNc/oWMpPKqgP58YGuveKTWaUF5rl5lM7Ne
 nm07uOmq02JVR8YzPl/FcfhU2dPMawWuMwUjEr2eU+lAunY3PF88vu0FALj7iOBd
 66F8qrtpDHJanOxrdEUwSJ7hgw79qY1iw66Db7cQYjMazFKZONxArQPqFUZ0ngLI
 n9hVa7brg1bAQKrQflqjcIAIbpVu3SjPEl15cKpAJTB/gn5H66TQgw8uQ6HfG+h2
 GtOsn1nlvuk=
 =GDqb
 -----END PGP SIGNATURE-----

Merge tag 'sched-core-2023-10-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull scheduler updates from Ingo Molnar:
 "Fair scheduler (SCHED_OTHER) improvements:
   - Remove the old and now unused SIS_PROP code & option
   - Scan cluster before LLC in the wake-up path
   - Use candidate prev/recent_used CPU if scanning failed for cluster
     wakeup

  NUMA scheduling improvements:
   - Improve the VMA access-PID code to better skip/scan VMAs
   - Extend tracing to cover VMA-skipping decisions
   - Improve/fix the recently introduced sched_numa_find_nth_cpu() code
   - Generalize numa_map_to_online_node()

  Energy scheduling improvements:
   - Remove the EM_MAX_COMPLEXITY limit
   - Add tracepoints to track energy computation
   - Make the behavior of the 'sched_energy_aware' sysctl more
     consistent
   - Consolidate and clean up access to a CPU's max compute capacity
   - Fix uclamp code corner cases

  RT scheduling improvements:
   - Drive dl_rq->overloaded with dl_rq->pushable_dl_tasks updates
   - Drive the ->rto_mask with rt_rq->pushable_tasks updates

  Scheduler scalability improvements:
   - Rate-limit updates to tg->load_avg
   - On x86 disable IBRS when CPU is offline to improve single-threaded
     performance
   - Micro-optimize in_task() and in_interrupt()
   - Micro-optimize the PSI code
   - Avoid updating PSI triggers and ->rtpoll_total when there are no
     state changes

  Core scheduler infrastructure improvements:
   - Use saved_state to reduce some spurious freezer wakeups
   - Bring in a handful of fast-headers improvements to scheduler
     headers
   - Make the scheduler UAPI headers more widely usable by user-space
   - Simplify the control flow of scheduler syscalls by using lock
     guards
   - Fix sched_setaffinity() vs. CPU hotplug race

  Scheduler debuggability improvements:
   - Disallow writing invalid values to sched_rt_period_us
   - Fix a race in the rq-clock debugging code triggering warnings
   - Fix a warning in the bandwidth distribution code
   - Micro-optimize in_atomic_preempt_off() checks
   - Enforce that the tasklist_lock is held in for_each_thread()
   - Print the TGID in sched_show_task()
   - Remove the /proc/sys/kernel/sched_child_runs_first sysctl

  ... and misc cleanups & fixes"

* tag 'sched-core-2023-10-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (82 commits)
  sched/fair: Remove SIS_PROP
  sched/fair: Use candidate prev/recent_used CPU if scanning failed for cluster wakeup
  sched/fair: Scan cluster before scanning LLC in wake-up path
  sched: Add cpus_share_resources API
  sched/core: Fix RQCF_ACT_SKIP leak
  sched/fair: Remove unused 'curr' argument from pick_next_entity()
  sched/nohz: Update comments about NEWILB_KICK
  sched/fair: Remove duplicate #include
  sched/psi: Update poll => rtpoll in relevant comments
  sched: Make PELT acronym definition searchable
  sched: Fix stop_one_cpu_nowait() vs hotplug
  sched/psi: Bail out early from irq time accounting
  sched/topology: Rename 'DIE' domain to 'PKG'
  sched/psi: Delete the 'update_total' function parameter from update_triggers()
  sched/psi: Avoid updating PSI triggers and ->rtpoll_total when there are no state changes
  sched/headers: Remove comment referring to rq::cpu_load, since this has been removed
  sched/numa: Complete scanning of inactive VMAs when there is no alternative
  sched/numa: Complete scanning of partial VMAs regardless of PID activity
  sched/numa: Move up the access pid reset logic
  sched/numa: Trace decisions related to skipping VMAs
  ...
2023-10-30 13:12:15 -10:00
Linus Torvalds
3cf3fabccb Locking changes in this cycle are:
- Futex improvements:
 
     - Add the 'futex2' syscall ABI, which is an attempt to get away from the
       multiplex syscall and adds a little room for extentions, while lifting
       some limitations.
 
     - Fix futex PI recursive rt_mutex waiter state bug
 
     - Fix inter-process shared futexes on no-MMU systems
 
     - Use folios instead of pages
 
  - Micro-optimizations of locking primitives:
 
     - Improve arch_spin_value_unlocked() on asm-generic ticket spinlock
       architectures, to improve lockref code generation.
 
     - Improve the x86-32 lockref_get_not_zero() main loop by adding
       build-time CMPXCHG8B support detection for the relevant lockref code,
       and by better interfacing the CMPXCHG8B assembly code with the compiler.
 
     - Introduce arch_sync_try_cmpxchg() on x86 to improve sync_try_cmpxchg()
       code generation. Convert some sync_cmpxchg() users to sync_try_cmpxchg().
 
     - Micro-optimize rcuref_put_slowpath()
 
  - Locking debuggability improvements:
 
     - Improve CONFIG_DEBUG_RT_MUTEXES=y to have a fast-path as well
 
     - Enforce atomicity of sched_submit_work(), which is de-facto atomic but
       was un-enforced previously.
 
     - Extend <linux/cleanup.h>'s no_free_ptr() with __must_check semantics
 
     - Fix ww_mutex self-tests
 
     - Clean up const-propagation in <linux/seqlock.h> and simplify
       the API-instantiation macros a bit.
 
  - RT locking improvements:
 
     - Provide the rt_mutex_*_schedule() primitives/helpers and use them
       in the rtmutex code to avoid recursion vs. rtlock on the PI state.
 
     - Add nested blocking lockdep asserts to rt_mutex_lock(), rtlock_lock()
       and rwbase_read_lock().
 
  - Plus misc fixes & cleanups
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmU877IRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1g9jw/+N7rxQ78dmFCYh4UWnLCYvuKP0/ivHErG
 493JcB8MupuA2tfJHIkDdr4aM2mNq2E61w69/WlZAQWWD6pdOhwgF5Xf5eoEcJm0
 vsAhWBGLxihXdtevPuMAx0dEpg3AMp2wc6i5PkN831KdPUgCNsrKq9Bfnfef7/G8
 MQTSHjmtba6jxleyxfEa4tE2xe5PJX825nRfkX2e1cf+stkYua+uJFxVxUfxFWGE
 4pBy70D9OC7MsJ44WWOA1gwkVtMMiBTmRPNjlP8Gz2GQ0f3ERHRwYk3jDHOPHZI6
 0GNt7pE3IMXQn2UuDtfkvv9IFTd+U5qD+APnWIn2ntWXqzGLFqOlmovMrobVn7El
 olYDCyweWPG71m1Qblsb1VK2QjRPQVJ9NAEg8RlDHIu2ThxHbMysDVGPVOYnPFq4
 S8QFpmldzbNoPU4rDJyT1fAmoUIrusBHkl+Us3yGfC74iM+fHnDEvaSoMZbzEdY1
 x/Nocj9XgKEgfXdYzrCWFmZ9xXqHkO25/wDL6yKqBdQtvaEalXuHTT6mQcYxrUPm
 Xx1BPan2Jg7p4u2oOFcVtKewUtRH9KBx8qytr5S+JK4PJbrBsixMnr84HLd/3X2V
 ykYkO+367T5MTYv4TnJDE5vdurzUqekKSCFPY3skPujPJfdLj1vsPzYf9iMkCLdo
 hU2f/R+Wpdk=
 =36Ff
 -----END PGP SIGNATURE-----

Merge tag 'locking-core-2023-10-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull locking updates from Info Molnar:
 "Futex improvements:

   - Add the 'futex2' syscall ABI, which is an attempt to get away from
     the multiplex syscall and adds a little room for extentions, while
     lifting some limitations.

   - Fix futex PI recursive rt_mutex waiter state bug

   - Fix inter-process shared futexes on no-MMU systems

   - Use folios instead of pages

  Micro-optimizations of locking primitives:

   - Improve arch_spin_value_unlocked() on asm-generic ticket spinlock
     architectures, to improve lockref code generation

   - Improve the x86-32 lockref_get_not_zero() main loop by adding
     build-time CMPXCHG8B support detection for the relevant lockref
     code, and by better interfacing the CMPXCHG8B assembly code with
     the compiler

   - Introduce arch_sync_try_cmpxchg() on x86 to improve
     sync_try_cmpxchg() code generation. Convert some sync_cmpxchg()
     users to sync_try_cmpxchg().

   - Micro-optimize rcuref_put_slowpath()

  Locking debuggability improvements:

   - Improve CONFIG_DEBUG_RT_MUTEXES=y to have a fast-path as well

   - Enforce atomicity of sched_submit_work(), which is de-facto atomic
     but was un-enforced previously.

   - Extend <linux/cleanup.h>'s no_free_ptr() with __must_check
     semantics

   - Fix ww_mutex self-tests

   - Clean up const-propagation in <linux/seqlock.h> and simplify the
     API-instantiation macros a bit

  RT locking improvements:

   - Provide the rt_mutex_*_schedule() primitives/helpers and use them
     in the rtmutex code to avoid recursion vs. rtlock on the PI state.

   - Add nested blocking lockdep asserts to rt_mutex_lock(),
     rtlock_lock() and rwbase_read_lock()

  .. plus misc fixes & cleanups"

* tag 'locking-core-2023-10-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (39 commits)
  futex: Don't include process MM in futex key on no-MMU
  locking/seqlock: Fix grammar in comment
  alpha: Fix up new futex syscall numbers
  locking/seqlock: Propagate 'const' pointers within read-only methods, remove forced type casts
  locking/lockdep: Fix string sizing bug that triggers a format-truncation compiler-warning
  locking/seqlock: Change __seqprop() to return the function pointer
  locking/seqlock: Simplify SEQCOUNT_LOCKNAME()
  locking/atomics: Use atomic_try_cmpxchg_release() to micro-optimize rcuref_put_slowpath()
  locking/atomic, xen: Use sync_try_cmpxchg() instead of sync_cmpxchg()
  locking/atomic/x86: Introduce arch_sync_try_cmpxchg()
  locking/atomic: Add generic support for sync_try_cmpxchg() and its fallback
  locking/seqlock: Fix typo in comment
  futex/requeue: Remove unnecessary ‘NULL’ initialization from futex_proxy_trylock_atomic()
  locking/local, arch: Rewrite local_add_unless() as a static inline function
  locking/debug: Fix debugfs API return value checks to use IS_ERR()
  locking/ww_mutex/test: Make sure we bail out instead of livelock
  locking/ww_mutex/test: Fix potential workqueue corruption
  locking/ww_mutex/test: Use prng instead of rng to avoid hangs at bootup
  futex: Add sys_futex_requeue()
  futex: Add flags2 argument to futex_requeue()
  ...
2023-10-30 12:38:48 -10:00
Linus Torvalds
9cda4eb04a - A kernel-doc fix
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmU8zJkACgkQEsHwGGHe
 VUpXxg/+Jn2nZG6jJr1PelUnGGBMY3GZuXmB947Ueb9BgbZEvi2VCp93ipO6EkWl
 ba0/3Uw0nO29FHsVM508sbLTHqo5RaMsjfg933HKOwiNx9pPIVtYfB4bs2b1Habq
 qCSCCUTNhNuDnxDduE/tglDVS5vab0fhODLnkYnzrvJ7D/3Zn5BwETbTYslRdnmD
 sIA8R1msot8n+a9aqHb99Lp+wmkinCsuhJbf+rIajVZUAFCSF+sL3hhaVIExgIWq
 3spI7ldKuxcNunmwB/LZYFGtQBeJb5r1JAg2LPOhui1W2D+oRbdXZCh8K2wA7Hec
 z+oSkep0qVbOED+GBfTDqxHYAiBkM0b1QwZdyr1cI3qSSZlb9Z7DreAewS5cxbj/
 hhYQl7voNDn7JH3ROOVyoo5seYIhtjO6GMQ7pPmXkvtXgm05cm7v8MDVN6m/ECPm
 1Iv9paI6EfVGORQLcG7ZndzLkN91KgT3h3OfYGISTrNObmDzMNxPSJT3/GY7TJ8l
 V/QR5hzfw7nuWBES1FRypfB9H8qiezfitMuaEmCHawJiSIwOBbsDHX0mAJzbeU9U
 64HFn98NHDRQLWFIq3s1Vl4omgVGMF0LI6FNOADlEIW0cWzxMk1tLkfVqhhUk6Ar
 b3f8DFAYXSCLSiTvm9yNErdQoSn5g2e+V1Ugs8Qu2pgRhmd1nh0=
 =E12/
 -----END PGP SIGNATURE-----

Merge tag 'x86_fpu_for_6.7_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fpu fixlet from Borislav Petkov:

 - kernel-doc fix

* tag 'x86_fpu_for_6.7_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/fpu/xstate: Address kernel-doc warning
2023-10-30 12:36:41 -10:00
Linus Torvalds
f155f3b3ed - Make sure PCI function 4 IDs of AMD family 0x19, models 0x60-0x7f are
actually used in the amd_nb.c enumeration
 
 - Add support for extracting NUMA information from devicetree for
   Hyper-V usages
 
 - Add PCI device IDs for the new AMD MI300 AI accelerators
 
 - Annotate an array in struct uv_rtc_timer_head with the new
   __counted_by attribute
 
 - Rework UV's NMI action parameter handling
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmU78I4ACgkQEsHwGGHe
 VUoDFQ//a8KYzcj2wC+GA+tAScMCP4lT3hsSYoHvFE6Fd4HBngMUBcaEHY0InxWc
 jfXRHFZBS54sLMH2XPVOnozTBDLUEF+4cnApj9iVf//RwjSz/mnJ1P/fqwGQHmfY
 eyOKn38nzkdcf+MW69HAyOeOLxOhOcxEdrzJ4AuPijjTH0GfHWXtuZrqCgaRB6XZ
 yYw7PwO9azSBIYA3c87IPo5uGDa0Ht/BWWHv+XD21DjH0E2n2jaGPgExojBwpI1F
 zJZkx8xJWlQANSD9qiYWoYFdEluGAJ6Fyn989hMRVUqFLbGB9ppEu0BD5TWor08A
 UMX3LYIXM+706EnpOYKc5m5dajbUNOEUl73M6S6J67ugvRUWjCZpoejEx1Aejup7
 aCcYrkbSiuVPFn8Wj9/JdkItu+zyzzsp3fjEdYlYVfefLh3EY4IrFAkQGUk7A0cA
 p6Idau8j4Cx+4lvNEFGMk/u7SAxcInJx7NDy8rJL9CG1BdbjzkwJH+GJFELuI/zM
 dSd+N2r5YUMClHR8Tor17PDh61i+ovg1kPhOyDyf7bNv0HQG0Hbcded4lyiVrsXa
 m5GNYRXEQFZXMeBxo6dJKDAj2rdSPsACC2FBTa9B+bciklcWid4FVTvf/9sSFeQv
 cvGFfGbFdW/uddAHsWqHMiuTfBsQneZ9UbZCKq/kAkHAtfdhRa8=
 =7JVZ
 -----END PGP SIGNATURE-----

Merge tag 'x86_platform_for_6.7_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 platform updates from Borislav Petkov:

 - Make sure PCI function 4 IDs of AMD family 0x19, models 0x60-0x7f are
   actually used in the amd_nb.c enumeration

 - Add support for extracting NUMA information from devicetree for
   Hyper-V usages

 - Add PCI device IDs for the new AMD MI300 AI accelerators

 - Annotate an array in struct uv_rtc_timer_head with the new
   __counted_by attribute

 - Rework UV's NMI action parameter handling

* tag 'x86_platform_for_6.7_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/amd_nb: Use Family 19h Models 60h-7Fh Function 4 IDs
  x86/numa: Add Devicetree support
  x86/of: Move the x86_flattree_get_config() call out of x86_dtb_init()
  x86/amd_nb: Add AMD Family MI300 PCI IDs
  x86/platform/uv: Annotate struct uv_rtc_timer_head with __counted_by
  x86/platform/uv: Rework NMI "action" modparam handling
2023-10-30 12:32:48 -10:00
Linus Torvalds
ca2e9c3bee - Make sure the "svm" feature flag is cleared from /proc/cpuinfo when
virtualization support is disabled in the BIOS on AMD and Hygon
   platforms
 
 - A minor cleanup
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmU77KoACgkQEsHwGGHe
 VUrophAAtfsB+WhRydin0V6kjQeH+RbiWyx/jOw6eNqvzOzaOPxVXn0cAHRSgAO4
 +S8tKIqaWpXNNNKpOIKBVaDkh9qr50/p36/jfVkXi8GOLYrK633F0BMjcG4+/vYQ
 A9b5iNiJhZ7xWE6+qRrqdg+o+a6UyPUGz34HNp3KwJVTdaHU2OnXXwuWeiUkgRrJ
 uQSfLc4+UIeefIzNy8Tqg083iaENBYMya7U90rzewD64NF0bsA15AEPut/6tnUVq
 ej3UU3cqO7nKXyhuZX+zpt856MZFa1rNYVXUAfoAO4xhqdN0Q5LFWO506sqajNx/
 hqbT+hKDoC03zuLmbZO21s/uWQdtVFo63FU0h9QBRp1m6Ug5P3rQQCK8ydJc5xwr
 Yd7je6UPK9jIKBo9VP1qmsyzGwADNevNf1qGExHI2T6Wml7HgDmPysAHnGiKqRGI
 1o9+Yqa+VBt8Wml9M8Ny+dLyr5F/2uq8sMrQedQlXdFMSzVm2JYecukJ5BvUWE/r
 Qyll8mTpIdgGXjBt56lMrgH7ibMC5ct/4MvTHOHuA997g/PwuwtWj7QyKXpUq2Rf
 o/c3zKKWIFxevjzwU86haCBaz+5xAQlB6dJw61ExxsmUuT/kZzkN15w6aqGZtpns
 PsARwnvuwZJ7vfqFLIa0ZkPN4OgnkRX7HlNqrVyKpONDTocZd9E=
 =i9On
 -----END PGP SIGNATURE-----

Merge tag 'x86_cpu_for_6.7_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 cpuid updates from Borislav Petkov:

 - Make sure the "svm" feature flag is cleared from /proc/cpuinfo when
   virtualization support is disabled in the BIOS on AMD and Hygon
   platforms

 - A minor cleanup

* tag 'x86_cpu_for_6.7_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/cpu/amd: Remove redundant 'break' statement
  x86/cpu: Clear SVM feature if disabled by BIOS
2023-10-30 12:10:24 -10:00
Linus Torvalds
9ab021a1b5 - Add support for non-contiguous capacity bitmasks being added to
Intel's CAT implementation
 
 - Other improvements to resctrl code: better configuration,
   simplifications, debugging support, fixes
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmU7x3YACgkQEsHwGGHe
 VUqj4BAAn9HiQPuBWW5UPVpLoHBmKHtoNuIn2AWD3wcRwFwd+mO1JbPgQzMude0C
 QV/Dpm+PPxyFNATtCiRtqns3qHSt8wVMy/mrOKT7R20mmBxIhMb+323YvoamFSzc
 gamSKDZJFEp8Dqj2ccnwpFIdPjlTZGuOCumcxrHbrEs10ezsZ1UCSOjlQrRJSrFX
 M/KCVmrwVt6VR24Sz3K7e6atWgnl5Gj926VB0wXFSVAHII22Pirx7rdsHZXkgI0z
 pBHomVyZxyjzo3XV9szG1h+3iTPIebWH6A+25YGpmh7PZeFzuJhn6XXbpZ35tjSw
 3EKjkbwJyDXLLfAV+MMzVYZeCzpxy5MEuTW6aNi4y59k6GAhyeHClq9HWePo8rp7
 lMXVfSeFpdtG0n7WUVF2ctm7mAqTF8id2WGNfvXoP/bzB2mkXQ4x1GV+TvxAyex8
 OAk7iHk0IhrakfDj1XAE1o0BiSkGKaNo0eT8LnuByaUvHQHSBo/24fFcFOC9V1NL
 K4eGfgn7yyXBFWvJch5LtdmG3LHQEJ9Dh4zZ8TkyHOt42Lc32JdHtBrRGdWRMyq9
 p5lhLvPwuumjfjTeaXG4ABacdED1a8fiUzydumHmAux+an7irTqxwP51Y9MrxR1O
 37+YBgEcO2nmubCUKUjvBga4ztFo0f/hMVqGqnME+tYwr7NzXx8=
 =GGQX
 -----END PGP SIGNATURE-----

Merge tag 'x86_cache_for_6.7_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 resource control updates from Borislav Petkov:

 - Add support for non-contiguous capacity bitmasks being added to
   Intel's CAT implementation

 - Other improvements to resctrl code: better configuration,
   simplifications, debugging support, fixes

* tag 'x86_cache_for_6.7_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/resctrl: Display RMID of resource group
  x86/resctrl: Add support for the files of MON groups only
  x86/resctrl: Display CLOSID for resource group
  x86/resctrl: Introduce "-o debug" mount option
  x86/resctrl: Move default group file creation to mount
  x86/resctrl: Unwind properly from rdt_enable_ctx()
  x86/resctrl: Rename rftype flags for consistency
  x86/resctrl: Simplify rftype flag definitions
  x86/resctrl: Add multiple tasks to the resctrl group at once
  Documentation/x86: Document resctrl's new sparse_masks
  x86/resctrl: Add sparse_masks file in info
  x86/resctrl: Enable non-contiguous CBMs in Intel CAT
  x86/resctrl: Rename arch_has_sparse_bitmaps
  x86/resctrl: Fix remaining kernel-doc warnings
2023-10-30 12:07:29 -10:00
Linus Torvalds
f84a52eef5 - A bunch of improvements, cleanups and fixlets to the SRSO mitigation
machinery and other, general cleanups to the hw mitigations code,
   by Josh Poimboeuf
 
 - Improve the return thunk detection by objtool as it is absolutely
   important that the default return thunk is not used after returns
   have been patched. Future work to detect and report this better is
   pending
 
 - Other misc cleanups and fixes
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmU7mFEACgkQEsHwGGHe
 VUpbBxAAtS4X5LCntPWUsDEBU80SBYAunEp0Wd0ttYEj+UrEk4tvnWVGFiIEr47A
 PrRKK9JCJtC6ko0+dwPtMi66L/T7mCpoNPI1kzfRG1IHJBfvCTGJhzZsesogvkA2
 1X9Je+RCVW4xVybIryxhjMGdB6jUoGEU1a4DmQXq481qiLB3ilvA1bIAaNo9BBYP
 rxKPrPcdOxn2NjxuOWg+FXjSc8LuAVSu3HqsgCW2AHJ6XIKEYWEq9FkXhwj9OJOr
 ax1F4qD1IY++jYZO9DJiltjeJyj0wC+yp8kDDURoLbcTk85WHlpD5vK0g64mELOA
 y0375thHep+vsrtQ/qZAmi/eVTaTekgbi7McahjoZebK7FbKOYRk6GZ+5+m29AVr
 DfQSJ7xQQqbCbpimeFmZ+gQf7mFexyDWvjUPyBl+OelOY1umdPM9IZVTnqib5LPr
 D2M+uqWfJhSwACi2o05LRv0gyhkAz0bGHrwZPmCVuxE5kBbhOpj4aT87fetUp/MW
 8lEFa3PHx/gkh2VOJ7ZgKzpeD75Vjo8TRAXOe4O2jn/L54gNEJ+1mukvrjW3+lp1
 ShmcZokl3ldPq6F5ioE+u45hVAfHkaruWM+5Rj3hsA/fdFN3isTVLhIRIsypPTKc
 p1ITT8Yhek8vkm9PcRBE5xWRmEZ2XE5ooDld930nJxra8QNVVQw=
 =E7c4
 -----END PGP SIGNATURE-----

Merge tag 'x86_bugs_for_6.7_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 hw mitigation updates from Borislav Petkov:

 - A bunch of improvements, cleanups and fixlets to the SRSO mitigation
   machinery and other, general cleanups to the hw mitigations code, by
   Josh Poimboeuf

 - Improve the return thunk detection by objtool as it is absolutely
   important that the default return thunk is not used after returns
   have been patched. Future work to detect and report this better is
   pending

 - Other misc cleanups and fixes

* tag 'x86_bugs_for_6.7_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (21 commits)
  x86/retpoline: Document some thunk handling aspects
  x86/retpoline: Make sure there are no unconverted return thunks due to KCSAN
  x86/callthunks: Delete unused "struct thunk_desc"
  x86/vdso: Run objtool on vdso32-setup.o
  objtool: Fix return thunk patching in retpolines
  x86/srso: Remove unnecessary semicolon
  x86/pti: Fix kernel warnings for pti= and nopti cmdline options
  x86/calldepth: Rename __x86_return_skl() to call_depth_return_thunk()
  x86/nospec: Refactor UNTRAIN_RET[_*]
  x86/rethunk: Use SYM_CODE_START[_LOCAL]_NOALIGN macros
  x86/srso: Disentangle rethunk-dependent options
  x86/srso: Move retbleed IBPB check into existing 'has_microcode' code block
  x86/bugs: Remove default case for fully switched enums
  x86/srso: Remove 'pred_cmd' label
  x86/srso: Unexport untraining functions
  x86/srso: Improve i-cache locality for alias mitigation
  x86/srso: Fix unret validation dependencies
  x86/srso: Fix vulnerability reporting for missing microcode
  x86/srso: Print mitigation for retbleed IBPB case
  x86/srso: Print actual mitigation if requested mitigation isn't possible
  ...
2023-10-30 11:48:49 -10:00
Linus Torvalds
01ae815c50 - Specify what error addresses reported on AMD are actually usable
memory error addresses for further decoding
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmU7kb8ACgkQEsHwGGHe
 VUq3/A//VTrsON+RRS+M7PVewXMiTbwjVytum/9gWXtuUBEFdWCQjCe4TSaI6+mX
 v8inAomBE7s3SoQYkosF1VO2l0r68aJLOm6hczzbjz+ZjGvramDiv5qCs0iMM8m4
 Nvwyjeo1+2G6JeaX2rR7fqnZkA4NcYE1/s05pksNEaXMsAhpSOWenRgUK1EyQXLE
 y1u63G5GLMT4cpjEmEcbp9Lb02WwQzB9inZ1f4MFoujkI5VJ/9b68D+DpGwHd4Ag
 HNrg6LR/YpVwioVnsa+xEiQSxxwuRCHvS8kbc27d3qhfT4cRhmtAIsHYkyeO75TJ
 jkXU1Gme/k2RDEYHOz7heSVWgmG3y9/swc3UZJFE0QAnjdajY7mUsM9+o5uCm4Y6
 rALf8z7t0+oMpG1YML5Y+0wvgcPk9pih6Mm9tbBlFCXPi2OQ5bieNkHe7RQXHcQx
 xwFoQI0ByWvW7omu+jqA8iN4YSLaQhST2wzghPF1Wu7KAewu5lpU+9kmgmL7utme
 aHIQFdhRFusYEABqlr8XQSew5FMtIfOeWWCdgWHghUQp6LCsA0QeUxcQR9VdY9Th
 IgY1j4G2lQeLpDlnWE9VPMCWk4cuTABRyVWEu1B4wScU3xRWD8jOh+LcF76RdoYz
 k7GW0d68DGDCRgU7q86LNM/bNH0zyIIteTj64uHBzQm0ygJcdZU=
 =NcVX
 -----END PGP SIGNATURE-----

Merge tag 'ras_core_for_6.7_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 RAS updates from Borislav Petkov:

 - Specify what error addresses reported on AMD are actually usable
   memory error addresses for further decoding

* tag 'ras_core_for_6.7_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mce: Cleanup mce_usable_address()
  x86/mce: Define amd_mce_usable_address()
  x86/MCE/AMD: Split amd_mce_is_memory_error()
2023-10-30 11:47:03 -10:00
Linus Torvalds
66cc8838c7 - A new EDAC driver for Xilinx's Versal integrated memory controller
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmU7igwACgkQEsHwGGHe
 VUowxg/+OB0V3j93bL9TCakU8wa/yoe8nMZZYdnnPUadeC8nKzoW8hblxkLy5b25
 3WF8e+CtcergGNqNhPbl+1ozkZCtrVs71JuWRW/TASNoW66nFPJhWtqSwg7TJuMl
 PPkwodDDKznxTPpGshpw2S1ZR8mU02ts5aws5js8+VuVb8EcEGX5wXdW93zf3Ei8
 ZtiSJbxh8Wc4gtqq8iMqXPDiMCoC+oyZ1Y1SQu0h1ZGE5ui2iLMNcagwB2dmQP3q
 NJYDkZK2aNkPmhqzkZsxtIFzDpXLMzJwgTYqd6Szk3wZuB0K/3rvKq2Eb5dbIzlw
 X4z6Y1KdhQnQi1Q6EmE6w1qdrrVGcwARRW1z5exQ9ZS3xqQ9OPBS4IlU+Z+2msTJ
 1rL81PFgta6b482N87sxeekPt6Y7DMZ+9/UvAwo7YnNq9W+/Q7l/UZXt7DqBjV2G
 ddyCXHgdWH/365w+EGWLt6tnoQjcXIeA7CJpCJZ/ebpFTRJ43qYj1DhJfcDtIj5L
 F0Ai9/s/TjnxY8V6sGLn+OlL/MzmNCIioPzhBg1YKcbH3sNO30BE/oAUPgIIqFtl
 k56bMXO0N2Sn0OjkFpkIGJtDsUW6VRnpmL53CvdHCZUKp0aSRybsL5AlCsHNaCsb
 p4KVvP2ONzscO+DPf5dOz2/kVZjPZneMyy/5CvRaNWeOkhx/FO0=
 =3V+h
 -----END PGP SIGNATURE-----

Merge tag 'edac_updates_for_v6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras

Pull EDAC updates from Borislav Petkov:

 - A new EDAC driver for Xilinx's Versal integrated memory controller

* tag 'edac_updates_for_v6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
  EDAC/versal: Add a Xilinx Versal memory controller driver
  dt-bindings: memory-controllers: Add support for Xilinx Versal EDAC for DDRMC
2023-10-30 11:45:36 -10:00
Linus Torvalds
9e87705289 Initial bcachefs pull request for 6.7-rc1
Here's the bcachefs filesystem pull request.
 
 One new patch since last week: the exportfs constants ended up
 conflicting with other filesystems that are also getting added to the
 global enum, so switched to new constants picked by Amir.
 
 I'll also be sending another pull request later on in the cycle bringing
 things up to date my master branch that people are currently running;
 that will be restricted to fs/bcachefs/, naturally.
 
 Testing - fstests as well as the bcachefs specific tests in ktest:
   https://evilpiepirate.org/~testdashboard/ci?branch=bcachefs-for-upstream
 
 It's also been soaking in linux-next, which resulted in a whole bunch of
 smatch complaints and fixes and a patch or two from Kees.
 
 The only new non fs/bcachefs/ patch is the objtool patch that adds
 bcachefs functions to the list of noreturns. The patch that exports
 osq_lock() has been dropped for now, per Ingo.
 
 Prereq patch list:
 
 faf1dce852 objtool: Add bcachefs noreturns
 73badee428 lib/generic-radix-tree.c: Add peek_prev()
 9492261ff2 lib/generic-radix-tree.c: Don't overflow in peek()
 0fb5d567f5 MAINTAINERS: Add entry for generic-radix-tree
 b414e8ecd4 closures: Add a missing include
 48b7935722 closures: closure_nr_remaining()
 ced58fc7ab closures: closure_wait_event()
 bd0d22e41e MAINTAINERS: Add entry for closures
 8c8d2d9670 bcache: move closures to lib/
 957e48087d locking: export contention tracepoints for bcachefs six locks
 21db931445 lib: Export errname
 83feeb1955 lib/string_helpers: string_get_size() now returns characters wrote
 7d672f4094 stacktrace: Export stack_trace_save_tsk
 771eb4fe8b fs: factor out d_mark_tmpfile()
 2b69987be5 sched: Add task_struct->faults_disabled_mapping
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEKnAFLkS8Qha+jvQrE6szbY3KbnYFAmU/wyIACgkQE6szbY3K
 bnZc1xAAqjQBGXdtgtKQvk0/ru0WaMZguMsOHd3BUXIbm30F6eJqnoXQ/ahALofc
 Ju6NrOgcy9wmdPKWpbeF+aK3WnkAW9jShDd0QieVH6PkhyYyh5r11iR/EVtjjLu5
 6Teodn8fyTqn9WSDtKG15QreTCJrEasAoGFQKQDA8oiXC7zc+RSpLUkkTWD/pxyW
 zVqkGGiAUG4x6FON+X2a3QBa9WCahIgV6XzHstGLsmOECxKO/LopGR5jThuIhv9t
 Yo0wodQTKAgb9QviG6V3f2dJLQKKUVDmVEGTXv+8Hl3d8CiYBJeIh+icp+VESBo1
 m8ev0y2xbTPLwgm5v0Uj4o/G8ISZ+qmcexV2zQ9xUWUAd2AjEBzhCh9BrNXM5qSg
 o7mphH+Pt6bJXgzxb2RkYJixU11yG3yuHPOCrRGGFpVHiNYhdHuJeDZOqChWZB8x
 6kY0uvU0X0tqVfWKxMwTwuqG8mJ5BkJNvnEvYi05QEZG0dDcUhgOqYlNNaL8vGkl
 qVixOwE4aH4kscdmW2gXY1c76VSebheyN8n6Wj1zrmTw4hTJH7ZWXPtmbRqQzpB6
 U6w3NjVyopbIjuF+syWeGqitTT/8fpvgZU4E9MpKGmHX4ADgecp6YSZQzzxTJn7D
 cbVX7YQxhmsM50C1PW7A8yLCspD/uRNiKLvzb/g9gFSInk4rV+U=
 =g+ia
 -----END PGP SIGNATURE-----

Merge tag 'bcachefs-2023-10-30' of https://evilpiepirate.org/git/bcachefs

Pull initial bcachefs updates from Kent Overstreet:
 "Here's the bcachefs filesystem pull request.

  One new patch since last week: the exportfs constants ended up
  conflicting with other filesystems that are also getting added to the
  global enum, so switched to new constants picked by Amir.

  The only new non fs/bcachefs/ patch is the objtool patch that adds
  bcachefs functions to the list of noreturns. The patch that exports
  osq_lock() has been dropped for now, per Ingo"

* tag 'bcachefs-2023-10-30' of https://evilpiepirate.org/git/bcachefs: (2781 commits)
  exportfs: Change bcachefs fid_type enum to avoid conflicts
  bcachefs: Refactor memcpy into direct assignment
  bcachefs: Fix drop_alloc_keys()
  bcachefs: snapshot_create_lock
  bcachefs: Fix snapshot skiplists during snapshot deletion
  bcachefs: bch2_sb_field_get() refactoring
  bcachefs: KEY_TYPE_error now counts towards i_sectors
  bcachefs: Fix handling of unknown bkey types
  bcachefs: Switch to unsafe_memcpy() in a few places
  bcachefs: Use struct_size()
  bcachefs: Correctly initialize new buckets on device resize
  bcachefs: Fix another smatch complaint
  bcachefs: Use strsep() in split_devs()
  bcachefs: Add iops fields to bch_member
  bcachefs: Rename bch_sb_field_members -> bch_sb_field_members_v1
  bcachefs: New superblock section members_v2
  bcachefs: Add new helper to retrieve bch_member from sb
  bcachefs: bucket_lock() is now a sleepable lock
  bcachefs: fix crc32c checksum merge byte order problem
  bcachefs: Fix bch2_inode_delete_keys()
  ...
2023-10-30 11:09:38 -10:00
Linus Torvalds
d5acbc60fa for-6.7-tag
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAmU/xAEACgkQxWXV+ddt
 WDvYKg//SjTimA5Nins9mb4jdz8n+dDeZnQhKzy3FqInU41EzDRc4WwnEODmDlTa
 AyU9rGB3k0JNSUc075jZFCyLqq/ARiOqRi4x33Gk0ckIlc4X5OgBoqP2XkPh0VlP
 txskLCrmhc3pwyR4ErlFDX2jebIUXfkv39bJuE40grGvUatRe+WNq0ERIrgO8RAr
 Rc3hBotMH8AIqfD1L6j1ZiZIAyrOkT1BJMuqeoq27/gJZn/MRhM9TCrMTzfWGaoW
 SxPrQiCDEN3KECsOY/caroMn3AekDijg/ley1Nf7Z0N6oEV+n4VWWPBFE9HhRz83
 9fIdvSbGjSJF6ekzTjcVXPAbcuKZFzeqOdBRMIW3TIUo7mZQyJTVkMsc1y/NL2Z3
 9DhlRLIzvWJJjt1CEK0u18n5IU+dGngdktbhWWIuIlo8r+G/iKR/7zqU92VfWLHL
 Z7/eh6HgH5zr2bm+yKORbrUjkv4IVhGVarW8D4aM+MCG0lFN2GaPcJCCUrp4n7rZ
 PzpQbxXa38ANBk6hsp4ndS8TJSBL9moY8tumzLcKg97nzNMV6KpBdV/G6/QfRLCN
 3kM6UbwTAkMwGcQS86Mqx6s04ORLnQeD6f7N6X4Ppx0Mi/zkjI2HkRuvQGp12B0v
 iZjCCZAYY2Iu+/TU0GrCXSss/grzIAUPzM9msyV3XGO/VBpwdec=
 =9TVx
 -----END PGP SIGNATURE-----

Merge tag 'for-6.7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull btrfs updates from David Sterba:
 "New features:

   - raid-stripe-tree

     New tree for logical file extent mapping where the physical mapping
     may not match on multiple devices. This is now used in zoned mode
     to implement RAID0/RAID1* profiles, but can be used in non-zoned
     mode as well. The support for RAID56 is in development and will
     eventually fix the problems with the current implementation. This
     is a backward incompatible feature and has to be enabled at mkfs
     time.

   - simple quota accounting (squota)

     A simplified mode of qgroup that accounts all space on the initial
     extent owners (a subvolume), the snapshots are then cheap to create
     and delete. The deletion of snapshots in fully accounting qgroups
     is a known CPU/IO performance bottleneck.

     The squota is not suitable for the general use case but works well
     for containers where the original subvolume exists for the whole
     time. This is a backward incompatible feature as it needs extending
     some structures, but can be enabled on an existing filesystem.

   - temporary filesystem fsid (temp_fsid)

     The fsid identifies a filesystem and is hard coded in the
     structures, which disallows mounting the same fsid found on
     different devices.

     For a single device filesystem this is not strictly necessary, a
     new temporary fsid can be generated on mount e.g. after a device is
     cloned. This will be used by Steam Deck for root partition A/B
     testing, or can be used for VM root images.

  Other user visible changes:

   - filesystems with partially finished metadata_uuid conversion cannot
     be mounted anymore and the uuid fixup has to be done by btrfs-progs
     (btrfstune).

  Performance improvements:

   - reduce reservations for checksum deletions (with enabled free space
     tree by factor of 4), on a sample workload on file with many
     extents the deletion time decreased by 12%

   - make extent state merges more efficient during insertions, reduce
     rb-tree iterations (run time of critical functions reduced by 5%)

  Core changes:

   - the integrity check functionality has been removed, this was a
     debugging feature and removal does not affect other integrity
     checks like checksums or tree-checker

   - space reservation changes:

      - more efficient delayed ref reservations, this avoids building up
        too much work or overusing or exhausting the global block
        reserve in some situations

      - move delayed refs reservation to the transaction start time,
        this prevents some ENOSPC corner cases related to exhaustion of
        global reserve

      - improvements in reducing excessive reservations for block group
        items

      - adjust overcommit logic in near full situations, account for one
        more chunk to eventually allocate metadata chunk, this is mostly
        relevant for small filesystems (<10GiB)

   - single device filesystems are scanned but not registered (except
     seed devices), this allows temp_fsid to work

   - qgroup iterations do not need GFP_ATOMIC allocations anymore

   - cleanups, refactoring, reduced data structure size, function
     parameter simplifications, error handling fixes"

* tag 'for-6.7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (156 commits)
  btrfs: open code timespec64 in struct btrfs_inode
  btrfs: remove redundant log root tree index assignment during log sync
  btrfs: remove redundant initialization of variable dirty in btrfs_update_time()
  btrfs: sysfs: show temp_fsid feature
  btrfs: disable the device add feature for temp-fsid
  btrfs: disable the seed feature for temp-fsid
  btrfs: update comment for temp-fsid, fsid, and metadata_uuid
  btrfs: remove pointless empty log context list check when syncing log
  btrfs: update comment for struct btrfs_inode::lock
  btrfs: remove pointless barrier from btrfs_sync_file()
  btrfs: add and use helpers for reading and writing last_trans_committed
  btrfs: add and use helpers for reading and writing fs_info->generation
  btrfs: add and use helpers for reading and writing log_transid
  btrfs: add and use helpers for reading and writing last_log_commit
  btrfs: support cloned-device mount capability
  btrfs: add helper function find_fsid_by_disk
  btrfs: stop reserving excessive space for block group item insertions
  btrfs: stop reserving excessive space for block group item updates
  btrfs: reorder btrfs_inode to fill gaps
  btrfs: open code btrfs_ordered_inode_tree in btrfs_inode
  ...
2023-10-30 10:42:06 -10:00
Linus Torvalds
8829687a4a fscrypt updates for 6.7
This update adds support for configuring the crypto data unit size (i.e.
 the granularity of file contents encryption) to be less than the
 filesystem block size. This can allow users to use inline encryption
 hardware in some cases when it wouldn't otherwise be possible.
 
 In addition, there are two commits that are prerequisites for the
 extent-based encryption support that the btrfs folks are working on.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCZT8acBQcZWJpZ2dlcnNA
 Z29vZ2xlLmNvbQAKCRDzXCl4vpKOK+czAQDkStgX1ICJQANnxwbrg/SUVdZjPuFH
 sJw3sUVpBR81TwEA/SyWh3YzVNZdpE7PWNrCknrC+qnO8hd9QBEjnQfwIQc=
 =t44a
 -----END PGP SIGNATURE-----

Merge tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/linux

Pull fscrypt updates from Eric Biggers:
 "This update adds support for configuring the crypto data unit size
  (i.e. the granularity of file contents encryption) to be less than the
  filesystem block size. This can allow users to use inline encryption
  hardware in some cases when it wouldn't otherwise be possible.

  In addition, there are two commits that are prerequisites for the
  extent-based encryption support that the btrfs folks are working on"

* tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/linux:
  fscrypt: track master key presence separately from secret
  fscrypt: rename fscrypt_info => fscrypt_inode_info
  fscrypt: support crypto data unit size less than filesystem block size
  fscrypt: replace get_ino_and_lblk_bits with just has_32bit_inodes
  fscrypt: compute max_lblk_bits from s_maxbytes and block size
  fscrypt: make the bounce page pool opt-in instead of opt-out
  fscrypt: make it clearer that key_prefix is deprecated
2023-10-30 10:23:42 -10:00
Linus Torvalds
8b16da681e NFSD 6.7 Release Notes
This release completes the SunRPC thread scheduler work that was
 begun in v6.6. The scheduler can now find an svc thread to wake in
 constant time and without a list walk. Thanks again to Neil Brown
 for this overhaul.
 
 Lorenzo Bianconi contributed infrastructure for a netlink-based
 NFSD control plane. The long-term plan is to provide the same
 functionality as found in /proc/fs/nfsd, plus some interesting
 additions, and then migrate the NFSD user space utilities to
 netlink.
 
 A long series to overhaul NFSD's NFSv4 operation encoding was
 applied in this release. The goals are to bring this family of
 encoding functions in line with the matching NFSv4 decoding
 functions and with the NFSv2 and NFSv3 XDR functions, preparing
 the way for better memory safety and maintainability.
 
 A further improvement to NFSD's write delegation support was
 contributed by Dai Ngo. This adds a CB_GETATTR callback,
 enabling the server to retrieve cached size and mtime data from
 clients holding write delegations. If the server can retrieve
 this information, it does not have to recall the delegation in
 some cases.
 
 The usual panoply of bug fixes and minor improvements round out
 this release. As always I am grateful to all contributors,
 reviewers, and testers.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEKLLlsBKG3yQ88j7+M2qzM29mf5cFAmU5IuoACgkQM2qzM29m
 f5eVsg//bVp8S93ci/oDlKfzOwH2fO5e5rna91wrDpJxkd51h6KTx55dSRG5sjAZ
 EywIVOann6xCtsixAPyff5Cweg2dWvzQRsy1ZnvWQ1qZBzD5KAJY5LPkeSFUCKBo
 Zani/qTOYbxzgFMjZx+yDSXDPKG68WYZBQK59SI7mURu4SYdk8aRyNY8mjHfr0Vh
 Aqrcny4oVtXV4sL5P5G/2FUW7WKT3olA3jSYlRRNMhbs2qpEemRCCrspOEMMad+b
 t1+ZCg+U27PMranvOJnof4RU7peZbaxDWA0gyiUbivVXVtZn9uOs0ffhktkvechL
 ePc33dqdp2ITdKIPA6JlaRv5WflKXQw0YYM9Kv5mcR4A2el7owL4f/pMlPhtbYwJ
 IOJv15KdKVN979G2e6WMYiKK+iHfaUUguhMEXnfnGoAajHOZNQiUEo3iFQAD7LDc
 DvMF8d9QqYmB9IW8FOYaRRfZGJOQHf3TL79Nd08z/bn5swvlvfj77leux9Sb+0/m
 Luk2Xvz2AJVSXE31wzabaGHkizN+BtH+e4MMbXUHBPW5jE9v7XOnEUFr4UdZyr9P
 Gl87A7NcrzNjJWT5TrnzM4sOslNsx46Aeg+VuNt2fSRn2dm6iBu2B8s0N4imx6dV
 PX1y9VSLq5WRhjrFZ1qeiZdsuTaQtrEiNDoRIQR6nCJPAV80iFk=
 =B4wJ
 -----END PGP SIGNATURE-----

Merge tag 'nfsd-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux

Pull nfsd updates from Chuck Lever:
 "This release completes the SunRPC thread scheduler work that was begun
  in v6.6. The scheduler can now find an svc thread to wake in constant
  time and without a list walk. Thanks again to Neil Brown for this
  overhaul.

  Lorenzo Bianconi contributed infrastructure for a netlink-based NFSD
  control plane. The long-term plan is to provide the same functionality
  as found in /proc/fs/nfsd, plus some interesting additions, and then
  migrate the NFSD user space utilities to netlink.

  A long series to overhaul NFSD's NFSv4 operation encoding was applied
  in this release. The goals are to bring this family of encoding
  functions in line with the matching NFSv4 decoding functions and with
  the NFSv2 and NFSv3 XDR functions, preparing the way for better memory
  safety and maintainability.

  A further improvement to NFSD's write delegation support was
  contributed by Dai Ngo. This adds a CB_GETATTR callback, enabling the
  server to retrieve cached size and mtime data from clients holding
  write delegations. If the server can retrieve this information, it
  does not have to recall the delegation in some cases.

  The usual panoply of bug fixes and minor improvements round out this
  release. As always I am grateful to all contributors, reviewers, and
  testers"

* tag 'nfsd-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: (127 commits)
  svcrdma: Fix tracepoint printk format
  svcrdma: Drop connection after an RDMA Read error
  NFSD: clean up alloc_init_deleg()
  NFSD: Fix frame size warning in svc_export_parse()
  NFSD: Rewrite synopsis of nfsd_percpu_counters_init()
  nfsd: Clean up errors in nfs3proc.c
  nfsd: Clean up errors in nfs4state.c
  NFSD: Clean up errors in stats.c
  NFSD: simplify error paths in nfsd_svc()
  NFSD: Clean up nfsd4_encode_seek()
  NFSD: Clean up nfsd4_encode_offset_status()
  NFSD: Clean up nfsd4_encode_copy_notify()
  NFSD: Clean up nfsd4_encode_copy()
  NFSD: Clean up nfsd4_encode_test_stateid()
  NFSD: Clean up nfsd4_encode_exchange_id()
  NFSD: Clean up nfsd4_do_encode_secinfo()
  NFSD: Clean up nfsd4_encode_access()
  NFSD: Clean up nfsd4_encode_readdir()
  NFSD: Clean up nfsd4_encode_entry4()
  NFSD: Add an nfsd4_encode_nfs_cookie4() helper
  ...
2023-10-30 10:12:29 -10:00
Linus Torvalds
14ab6d425e vfs-6.7.ctime
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZTppYgAKCRCRxhvAZXjc
 okIHAP9anLz1QDyMLH12ASuHjgBc0Of3jcB6NB97IWGpL4O21gEA46ohaD+vcJuC
 YkBLU3lXqQ87nfu28ExFAzh10hG2jwM=
 =m4pB
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.7.ctime' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs

Pull vfs inode time accessor updates from Christian Brauner:
 "This finishes the conversion of all inode time fields to accessor
  functions as discussed on list. Changing timestamps manually as we
  used to do before is error prone. Using accessors function makes this
  robust.

  It does not contain the switch of the time fields to discrete 64 bit
  integers to replace struct timespec and free up space in struct inode.
  But after this, the switch can be trivially made and the patch should
  only affect the vfs if we decide to do it"

* tag 'vfs-6.7.ctime' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs: (86 commits)
  fs: rename inode i_atime and i_mtime fields
  security: convert to new timestamp accessors
  selinux: convert to new timestamp accessors
  apparmor: convert to new timestamp accessors
  sunrpc: convert to new timestamp accessors
  mm: convert to new timestamp accessors
  bpf: convert to new timestamp accessors
  ipc: convert to new timestamp accessors
  linux: convert to new timestamp accessors
  zonefs: convert to new timestamp accessors
  xfs: convert to new timestamp accessors
  vboxsf: convert to new timestamp accessors
  ufs: convert to new timestamp accessors
  udf: convert to new timestamp accessors
  ubifs: convert to new timestamp accessors
  tracefs: convert to new timestamp accessors
  sysv: convert to new timestamp accessors
  squashfs: convert to new timestamp accessors
  server: convert to new timestamp accessors
  client: convert to new timestamp accessors
  ...
2023-10-30 09:47:13 -10:00
Linus Torvalds
7352a6765c vfs-6.7.xattr
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZTppWAAKCRCRxhvAZXjc
 okB2AP4jjoRErJBwj245OIDJqzoj4m4UVOVd0MH2AkiSpANczwD/TToChdpusY2y
 qAYg1fQoGMbDVlb7Txaj9qI9ieCf9w0=
 =2PXg
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.7.xattr' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs

Pull vfs xattr updates from Christian Brauner:
 "The 's_xattr' field of 'struct super_block' currently requires a
  mutable table of 'struct xattr_handler' entries (although each handler
  itself is const). However, no code in vfs actually modifies the
  tables.

  This changes the type of 's_xattr' to allow const tables, and modifies
  existing file systems to move their tables to .rodata. This is
  desirable because these tables contain entries with function pointers
  in them; moving them to .rodata makes it considerably less likely to
  be modified accidentally or maliciously at runtime"

* tag 'vfs-6.7.xattr' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs: (30 commits)
  const_structs.checkpatch: add xattr_handler
  net: move sockfs_xattr_handlers to .rodata
  shmem: move shmem_xattr_handlers to .rodata
  overlayfs: move xattr tables to .rodata
  xfs: move xfs_xattr_handlers to .rodata
  ubifs: move ubifs_xattr_handlers to .rodata
  squashfs: move squashfs_xattr_handlers to .rodata
  smb: move cifs_xattr_handlers to .rodata
  reiserfs: move reiserfs_xattr_handlers to .rodata
  orangefs: move orangefs_xattr_handlers to .rodata
  ocfs2: move ocfs2_xattr_handlers and ocfs2_xattr_handler_map to .rodata
  ntfs3: move ntfs_xattr_handlers to .rodata
  nfs: move nfs4_xattr_handlers to .rodata
  kernfs: move kernfs_xattr_handlers to .rodata
  jfs: move jfs_xattr_handlers to .rodata
  jffs2: move jffs2_xattr_handlers to .rodata
  hfsplus: move hfsplus_xattr_handlers to .rodata
  hfs: move hfs_xattr_handlers to .rodata
  gfs2: move gfs2_xattr_handlers_max to .rodata
  fuse: move fuse_xattr_handlers to .rodata
  ...
2023-10-30 09:29:44 -10:00
Linus Torvalds
df9c65b5fc vfs-6.7.iov_iter
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZTppQwAKCRCRxhvAZXjc
 om2kAP4u+eLsrhJHfUPUttGUEkSkZE+5/s/f1A/1GcV51usLSgEAu8urxAnP49GW
 INaDABXaFfKx8/KI/H2YFZPKGwlNEwY=
 =PDDE
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.7.iov_iter' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs

Pull iov_iter updates from Christian Brauner:
 "This contain's David's iov_iter cleanup work to convert the iov_iter
  iteration macros to inline functions:

   - Remove last_offset from iov_iter as it was only used by ITER_PIPE

   - Add a __user tag on copy_mc_to_user()'s dst argument on x86 to
     match that on powerpc and get rid of a sparse warning

   - Convert iter->user_backed to user_backed_iter() in the sound PCM
     driver

   - Convert iter->user_backed to user_backed_iter() in a couple of
     infiniband drivers

   - Renumber the type enum so that the ITER_* constants match the order
     in iterate_and_advance*()

   - Since the preceding patch puts UBUF and IOVEC at 0 and 1, change
     user_backed_iter() to just use the type value and get rid of the
     extra flag

   - Convert the iov_iter iteration macros to always-inline functions to
     make the code easier to follow. It uses function pointers, but they
     get optimised away

   - Move the check for ->copy_mc to _copy_from_iter() and
     copy_page_from_iter_atomic() rather than in memcpy_from_iter_mc()
     where it gets repeated for every segment. Instead, we check once
     and invoke a side function that can use iterate_bvec() rather than
     iterate_and_advance() and supply a different step function

   - Move the copy-and-csum code to net/ where it can be in proximity
     with the code that uses it

   - Fold memcpy_and_csum() in to its two users

   - Move csum_and_copy_from_iter_full() out of line and merge in
     csum_and_copy_from_iter() since the former is the only caller of
     the latter

   - Move hash_and_copy_to_iter() to net/ where it can be with its only
     caller"

* tag 'vfs-6.7.iov_iter' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs:
  iov_iter, net: Move hash_and_copy_to_iter() to net/
  iov_iter, net: Merge csum_and_copy_from_iter{,_full}() together
  iov_iter, net: Fold in csum_and_memcpy()
  iov_iter, net: Move csum_and_copy_to/from_iter() to net/
  iov_iter: Don't deal with iter->copy_mc in memcpy_from_iter_mc()
  iov_iter: Convert iterate*() to inline funcs
  iov_iter: Derive user-backedness from the iterator type
  iov_iter: Renumber ITER_* constants
  infiniband: Use user_backed_iter() to see if iterator is UBUF/IOVEC
  sound: Fix snd_pcm_readv()/writev() to use iov access functions
  iov_iter, x86: Be consistent about the __user tag on copy_mc_to_user()
  iov_iter: Remove last_offset from iov_iter as it was for ITER_PIPE
2023-10-30 09:24:21 -10:00
Linus Torvalds
3b3f874cc1 vfs-6.7.misc
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZTpoQAAKCRCRxhvAZXjc
 ovFNAQDgIRjXfZ1Ku+USxsRRdqp8geJVaNc3PuMmYhOYhUenqgEAmC1m+p0y31dS
 P6+HlL16Mqgu0tpLCcJK9BibpDZ0Ew4=
 =7yD1
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.7.misc' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs

Pull misc vfs updates from Christian Brauner:
 "This contains the usual miscellaneous features, cleanups, and fixes
  for vfs and individual fses.

  Features:

   - Rename and export helpers that get write access to a mount. They
     are used in overlayfs to get write access to the upper mount.

   - Print the pretty name of the root device on boot failure. This
     helps in scenarios where we would usually only print
     "unknown-block(1,2)".

   - Add an internal SB_I_NOUMASK flag. This is another part in the
     endless POSIX ACL saga in a way.

     When POSIX ACLs are enabled via SB_POSIXACL the vfs cannot strip
     the umask because if the relevant inode has POSIX ACLs set it might
     take the umask from there. But if the inode doesn't have any POSIX
     ACLs set then we apply the umask in the filesytem itself. So we end
     up with:

      (1) no SB_POSIXACL -> strip umask in vfs
      (2) SB_POSIXACL    -> strip umask in filesystem

     The umask semantics associated with SB_POSIXACL allowed filesystems
     that don't even support POSIX ACLs at all to raise SB_POSIXACL
     purely to avoid umask stripping. That specifically means NFS v4 and
     Overlayfs. NFS v4 does it because it delegates this to the server
     and Overlayfs because it needs to delegate umask stripping to the
     upper filesystem, i.e., the filesystem used as the writable layer.

     This went so far that SB_POSIXACL is raised eve on kernels that
     don't even have POSIX ACL support at all.

     Stop this blatant abuse and add SB_I_NOUMASK which is an internal
     superblock flag that filesystems can raise to opt out of umask
     handling. That should really only be the two mentioned above. It's
     not that we want any filesystems to do this. Ideally we have all
     umask handling always in the vfs.

   - Make overlayfs use SB_I_NOUMASK too.

   - Now that we have SB_I_NOUMASK, stop checking for SB_POSIXACL in
     IS_POSIXACL() if the kernel doesn't have support for it. This is a
     very old patch but it's only possible to do this now with the wider
     cleanup that was done.

   - Follow-up work on fake path handling from last cycle. Citing mostly
     from Amir:

     When overlayfs was first merged, overlayfs files of regular files
     and directories, the ones that are installed in file table, had a
     "fake" path, namely, f_path is the overlayfs path and f_inode is
     the "real" inode on the underlying filesystem.

     In v6.5, we took another small step by introducing of the
     backing_file container and the file_real_path() helper. This change
     allowed vfs and filesystem code to get the "real" path of an
     overlayfs backing file. With this change, we were able to make
     fsnotify work correctly and report events on the "real" filesystem
     objects that were accessed via overlayfs.

     This method works fine, but it still leaves the vfs vulnerable to
     new code that is not aware of files with fake path. A recent
     example is commit db1d1e8b98 ("IMA: use vfs_getattr_nosec to get
     the i_version"). This commit uses direct referencing to f_path in
     IMA code that otherwise uses file_inode() and file_dentry() to
     reference the filesystem objects that it is measuring.

     This contains work to switch things around: instead of having
     filesystem code opt-in to get the "real" path, have generic code
     opt-in for the "fake" path in the few places that it is needed.

     Is it far more likely that new filesystems code that does not use
     the file_dentry() and file_real_path() helpers will end up causing
     crashes or averting LSM/audit rules if we keep the "fake" path
     exposed by default.

     This change already makes file_dentry() moot, but for now we did
     not change this helper just added a WARN_ON() in ovl_d_real() to
     catch if we have made any wrong assumptions.

     After the dust settles on this change, we can make file_dentry() a
     plain accessor and we can drop the inode argument to ->d_real().

   - Switch struct file to SLAB_TYPESAFE_BY_RCU. This looks like a small
     change but it really isn't and I would like to see everyone on
     their tippie toes for any possible bugs from this work.

     Essentially we've been doing most of what SLAB_TYPESAFE_BY_RCU for
     files since a very long time because of the nasty interactions
     between the SCM_RIGHTS file descriptor garbage collection. So
     extending it makes a lot of sense but it is a subtle change. There
     are almost no places that fiddle with file rcu semantics directly
     and the ones that did mess around with struct file internal under
     rcu have been made to stop doing that because it really was always
     dodgy.

     I forgot to put in the link tag for this change and the discussion
     in the commit so adding it into the merge message:

       https://lore.kernel.org/r/20230926162228.68666-1-mjguzik@gmail.com

  Cleanups:

   - Various smaller pipe cleanups including the removal of a spin lock
     that was only used to protect against writes without pipe_lock()
     from O_NOTIFICATION_PIPE aka watch queues. As that was never
     implemented remove the additional locking from pipe_write().

   - Annotate struct watch_filter with the new __counted_by attribute.

   - Clarify do_unlinkat() cleanup so that it doesn't look like an extra
     iput() is done that would cause issues.

   - Simplify file cleanup when the file has never been opened.

   - Use module helper instead of open-coding it.

   - Predict error unlikely for stale retry.

   - Use WRITE_ONCE() for mount expiry field instead of just commenting
     that one hopes the compiler doesn't get smart.

  Fixes:

   - Fix readahead on block devices.

   - Fix writeback when layztime is enabled and inodes whose timestamp
     is the only thing that changed reside on wb->b_dirty_time. This
     caused excessively large zombie memory cgroup when lazytime was
     enabled as such inodes weren't handled fast enough.

   - Convert BUG_ON() to WARN_ON_ONCE() in open_last_lookups()"

* tag 'vfs-6.7.misc' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs: (26 commits)
  file, i915: fix file reference for mmap_singleton()
  vfs: Convert BUG_ON to WARN_ON_ONCE in open_last_lookups
  writeback, cgroup: switch inodes with dirty timestamps to release dying cgwbs
  chardev: Simplify usage of try_module_get()
  ovl: rely on SB_I_NOUMASK
  fs: fix umask on NFS with CONFIG_FS_POSIX_ACL=n
  fs: store real path instead of fake path in backing file f_path
  fs: create helper file_user_path() for user displayed mapped file path
  fs: get mnt_writers count for an open backing file's real path
  vfs: stop counting on gcc not messing with mnt_expiry_mark if not asked
  vfs: predict the error in retry_estale as unlikely
  backing file: free directly
  vfs: fix readahead(2) on block devices
  io_uring: use files_lookup_fd_locked()
  file: convert to SLAB_TYPESAFE_BY_RCU
  vfs: shave work on failed file open
  fs: simplify misleading code to remove ambiguity regarding ihold()/iput()
  watch_queue: Annotate struct watch_filter with __counted_by
  fs/pipe: use spinlock in pipe_read() only if there is a watch_queue
  fs/pipe: remove unnecessary spinlock from pipe_write()
  ...
2023-10-30 09:14:19 -10:00
Linus Torvalds
0d63d8b229 vfs-6.7.autofs
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZTppJgAKCRCRxhvAZXjc
 omYoAQD+g3BxYbxEdEuhrnbaZMljp2GEYn+L6I2txdvmp/TpSQEAsQipcEgMC1WI
 uc9IDiakYWWCSaN8F7BGR7zKsK5feAc=
 =V+jp
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.7.autofs' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs

Pull autofs mount api updates from Christian Brauner:
 "This ports autofs to the new mount api. The patchset has existed for
  quite a while but never made it upstream. Ian picked it back up.

  This also fixes a bug where fs_param_is_fd() was passed a garbage
  param->dirfd but it expected it to be set to the fd that was used to
  set param->file otherwise result->uint_32 contains nonsense. So make
  sure it's set.

  One less filesystem using the old mount api. We're getting there,
  albeit rather slow. The last remaining major filesystem that hasn't
  converted is btrfs. Patches exist - I even wrote them - but so far
  they haven't made it upstream"

* tag 'vfs-6.7.autofs' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs:
  autofs: fix add autofs_parse_fd()
  fsconfig: ensure that dirfd is set to aux
  autofs: fix protocol sub version setting
  autofs: convert autofs to use the new mount api
  autofs: validate protocol version
  autofs: refactor parse_options()
  autofs: reformat 0pt enum declaration
  autofs: refactor super block info init
  autofs: add autofs_parse_fd()
  autofs: refactor autofs_prepare_pipe()
2023-10-30 09:10:21 -10:00
Linus Torvalds
d4e175f2c4 vfs-6.7.super
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZT0C2gAKCRCRxhvAZXjc
 otV8AQCK5F9ONoQ7ISpdrKyUJiswySGXx0CYPfXbSg5gHH87zgEAua3vwVKeGXXF
 5iVsdiNzIIQDwGDx7FyxufL4ggcN6gQ=
 =E1kV
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.7.super' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs

Pull vfs superblock updates from Christian Brauner:
 "This contains the work to make block device opening functions return a
  struct bdev_handle instead of just a struct block_device. The same
  struct bdev_handle is then also passed to block device closing
  functions.

  This allows us to propagate context from opening to closing a block
  device without having to modify all users everytime.

  Sidenote, in the future we might even want to try and have block
  device opening functions return a struct file directly but that's a
  series on top of this.

  These are further preparatory changes to be able to count writable
  opens and blocking writes to mounted block devices. That's a separate
  piece of work for next cycle and for that we absolutely need the
  changes to btrfs that have been quietly dropped somehow.

  Originally the series contained a patch that removed the old
  blkdev_*() helpers. But since this would've caused needles churn in
  -next for bcachefs we ended up delaying it.

  The second piece of work addresses one of the major annoyances about
  the work last cycle, namely that we required dropping s_umount
  whenever we used the superblock and fs_holder_ops for a block device.

  The reason for that requirement had been that in some codepaths
  s_umount could've been taken under disk->open_mutex (that's always
  been the case, at least theoretically). For example, on surprise block
  device removal or media change. And opening and closing block devices
  required grabbing disk->open_mutex as well.

  So we did the work and went through the block layer and fixed all
  those places so that s_umount is never taken under disk->open_mutex.
  This means no more brittle games where we yield and reacquire s_umount
  during block device opening and closing and no more requirements where
  block devices need to be closed. Filesystems don't need to care about
  this.

  There's a bunch of other follow-up work such as moving block device
  freezing and thawing to holder operations which makes it work for all
  block devices and not just the main block device just as we did for
  surprise removal. But that is for next cycle.

  Tested with fstests for all major fses, blktests, LTP"

* tag 'vfs-6.7.super' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs: (37 commits)
  porting: update locking requirements
  fs: assert that open_mutex isn't held over holder ops
  block: assert that we're not holding open_mutex over blk_report_disk_dead
  block: move bdev_mark_dead out of disk_check_media_change
  block: WARN_ON_ONCE() when we remove active partitions
  block: simplify bdev_del_partition()
  fs: Avoid grabbing sb->s_umount under bdev->bd_holder_lock
  jfs: fix log->bdev_handle null ptr deref in lbmStartIO
  bcache: Fixup error handling in register_cache()
  xfs: Convert to bdev_open_by_path()
  reiserfs: Convert to bdev_open_by_dev/path()
  ocfs2: Convert to use bdev_open_by_dev()
  nfs/blocklayout: Convert to use bdev_open_by_dev/path()
  jfs: Convert to bdev_open_by_dev()
  f2fs: Convert to bdev_open_by_dev/path()
  ext4: Convert to bdev_open_by_dev()
  erofs: Convert to use bdev_open_by_path()
  btrfs: Convert to bdev_open_by_path()
  fs: Convert to bdev_open_by_dev()
  mm/swap: Convert to use bdev_open_by_dev()
  ...
2023-10-30 08:59:05 -10:00
Linus Torvalds
ffc253263a Linux 6.6 2023-10-29 16:31:08 -10:00
Linus Torvalds
2af9b20dbb Misc fixes:
- Fix a possible CPU hotplug deadlock bug caused by the new
    TSC synchronization code.
 
  - Fix a legacy PIC discovery bug that results in device troubles on
    affected systems, such as non-working keybards, etc.
 
  - Add a new Intel CPU model number to <asm/intel-family.h>.
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmU85uARHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1jSrhAAr3FrzkDWqLUUuDMWFEEVkTGz9GLc2uMP
 bGT7HApp301wGpFv35bbBVp7HKLSB9nFeLsWKQF6MBa2uSUiCxSBcSNfHVJ6jX1O
 Ac4hgl0Q/tZl0yZoag3u4j3FS57d9Qkzpg0QLTz9MVnQXEPTcDkli9tfSbRpxx3B
 WvcGwAD+VeMjeghIJHxTJW9KwouIf3gAzqoKE7TKEozJ4S2LbNs3nEsDGsRHJsPN
 jy/Fxlny6pdlkJArP/ILA9WO3yqBn2doIN9oqWKQCQtYbt1USnuJb5oCOXYAOijG
 A9iLJmMjDyN2TZ8lKYHBcHVmEVTeB+6VnikejcAaT4EMrArj5mdjL3K7McZ24NFh
 s8mx/S+v4mBExwNsaoq8kuUEZySWxNbFPLBjOq8BKIZYYDQUp1NK58CRWv9BAB6y
 GjgYlMYI6EGDb+QAoTKZ5KNqrwtUPuk2ijjJTB15DpgLkl7XmyoVOulrtVDrhuGZ
 Uuge9h0CLKnqD7lLfJdI7NLYxmZxPiQKbL9vrIqlk989UM5ItvttxWBNzhAdSTsG
 VIaXAOJgGDj74dJC+3b0umugKJcrycUFBy5/RxOg+OcnSPd5g/L4XWcLLk+OElE3
 pFBetWmO2HU4pxhL1GdWsDuF+RkYug4XKQqQP2LJ8Zepff/jkF+hVJoLYUhQIF6A
 OMQv1N6/nSg=
 =k19e
 -----END PGP SIGNATURE-----

Merge tag 'x86-urgent-2023-10-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull misc x86 fixes from Ingo Molnar:

 - Fix a possible CPU hotplug deadlock bug caused by the new TSC
   synchronization code

 - Fix a legacy PIC discovery bug that results in device troubles on
   affected systems, such as non-working keybards, etc

 - Add a new Intel CPU model number to <asm/intel-family.h>

* tag 'x86-urgent-2023-10-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/tsc: Defer marking TSC unstable to a worker
  x86/i8259: Skip probing when ACPI/MADT advertises PCAT compatibility
  x86/cpu: Add model number for Intel Arrow Lake mobile processor
2023-10-28 08:15:07 -10:00
Linus Torvalds
e663ab6bb5 Restore unintentionally lost quirk settings in the GIC irqchip driver,
which broke certain devices.
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmU9CZsRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1i1VQ//Sb0gT+Okqoqco7GZfpD/RcJO5yfr+xRd
 1XfI1S+3EbiPnJ+h7kuVD2lopZx3MjxtZWTgio9Rl1lJGTu345WoV1P0r+dsf8Hy
 Yj7pAby8hPmzG89Bx37Qr2+s34pkykUps19bu7u/BO/gO/bHmEaGMljg74x0H8QE
 vky9V8BG1Uo9PuZeAiXJn1fnEtYKv9w2SGL61Y70yAdog8Lkvyxaar3xGKbkyXVC
 Io2k+aZlX+6Xk4eljDaHHtrxA13WDfwSFki8KOR5Ec8NA8iVc6+fNKbwSHtIqIf0
 7F4WnPqQSbBSVlhVamu4EICqZr5aQ0MN9Tkv/o41APIPO/ClsoRwPy6NnXbCq3Vy
 4qH/JcquYiaL6GiY3bta47wEwX7Z0l/gvuUU/Pjh3MlobNwW0I8aTIQ+y0XPz/Ku
 vcrXtHw6BToZFceyWYEKREBcONF4Uds8Bxsb6rMhxfVF47MnagMl4B4uJio7nS+B
 Lta34Yjw69fVmwZAL/zWMT//rgu7QmI5SxkUZ2RwkLk+rOpx77roqd8nlmcFWmi2
 LMWB0wjrmy8gwm4HNQusYVoXVOjtGIUlzPlPS9PHwC3oYBND8n8YzWfNu4HV1eWz
 BYlDpyliE2dG62zsggNgJPVUBLWSAo8O5bXWQYV9PDcVnOxUXuoQ5uN3qP+3iHVD
 apdDrbheD28=
 =ETNN
 -----END PGP SIGNATURE-----

Merge tag 'irq-urgent-2023-10-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull irq fix from Ingo Molnar:
 "Restore unintentionally lost quirk settings in the GIC irqchip driver,
  which broke certain devices"

* tag 'irq-urgent-2023-10-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqchip/gic-v3-its: Don't override quirk settings with default values
2023-10-28 08:12:34 -10:00
Linus Torvalds
4714de0332 Fix a potential NULL dereference bug.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmU839YRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1isRg//S7E94bSvBE1uaclhehlro/V8t8qXiO2y
 RIvxCR16tornBWHYg49vVlZDGMVC5kf0O/6/b3p2VOwpZ+m9qp/4v5ImhYIhl1SI
 M2UFJ6pjy+ykbUR98WjuePXTNy6nEntJ8uYt+PxnGrApNG0DTnKbL03deimX/e2Z
 tOEYBh8iaHNx0AhuoWkLOXAbIFlwUeYVXZM1X5/3AS8AKcNYWzUkkyKWE4u6AY68
 E7uokwo+Z+rdSWIk+8mqALnf2IeIWl0ecyaA7P/wCf6ei3Yyys/H3N6qjwq0Yq2g
 gT2urQCBkPrYvkz3YS7i+P7hSe7cf6nPoTz+pN0oCEKT7cEenQTd+EtDnmpmPjxJ
 X7zTnag/l268cWudFS54DaZeUGOPx/AIG+k0RbN0w1XcDCg8DVTIB/MB0rTMeWPp
 y3lZMeU8ott+pHHjVUtDU7ERDWFf+EWpuPP8o9lq6oQV3W31l0XL3uL16mRvZtLB
 gWlR7DovFW+y6I9ISs3k18pQOKU8B4foyAbvlS5n4wmKZMn7ygryw3Tcg77mUZTK
 /xYdGQ5ZR6PDnqrn8uy0KeIIbtFkcxEKWanZzjjs49p820GvXvtOmLeirUqi8oc1
 c2mYqJH5T7U3KQUeG1JgytZuRpa/ph8GWw4LsS+4QLOAEXqU1x7dJYtvJMWj1bcx
 vpI9MHwKkck=
 =FjJZ
 -----END PGP SIGNATURE-----

Merge tag 'perf-urgent-2023-10-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf event fix from Ingo Molnar:
 "Fix a potential NULL dereference bug"

* tag 'perf-urgent-2023-10-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/core: Fix potential NULL deref
2023-10-28 08:10:47 -10:00
Linus Torvalds
51a7691038 Probes fixes for v6.6-rc7:
- tracing/kprobes: Fix kernel-doc warnings for the variable length
   arguments.
 
 - tracing/kprobes: Fix to count the symbols in modules even if the
   module name is not specified so that user can probe the symbols in
   the modules without module name.
 -----BEGIN PGP SIGNATURE-----
 
 iQFPBAABCgA5FiEEh7BulGwFlgAOi5DV2/sHvwUrPxsFAmU82MUbHG1hc2FtaS5o
 aXJhbWF0c3VAZ21haWwuY29tAAoJENv7B78FKz8bMZ0H+wZHWVUsmqGLGNCt3gfi
 m2EJX83VMwY8PzpwZ5ezrx4ibAcUyo7Dhh8OniGgEazC3BNeggoUu/HwpirS22gI
 Tx0EMlgLOJQykauiUe6FPem0IbrlbQMI1gLplx6cVd8lgIYZQfMIM5gI0kuCywT3
 Ka9sCgp6y3UKQNtHKFwtPRLYFTF3Afyy2C01wdsa800SEqeOAeTD9+8yz7ZnuFt+
 bNgu6vJGFfJHkEkvYCwFFqZ1eIfXON6lUFpijNpCGvMN2h1XArLexSk8JRBf6j2+
 8+1FrRQsTXRk3G6v9uQABeK7z5W2F8gufmSFyBlXajbZp2HT6j4s2S86u5lP9P9J
 l1U=
 =etyx
 -----END PGP SIGNATURE-----

Merge tag 'probes-fixes-v6.6-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull probes fixes from Masami Hiramatsu:

 - tracing/kprobes: Fix kernel-doc warnings for the variable length
   arguments

 - tracing/kprobes: Fix to count the symbols in modules even if the
   module name is not specified so that user can probe the symbols in
   the modules without module name

* tag 'probes-fixes-v6.6-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  tracing/kprobes: Fix symbol counting logic by looking at modules as well
  tracing/kprobes: Fix the description of variable length arguments
2023-10-28 08:04:56 -10:00
Linus Torvalds
bd80d2ef59 dma-mapping fix for Linux 6.6
- reduce the initialy dynamic swiotlb size to remove an annoying but
    harmless warning from the page allocator (Petr Tesarik)
 -----BEGIN PGP SIGNATURE-----
 
 iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAmU9Lt8LHGhjaEBsc3Qu
 ZGUACgkQD55TZVIEUYM2DxAAt01E6LudY/THouSHpUri8HnDUeNLEY/3AppdMrLs
 gNs+KhBZqXJlRFbbIjiQCnzuKJg6fMQHjrPiMkXsAzq4vklQOfywSxtcWOsSCSTl
 pN2pY7nYuSBLlum0qkwu9E1rtfx1FN58Z0SlmsGUqCb0GbsmNEKhJcGz/peaQi0+
 /Y4mqr7O5OXFbzLZMtB86XCPpVDoyjGVWFUJB6Ruz82zbmMqXs1P7JGaiv/aFn0y
 Diy1iNUrhluCAVbskoNHcFTXU1uHn+7JrMJGS7Usj49OyqfKSK6nhMEh5Kpe6ATg
 szmMxBknHBOfEb+ciHzvSna44+hx845Z2ylMtRUn6IXmWNsdrYP4zTkV+W5ioMHn
 QaE4Il+VuxHBL0Hy6tLHDElP/7Xx9o+KS6PJ/56Q5Qi360ZmjQCGTzum//9r0JB0
 LPFmWN5TMBYGd0uBp9kddIUl2MPiXdHulK8CZgdC+4o/KOnjINACFjynyKKyEEW9
 HZqWu6aoKwCfcsr9Wz+CZVnpvxPoP5b+BpwOJntrdqNtVycUOIe4z3ILuw1Vfs9P
 zkZ61QDk4N3vwBiqEiGDBwYVX0pq4/e1YHbxBOuLKUlYbmnHHLxp+PXDxKiaVbJc
 FyZUtN8I4hzvpH6RnX/W/UUz02nBpxcnamHUnXOEBorV3tJatvxav2A4QQJbBuqo
 M4w=
 =8Jfz
 -----END PGP SIGNATURE-----

Merge tag 'dma-mapping-6.6-2023-10-28' of git://git.infradead.org/users/hch/dma-mapping

Pull dma-mapping fix from Christoph Hellwig:

 - reduce the initialy dynamic swiotlb size to remove an annoying but
   harmless warning from the page allocator (Petr Tesarik)

* tag 'dma-mapping-6.6-2023-10-28' of git://git.infradead.org/users/hch/dma-mapping:
  swiotlb: do not try to allocate a TLB bigger than MAX_ORDER pages
2023-10-28 08:01:31 -10:00
Linus Torvalds
db5cda7fd4 Char/Misc driver fixes for 6.6-final
Here are some very small driver fixes for 6.6-final that have shown up
 in the past 2 weeks.  Included in here are:
   - tiny fastrpc bugfixes for reported errors
   - nvmem register fixes
   - iio driver fixes for some reported problems
   - fpga test fix
   - MAINTAINERS file update for fpga
 
 All of these have been in linux-next this week with no reported
 problems.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZTy13A8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ykHzACdHeaGvszwWwcjHX0SmZBRNbHoj80AnRaM0RP0
 SpAvYj7k79XtxOH0KTe0
 =y5Zi
 -----END PGP SIGNATURE-----

Merge tag 'char-misc-6.6-final' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull char/misc driver fixes from Greg KH:
 "Here are some very small driver fixes for 6.6-final that have shown up
  in the past two weeks. Included in here are:

   - tiny fastrpc bugfixes for reported errors

   - nvmem register fixes

   - iio driver fixes for some reported problems

   - fpga test fix

   - MAINTAINERS file update for fpga

  All of these have been in linux-next this week with no reported
  problems"

* tag 'char-misc-6.6-final' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
  fpga: Fix memory leak for fpga_region_test_class_find()
  fpga: m10bmc-sec: Change contact for secure update driver
  fpga: disable KUnit test suites when module support is enabled
  iio: afe: rescale: Accept only offset channels
  nvmem: imx: correct nregs for i.MX6ULL
  nvmem: imx: correct nregs for i.MX6UL
  nvmem: imx: correct nregs for i.MX6SLL
  misc: fastrpc: Unmap only if buffer is unmapped from DSP
  misc: fastrpc: Clean buffers on remote invocation failures
  misc: fastrpc: Free DMA handles for RPC calls with no arguments
  misc: fastrpc: Reset metadata buffer to avoid incorrect free
  iio: exynos-adc: request second interupt only when touchscreen mode is used
  iio: adc: xilinx-xadc: Correct temperature offset/scale for UltraScale
  iio: adc: xilinx-xadc: Don't clobber preset voltage/temperature thresholds
  dt-bindings: iio: add missing reset-gpios constrain
2023-10-28 07:51:27 -10:00
Linus Torvalds
8907bfd2b1 Bugfixes for Axxia when it is a target and for PEC handling of stm32f7.
Plus, fixing an of node leak pattern in the mux subsystem.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEOZGx6rniZ1Gk92RdFA3kzBSgKbYFAmU8qEIACgkQFA3kzBSg
 Kba5Sg//UAZYYdy1UsnXm1CpaPDiuZ4vDbJNlkSl1gkU/qBQCUJgb+RgOt7yqJxN
 NwgpuO5bmN/9lcMhHsCSdd9iL4CiqpC0lUPMWqZ1bfJoygHhyk4Re/nQeT8S783F
 vRQNHxt6XJSC9MchwByujJ+cj1lmnXOSqOgr7wW4UQ7TMhSFAqszgYT+9eFUC+87
 v4ZBsj1UNOaEGpq5fIWdSFWL0VjK+BPx1GCfG16c7qKWXt1jLSfIoVCAz4v92ldt
 DQLv8z/lnl+Fkx5xAJ7BI661fCjb8mxCbeD/vgqJV0w5+h6UEtxODUOonyJniUBB
 3TnWYnYGV4NCJHxybfP5B273R1fNrE+THRx2qKXdZYn6gR/+u8cecpSARFDS3dD1
 oyhyItbdW4d7ewsh8MFyO4MBOTGk6TswQyQjzo8GNjJts4NeiDur/EbCOvLqj4+5
 15af+ltsuhWY9/SWcdpJC7Wcdm26PCUqsunwnSg0g7xfnWinjwvEeVEGT9bEKPXO
 lRKlV/8Iw29h+oRVaFjAENzUEkia2XPo4trdkTUTjflHMvZ4SEBP9lqQO7fBKxdb
 4zoYhhWcE+rhqhEa+W5u0YEsXPBvpxuvxg9XNOYSswYmqQEO4kPpCdGFJFCbOr98
 8Zzkt6NZGUcDegtXbTYnrrGhfEO7n7gC0wZfKEq9gOhc1iG8u6I=
 =bIWp
 -----END PGP SIGNATURE-----

Merge tag 'i2c-for-6.6-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux

Pull i2c fixes from Wolfram Sang:
 "Bugfixes for Axxia when it is a target and for PEC handling of
  stm32f7.

  Plus, fix an OF node leak pattern in the mux subsystem"

* tag 'i2c-for-6.6-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  i2c: stm32f7: Fix PEC handling in case of SMBUS transfers
  i2c: muxes: i2c-mux-gpmux: Use of_get_i2c_adapter_by_node()
  i2c: muxes: i2c-demux-pinctrl: Use of_get_i2c_adapter_by_node()
  i2c: muxes: i2c-mux-pinctrl: Use of_get_i2c_adapter_by_node()
  i2c: aspeed: Fix i2c bus hang in slave read
2023-10-28 07:48:37 -10:00
Christian Brauner
5aa9130acb
porting: update locking requirements
Now that s_umount is never taken under open_mutex update the
documentation to say so.

Link: https://lore.kernel.org/r/20231017184823.1383356-1-hch@lst.de
Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-10-28 13:29:23 +02:00
Christian Brauner
3b224e1df6
fs: assert that open_mutex isn't held over holder ops
With recent block level changes we should never be in a situation where
we hold disk->open_mutex when calling into these helpers. So assert that
in the code.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20231017184823.1383356-6-hch@lst.de
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-10-28 13:29:23 +02:00
Christian Brauner
f61033390b
block: assert that we're not holding open_mutex over blk_report_disk_dead
blk_report_disk_dead() has the following major callers:

(1) del_gendisk()
(2) blk_mark_disk_dead()

Since del_gendisk() acquires disk->open_mutex it's clear that all
callers are assumed to be called without disk->open_mutex held.
In turn, blk_report_disk_dead() is called without disk->open_mutex held
in del_gendisk().

All callers of blk_mark_disk_dead() call it without disk->open_mutex as
well.

Ensure that it is clear that blk_report_disk_dead() is called without
disk->open_mutex on purpose by asserting it and a comment in the code.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20231017184823.1383356-5-hch@lst.de
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-10-28 13:29:23 +02:00
Christoph Hellwig
6e57236ed6
block: move bdev_mark_dead out of disk_check_media_change
disk_check_media_change is mostly called from ->open where it makes
little sense to mark the file system on the device as dead, as we
are just opening it.  So instead of calling bdev_mark_dead from
disk_check_media_change move it into the few callers that are not
in an open instance.  This avoid calling into bdev_mark_dead and
thus taking s_umount with open_mutex held.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20231017184823.1383356-4-hch@lst.de
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Christian Brauner <brauner@kernel.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-10-28 13:29:23 +02:00
Christian Brauner
51b4cb4f3e
block: WARN_ON_ONCE() when we remove active partitions
The logic for disk->open_partitions is:

blkdev_get_by_*()
-> bdev_is_partition()
   -> blkdev_get_part()
      -> blkdev_get_whole() // bdev_whole->bd_openers++
      -> if (part->bd_openers == 0)
                 disk->open_partitions++
         part->bd_openers

In other words, when we first claim/open a partition we increment
disk->open_partitions and only when all part->bd_openers are closed will
disk->open_partitions be zero. That should mean that
disk->open_partitions is always > 0 as long as there's anyone that
has an open partition.

So the check for disk->open_partitions should mean that we can never
remove an active partition that has a holder and holder ops set. Assert
that in the code. The main disk isn't removed so that check doesn't work
for disk->part0 which is what we want. After all we only care about
partition not about the main disk.

Link: https://lore.kernel.org/r/20231017184823.1383356-3-hch@lst.de
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-10-28 13:29:22 +02:00
Christian Brauner
c30b9787a4
block: simplify bdev_del_partition()
BLKPG_DEL_PARTITION refuses to delete partitions that still have
openers, i.e., that has an elevated @bdev->bd_openers count. If a device
is claimed by setting @bdev->bd_holder and @bdev->bd_holder_ops
@bdev->bd_openers and @bdev->bd_holders are incremented.
@bdev->bd_openers is effectively guaranteed to be >= @bdev->bd_holders.
So as long as @bdev->bd_openers isn't zero we know that this partition
is still in active use and that there might still be @bdev->bd_holder
and @bdev->bd_holder_ops set.

The only current example is @fs_holder_ops for filesystems. But that
means bdev_mark_dead() which calls into
bdev->bd_holder_ops->mark_dead::fs_bdev_mark_dead() is a nop. As long as
there's an elevated @bdev->bd_openers count we can't delete the
partition and if there isn't an elevated @bdev->bd_openers count then
there's no @bdev->bd_holder or @bdev->bd_holder_ops.

So simply open-code what we need to do. This gets rid of one more
instance where we acquire s_umount under @disk->open_mutex.

Link: https://lore.kernel.org/r/20231016-fototermin-umriss-59f1ea6c1fe6@brauner
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20231017184823.1383356-2-hch@lst.de
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-10-28 13:29:22 +02:00
Jan Kara
fd1464105c
fs: Avoid grabbing sb->s_umount under bdev->bd_holder_lock
The implementation of bdev holder operations such as fs_bdev_mark_dead()
and fs_bdev_sync() grab sb->s_umount semaphore under
bdev->bd_holder_lock. This is problematic because it leads to
disk->open_mutex -> sb->s_umount lock ordering which is counterintuitive
(usually we grab higher level (e.g. filesystem) locks first and lower
level (e.g. block layer) locks later) and indeed makes lockdep complain
about possible locking cycles whenever we open a block device while
holding sb->s_umount semaphore. Implement a function
bdev_super_lock_shared() which safely transitions from holding
bdev->bd_holder_lock to holding sb->s_umount on alive superblock without
introducing the problematic lock dependency. We use this function
fs_bdev_sync() and fs_bdev_mark_dead().

Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20231018152924.3858-1-jack@suse.cz
Link: https://lore.kernel.org/r/20231017184823.1383356-1-hch@lst.de
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-10-28 13:29:22 +02:00
Lizhi Xu
6306ff39a7
jfs: fix log->bdev_handle null ptr deref in lbmStartIO
When sbi->flag is JFS_NOINTEGRITY in lmLogOpen(), log->bdev_handle can't
be inited, so it value will be NULL.
Therefore, add the "log ->no_integrity=1" judgment in lbmStartIO() to avoid such
problems.

Reported-and-tested-by: syzbot+23bc20037854bb335d59@syzkaller.appspotmail.com
Signed-off-by: Lizhi Xu <lizhi.xu@windriver.com>
Link: https://lore.kernel.org/r/20231009094557.1398920-1-lizhi.xu@windriver.com
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-10-28 13:29:22 +02:00
Jan Kara
b3856da790
bcache: Fixup error handling in register_cache()
Coverity has noticed that the printing of error message in
register_cache() uses already freed bdev_handle to get to bdev. In fact
the problem has been there even before commit "bcache: Convert to
bdev_open_by_path()" just a bit more subtle one - cache object itself
could have been freed by the time we looked at ca->bdev and we don't
hold any reference to bdev either so even that could in principle go
away (due to device unplug or similar). Fix all these problems by
printing the error message before closing the bdev.

Fixes: dc893f51d24a ("bcache: Convert to bdev_open_by_path()")
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20231004093757.11560-1-jack@suse.cz
Asked-by: Coly Li <colyli@suse.de>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-10-28 13:29:22 +02:00