Commit Graph

1202254 Commits

Author SHA1 Message Date
Linus Torvalds
3bb156a556 fsverity updates for 6.6
Several cleanups for fs/verity/, including two commits that make the
 builtin signature support more cleanly separated from the base feature.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCZOwXSRQcZWJpZ2dlcnNA
 Z29vZ2xlLmNvbQAKCRDzXCl4vpKOK+kzAP9UsTtZjjQLvEDF6OYlysFLQppDrmk0
 zh9iH/Z7qT8BQwEA0azRHW5/FDLvkHZWsU0QCLjDpUPrNHc712aeM1pLpgc=
 =zOdL
 -----END PGP SIGNATURE-----

Merge tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fsverity/linux

Pull fsverity updates from Eric Biggers:
 "Several cleanups for fs/verity/, including two commits that make the
  builtin signature support more cleanly separated from the base
  feature"

* tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fsverity/linux:
  fsverity: skip PKCS#7 parser when keyring is empty
  fsverity: move sysctl registration out of signature.c
  fsverity: simplify handling of errors during initcall
  fsverity: explicitly check that there is no algorithm 0
2023-08-28 12:16:42 -07:00
Linus Torvalds
cc0a38d0f6 fscrypt update for 6.6
Just a small documentation improvement.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCZOwUaRQcZWJpZ2dlcnNA
 Z29vZ2xlLmNvbQAKCRDzXCl4vpKOKyGLAQC3nfSZ/sNNyUwPV46yBWzIDTuKXuCD
 iyd6L1N99pwMiwD5AT0PAylP7xQew7sMBimLvFJLoATP0OPApGR241yH4wo=
 =CYAT
 -----END PGP SIGNATURE-----

Merge tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/linux

Pull fscrypt update from Eric Biggers:
 "Just a small documentation improvement"

* tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/linux:
  fscrypt: improve the "Encryption modes and usage" section
2023-08-28 12:15:00 -07:00
Linus Torvalds
6016fc9162 New code for 6.6:
* Make large writes to the page cache fill sparse parts of the cache
    with large folios, then use large memcpy calls for the large folio.
  * Track the per-block dirty state of each large folio so that a
    buffered write to a single byte on a large folio does not result in a
    (potentially) multi-megabyte writeback IO.
  * Allow some directio completions to be performed in the initiating
    task's context instead of punting through a workqueue.  This will
    reduce latency for some io_uring requests.
 
 Signed-off-by: Darrick J. Wong <djwong@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQQ2qTKExjcn+O1o2YRKO3ySh0YRpgUCZM0Z1AAKCRBKO3ySh0YR
 pp7BAQCzkKejCM0185tNIH/faHjzidSisNQkJ5HoB4Opq9U66AEA6IPuAdlPlM/J
 FPW1oPq33Yn7AV4wXjUNFfDLzVb/Fgg=
 =dFBU
 -----END PGP SIGNATURE-----

Merge tag 'iomap-6.6-merge-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux

Pull iomap updates from Darrick Wong:
 "We've got some big changes for this release -- I'm very happy to be
  landing willy's work to enable large folios for the page cache for
  general read and write IOs when the fs can make contiguous space
  allocations, and Ritesh's work to track sub-folio dirty state to
  eliminate the write amplification problems inherent in using large
  folios.

  As a bonus, io_uring can now process write completions in the caller's
  context instead of bouncing through a workqueue, which should reduce
  io latency dramatically. IOWs, XFS should see a nice performance bump
  for both IO paths.

  Summary:

   - Make large writes to the page cache fill sparse parts of the cache
     with large folios, then use large memcpy calls for the large folio.

   - Track the per-block dirty state of each large folio so that a
     buffered write to a single byte on a large folio does not result in
     a (potentially) multi-megabyte writeback IO.

   - Allow some directio completions to be performed in the initiating
     task's context instead of punting through a workqueue. This will
     reduce latency for some io_uring requests"

* tag 'iomap-6.6-merge-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (26 commits)
  iomap: support IOCB_DIO_CALLER_COMP
  io_uring/rw: add write support for IOCB_DIO_CALLER_COMP
  fs: add IOCB flags related to passing back dio completions
  iomap: add IOMAP_DIO_INLINE_COMP
  iomap: only set iocb->private for polled bio
  iomap: treat a write through cache the same as FUA
  iomap: use an unsigned type for IOMAP_DIO_* defines
  iomap: cleanup up iomap_dio_bio_end_io()
  iomap: Add per-block dirty state tracking to improve performance
  iomap: Allocate ifs in ->write_begin() early
  iomap: Refactor iomap_write_delalloc_punch() function out
  iomap: Use iomap_punch_t typedef
  iomap: Fix possible overflow condition in iomap_write_delalloc_scan
  iomap: Add some uptodate state handling helpers for ifs state bitmap
  iomap: Drop ifs argument from iomap_set_range_uptodate()
  iomap: Rename iomap_page to iomap_folio_state and others
  iomap: Copy larger chunks from userspace
  iomap: Create large folios in the buffered write path
  filemap: Allow __filemap_get_folio to allocate large folios
  filemap: Add fgf_t typedef
  ...
2023-08-28 11:59:52 -07:00
Linus Torvalds
dd2c0198a8 Changes since last update:
- Support xattr bloom filter to optimize negative xattr lookups;
 
  - Support DEFLATE compression algorithm as an alternative;
 
  - Fix a regression that ztailpacking pclusters don't release properly;
 
  - Avoid warning dedupe and fragments features anymore;
 
  - Some folio conversions and cleanups.
 -----BEGIN PGP SIGNATURE-----
 
 iIcEABYIAC8WIQThPAmQN9sSA0DVxtI5NzHcH7XmBAUCZOvhIBEceGlhbmdAa2Vy
 bmVsLm9yZwAKCRA5NzHcH7XmBFgqAP4/gcxH5vhgxMunxmgBkSxMFBQf/W7CfOiN
 QkGHjSKl8gEA78EBwAJ3vDJ1JgQRTb9/9UBrtW7n2hzj/eVS/LIyYQI=
 =o3Bx
 -----END PGP SIGNATURE-----

Merge tag 'erofs-for-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs

Pull erofs updates from Gao Xiang:
 "In this cycle, a xattr bloom filter feature is introduced to speed up
  negative xattr lookups, which was originally suggested by Alexander
  for Composefs use cases.

  Additionally, the DEFLATE algorithm is now supported, which can be
  used together with hardware accelerators for our cloud workloads. Each
  supported compression algorithm can be selected on a per-file basis
  for specific access patterns too.

  There are also some random fixes and cleanups as usual:

   - Support xattr bloom filter to optimize negative xattr lookups

   - Support DEFLATE compression algorithm as an alternative

   - Fix a regression that ztailpacking pclusters don't release properly

   - Avoid warning dedupe and fragments features anymore

   - Some folio conversions and cleanups"

* tag 'erofs-for-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
  erofs: release ztailpacking pclusters properly
  erofs: don't warn dedupe and fragments features anymore
  erofs: adapt folios for z_erofs_read_folio()
  erofs: adapt folios for z_erofs_readahead()
  erofs: get rid of fe->backmost for cache decompression
  erofs: drop z_erofs_page_mark_eio()
  erofs: tidy up z_erofs_do_read_page()
  erofs: move preparation logic into z_erofs_pcluster_begin()
  erofs: avoid obsolete {collector,collection} terms
  erofs: simplify z_erofs_read_fragment()
  erofs: remove redundant erofs_fs_type declaration in super.c
  erofs: add necessary kmem_cache_create flags for erofs inode cache
  erofs: clean up redundant comment and adjust code alignment
  erofs: refine warning messages for zdata I/Os
  erofs: boost negative xattr lookup with bloom filter
  erofs: update on-disk format for xattr name filter
  erofs: DEFLATE compression support
2023-08-28 11:52:10 -07:00
Linus Torvalds
f20ae9cf5b File locking fixes for v6.6
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEES8DXskRxsqGE6vXTAA5oQRlWghUFAmTngQgACgkQAA5oQRlW
 ghXfYg//cbUlOoba3RLBmKTeN68BWaLTIgeibJ17ioOSjVnxxMqyt4tLH3FD1pnT
 2QisXiR3qrJav+sbwClpC23tafCHDfTo4uGbdXlehgAu4OSJ1+8gF2k1SsdAd4Qe
 tyfP1I2LO29/Yfres2Oy86JQi99Ds4slxjj63z/pYBdo4LEDJc83m/9limgMv4ro
 Geop6f9lV3Z03rOZIixQinMtOQTXW0NJbiXMEGzFS1Y05xQ34kX+v33MBAvtnbY9
 nhbWuJKYlgZZhH5CghZUAvKfq2a4suvpZbnzmEZLJ0DdRE0h/4ctIxTSK8tnrq5d
 07cs34NOGWljlguJc7cYotrKv51lu6yGbSCHyNdAEdBotuIXJ4Q3krAUHBnuGzy3
 fa/a7nozRO7RDzshGTO4dBRerTEyS7bWUEnIhRp6BHR2ny9jhK6zMa1ys3eAMG4F
 5a61AH/3qS+i8gQQSF9HFRyD+D0qDVMg2YCE/lKveF5tNBeKqrfj/Wz3HnY2zB/L
 luQY0esX5idWTH8YYJlntxgmWEZBvNgmuWcbDVu2Jj/jXUGP6S3dnY0MZio2Sbdr
 zM7/HenAkvXTlbtJsLmdGGZSMnINuiJi4sT2aYn/8JG4HbHFlAU89ILhIGrF3ayP
 669VpO6g/DoEpVRyOnzC3MCHE09SWoiHcXo4VCsz/270uXOpcVs=
 =f9CF
 -----END PGP SIGNATURE-----

Merge tag 'filelock-v6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux

Pull file locking updates from Jeff Layton:

 - new functionality for F_OFD_GETLK: requesting a type of F_UNLCK will
   find info about whatever lock happens to be first in the given range,
   regardless of type.

 - an OFD lock selftest

 - bugfix involving a UAF in a tracepoint

 - comment typo fix

* tag 'filelock-v6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux:
  locks: fix KASAN: use-after-free in trace_event_raw_event_filelock_lock
  fs/locks: Fix typo
  selftests: add OFD lock tests
  fs/locks: F_UNLCK extension for F_OFD_GETLK
2023-08-28 11:47:24 -07:00
Linus Torvalds
b4a04f92a4 v6.6-fs.proc.uapi
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZOXT2QAKCRCRxhvAZXjc
 olkFAQCT4nRkRTpBvbiv4DgvCIy+URqLNfHGxCxdAX1B09o3UwEAyepf1tz7aFpB
 wB67V265JFDMWtvQkSx4ORNpAjZ9Kg0=
 =Opqi
 -----END PGP SIGNATURE-----

Merge tag 'v6.6-fs.proc.uapi' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull procfs fixes from Christian Brauner:
 "Mode changes to files under /proc/<pid>/ aren't supported ever since
  commit 6d76fa58b0 ("Don't allow chmod() on the /proc/<pid>/ files").

  Due to an oversight in commit 1b3044e39a ("procfs: fix pthread
  cross-thread naming if !PR_DUMPABLE") in switching from REG to NOD,
  mode changes on /proc/thread-self/comm were accidently allowed.

  Similar, mode changes for all files beneath /proc/<pid>/net/ are
  blocked but mode changes on /proc/<pid>/net itself were accidently
  allowed.

  Both issues come down to not using the generic proc_setattr() helper
  which blocks all mode changes. This is rectified with this pull
  request.

  This also removes a strange nolibc test that abused /proc/<pid>/net
  for testing mode changes. Using procfs for this test never made a lot
  of sense given procfs has special semantics for almost everything
  anway.

  Both changes are minor user-visible changes. It is however very
  unlikely that mode changes on proc/<pid>/net and
  /proc/thread-self/comm are something that userspace relies on"

* tag 'v6.6-fs.proc.uapi' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  procfs: block chmod on /proc/thread-self/comm
  proc: use generic setattr() for /proc/$PID/net
  selftests/nolibc: drop test chmod_net
2023-08-28 11:43:19 -07:00
Linus Torvalds
2e0afa7e78 v6.6-vfs.autofs
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZOXUDgAKCRCRxhvAZXjc
 ogplAQCYXt+zcfs1GMhCUtPFzyyCwNsraMNzVwTdFbMz4R1JuQD9HL82VKyvMwmZ
 uo6uGVd9xN6cEy61Lpz9K8dn59uVAQE=
 =851o
 -----END PGP SIGNATURE-----

Merge tag 'v6.6-vfs.autofs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull autofs fixes from Christian Brauner:
 "This fixes a memory leak in autofs reported by syzkaller and a missing
  conversion from uninterruptible to interruptible wake up when autofs
  is in catatonic mode"

* tag 'v6.6-vfs.autofs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  autofs: use wake_up() instead of wake_up_interruptible(()
  autofs: fix memory leak of waitqueues in autofs_catatonic_mode
2023-08-28 11:39:14 -07:00
Linus Torvalds
475d4df827 v6.6-vfs.fchmodat2
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZOXT7QAKCRCRxhvAZXjc
 ort3AP0VIK/oJk5skgjpinQrCfvtVz0XOtawuBtn0f1weIfb6AD9Hg1rqOKnQD5z
 dkvn3xaEr3gPOVzqU5SvFwVoCM0cMwA=
 =24Ha
 -----END PGP SIGNATURE-----

Merge tag 'v6.6-vfs.fchmodat2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull fchmodat2 system call from Christian Brauner:
 "This adds the fchmodat2() system call. It is a revised version of the
  fchmodat() system call, adding a missing flag argument. Support for
  both AT_SYMLINK_NOFOLLOW and AT_EMPTY_PATH are included.

  Adding this system call revision has been a longstanding request but
  so far has always fallen through the cracks. While the kernel
  implementation of fchmodat() does not have a flag argument the libc
  provided POSIX-compliant fchmodat(3) version does. Both glibc and musl
  have to implement a workaround in order to support AT_SYMLINK_NOFOLLOW
  (see [1] and [2]).

  The workaround is brittle because it relies not just on O_PATH and
  O_NOFOLLOW semantics and procfs magic links but also on our rather
  inconsistent symlink semantics.

  This gives userspace a proper fchmodat2() system call that libcs can
  use to properly implement fchmodat(3) and allows them to get rid of
  their hacks. In this case it will immediately benefit them as the
  current workaround is already defunct because of aformentioned
  inconsistencies.

  In addition to AT_SYMLINK_NOFOLLOW, give userspace the ability to use
  AT_EMPTY_PATH with fchmodat2(). This is already possible with
  fchownat() so there's no reason to not also support it for
  fchmodat2().

  The implementation is simple and comes with selftests. Implementation
  of the system call and wiring up the system call are done as separate
  patches even though they could arguably be one patch. But in case
  there are merge conflicts from other system call additions it can be
  beneficial to have separate patches"

Link: https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/fchmodat.c;h=17eca54051ee28ba1ec3f9aed170a62630959143;hb=a492b1e5ef7ab50c6fdd4e4e9879ea5569ab0a6c#l35 [1]
Link: https://git.musl-libc.org/cgit/musl/tree/src/stat/fchmodat.c?id=718f363bc2067b6487900eddc9180c84e7739f80#n28 [2]

* tag 'v6.6-vfs.fchmodat2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  selftests: fchmodat2: remove duplicate unneeded defines
  fchmodat2: add support for AT_EMPTY_PATH
  selftests: Add fchmodat2 selftest
  arch: Register fchmodat2, usually as syscall 452
  fs: Add fchmodat2()
  Non-functional cleanup of a "__user * filename"
2023-08-28 11:25:27 -07:00
Linus Torvalds
511fb5bafe v6.6-vfs.super
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZOXpbgAKCRCRxhvAZXjc
 oi8PAQCtXelGZHmTcmevsO8p4Qz7hFpkonZ/TnxKf+RdnlNgPgD+NWi+LoRBpaAj
 xk4z8SqJaTTP4WXrG5JZ6o7EQkUL8gE=
 =2e9I
 -----END PGP SIGNATURE-----

Merge tag 'v6.6-vfs.super' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull superblock updates from Christian Brauner:
 "This contains the super rework that was ready for this cycle. The
  first part changes the order of how we open block devices and allocate
  superblocks, contains various cleanups, simplifications, and a new
  mechanism to wait on superblock state changes.

  This unblocks work to ultimately limit the number of writers to a
  block device. Jan has already scheduled follow-up work that will be
  ready for v6.7 and allows us to restrict the number of writers to a
  given block device. That series builds on this work right here.

  The second part contains filesystem freezing updates.

  Overview:

  The generic superblock changes are rougly organized as follows
  (ignoring additional minor cleanups):

   (1) Removal of the bd_super member from struct block_device.

       This was a very odd back pointer to struct super_block with
       unclear rules. For all relevant places we have other means to get
       the same information so just get rid of this.

   (2) Simplify rules for superblock cleanup.

       Roughly, everything that is allocated during fs_context
       initialization and that's stored in fs_context->s_fs_info needs
       to be cleaned up by the fs_context->free() implementation before
       the superblock allocation function has been called successfully.

       After sget_fc() returned fs_context->s_fs_info has been
       transferred to sb->s_fs_info at which point sb->kill_sb() if
       fully responsible for cleanup. Adhering to these rules means that
       cleanup of sb->s_fs_info in fill_super() is to be avoided as it's
       brittle and inconsistent.

       Cleanup shouldn't be duplicated between sb->put_super() as
       sb->put_super() is only called if sb->s_root has been set aka
       when the filesystem has been successfully born (SB_BORN). That
       complexity should be avoided.

       This also means that block devices are to be closed in
       sb->kill_sb() instead of sb->put_super(). More details in the
       lower section.

   (3) Make it possible to lookup or create a superblock before opening
       block devices

       There's a subtle dependency on (2) as some filesystems did rely
       on fill_super() to be called in order to correctly clean up
       sb->s_fs_info. All these filesystems have been fixed.

   (4) Switch most filesystem to follow the same logic as the generic
       mount code now does as outlined in (3).

   (5) Use the superblock as the holder of the block device. We can now
       easily go back from block device to owning superblock.

   (6) Export and extend the generic fs_holder_ops and use them as
       holder ops everywhere and remove the filesystem specific holder
       ops.

   (7) Call from the block layer up into the filesystem layer when the
       block device is removed, allowing to shut down the filesystem
       without risk of deadlocks.

   (8) Get rid of get_super().

       We can now easily go back from the block device to owning
       superblock and can call up from the block layer into the
       filesystem layer when the device is removed. So no need to wade
       through all registered superblock to find the owning superblock
       anymore"

Link: https://lore.kernel.org/lkml/20230824-prall-intakt-95dbffdee4a0@brauner/

* tag 'v6.6-vfs.super' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (47 commits)
  super: use higher-level helper for {freeze,thaw}
  super: wait until we passed kill super
  super: wait for nascent superblocks
  super: make locking naming consistent
  super: use locking helpers
  fs: simplify invalidate_inodes
  fs: remove get_super
  block: call into the file system for ioctl BLKFLSBUF
  block: call into the file system for bdev_mark_dead
  block: consolidate __invalidate_device and fsync_bdev
  block: drop the "busy inodes on changed media" log message
  dasd: also call __invalidate_device when setting the device offline
  amiflop: don't call fsync_bdev in FDFMTBEG
  floppy: call disk_force_media_change when changing the format
  block: simplify the disk_force_media_change interface
  nbd: call blk_mark_disk_dead in nbd_clear_sock_ioctl
  xfs use fs_holder_ops for the log and RT devices
  xfs: drop s_umount over opening the log and RT devices
  ext4: use fs_holder_ops for the log device
  ext4: drop s_umount over opening the log device
  ...
2023-08-28 11:04:18 -07:00
Linus Torvalds
de16588a77 v6.6-vfs.misc
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZOXTxQAKCRCRxhvAZXjc
 okaVAP94WAlItvDRt/z2Wtzf0+RqPZeTXEdGTxua8+RxqCyYIQD+OO5nRfKQPHlV
 AqqGJMKItQMSMIYgB5ftqVhNWZfnHgM=
 =pSEW
 -----END PGP SIGNATURE-----

Merge tag 'v6.6-vfs.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull misc vfs updates from Christian Brauner:
 "This contains the usual miscellaneous features, cleanups, and fixes
  for vfs and individual filesystems.

  Features:

   - Block mode changes on symlinks and rectify our broken semantics

   - Report file modifications via fsnotify() for splice

   - Allow specifying an explicit timeout for the "rootwait" kernel
     command line option. This allows to timeout and reboot instead of
     always waiting indefinitely for the root device to show up

   - Use synchronous fput for the close system call

  Cleanups:

   - Get rid of open-coded lockdep workarounds for async io submitters
     and replace it all with a single consolidated helper

   - Simplify epoll allocation helper

   - Convert simple_write_begin and simple_write_end to use a folio

   - Convert page_cache_pipe_buf_confirm() to use a folio

   - Simplify __range_close to avoid pointless locking

   - Disable per-cpu buffer head cache for isolated cpus

   - Port ecryptfs to kmap_local_page() api

   - Remove redundant initialization of pointer buf in pipe code

   - Unexport the d_genocide() function which is only used within core
     vfs

   - Replace printk(KERN_ERR) and WARN_ON() with WARN()

  Fixes:

   - Fix various kernel-doc issues

   - Fix refcount underflow for eventfds when used as EFD_SEMAPHORE

   - Fix a mainly theoretical issue in devpts

   - Check the return value of __getblk() in reiserfs

   - Fix a racy assert in i_readcount_dec

   - Fix integer conversion issues in various functions

   - Fix LSM security context handling during automounts that prevented
     NFS superblock sharing"

* tag 'v6.6-vfs.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (39 commits)
  cachefiles: use kiocb_{start,end}_write() helpers
  ovl: use kiocb_{start,end}_write() helpers
  aio: use kiocb_{start,end}_write() helpers
  io_uring: use kiocb_{start,end}_write() helpers
  fs: create kiocb_{start,end}_write() helpers
  fs: add kerneldoc to file_{start,end}_write() helpers
  io_uring: rename kiocb_end_write() local helper
  splice: Convert page_cache_pipe_buf_confirm() to use a folio
  libfs: Convert simple_write_begin and simple_write_end to use a folio
  fs/dcache: Replace printk and WARN_ON by WARN
  fs/pipe: remove redundant initialization of pointer buf
  fs: Fix kernel-doc warnings
  devpts: Fix kernel-doc warnings
  doc: idmappings: fix an error and rephrase a paragraph
  init: Add support for rootwait timeout parameter
  vfs: fix up the assert in i_readcount_dec
  fs: Fix one kernel-doc comment
  docs: filesystems: idmappings: clarify from where idmappings are taken
  fs/buffer.c: disable per-CPU buffer_head cache for isolated CPUs
  vfs, security: Fix automount superblock LSM init problem, preventing NFS sb sharing
  ...
2023-08-28 10:17:14 -07:00
Linus Torvalds
ecd7db2047 v6.6-vfs.tmpfs
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZOXTkgAKCRCRxhvAZXjc
 ouZsAPwNBHB2aPKtzWURuKx5RX02vXTzHX+A/LpuDz5WBFe8zQD+NlaBa4j0MBtS
 rVYM+CjOXnjnsLc8W0euMnfYNvViKgQ=
 =L2+2
 -----END PGP SIGNATURE-----

Merge tag 'v6.6-vfs.tmpfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull libfs and tmpfs updates from Christian Brauner:
 "This cycle saw a lot of work for tmpfs that required changes to the
  vfs layer. Andrew, Hugh, and I decided to take tmpfs through vfs this
  cycle. Things will go back to mm next cycle.

  Features
  ========

   - By far the biggest work is the quota support for tmpfs. New tmpfs
     quota infrastructure is added to support it and a new QFMT_SHMEM
     uapi option is exposed.

     This offers user and group quotas to tmpfs (project quotas will be
     added later). Similar to other filesystems tmpfs quota are not
     supported within user namespaces yet.

   - Add support for user xattrs. While tmpfs already supports security
     xattrs (security.*) and POSIX ACLs for a long time it lacked
     support for user xattrs (user.*). With this pull request tmpfs will
     be able to support a limited number of user xattrs.

     This is accompanied by a fix (see below) to limit persistent simple
     xattr allocations.

   - Add support for stable directory offsets. Currently tmpfs relies on
     the libfs provided cursor-based mechanism for readdir. This causes
     issues when a tmpfs filesystem is exported via NFS.

     NFS clients do not open directories. Instead, each server-side
     readdir operation opens the directory, reads it, and then closes
     it. Since the cursor state for that directory is associated with
     the opened file it is discarded after each readdir operation. Such
     directory offsets are not just cached by NFS clients but also
     various userspace libraries based on these clients.

     As it stands there is no way to invalidate the caches when
     directory offsets have changed and the whole application depends on
     unchanging directory offsets.

     At LSFMM we discussed how to solve this problem and decided to
     support stable directory offsets. libfs now allows filesystems like
     tmpfs to use an xarrary to map a directory offset to a dentry. This
     mechanism is currently only used by tmpfs but can be supported by
     others as well.

  Fixes
  =====

   - Change persistent simple xattrs allocations in libfs from
     GFP_KERNEL to GPF_KERNEL_ACCOUNT so they're subject to memory
     cgroup limits. Since this is a change to libfs it affects both
     tmpfs and kernfs.

   - Correctly verify {g,u}id mount options.

     A new filesystem context is created via fsopen() which records the
     namespace that becomes the owning namespace of the superblock when
     fsconfig(FSCONFIG_CMD_CREATE) is called for filesystems that are
     mountable in namespaces. However, fsconfig() calls can occur in a
     namespace different from the namespace where fsopen() has been
     called.

     Currently, when fsconfig() is called to set {g,u}id mount options
     the requested {g,u}id is mapped into a k{g,u}id according to the
     namespace where fsconfig() was called from. The resulting k{g,u}id
     is not guaranteed to be resolvable in the namespace of the
     filesystem (the one that fsopen() was called in).

     This means it's possible for an unprivileged user to create files
     owned by any group in a tmpfs mount since it's possible to set the
     setid bits on the tmpfs directory.

     The contract for {g,u}id mount options and {g,u}id values in
     general set from userspace has always been that they are translated
     according to the caller's idmapping. In so far, tmpfs has been
     doing the correct thing. But since tmpfs is mountable in
     unprivileged contexts it is also necessary to verify that the
     resulting {k,g}uid is representable in the namespace of the
     superblock to avoid such bugs.

     The new mount api's cross-namespace delegation abilities are
     already widely used. Having talked to a bunch of userspace this is
     the most faithful solution with minimal regression risks"

* tag 'v6.6-vfs.tmpfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  tmpfs,xattr: GFP_KERNEL_ACCOUNT for simple xattrs
  mm: invalidation check mapping before folio_contains
  tmpfs: trivial support for direct IO
  tmpfs,xattr: enable limited user extended attributes
  tmpfs: track free_ispace instead of free_inodes
  xattr: simple_xattr_set() return old_xattr to be freed
  tmpfs: verify {g,u}id mount options correctly
  shmem: move spinlock into shmem_recalc_inode() to fix quota support
  libfs: Remove parent dentry locking in offset_iterate_dir()
  libfs: Add a lock class for the offset map's xa_lock
  shmem: stable directory offsets
  shmem: Refactor shmem_symlink()
  libfs: Add directory operations for stable offsets
  shmem: fix quota lock nesting in huge hole handling
  shmem: Add default quota limit mount options
  shmem: quota support
  shmem: prepare shmem quota infrastructure
  quota: Check presence of quota operation structures instead of ->quota_read and ->quota_write callbacks
  shmem: make shmem_get_inode() return ERR_PTR instead of NULL
  shmem: make shmem_inode_acct_block() return error
2023-08-28 09:55:25 -07:00
Linus Torvalds
615e95831e v6.6-vfs.ctime
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZOXTKAAKCRCRxhvAZXjc
 oifJAQCzi/p+AdQu8LA/0XvR7fTwaq64ZDCibU4BISuLGT2kEgEAuGbuoFZa0rs2
 XYD/s4+gi64p9Z01MmXm2XO1pu3GPg0=
 =eJz5
 -----END PGP SIGNATURE-----

Merge tag 'v6.6-vfs.ctime' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull vfs timestamp updates from Christian Brauner:
 "This adds VFS support for multi-grain timestamps and converts tmpfs,
  xfs, ext4, and btrfs to use them. This carries acks from all relevant
  filesystems.

  The VFS always uses coarse-grained timestamps when updating the ctime
  and mtime after a change. This has the benefit of allowing filesystems
  to optimize away a lot of metadata updates, down to around 1 per
  jiffy, even when a file is under heavy writes.

  Unfortunately, this has always been an issue when we're exporting via
  NFSv3, which relies on timestamps to validate caches. A lot of changes
  can happen in a jiffy, so timestamps aren't sufficient to help the
  client decide to invalidate the cache.

  Even with NFSv4, a lot of exported filesystems don't properly support
  a change attribute and are subject to the same problems with timestamp
  granularity. Other applications have similar issues with timestamps
  (e.g., backup applications).

  If we were to always use fine-grained timestamps, that would improve
  the situation, but that becomes rather expensive, as the underlying
  filesystem would have to log a lot more metadata updates.

  This introduces fine-grained timestamps that are used when they are
  actively queried.

  This uses the 31st bit of the ctime tv_nsec field to indicate that
  something has queried the inode for the mtime or ctime. When this flag
  is set, on the next mtime or ctime update, the kernel will fetch a
  fine-grained timestamp instead of the usual coarse-grained one.

  As POSIX generally mandates that when the mtime changes, the ctime
  must also change the kernel always stores normalized ctime values, so
  only the first 30 bits of the tv_nsec field are ever used.

  Filesytems can opt into this behavior by setting the FS_MGTIME flag in
  the fstype. Filesystems that don't set this flag will continue to use
  coarse-grained timestamps.

  Various preparatory changes, fixes and cleanups are included:

   - Fixup all relevant places where POSIX requires updating ctime
     together with mtime. This is a wide-range of places and all
     maintainers provided necessary Acks.

   - Add new accessors for inode->i_ctime directly and change all
     callers to rely on them. Plain accesses to inode->i_ctime are now
     gone and it is accordingly rename to inode->__i_ctime and commented
     as requiring accessors.

   - Extend generic_fillattr() to pass in a request mask mirroring in a
     sense the statx() uapi. This allows callers to pass in a request
     mask to only get a subset of attributes filled in.

   - Rework timestamp updates so it's possible to drop the @now
     parameter the update_time() inode operation and associated helpers.

   - Add inode_update_timestamps() and convert all filesystems to it
     removing a bunch of open-coding"

* tag 'v6.6-vfs.ctime' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (107 commits)
  btrfs: convert to multigrain timestamps
  ext4: switch to multigrain timestamps
  xfs: switch to multigrain timestamps
  tmpfs: add support for multigrain timestamps
  fs: add infrastructure for multigrain timestamps
  fs: drop the timespec64 argument from update_time
  xfs: have xfs_vn_update_time gets its own timestamp
  fat: make fat_update_time get its own timestamp
  fat: remove i_version handling from fat_update_time
  ubifs: have ubifs_update_time use inode_update_timestamps
  btrfs: have it use inode_update_timestamps
  fs: drop the timespec64 arg from generic_update_time
  fs: pass the request_mask to generic_fillattr
  fs: remove silly warning from current_time
  gfs2: fix timestamp handling on quota inodes
  fs: rename i_ctime field to __i_ctime
  selinux: convert to ctime accessor functions
  security: convert to ctime accessor functions
  apparmor: convert to ctime accessor functions
  sunrpc: convert to ctime accessor functions
  ...
2023-08-28 09:31:32 -07:00
Linus Torvalds
84ab1277ce v6.6-vfs.fs_context
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZOXUHQAKCRCRxhvAZXjc
 opWuAQC5wYyKWMwpxc3GaGcHiC7nq0uyYCcVgzeebsw1eGzFvgD9FoYRphC2pqi1
 p8qUexEK2aOZmPquFWmRDTRMcZ23YAk=
 =UKnx
 -----END PGP SIGNATURE-----

Merge tag 'v6.6-vfs.fs_context' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull mount API updates from Christian Brauner:
 "This introduces FSCONFIG_CMD_CREATE_EXCL which allows userspace to
  implement something like

      $ mount -t ext4 --exclusive /dev/sda /B

  which fails if a superblock for the requested filesystem does already
  exist instead of silently reusing an existing superblock.

  Without it, in the sequence

      $ move-mount -f xfs -o       source=/dev/sda4 /A
      $ move-mount -f xfs -o noacl,source=/dev/sda4 /B

  the initial mounter will create a superblock. The second mounter will
  reuse the existing superblock, creating a bind-mount (see [1] for the
  source of the move-mount binary).

  The problem is that reusing an existing superblock means all mount
  options other than read-only and read-write will be silently ignored
  even if they are incompatible requests. For example, the second mount
  has requested no POSIX ACL support but since the existing superblock
  is reused POSIX ACL support will remain enabled.

  Such silent superblock reuse can easily become a security issue.

  After adding support for FSCONFIG_CMD_CREATE_EXCL to mount(8) in
  util-linux this can be fixed:

      $ move-mount -f xfs --exclusive -o       source=/dev/sda4 /A
      $ move-mount -f xfs --exclusive -o noacl,source=/dev/sda4 /B
      Device or resource busy | move-mount.c: 300: do_fsconfig: i xfs: reusing existing filesystem not allowed

  This requires the new mount api. With the old mount api it would be
  necessary to plumb this through every legacy filesystem's
  file_system_type->mount() method. If they want this feature they are
  most welcome to switch to the new mount api"

Link: https://github.com/brauner/move-mount-beneath [1]
Link: https://lore.kernel.org/linux-block/20230704-fasching-wertarbeit-7c6ffb01c83d@brauner
Link: https://lore.kernel.org/linux-block/20230705-pumpwerk-vielversprechend-a4b1fd947b65@brauner
Link: https://lore.kernel.org/linux-fsdevel/20230725-einnahmen-warnschilder-17779aec0a97@brauner
Link: https://lore.kernel.org/lkml/20230824-anzog-allheilmittel-e8c63e429a79@brauner/

* tag 'v6.6-vfs.fs_context' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  fs: add FSCONFIG_CMD_CREATE_EXCL
  fs: add vfs_cmd_reconfigure()
  fs: add vfs_cmd_create()
  super: remove get_tree_single_reconf()
2023-08-28 09:00:09 -07:00
Linus Torvalds
2dde18cd1d Linux 6.5 2023-08-27 14:49:51 -07:00
Linus Torvalds
85eb043618 SCSI fixes on 20230827
Three small driver fixes and one larger unused function set removal in
 the raid class (so no external impact).
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCZOr0iCYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishaV8AQDtFvZp
 KI3GW2x6XjZeXVW3buQVmwLmdBfIIx0yDZGLqAEAm8qZGROMsMhBCvK/iizjsrir
 KerWJQ1LU9oMcjbesuk=
 =z12E
 -----END PGP SIGNATURE-----

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "Three small driver fixes and one larger unused function set removal in
  the raid class (so no external impact)"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: snic: Fix double free in snic_tgt_create()
  scsi: core: raid_class: Remove raid_component_add()
  scsi: ufs: ufs-qcom: Clear qunipro_g4_sel for HW major version > 5
  scsi: ufs: mcq: Fix the search/wrap around logic
2023-08-27 07:33:54 -07:00
Linus Torvalds
28f20a1929 Fix an FPU invalidation bug on exec(), and fix a performance
regression due to a missing setting of X86_FEATURE_OSXSAVE.
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmTqO5ARHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1gXvA//dBph3091OIibZ23n+3eJODmcItKV9TBK
 fJmlKrfY4q3zsbt4WRQKplTVfRRGZviNzoL1S6nEHMBQ9NFAZjPETRmnpypfS6VM
 Hd093iIYN1LWvL549FJdAwB1jJQdzuYCys8qAvmhjJUzHJhO2QRgFoiI6BCuiu4U
 LoyBRKakLQRLCCirfXBjlb0BPpXnHHeIiuOn+xrxJphCyjcnS5bE1ud54g9ws+Ji
 neYZ/kMpqj+zHsMHQkNNwPuW+WyBKlM1O2yax1OFwjKQnIUuq2qdL/YICj+Yr67A
 8EpTSOx4XPNROqu32Roa0WsFQy9OloaZLNRAdIjR+jf1jeSwBbY3QCvghnzOojTa
 jnPvrvAf9e0AOVt94FmYaygtraybVp4lwem1/eqKQMarWGZtQmZV7VTLooqG81jH
 +I/rkNvTyHrhv9qICzvD2AkT9AK5Ayo/d6O3F7OHN1/tcbQCs/jGGF4vUGsKf9WB
 HULb9wE6cdBQYah6my6jzVFDkBcFLH/mbigQXHO5MX4bwuA5bZdwwkOFVtd/J8dN
 dsvF5a7i+qpK3bVCInilUs20gzymDEsqOQm78IDLYOSW/sOS29cNOdO52/jk2YB/
 8pDp1tpgdghR6oTagP6PUERFU6m4XAnMz68yfyugiCSsB9V6Srp3H7yQXbD1aLBA
 /iz4x4rZq0o=
 =bPF+
 -----END PGP SIGNATURE-----

Merge tag 'x86-urgent-2023-08-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Ingo Molnar:
 "Fix an FPU invalidation bug on exec(), and fix a performance
  regression due to a missing setting of X86_FEATURE_OSXSAVE"

* tag 'x86-urgent-2023-08-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/fpu: Set X86_FEATURE_OSXSAVE feature after enabling OSXSAVE in CR4
  x86/fpu: Invalidate FPU state correctly on exec()
2023-08-26 10:57:29 -07:00
Linus Torvalds
3b35375f19 A last minute fix for a regression introduced in the v6.5 merge window. The
conversion of the software based interrupt resend mechanism to hlist missed
 to add a check whether the descriptor is already enqueued and dropped the
 interrupt descriptor lookup for nested interrupts.
 
 The missing check whether the descriptor is already queued causes hlist
 corruption and can be observed in the wild. The dropped parent descriptor
 lookup has not yet caused problems, but it would result in stale interrupt
 line in the worst case.
 
 Add the missing enqueued check and bring the descriptor lookup back to cure
 this.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmTqNLQTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoTeBD/0b8zbNhmO5TXhP6GrCPXahFM6aTmyK
 NveZMzh1c7tQZzMBNNEnRoaYvmcgPOviZ1Yi3+/Hs3oaR/b6nLt36K8+MRC7J+15
 j6cIylmpTp9eH5Na3IT1wmTNfCVAdoejoZVYq4PPHAHUrzqu7ESOTLzHbPmWS97E
 VGdvUrKnQ7J4ajOZn7bXWaia+qCuIij87CYAKH++c9JVMIc0iTs2Zd7FG2sncgLm
 OJdvjmMy/qN9a1jYdM4DrGOS8HBdvuYb9EEDuZB4NEY3nBR+svQqBHsD462LgxNe
 +OTzLBVMoP9heKbyTU9357PUq2qz6OmpC0vE1n5XgkSEdrvm9x1UjYcPQnagRm25
 JZp/pEI/ryD8oGQNWzsPe7PDyyKHV5F0Q1KPHGUvvEJxwF+USVe9Zm6damfZvGeA
 dp34zYg0mFCH0hmqdYs6+cc8sJcEy8aR8FFUgI1Uj5nr9zZ3vV7WTsOjJ12NDFo/
 L+oDKz6/sdL2X/EKddP3ffQrImPF8DdSYfEPmoukTMhihfgXewBlgvg3b9HekVVm
 9j7UhqsQw/mdPcTpkM6cd5ngxB71X64gMjAfotwsproJg/EUw978CM++9sGKmKy8
 jU7hlgZQ3DniSCyCpXB/7vZxAFej8TKTWmTc4KZYKiMfej2vqI3FjA3KLGY6GzK+
 ls/Rm57EOhKZlw==
 =Snax
 -----END PGP SIGNATURE-----

Merge tag 'irq-urgent-2023-08-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull irq fix from Thomas Gleixner:
 "A last minute fix for a regression introduced in the v6.5 merge
  window.

  The conversion of the software based interrupt resend mechanism to
  hlist missed to add a check whether the descriptor is already enqueued
  and dropped the interrupt descriptor lookup for nested interrupts.

  The missing check whether the descriptor is already queued causes
  hlist corruption and can be observed in the wild. The dropped parent
  descriptor lookup has not yet caused problems, but it would result in
  stale interrupt line in the worst case.

  Add the missing enqueued check and bring the descriptor lookup back to
  cure this"

* tag 'irq-urgent-2023-08-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  genirq: Fix software resend lockup and nested resend
2023-08-26 10:34:29 -07:00
Linus Torvalds
c313761337 LoongArch fixes for v6.5-final
-----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCAA0FiEEzOlt8mkP+tbeiYy5AoYrw/LiJnoFAmTqJf4WHGNoZW5odWFj
 YWlAa2VybmVsLm9yZwAKCRAChivD8uImesu4D/4yXb19/F4JZRx8T46Osx1OZ4pn
 Z0WlAS8e3QUV4HNAVsgMnp8IkPnK82weliZdIZM4T6Vgid9UUV5egCbresMK4wCy
 8wpwDOK13V0pqHcdlGTL3wQTe3gdJDorQN5ReK4OOugYuG5dAW8W+c5Q0kfe3to8
 or8nzjEomf2jBdbsGfJ9vYbucE9vB7eei8V/rp94VijmPTnIk6WooYPNwrG4oh2o
 p5SSB3P1Z3OfI7tCRNM3Y5BGFvI8YJ8ujjE+Qk7YI1EeHSHfMypJxTWGimjq5Dgq
 QGyy25gg5XHLxR7u3RUcQHoKC8BFSOwkOkSBHG8rzUovySkYA6u75aZQNA+xQiJZ
 JT9+6p0U5QCBBeyjfTiCO8LDwulrSdXsDKPiUqrkjITg2dFW9cukZl7iP8BUUwr9
 3M2Ml7Y/QKlk7/3qGgWRZ8030aGbCuWEFT46W9MZqCh/a6+ij5anRIlvcPhKEAxw
 0gJWMkKCLlbMvCyRJvi6WVH00xoNMXvlgcJAdswIVtUrOQMBLSCIiHvdox+jjiNo
 LcRb/6SpSVKi3ux3jIFJ9DBP9lmWwQPGHvZaoddMXvbsps5+QX1byfuJlfTYjGm2
 Mw9SwV7m4vcRJKc+MNVJ2/gBMz0qCgYv0KsfI2ZlBfOaGos1rMu9ubZjrV3Tgf5Y
 4zw/VKoRw0zyZSQWZA==
 =lord
 -----END PGP SIGNATURE-----

Merge tag 'loongarch-fixes-6.5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson

Pull LoongArch fixes from Huacai Chen:
 "Fix a ptrace bug, a hw_breakpoint bug, some build errors/warnings and
  some trivial cleanups"

* tag 'loongarch-fixes-6.5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
  LoongArch: Fix hw_breakpoint_control() for watchpoints
  LoongArch: Ensure FP/SIMD registers in the core dump file is up to date
  LoongArch: Put the body of play_dead() into arch_cpu_idle_dead()
  LoongArch: Add identifier names to arguments of die() declaration
  LoongArch: Return earlier in die() if notify_die() returns NOTIFY_STOP
  LoongArch: Do not kill the task in die() if notify_die() returns NOTIFY_STOP
  LoongArch: Remove <asm/export.h>
  LoongArch: Replace #include <asm/export.h> with #include <linux/export.h>
  LoongArch: Remove unneeded #include <asm/export.h>
  LoongArch: Replace -ffreestanding with finer-grained -fno-builtin's
  LoongArch: Remove redundant "source drivers/firmware/Kconfig"
2023-08-26 10:28:52 -07:00
Johan Hovold
9f5deb5516 genirq: Fix software resend lockup and nested resend
The switch to using hlist for managing software resend of interrupts
broke resend in at least two ways:

First, unconditionally adding interrupt descriptors to the resend list can
corrupt the list when the descriptor in question has already been
added. This causes the resend tasklet to loop indefinitely with interrupts
disabled as was recently reported with the Lenovo ThinkPad X13s after
threaded NAPI was disabled in the ath11k WiFi driver.

This bug is easily fixed by restoring the old semantics of irq_sw_resend()
so that it can be called also for descriptors that have already been marked
for resend.

Second, the offending commit also broke software resend of nested
interrupts by simply discarding the code that made sure that such
interrupts are retriggered using the parent interrupt.

Add back the corresponding code that adds the parent descriptor to the
resend list.

Fixes: bc06a9e087 ("genirq: Use hlist for managing resend handlers")
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/lkml/20230809073432.4193-1-johan+linaro@kernel.org/
Link: https://lore.kernel.org/r/20230826154004.1417-1-johan+linaro@kernel.org
2023-08-26 19:14:31 +02:00
Huacai Chen
9730870b48 LoongArch: Fix hw_breakpoint_control() for watchpoints
In hw_breakpoint_control(), encode_ctrl_reg() has already encoded the
MWPnCFG3_LoadEn/MWPnCFG3_StoreEn bits in info->ctrl. We don't need to
add (1 << MWPnCFG3_LoadEn | 1 << MWPnCFG3_StoreEn) unconditionally.

Otherwise we can't set read watchpoint and write watchpoint separately.

Cc: stable@vger.kernel.org
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-08-26 22:21:57 +08:00
Huacai Chen
656f9aec07 LoongArch: Ensure FP/SIMD registers in the core dump file is up to date
This is a port of commit 379eb01c21 ("riscv: Ensure the value
of FP registers in the core dump file is up to date").

The values of FP/SIMD registers in the core dump file come from the
thread.fpu. However, kernel saves the FP/SIMD registers only before
scheduling out the process. If no process switch happens during the
exception handling, kernel will not have a chance to save the latest
values of FP/SIMD registers. So it may cause their values in the core
dump file incorrect. To solve this problem, force fpr_get()/simd_get()
to save the FP/SIMD registers into the thread.fpu if the target task
equals the current task.

Cc: stable@vger.kernel.org
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-08-26 22:21:57 +08:00
Linus Torvalds
7d2f353b26 One clk driver fix and two clk framework fixes
- Fix an OOB access when devm_get_clk_from_child() is used and
    devm_clk_release() casts the void pointer to the wrong type
  - Move clk_rate_exclusive_{get,put}() within the correct ifdefs in
    clk.h so that the stubs are used when CONFIG_COMMON_CLK=n
  - Register the proper clk provider function depending on the value of
    #clock-cells in the TI keystone driver
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCAAvFiEE9L57QeeUxqYDyoaDrQKIl8bklSUFAmTpGAYRHHNib3lkQGtl
 cm5lbC5vcmcACgkQrQKIl8bklSUICxAAr+KgWzWYdDeSKU273oHmvZi2wflLwIri
 6vRkQBUxokV1j5Us4OPfu5/RzIla4JenrtYa7A6FmcUvj4ov/91uWUNA1rAVY87u
 q1KdlgKCOvW6yt0I1J93tgBEnBim9Dww00v5ULSrj+AqqQXbVKJv/xohgX3NfCtg
 Q177mxM3pOkuOtHeuHkb5etTiozEfJvQICX76EdDyxv4V4WatcxfZEPXgQfsyyM8
 NZ8mB/+FHlimPsV0jZZvmiq0VX/xop7FB0mbbFX9MVddT+mc6UHGzO9gnZc8msrg
 7KePXzwqw3pOV9erqg+Vz1POhWMliKq3tYQ9tEb8EogkRdThb2uLUE5Y13ppWZ42
 Tpps317DqPaoYBqYWNvv2S+7eJTeTNCjf4fHC52cJ5O2hBiOIEH2hnxuMBpFqOy5
 RC3ZvwDymztwqLTFbwsZ1Mp88f0y9Gl0sOaYEpt7mMAAR2tFSZlmOxwoO7uYBvyj
 norKnT1tnmDKylN90N+nHkIk1wHOVFqe4MM75OQRbacef8gn4btiAQXFJWOzY+xP
 HYJMGxN4aHrAEl5HsPHTkaf0MbSibWAYZ3lnA4hsMCiIqSDI4ZGDDZPQXizen5/x
 rS4tiLxNb4Wm2xJoGiNr+pjU+sQzxbFkVpkJW9Q+82hO9UXzC2xvVKaMMmDnbBVo
 NdC0tPznmEM=
 =fjuw
 -----END PGP SIGNATURE-----

Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux

Pull clk fixes from Stephen Boyd:
 "One clk driver fix and two clk framework fixes:

   - Fix an OOB access when devm_get_clk_from_child() is used and
     devm_clk_release() casts the void pointer to the wrong type

   - Move clk_rate_exclusive_{get,put}() within the correct ifdefs in
     clk.h so that the stubs are used when CONFIG_COMMON_CLK=n

   - Register the proper clk provider function depending on the value of
     #clock-cells in the TI keystone driver"

* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
  clk: Fix slab-out-of-bounds error in devm_clk_release()
  clk: Fix undefined reference to `clk_rate_exclusive_{get,put}'
  clk: keystone: syscon-clk: Fix audio refclk
2023-08-25 17:49:03 -07:00
Helge Deller
382d4cd184 lib/clz_ctz.c: Fix __clzdi2() and __ctzdi2() for 32-bit kernels
The gcc compiler translates on some architectures the 64-bit
__builtin_clzll() function to a call to the libgcc function __clzdi2(),
which should take a 64-bit parameter on 32- and 64-bit platforms.

But in the current kernel code, the built-in __clzdi2() function is
defined to operate (wrongly) on 32-bit parameters if BITS_PER_LONG ==
32, thus the return values on 32-bit kernels are in the range from
[0..31] instead of the expected [0..63] range.

This patch fixes the in-kernel functions __clzdi2() and __ctzdi2() to
take a 64-bit parameter on 32-bit kernels as well, thus it makes the
functions identical for 32- and 64-bit kernels.

This bug went unnoticed since kernel 3.11 for over 10 years, and here
are some possible reasons for that:

 a) Some architectures have assembly instructions to count the bits and
    which are used instead of calling __clzdi2(), e.g. on x86 the bsr
    instruction and on ppc cntlz is used. On such architectures the
    wrong __clzdi2() implementation isn't used and as such the bug has
    no effect and won't be noticed.

 b) Some architectures link to libgcc.a, and the in-kernel weak
    functions get replaced by the correct 64-bit variants from libgcc.a.

 c) __builtin_clzll() and __clzdi2() doesn't seem to be used in many
    places in the kernel, and most likely only in uncritical functions,
    e.g. when printing hex values via seq_put_hex_ll(). The wrong return
    value will still print the correct number, but just in a wrong
    formatting (e.g. with too many leading zeroes).

 d) 32-bit kernels aren't used that much any longer, so they are less
    tested.

A trivial testcase to verify if the currently running 32-bit kernel is
affected by the bug is to look at the output of /proc/self/maps:

Here the kernel uses a correct implementation of __clzdi2():

  root@debian:~# cat /proc/self/maps
  00010000-00019000 r-xp 00000000 08:05 787324     /usr/bin/cat
  00019000-0001a000 rwxp 00009000 08:05 787324     /usr/bin/cat
  0001a000-0003b000 rwxp 00000000 00:00 0          [heap]
  f7551000-f770d000 r-xp 00000000 08:05 794765     /usr/lib/hppa-linux-gnu/libc.so.6
  ...

and this kernel uses the broken implementation of __clzdi2():

  root@debian:~# cat /proc/self/maps
  0000000010000-0000000019000 r-xp 00000000 000000008:000000005 787324  /usr/bin/cat
  0000000019000-000000001a000 rwxp 000000009000 000000008:000000005 787324  /usr/bin/cat
  000000001a000-000000003b000 rwxp 00000000 00:00 0  [heap]
  00000000f73d1000-00000000f758d000 r-xp 00000000 000000008:000000005 794765  /usr/lib/hppa-linux-gnu/libc.so.6
  ...

Signed-off-by: Helge Deller <deller@gmx.de>
Fixes: 4df87bb7b6 ("lib: add weak clz/ctz functions")
Cc: Chanho Min <chanho.min@lge.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: stable@vger.kernel.org # v3.11+
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-08-25 13:22:10 -07:00
Linus Torvalds
6f0edbb833 18 hotfixes. 13 are cc:stable and the remainder pertain to post-6.4 issues
or aren't considered suitable for a -stable backport.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZOjuGgAKCRDdBJ7gKXxA
 jkLlAQDY9sYxhQZp1PFLirUIPeOBjEyifVy6L6gCfk9j0snLggEA2iK+EtuJt2Dc
 SlMfoTq29zyU/YgfKKwZEVKtPJZOHQU=
 =oTcj
 -----END PGP SIGNATURE-----

Merge tag 'mm-hotfixes-stable-2023-08-25-11-07' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull misc fixes from Andrew Morton:
 "18 hotfixes. 13 are cc:stable and the remainder pertain to post-6.4
  issues or aren't considered suitable for a -stable backport"

* tag 'mm-hotfixes-stable-2023-08-25-11-07' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  shmem: fix smaps BUG sleeping while atomic
  selftests: cachestat: catch failing fsync test on tmpfs
  selftests: cachestat: test for cachestat availability
  maple_tree: disable mas_wr_append() when other readers are possible
  madvise:madvise_free_pte_range(): don't use mapcount() against large folio for sharing check
  madvise:madvise_free_huge_pmd(): don't use mapcount() against large folio for sharing check
  madvise:madvise_cold_or_pageout_pte_range(): don't use mapcount() against large folio for sharing check
  mm: multi-gen LRU: don't spin during memcg release
  mm: memory-failure: fix unexpected return value in soft_offline_page()
  radix tree: remove unused variable
  mm: add a call to flush_cache_vmap() in vmap_pfn()
  selftests/mm: FOLL_LONGTERM need to be updated to 0x100
  nilfs2: fix general protection fault in nilfs_lookup_dirty_data_buffers()
  mm/gup: handle cont-PTE hugetlb pages correctly in gup_must_unshare() via GUP-fast
  selftests: cgroup: fix test_kmem_basic less than error
  mm: enable page walking API to lock vmas during the walk
  smaps: use vm_normal_page_pmd() instead of follow_trans_huge_pmd()
  mm/gup: reintroduce FOLL_NUMA as FOLL_HONOR_NUMA_FAULT
2023-08-25 11:44:43 -07:00
Linus Torvalds
4942fed84b RISC-V Fixes for 6.5-rc8
* The vector ucontext extension has been extended with vlenb.
 * The vector registers ELF core dump note type has been changed to avoid
   aliasing with the CSR type used in embedded systems.
 * Support for accessing vector registers via ptrace() has been reverted.
 * Another build fix for the ISA spec changes around Zifencei/Zicsr that
   manifests on some systems built with binutils-2.37 and gcc-11.2.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmTooW4THHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYiXB5EACCEaqKDGfITAIII+KJAZFfvoL9UqgU
 iyywubMbFpcpmiqOM9KmRazfbLhxEU6arU5ZOPbwwf03wcS7/dyIn/fDV7wd/lDx
 K+tN8XE0BoQkehDMKfSGAT2WZSIfzBjoa3zkNIUzKCc9DgXdDe0TPrGuQdft5oaf
 /KqE18CHQmqHrWfbt0Mc+Dpq8YXhw9pOKNA994k2aX5GR9/+wphRoA3JmNa4dzHm
 rkBoOQpirWEz1F12JpGilscdFIJOeTs3WB20rt/zisUOfEZCfjzmdx5amviR+e4X
 ENPDo1TzJVYhKfJfigyYO1pPMJ8EOB3t58sVkGjbfEmy7xa4rz3DVml2rn9CYdf/
 FeazMMo7R74DukQrSOMtiBhIlCNTIz0VKIeL24N9sTNXn7HaDzq45mQL6WVI4JxJ
 RBhvdHl3sOzMfFhB8fdebgAGtRcgBZw+joqCPBu7V37Ros2w1hv8c7Ec2q4gX5Yl
 wdtbV9JLmq4DoIrMnxxr8dgMt4QGc8io0UjvK82qBOQ5tHvSv430OSydcFbicBaU
 mLtxuI3SmlqFIURBrUPjk18B/3RZvSCtoRYgz8wyKU5DKUj7CTP6p+6sKqxM3y9G
 I+rg3SlteAqKWdNk3Tc2qExSIL6hWkOXXYeXr53uSweig7TmC2uutHs7w3hThMMp
 9/iByBaT8H2+dQ==
 =bvaa
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-6.5-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V fixes from Palmer Dabbelt:
 "This is obviously not ideal, particularly for something this late in
  the cycle.

  Unfortunately we found some uABI issues in the vector support while
  reviewing the GDB port, which has triggered a revert -- probably a
  good sign we should have reviewed GDB before merging this, I guess I
  just dropped the ball because I was so worried about the context
  extension and libc suff I forgot. Hence the late revert.

  There's some risk here as we're still exposing the vector context for
  signal handlers, but changing that would have meant reverting all of
  the vector support. The issues we've found so far have been fixed
  already and they weren't absolute showstoppers, so we're essentially
  just playing it safe by holding ptrace support for another release (or
  until we get through a proper userspace code review).

  Summary:

   - The vector ucontext extension has been extended with vlenb

   - The vector registers ELF core dump note type has been changed to
     avoid aliasing with the CSR type used in embedded systems

   - Support for accessing vector registers via ptrace() has been
     reverted

   - Another build fix for the ISA spec changes around Zifencei/Zicsr
     that manifests on some systems built with binutils-2.37 and
     gcc-11.2"

* tag 'riscv-for-linus-6.5-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  riscv: Fix build errors using binutils2.37 toolchains
  RISC-V: vector: export VLENB csr in __sc_riscv_v_state
  RISC-V: Remove ptrace support for vectors
2023-08-25 09:29:47 -07:00
Linus Torvalds
98c6b8a558 gpio fixes for v6.5
- fix an irq mapping leak in gpio-sim
 - associate the GPIO device's software node with the irq domain in gpio-sim
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEFp3rbAvDxGAT0sefEacuoBRx13IFAmTopZgACgkQEacuoBRx
 13JechAArx/BRwog5ao3xHImAeS/C9/tHAOrWN2q83lrukHNBbyE9w4Bt+fjRtb+
 YT+8zBoImQEI3S1hN3rrGgyRpxuTFIZBiQMhua4aJ5aj3HSUq+daTickwBIhkM65
 nEqqL1TViQBBEJDE2Cy34z+5tBj9+iW3L09LCEGunCQfd4ElZttpVJQpk5dgS5Lk
 OGHSq6JhnXKz0iEJehTWfS9OhcfPj5s8E2++CKkJoFqCYoAuCMn1uifhDBPvMjwD
 0Bmduc6XVk+1pFJGVGgkgTOHljAcEfZWsck/u5CDQh706kPtLRaa9gsaE7J8cOne
 TA6XPlJLepKqFnHv6XJKpZXg/hGFeoy0rDznLywCA52paI5KkZg/V6yTiVOy9ZNs
 WQSvI+7gyHa8PT9KrBF9N3dVAw2ybuMC1YlnidWaXz5v6pbbFb0FadFz+CcO90NT
 /ToAkVjkaGJQPcqXywV0RNsVpZynwCCJ9VG75sMCLXcEjhkNMSm7pP/x4YTJU4hK
 RnQXg7E8G3JdY5sXOzeEWpTKgvuJhi6xJRmci8OBBImMGHQ1bmJMUx/B4/WX6PB5
 jfTEEr1TDlY6AKcB6nCqxyE8QsynJ4SJltjgsh6Hj1maAK2O3W/xXAaQtUkloIdD
 tONu5EShYqaOykFCSCd7ewDOOdDuPN4amtYlRZQwpSJsndWrRN0=
 =v6bG
 -----END PGP SIGNATURE-----

Merge tag 'gpio-fixes-for-v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux

Pull gpio fixes from Bartosz Golaszewski:

 - fix an irq mapping leak in gpio-sim

 - associate the GPIO device's software node with the irq domain in
   gpio-sim

* tag 'gpio-fixes-for-v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
  gpio: sim: pass the GPIO device's software node to irq domain
  gpio: sim: dispose of irq mappings before destroying the irq_sim domain
2023-08-25 09:18:22 -07:00
Linus Torvalds
a87eaffbb2 Pin control fixes for the v6.5 kernel:
- Fix DT parsing and related locking in the Renesas driver.
 
 - Fix wakeup IRQs in the AMD driver once again. Really tricky
   this one.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEElDRnuGcz/wPCXQWMQRCzN7AZXXMFAmToYvkACgkQQRCzN7AZ
 XXMcFxAAoXnZqSto0WF8+yLWhdPtULrq6hfB39+Qj3AhVw3FjZpHOXISPsxer/mP
 uqXEfKvMefvvUEQo0jHKd8yd1ef9Luy/m41jz/cAMEyA/Mwe6udTXsq+q0FgS/1j
 Ciz/E2EgUwMxI1IlVaZM78CEPhCe/NgoUN9sw9bSuPigWesDATaefA+Zzk+KUdxO
 HQkpObb/B/kAROLcT8MBA9vKGUJJC67mH93ujBfk7R/S6P9YOF23n0a5FAwu+vmP
 CapyK5Cyz4Q7uNmMaIaNSjc0OmDUtu8ZzETUc9zLgVWvt8H1DuLn2FrgHFCEmo3d
 dAlH5pjpu7tFt1G2T75vdu4dsEyfsbydmauThLDfbkGOx20vr4KA6FaW7FfdmyCz
 M9T8hVrvSydAQ0YFd5Q8FPTMD+Gt89IYSibkJ8ep5oUAFt2pchxbuBfQKX2QkCGI
 eDaOQsuGJxrx9NydSKkgfdxXMmUG4auK+QlaKfx/xCLBErHJROaXDCyBBMRum/h4
 0DfHQdD/pzohoibLzA0EONPIpCE5PvqOfzufjbX9g/FArT8UAQNcJ+DD1K6EuY4o
 kO5iTm0+Mn80ZgV3IADfkC1rl8zkgQul/3KuA/k9KlyWZhA8PTmbVlJD2taKhf8N
 CIeA88rjHSqlX2LRbwwfh4SoamF/vAIMfvLzjfqaDZEPgr0BgvY=
 =vbK8
 -----END PGP SIGNATURE-----

Merge tag 'pinctrl-v6.5-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl

Pull pin control fixes from Linus Walleij:
 "Here are some Renesas and AMD driver fixes, the AMD fix affects
  important laptops in the wild so this one is pretty important. It
  seems a bit tough to get this right.

   - Fix DT parsing and related locking in the Renesas driver.

   - Fix wakeup IRQs in the AMD driver once again. Really tricky this
     one"

* tag 'pinctrl-v6.5-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
  pinctrl: amd: Mask wake bits on probe again
  pinctrl: renesas: rza2: Add lock around pinctrl_generic{{add,remove}_group,{add,remove}_function}
  pinctrl: renesas: rzv2m: Fix NULL pointer dereference in rzv2m_dt_subnode_to_map()
  pinctrl: renesas: rzg2l: Fix NULL pointer dereference in rzg2l_dt_subnode_to_map()
2023-08-25 09:10:16 -07:00
Linus Torvalds
ced5bf2493 sound fixes for 6.5
Hopefully the last bits for 6.5.  It's slightly higher LOCs than
 wished, but it doesn't look scary.
 
 The biggest change is MAINTAINERS update for TI; it's good to have
 the update before the final release, so that people can contact to
 the right persons for bug reports (which shouldn't happen of course!)
 
 The rest are all device-specific fixes and quirks, most for various
 ASoC platforms.
 -----BEGIN PGP SIGNATURE-----
 
 iQJCBAABCAAsFiEEIXTw5fNLNI7mMiVaLtJE4w1nLE8FAmToW88OHHRpd2FpQHN1
 c2UuZGUACgkQLtJE4w1nLE+sfw//XFkB7lB6zudDCWY5orGhTKF8kVefgm17vvIi
 ya4KkSpo8C1K756s7G7LKpUb6GVXcSnvThVUPnNmP2cZNGEkK4vWS7jmQZhEX7fU
 tF0s+6qSY1wCBq9feOvV8KARVNyAXBEOC+4Xz+zpkipypnS+by2r1C1o5GALxiUs
 MGCAuAi9G8sLgzoELuFRM5cv6fbiBGmLd3PMaZ8RVNXCXr2vfrgZhDUrItYelKAU
 L64L2nsOR6ZATzEDUbdM7gBy1U7u3dwI+hpX06WsdqN4kWkjV+jCrvWaQtpEtntY
 zjlUGrElVnQFy0KUtFRRYvQCDwjj+E6HFGhp8dLe4DLl61da1kD1m47i0jB9pS/Q
 AUVOzaZfjZJ6goBFg4Y9i+4UqTjQ4sOIB/L1mEmI9SC75hLE1fMUfZFalLqUmkLd
 U61ZmE9sQ5Y6g4f6fcde0TKh5Fq4yjCgtVa0D/KvWAZ0H+1O9ECI8qbgLg5xIC45
 yDQaOSv6Q8xx5FtB/44QK06Hf3Jkg7ajvY/+eyb6wA2VXF6Ntzd/GGzcepeOI/dh
 M5FdJVyGqrFI9dUgcEsGWFWijUYQFRSHLdRtqHToBi/GUlS7IwbC18t5G3Dd5MQv
 A74Xat7IfLjHboR3Iv28WtUZEGwzoGsnuSBMT/Qyuno7oFlPHJ2iVDnWftt1BWCK
 ZADn3GI=
 =9D1D
 -----END PGP SIGNATURE-----

Merge tag 'sound-6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
 "Hopefully the last bits for 6.5. It's slightly higher LOCs than
  wished, but it doesn't look scary.

  The biggest change is MAINTAINERS update for TI; it's good to have the
  update before the final release, so that people can contact to the
  right persons for bug reports (which shouldn't happen of course!)

  The rest are all device-specific fixes and quirks, most for various
  ASoC platforms"

* tag 'sound-6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ASoC: amd: yc: Fix a non-functional mic on Lenovo 82SJ
  ALSA: ymfpci: Fix the missing snd_card_free() call at probe error
  ASoC: cs35l41: Correct amp_gain_tlv values
  ASoC: amd: yc: Add VivoBook Pro 15 to quirks list for acp6x
  ASoC: tas2781: fixed register access error when switching to other chips
  ASoC: cs35l56: Add an ACPI match table
  ASoC: cs35l56: Read firmware uuid from a device property instead of _SUB
  ASoC: SOF: ipc4-pcm: fix possible null pointer deference
  MAINTAINERS: Add entries for TEXAS INSTRUMENTS ASoC DRIVERS
2023-08-25 08:48:14 -07:00
Tiezhu Yang
c337c849ab LoongArch: Put the body of play_dead() into arch_cpu_idle_dead()
The initial aim is to silence the following objtool warning:

arch/loongarch/kernel/process.o: warning: objtool: arch_cpu_idle_dead() falls through to next function start_thread()

According to tools/objtool/Documentation/objtool.txt, this is because
the last instruction of arch_cpu_idle_dead() is a call to a noreturn
function play_dead(). In order to silence the warning, one simple way
is to add the noreturn function play_dead() to objtool's hard-coded
global_noreturns array, that is to say, just put "NORETURN(play_dead)"
into tools/objtool/noreturns.h, it works well.

But I noticed that play_dead() is only defined once and only called by
arch_cpu_idle_dead(), so put the body of play_dead() into the caller
arch_cpu_idle_dead(), then remove the noreturn function play_dead() is
an alternative way which can reduce the overhead of the function call
at the same time.

Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-08-25 23:40:38 +08:00
Tiezhu Yang
8879515e12 LoongArch: Add identifier names to arguments of die() declaration
Add identifier names to arguments of die() declaration in ptrace.h
to fix the following checkpatch warnings:

  WARNING: function definition argument 'const char *' should also have an identifier name
  WARNING: function definition argument 'struct pt_regs *' should also have an identifier name

Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-08-25 23:40:26 +08:00
Tiezhu Yang
a038ae7148 LoongArch: Return earlier in die() if notify_die() returns NOTIFY_STOP
After the call to oops_exit(), it should not panic or execute
the crash kernel if the oops is to be suppressed.

Suggested-by: Maciej W. Rozycki <macro@orcam.me.uk>
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-08-25 23:40:26 +08:00
Tiezhu Yang
6933c11fb5 LoongArch: Do not kill the task in die() if notify_die() returns NOTIFY_STOP
If notify_die() returns NOTIFY_STOP, honor the return value from the
handler chain invocation in die() and return without killing the task
as, through a debugger, the fault may have been fixed. It makes sense
even if ignoring the event will make the system unstable: by allowing
access through a debugger it has been compromised already anyway. It
makes our port consistent with x86, arm64, riscv and csky.

Commit 20c0d2d440 ("[PATCH] i386: pass proper trap numbers to die
chain handlers") may be the earliest of similar changes.

Link: https://lore.kernel.org/r/43DDF02E.76F0.0078.0@novell.com/
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-08-25 23:40:26 +08:00
Masahiro Yamada
a746ceb1f3 LoongArch: Remove <asm/export.h>
All *.S files under arch/loongarch/ have been converted to include
<linux/export.h> instead of <asm/export.h>.

Remove <asm/export.h>.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-08-25 23:40:26 +08:00
Masahiro Yamada
55b46ff939 LoongArch: Replace #include <asm/export.h> with #include <linux/export.h>
Commit ddb5cdbafa ("kbuild: generate KSYMTAB entries by modpost")
deprecated <asm/export.h>, which is now a wrapper of <linux/export.h>.

Replace #include <asm/export.h> with #include <linux/export.h>.

After all the <asm/export.h> lines are converted, <asm/export.h> and
<asm-generic/export.h> will be removed.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-08-25 23:40:26 +08:00
Masahiro Yamada
347aa8dec2 LoongArch: Remove unneeded #include <asm/export.h>
There is no EXPORT_SYMBOL() line there, hence #include <asm/export.h>
is unneeded.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-08-25 23:40:26 +08:00
WANG Xuerui
3f301dc292 LoongArch: Replace -ffreestanding with finer-grained -fno-builtin's
As explained by Nick in the original issue: the kernel usually does a
good job of providing library helpers that have similar semantics as
their ordinary userspace libc equivalents, but -ffreestanding disables
such libcall optimization and other related features in the compiler,
which can lead to unexpected things such as CONFIG_FORTIFY_SOURCE not
working (!).

However, due to the desire for better control over unaligned accesses
with respect to CONFIG_ARCH_STRICT_ALIGN, and also for avoiding the
GCC bug https://gcc.gnu.org/PR109465, we do want to still disable
optimizations for the memory libcalls (memcpy, memmove and memset for
now). Use finer-grained -fno-builtin-* toggles to achieve this without
losing source fortification and other libcall optimizations.

Closes: https://github.com/ClangBuiltLinux/linux/issues/1897
Reported-by: Nathan Chancellor <nathan@kernel.org>
Suggested-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-08-25 23:40:26 +08:00
Xi Ruoyao
b8e2771b7f LoongArch: Remove redundant "source drivers/firmware/Kconfig"
In drivers/Kconfig, drivers/firmware/Kconfig is sourced for all ports so
there is no need to source it in the port-specific Kconfig file.  And
sourcing it here also caused the "Firmware Drivers" menu appeared two
times: one in the "Device Drivers" menu, another in the toplevel menu.
This is really puzzling so remove it.

Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Xi Ruoyao <xry111@xry111.site>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-08-25 23:40:26 +08:00
Linus Torvalds
beaa71d6e6 drm fixes for 6.5-rc8
core:
 - add a HPD poll helper
 
 i915:
 - fix regression in i915 polling
 - fix docs build warning
 - fix DG2 idle power consumption
 
 bridge:
 - samsung-dsim: init fix
 
 panfrost:
 - fix speed binning issue
 
 dma-buf:
 - fix recursive lock in fence signal
 
 vmwgfx:
 - fix shader stage validation
 - fix NULL ptr derefs in gem put
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmToGlAACgkQDHTzWXnE
 hr677A//effILD7P6UcIppk9NdlB5oIKI+4JSwHH8fczUuPQFQJeapGsEYEhcyvt
 c2antucbC1pNuJ4xWwasWFQFf5hQGxITpCl+0ZaxcmwRU8lRxvWA4R8eqN9gnL+g
 Do2l9ObDRN2jSfS5z0Gbn21hSlji93k5M+Xa4CgAI2bnKFNkECVxY3nUJS7eMgQd
 wlsE70gQwbaNW9cA+sg43XS/6wwy78Mn876AWFoCb7IWKfmlvq53KhQykoWuZYEv
 2dRTQmnQWeftQcADuX9fHMVVmIcETcSv1v7QmnlGAty0Pvx0UbYMgdYuIistwzgg
 HxwgWJYTlPAXnVU2+/yRGJkstqB3yUvshKsaILPwQ/Xhm+vJ6/d0tV4WocfBGjm4
 AEvzwrNWqVl8ArjvoUHpUaHxo4OD64buR4oEGl5TpaLjQsGgtSs32JcZzioFq8kk
 8ETtcS8rrzRFiy3+5bGTD6TlBm9177UdIUXv4dJUUNrPKgqHJzpCBNaewZkGKU2h
 Zp1hkFbzcEl8d2/QMzqtv6Thn873CEvekIG65U4xxDq5Q1WO6OcghV7ctt67+4Qz
 xJferKo9tmns8zuZMfyC6/JP12roN7Q3LdFB0JM0Sbw6G+pmy3Sa/ye4c/IMOM2M
 3xjCSK76f6oVgEiSkNp83Gf6hHvllM4vWzIZu3/fSRhUId3o8fY=
 =ZLyj
 -----END PGP SIGNATURE-----

Merge tag 'drm-fixes-2023-08-25' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
 "A bit bigger than I'd care for, but it's mostly a single vmwgfx fix
  and a fix for an i915 hotplug probing. Otherwise misc i915, bridge,
  panfrost and dma-buf fixes.

  core:
   - add a HPD poll helper

  i915:
   - fix regression in i915 polling
   - fix docs build warning
   - fix DG2 idle power consumption

  bridge:
   - samsung-dsim: init fix

  panfrost:
   - fix speed binning issue

  dma-buf:
   - fix recursive lock in fence signal

  vmwgfx:
   - fix shader stage validation
   - fix NULL ptr derefs in gem put"

* tag 'drm-fixes-2023-08-25' of git://anongit.freedesktop.org/drm/drm:
  drm/i915: Fix HPD polling, reenabling the output poll work as needed
  drm: Add an HPD poll helper to reschedule the poll work
  drm/vmwgfx: Fix possible invalid drm gem put calls
  drm/vmwgfx: Fix shader stage validation
  dma-buf/sw_sync: Avoid recursive lock during fence signal
  drm/i915: fix Sphinx indentation warning
  drm/i915/dgfx: Enable d3cold at s2idle
  drm/display/dp: Fix the DP DSC Receiver cap size
  drm/panfrost: Skip speed binning on EOPNOTSUPP
  drm: bridge: samsung-dsim: Fix init during host transfer
2023-08-25 08:38:40 -07:00
Takashi Iwai
37e44d60cb ASoC: Quirk for v6.5
One additional fix for v6.5, an additional quirk.  As with the other
 fixes this could wait for the merge window.
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmTn0QgACgkQJNaLcl1U
 h9Al3Af9H6ZmSqK8N0KqLriCo/5dw49w32+85dUC1byVE0az7VXxzTKpzn+JPEii
 xGygWWEUvflbAOm5A8zp11GBizTr9IalnlLnnHFEIt0Zii8YfVrhe+eaRuZLFzxu
 c3rjNhRWoIIEVAnitF9cZrBk+eQC/pjLDP/1VQLaaYGlDkB3OKhbhWUMrCIaX2Um
 Y5XeBHbUnvBtKV2w2UQRoS8dCfT/OrtckmQ7I7U73PbFVtjg+TPzoYe58uuPV21n
 4GT/qpYA/Pb5walxOw2bP3JP+1MBFrK/MenzihZ/Eb4JAlzBSdfFn6uyOvJGmQL1
 nzBjKWE67xen4E64rTiDF6jYD7bSrg==
 =reaf
 -----END PGP SIGNATURE-----

Merge tag 'asoc-fix-v6.5-rc7-2' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus

ASoC: Quirk for v6.5

One additional fix for v6.5, an additional quirk.  As with the other
fixes this could wait for the merge window.
2023-08-25 09:43:49 +02:00
Linus Torvalds
4f9e7fabf8 Tracing fixes for 6.5:
- Fix ring buffer being permanently disabled due to missed record_disabled()
   Changing the trace cpu mask will disable the ring buffers for the CPUs no
   longer in the mask. But it fails to update the snapshot buffer. If a snapshot
   takes place, the accounting for the ring buffer being disabled is corrupted
   and this can lead to the ring buffer being permanently disabled.
 
 - Add test case for snapshot and cpu mask working together
 
 - Fix memleak by the function graph tracer not getting closed properly.
   The iterator is used to read the ring buffer. When it opens, it calls
   the open function of a tracer, and when it is closed, it calls the close
   iteration. While a trace is being read, it is still possible to change
   the tracer. If this happens between the function graph tracer and the
   wakeup tracer (which uses function graph tracing), the tracers are not
   closed properly during when the iterator sees the switch, and the wakeup
   function did not initialize its private pointer to NULL, which is used
   to know if the function graph tracer was the last tracer. It could be
   fooled in thinking it is, but then on exit it does not call the close
   function of the function graph tracer to clean up its data.
 
 - Fix synthetic events on big endian machines, by introducing a union
   that does the conversions properly.
 
 - Fix synthetic events from printing out the number of elements in the
   stacktrace when it shouldn't.
 
 - Fix synthetic events stacktrace to not print a bogus value at the end.
 
 - Introduce a pipe_cpumask that prevents the trace_pipe files from being
   opened by more than one task (file descriptor). There was a race found
   where if splice is called, the iter->ent could become stale and events
   could be missed. There's no point reading a producer/consumer file by
   more than one task as they will corrupt each other anyway. Add a cpumask
   that keeps track of the per_cpu trace_pipe files as well as the global
   trace_pipe file that prevents more than one open of a trace_pipe file
   that represents the same ring buffer. This prevents the race from
   happening.
 
 - Fix ftrace samples for arm64 to work with older compilers.
 -----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEEXtmkj8VMCiLR0IBM68Js21pW3nMFAmTn7hsUHHJvc3RlZHRA
 Z29vZG1pcy5vcmcACgkQ68Js21pW3nOSVg/9HVFTUX52yqvT9YLv8b1QZYMo2I5n
 to1nSbF9UUYIAMqkyNHuXU2TBj1I77JVkbsPEFsjYTN97CzJ/Zc8jNa6p7HgVure
 1xUuaCgtOFPDjU6OpTa6wFRt4usU1UM8Noc/ii0WDUGsA+RKBAKmGUsNmuDHx1Ae
 a3uJTLR4VHMeAkbtth/8f6RHBVocDVSYPQbKnC0PksGW1wg9e9PU2G6+o3069I1Q
 qbOZJqvGDpeaapjyG2LYrDwkVGOPSvUJuJNfcjcNcyKoHSqxZzBb+feY16NhWDm9
 idpyHLE4qF1nvlh1SIpErFl8Bu5MG9CN8a+xrx3ufd6i1jO4bcHyRD9XmjG4p72+
 zVshoS86DI4KCK9wHxJ/5/SU6XuL6JoNTP7NhmDIX83QCuZwgTOa8C2xzHKHu58F
 In13IhJqS5ob6Jy25a/bAy0CbdTl0cjQvMfXrrYK0ZWuEBWgBUDqwB/eKY6oq79D
 oTKmFNOZsuiLAhPywAoY5cAqQtftpy5Ul8o6ed3RAw3th0WC4EeXIk5eUu5g/ABI
 1ZfnpeY5al/JROFGXWxyn964RRjoVpbC1M6NVTz33e7No+r2KSRlLb5ZEWZVVIcZ
 My96QiZXamBZ/EOR7x72yFWxXBSuACu9nvbSSZRnbEGujoKwDb89xtgWzSNuR/wi
 GexDcj6qWfODrfc=
 =CI9f
 -----END PGP SIGNATURE-----

Merge tag 'trace-v6.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull tracing fixes from Steven Rostedt:

 - Fix ring buffer being permanently disabled due to missed
   record_disabled()

   Changing the trace cpu mask will disable the ring buffers for the
   CPUs no longer in the mask. But it fails to update the snapshot
   buffer. If a snapshot takes place, the accounting for the ring buffer
   being disabled is corrupted and this can lead to the ring buffer
   being permanently disabled.

 - Add test case for snapshot and cpu mask working together

 - Fix memleak by the function graph tracer not getting closed properly.

   The iterator is used to read the ring buffer. When it opens, it calls
   the open function of a tracer, and when it is closed, it calls the
   close iteration. While a trace is being read, it is still possible to
   change the tracer.

   If this happens between the function graph tracer and the wakeup
   tracer (which uses function graph tracing), the tracers are not
   closed properly during when the iterator sees the switch, and the
   wakeup function did not initialize its private pointer to NULL, which
   is used to know if the function graph tracer was the last tracer. It
   could be fooled in thinking it is, but then on exit it does not call
   the close function of the function graph tracer to clean up its data.

 - Fix synthetic events on big endian machines, by introducing a union
   that does the conversions properly.

 - Fix synthetic events from printing out the number of elements in the
   stacktrace when it shouldn't.

 - Fix synthetic events stacktrace to not print a bogus value at the
   end.

 - Introduce a pipe_cpumask that prevents the trace_pipe files from
   being opened by more than one task (file descriptor).

   There was a race found where if splice is called, the iter->ent could
   become stale and events could be missed. There's no point reading a
   producer/consumer file by more than one task as they will corrupt
   each other anyway. Add a cpumask that keeps track of the per_cpu
   trace_pipe files as well as the global trace_pipe file that prevents
   more than one open of a trace_pipe file that represents the same ring
   buffer. This prevents the race from happening.

 - Fix ftrace samples for arm64 to work with older compilers.

* tag 'trace-v6.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  samples: ftrace: Replace bti assembly with hint for older compiler
  tracing: Introduce pipe_cpumask to avoid race on trace_pipes
  tracing: Fix memleak due to race between current_tracer and trace
  tracing/synthetic: Allocate one additional element for size
  tracing/synthetic: Skip first entry for stack traces
  tracing/synthetic: Use union instead of casts
  selftests/ftrace: Add a basic testcase for snapshot
  tracing: Fix cpu buffers unavailable due to 'record_disabled' missed
2023-08-24 19:39:20 -07:00
Zhu Wang
1bd3a76880 scsi: snic: Fix double free in snic_tgt_create()
Commit 41320b18a0 ("scsi: snic: Fix possible memory leak if device_add()
fails") fixed the memory leak caused by dev_set_name() when device_add()
failed. However, it did not consider that 'tgt' has already been released
when put_device(&tgt->dev) is called. Remove kfree(tgt) in the error path
to avoid double free of 'tgt' and move put_device(&tgt->dev) after the
removed kfree(tgt) to avoid a use-after-free.

Fixes: 41320b18a0 ("scsi: snic: Fix possible memory leak if device_add() fails")
Signed-off-by: Zhu Wang <wangzhu9@huawei.com>
Link: https://lore.kernel.org/r/20230819083941.164365-1-wangzhu9@huawei.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-08-24 22:30:32 -04:00
Linus Torvalds
14ddccc8a6 media fixes for v6.5-rc8
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE+QmuaPwR3wnBdVwACF8+vY7k4RUFAmTn7TIACgkQCF8+vY7k
 4RUDzw/7BgvrQGGQ3S1kGN1UCmburYxZCTeKVBLgJCwQtKzZ68SGL0ZSw1EzSk/9
 TfmNMt9dSvnT7KM2BIMyKJWMfEWSjvv8Osvzf8Pd7IOMm6R4uY6wZ10La2u73Khg
 v/z9Dqe7lBOfqvNsIFKeK8XOLNz5l85deS3x+6hGOVTknPkT6zzeGK1AbRXbkWaL
 f0fOAW+w2cwt8N8Gfx+bHkGlg13/4Wij3/v22JFg/v8rgLhl3838gO1+RXboHT4f
 SvwXd7t6jkkpKtsjSnf2g2zDTPW7LuMYGh3FHlcpy+PkmBGyRhPUg+NsMrKXeQnD
 g6vmRNqvbmSnEHIrJiPSCWCc3pRRbul6bCt2RVZ2m3298y53CB8DZ3e3BsQrZmVe
 l4/bq0tsF4SNdNO0XpqKmamNd6uEYVkx2TLXJi2koPQWzCR8ddpUpBrogFjl3M8A
 gAwf/hnNV7pO8J1AaMOKv1Ef/O9g1TyZ2DMjavcAUjwBomNch56e5fSDzmMAwHzE
 0bCankzhQe6sDOIIqGb51mvn445ni4oVovuuaoG1Sx4PhRB23njGhap1/qPVyiAr
 ZtHp3eY5z0pKpaa8t4CGtprrqY2V5UxXb/Hwr/FuSQFdJtCDzwZau3bJsek5LY2+
 Fl+gVUbgZvt0G1SFeMZAPRNEWUDuiqo67KTOmGmlQZY7LJgPuH8=
 =7QYn
 -----END PGP SIGNATURE-----

Merge tag 'media/v6.5-4' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media

Pull media fix from Mauro Carvalho Chehab:
 "Fix a potential array out-of-bounds in the mediatek vcodec driver"

* tag 'media/v6.5-4' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
  media: vcodec: Fix potential array out-of-bounds in encoder queue_setup
2023-08-24 19:10:53 -07:00
Zhu Wang
60c5fd2e8f scsi: core: raid_class: Remove raid_component_add()
The raid_component_add() function was added to the kernel tree via patch
"[SCSI] embryonic RAID class" (2005). Remove this function since it never
has had any callers in the Linux kernel. And also raid_component_release()
is only used in raid_component_add(), so it is also removed.

Signed-off-by: Zhu Wang <wangzhu9@huawei.com>
Link: https://lore.kernel.org/r/20230822015254.184270-1-wangzhu9@huawei.com
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Fixes: 04b5b5cb01 ("scsi: core: Fix possible memory leak if device_add() fails")
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-08-24 21:34:28 -04:00
Dave Airlie
59fe2029b9 - Fix power consumption at s2idle on DG2 (Anshuman)
- Fix documentation build warning (Jani)
 - Fix Display HPD (Imre)
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEbSBwaO7dZQkcLOKj+mJfZA7rE8oFAmTnTrYACgkQ+mJfZA7r
 E8rxgQf/YgB/haPZF+eESWUjwp/TRW1NphcQEv3VtQhDayJZofQrL6ouZu+SSKyB
 yrAcQGQwJ831T2l7ms3LIzLvx7eDW5o1Q7fW1dTIhv6LHyI7rFufLXjb2Pv8cvdt
 zGBzPfcmmxVBpe3gDvJxbtP950yzRcwWgK1YVPpenlAs86R608THuq27Z6tW0Ztv
 in7IlN7FMB56ReepHOVYq+QfnnS7Goah64RLav7ioi/8xLY3SFLezEzGfRcz9ylM
 S5Qz+FWMqf0HhJGYtLyGKnUcLiWjH3Oz/mTwMEC0ChNfTirxZQ4P0BY6P2tgU0Ib
 h6u1DKOXWDaptcnGdFYqvXKKgSkihQ==
 =hKF2
 -----END PGP SIGNATURE-----

Merge tag 'drm-intel-fixes-2023-08-24' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes

- Fix power consumption at s2idle on DG2 (Anshuman)
- Fix documentation build warning (Jani)
- Fix Display HPD (Imre)

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ZOdPRFSJpo0ErPX/@intel.com
2023-08-25 09:12:02 +10:00
Hugh Dickins
e5548f85b4 shmem: fix smaps BUG sleeping while atomic
smaps_pte_hole_lookup() is calling shmem_partial_swap_usage() with page
table lock held: but shmem_partial_swap_usage() does cond_resched_rcu() if
need_resched(): "BUG: sleeping function called from invalid context".

Since shmem_partial_swap_usage() is designed to count across a range, but
smaps_pte_hole_lookup() only calls it for a single page slot, just break
out of the loop on the last or only page, before checking need_resched().

Link: https://lkml.kernel.org/r/6fe3b3ec-abdf-332f-5c23-6a3b3a3b11a9@google.com
Fixes: 2301003215 ("mm/smaps: simplify shmem handling of pte holes")
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Peter Xu <peterx@redhat.com>
Cc: <stable@vger.kernel.org>	[5.16+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-24 14:59:47 -07:00
Andre Przywara
f84f62e699 selftests: cachestat: catch failing fsync test on tmpfs
The cachestat kselftest runs a test on a normal file, which is created
temporarily in the current directory.  Among the tests it runs there is a
call to fsync(), which is expected to clean all dirty pages used by the
file.

However the tmpfs filesystem implements fsync() as noop_fsync(), so the
call will not even attempt to clean anything when this test file happens
to live on a tmpfs instance.  This happens in an initramfs, or when the
current directory is in /dev/shm or sometimes /tmp.

To avoid this test failing wrongly, use statfs() to check which filesystem
the test file lives on.  If that is "tmpfs", we skip the fsync() test.

Since the fsync test is only one part of the "normal file" test, we now
execute this twice, skipping the fsync part on the first call.  This way
only the second test, including the fsync part, would be skipped.

Link: https://lkml.kernel.org/r/20230821160534.3414911-3-andre.przywara@arm.com
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Nhat Pham <nphamcs@gmail.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-24 14:59:47 -07:00
Andre Przywara
5e56982dd0 selftests: cachestat: test for cachestat availability
Patch series "selftests: cachestat: fix run on older kernels", v2.

I ran all kernel selftests on some test machine, and stumbled upon
cachestat failing (among others).  These patches fix the run on older
kernels and when the current directory is on a tmpfs instance.


This patch (of 2):

As cachestat is a new syscall, it won't be available on older kernels, for
instance those running on a development machine.  At the moment the test
reports all tests as "not ok" in this case.

Test for the cachestat syscall availability first, before doing further
tests, and bail out early with a TAP SKIP comment.

This also uses the opportunity to add the proper TAP headers, and add one
check for proper error handling (illegal file descriptor).

Link: https://lkml.kernel.org/r/20230821160534.3414911-1-andre.przywara@arm.com
Link: https://lkml.kernel.org/r/20230821160534.3414911-2-andre.przywara@arm.com
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Acked-by: Nhat Pham <nphamcs@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-24 14:59:47 -07:00
Liam R. Howlett
cfeb6ae8bc maple_tree: disable mas_wr_append() when other readers are possible
The current implementation of append may cause duplicate data and/or
incorrect ranges to be returned to a reader during an update.  Although
this has not been reported or seen, disable the append write operation
while the tree is in rcu mode out of an abundance of caution.

During the analysis of the mas_next_slot() the following was
artificially created by separating the writer and reader code:

Writer:                                 reader:
mas_wr_append
    set end pivot
    updates end metata
    Detects write to last slot
    last slot write is to start of slot
    store current contents in slot
    overwrite old end pivot
                                        mas_next_slot():
                                                read end metadata
                                                read old end pivot
                                                return with incorrect range
    store new value

Alternatively:

Writer:                                 reader:
mas_wr_append
    set end pivot
    updates end metata
    Detects write to last slot
    last lost write to end of slot
    store value
                                        mas_next_slot():
                                                read end metadata
                                                read old end pivot
                                                read new end pivot
                                                return with incorrect range
    set old end pivot

There may be other accesses that are not safe since we are now updating
both metadata and pointers, so disabling append if there could be rcu
readers is the safest action.

Link: https://lkml.kernel.org/r/20230819004356.1454718-2-Liam.Howlett@oracle.com
Fixes: 54a611b605 ("Maple Tree: add new data structure")
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-24 14:59:47 -07:00
Yin Fengwei
0e0e9bd5f7 madvise:madvise_free_pte_range(): don't use mapcount() against large folio for sharing check
Commit 98b211d641 ("madvise: convert madvise_free_pte_range() to use a
folio") replaced the page_mapcount() with folio_mapcount() to check
whether the folio is shared by other mapping.

It's not correct for large folios. folio_mapcount() returns the total
mapcount of large folio which is not suitable to detect whether the folio
is shared.

Use folio_estimated_sharers() which returns a estimated number of shares.
That means it's not 100% correct. It should be OK for madvise case here.

User-visible effects is that the THP is skipped when user call madvise.
But the correct behavior is THP should be split and processed then.

NOTE: this change is a temporary fix to reduce the user-visible effects
before the long term fix from David is ready.

Link: https://lkml.kernel.org/r/20230808020917.2230692-4-fengwei.yin@intel.com
Fixes: 98b211d641 ("madvise: convert madvise_free_pte_range() to use a folio")
Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>
Reviewed-by: Yu Zhao <yuzhao@google.com>
Reviewed-by: Ryan Roberts <ryan.roberts@arm.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Vishal Moola (Oracle) <vishal.moola@gmail.com>
Cc: Yang Shi <shy828301@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-24 14:59:46 -07:00
Yin Fengwei
20b18aada1 madvise:madvise_free_huge_pmd(): don't use mapcount() against large folio for sharing check
Commit fc986a38b6 ("mm: huge_memory: convert madvise_free_huge_pmd to
use a folio") replaced the page_mapcount() with folio_mapcount() to check
whether the folio is shared by other mapping.

It's not correct for large folios. folio_mapcount() returns the total
mapcount of large folio which is not suitable to detect whether the folio
is shared.

Use folio_estimated_sharers() which returns a estimated number of shares.
That means it's not 100% correct. It should be OK for madvise case here.

User-visible effects is that the THP is skipped when user call madvise.
But the correct behavior is THP should be split and processed then.

NOTE: this change is a temporary fix to reduce the user-visible effects
before the long term fix from David is ready.

Link: https://lkml.kernel.org/r/20230808020917.2230692-3-fengwei.yin@intel.com
Fixes: fc986a38b6 ("mm: huge_memory: convert madvise_free_huge_pmd to use a folio")
Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>
Reviewed-by: Yu Zhao <yuzhao@google.com>
Reviewed-by: Ryan Roberts <ryan.roberts@arm.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Vishal Moola (Oracle) <vishal.moola@gmail.com>
Cc: Yang Shi <shy828301@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-24 14:59:46 -07:00