A previous commit moved the notifications and end-write handling, but
it is now missing a few spots where we also want to call both of those.
Without that, we can potentially be missing file notifications, and
more importantly, have an imbalance in the super_block writers sem
accounting.
Fixes: b000145e99 ("io_uring/rw: defer fsnotify calls to task context")
Reported-by: Dave Chinner <david@fromorbit.com>
Link: https://lore.kernel.org/all/20221010050319.GC2703033@dread.disaster.area/
Signed-off-by: Jens Axboe <axboe@kernel.dk>
This fixes the shadowing of the outer variable rw in the function
io_write(). No issue is caused by this, but let's silence the shadowing
warning anyway.
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Stefan Roesch <shr@devkernel.io>
Link: https://lore.kernel.org/r/20221010234330.244244-1-shr@devkernel.io
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The msg variants of sending aren't audited separately, so we should not
be setting audit_skip for the zerocopy sendmsg variant either.
Fixes: 493108d95f ("io_uring/net: zerocopy sendmsg")
Reported-by: Paul Moore <paul@paul-moore.com>
Reviewed-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Running local task_work requires taking uring_lock, for submit + wait we
can try to run them right after submit while we still hold the lock and
save one lock/unlokc pair. The optimisation was implemented in the first
local tw patches but got dropped for simplicity.
Suggested-by: Dylan Yudaken <dylany@fb.com>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/281fc79d98b5d91fe4778c5137a17a2ab4693e5c.1665088876.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
io_cqring_wake() needs a barrier for the waitqueue_active() check.
However, in the case of io_req_local_work_add(), we call llist_add()
first, which implies an atomic. Hence we can replace smb_mb() with
smp_mb__after_atomic().
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/43983bc8bc507172adda7a0f00cab1aff09fd238.1665018309.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
We treat EINPROGRESS like EAGAIN, but if we're retrying post getting
EINPROGRESS, then we just need to check the socket for errors and
terminate the request.
This was exposed on a bluetooth connection request which ends up
taking a while and hitting EINPROGRESS, and yields a CQE result of
-EBADFD because we're retrying a connect on a socket that is now
connected.
Cc: stable@vger.kernel.org
Fixes: 87f80d623c ("io_uring: handle connect -EINPROGRESS like -EAGAIN")
Link: https://github.com/axboe/liburing/issues/671
Reported-by: Aidan Sun <aidansun05@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Instead of putting io_uring's registered files in unix_gc() we want it
to be done by io_uring itself. The trick here is to consider io_uring
registered files for cycle detection but not actually putting them down.
Because io_uring can't register other ring instances, this will remove
all refs to the ring file triggering the ->release path and clean up
with io_ring_ctx_free().
Cc: stable@vger.kernel.org
Fixes: 6b06314c47 ("io_uring: add file set registration")
Reported-and-tested-by: David Bouman <dbouman03@gmail.com>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
[axboe: add kerneldoc comment to skb, fold in skb leak fix]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
This second KUnit update for Linux 6.1-rc1 consists of features and
fixes:
- simplifying resource use.
- make kunit_malloc() and kunit_free() allocations and frees consistent.
kunit_free() frees only the memory allocated by kunit_malloc().
- stop downloading risc-v opensbi binaries using wget.
- other fixes and improvements to tool and KUnit framework.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmNG/4EACgkQCwJExA0N
QxxznhAAqtXbCYxIxerdiAHwYifnsrLcCMm/Ol2yuFJhmTn6sZh7w4S8bRBt0RlX
+1IfqtzOi1K1fTpmWQqnq0/fH8gNZrhZHHqXxx3c353pG0BfrC3vODx1VzxuPCMi
nr/OHqAQ0VSTuxgWxsIr0SuhOM4LFDjhBcLDoCDoBF5aQSJricpa++ixiYsVgaUt
nG+E1i7I/hvEYwqqUqtJLp9fOD6LK2IeiOP4oH2PwYBIpFO+BXwk0Gbs/ISL+fRP
F8pph2Qm2jxCJ4kRDvs/N41mkIvG9PwC1h7fW4vDXix0zryJdh0TbilFQFFwiuW3
S8kFE1tarMBWyqEZU/2cln9MFdZpxXAWtJu1/B8dqOvLA06mBOaNbB4tOXzfyriE
QBOnEJNqgT0wqnwWONvrljz7L+YaFAkJAGxbub1cGIUa/t5HHs0WX5XncctGfsaE
Ec6bLOXMgemb3dm35fDpBHyN6np9K5BMmz8Ggv02+V8FH8nrXAzblOW/CN8KgXiG
R5+1vd3SxaLq7npal4S88LmNRoJCVCSWnNPItBTgWFXy6Ni2T5WEoi6rSdqJNX+/
bpPM4G47IO5BH0YEbl9IPvKLfDGczVB4TVLpIt61QST4rf+puUhysr76ZweqoU6f
sOyEenr3YZ7C3EpSbcAztzgyPomPAacR/lNbG5lezcEPRSo184I=
=FgDN
-----END PGP SIGNATURE-----
Merge tag 'linux-kselftest-kunit-6.1-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull more KUnit updates from Shuah Khan:
"Features and fixes:
- simplify resource use
- make kunit_malloc() and kunit_free() allocations and frees
consistent. kunit_free() frees only the memory allocated by
kunit_malloc()
- stop downloading risc-v opensbi binaries using wget
- other fixes and improvements to tool and KUnit framework"
* tag 'linux-kselftest-kunit-6.1-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
Documentation: kunit: Update description of --alltests option
kunit: declare kunit_assert structs as const
kunit: rename base KUNIT_ASSERTION macro to _KUNIT_FAILED
kunit: remove format func from struct kunit_assert, get it to 0 bytes
kunit: tool: Don't download risc-v opensbi firmware with wget
kunit: make kunit_kfree(NULL) a no-op to match kfree()
kunit: make kunit_kfree() not segfault on invalid inputs
kunit: make kunit_kfree() only work on pointers from kunit_malloc() and friends
kunit: drop test pointer in string_stream_fragment
kunit: string-stream: Simplify resource use
This second Kselftest update for Linux 6.1-rc1 consists of fixes
and improvements to memory-hotplug test and a minor spelling fix
to ftrace test.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmNG4hgACgkQCwJExA0N
QxwkPBAAyPd0ZUHlF7JjzdV2obHDGxbjMzi0x8Di8md4B24gE0PvGY79E7eM/uKd
pBsop5cnvwGBZGuoBM0E/1J7UB/Lgedl2iYDUFXQe8JoPlOgvmBMbJCdZ3Zv8gxp
sk5yIrLakgyp2WZng0QyQwZQY4nvq8Lf/f50T8/3+g8OBqF+xTo60DyEpsaDNHS4
3SddH8/jJ6TkG/5lRoEOlfYFrhCDuxq1e8R0jts1vgnpdhpSD9JZPr26VNGVcygB
dkp4icsQFWAaZjNO6+7scgp1yfxBFJ2Fh/gDdfWqEAYvZtvnnr2XhwlYK+O7JZRp
DuglF4Lo/AN3betWuAz4rWyqAYoBZxrUTxrsIVyzb3FqpRAlR32YPFfMo6iWYYn4
638E6cYvkNbbbhCEEgHJJiFZzUB/xbLR/Y8gD4Que/Y+Ck7+zuvQMzZWHQNJfsGx
OhhfUcJlw/VzRpdZx1UToT++DqOqJLBL7DVMATbiXd2rDGKbnEw2pKkeuURXVged
1nis9odge5yY42Q5I3doyPHO7rENOAP2wmlKvJqFDKZFoD23MsGv/m6gpg/HaS1Q
T27L1hHFXPrAZ14MxGva1DTVTPU8D/ciHcqjCWWmnHp8M359JvjOsDL7PyMUcGlm
bVSqcciy71utp1XaFfaF7kT1bwnNeGgGqs0EXcmj4xE8CyOtat8=
=GXpd
-----END PGP SIGNATURE-----
Merge tag 'linux-kselftest-next-6.1-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull more Kselftest updates from Shuah Khan:
"This consists of fixes and improvements to memory-hotplug test and a
minor spelling fix to ftrace test"
* tag 'linux-kselftest-next-6.1-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
docs: notifier-error-inject: Correct test's name
selftests/memory-hotplug: Adjust log info for maintainability
selftests/memory-hotplug: Restore memory before exit
selftests/memory-hotplug: Add checking after online or offline
selftests/ftrace: func_event_triggers: fix typo in user message
- Prune private items from vfio_pci_core.h to a new internal header,
fix missed function rename, and refactor vfio-pci interrupt defines.
(Jason Gunthorpe)
- Create consistent naming and handling of ioctls with a function per
ioctl for vfio-pci and vfio group handling, use proper type args
where available. (Jason Gunthorpe)
- Implement a set of low power device feature ioctls allowing userspace
to make use of power states such as D3cold where supported.
(Abhishek Sahu)
- Remove device counter on vfio groups, which had restricted the page
pinning interface to singleton groups to account for limitations in
the type1 IOMMU backend. Document usage as limited to emulated IOMMU
devices, ie. traditional mdev devices where this restriction is
consistent. (Jason Gunthorpe)
- Correct function prefix in hisi_acc driver incurred during previous
refactoring. (Shameer Kolothum)
- Correct typo and remove redundant warning triggers in vfio-fsl driver.
(Christophe JAILLET)
- Introduce device level DMA dirty tracking uAPI and implementation in
the mlx5 variant driver (Yishai Hadas & Joao Martins)
- Move much of the vfio_device life cycle management into vfio core,
simplifying and avoiding duplication across drivers. This also
facilitates adding a struct device to vfio_device which begins the
introduction of device rather than group level user support and fills
a gap allowing userspace identify devices as vfio capable without
implicit knowledge of the driver. (Kevin Tian & Yi Liu)
- Split vfio container handling to a separate file, creating a more
well defined API between the core and container code, masking IOMMU
backend implementation from the core, allowing for an easier future
transition to an iommufd based implementation of the same.
(Jason Gunthorpe)
- Attempt to resolve race accessing the iommu_group for a device
between vfio releasing DMA ownership and removal of the device from
the IOMMU driver. Follow-up with support to allow vfio_group to
exist with NULL iommu_group pointer to support existing userspace
use cases of holding the group file open. (Jason Gunthorpe)
- Fix error code and hi/lo register manipulation issues in the hisi_acc
variant driver, along with various code cleanups. (Longfang Liu)
- Fix a prior regression in GVT-g group teardown, resulting in
unreleased resources. (Jason Gunthorpe)
- A significant cleanup and simplification of the mdev interface,
consolidating much of the open coded per driver sysfs interface
support into the mdev core. (Christoph Hellwig)
- Simplification of tracking and locking around vfio_groups that
fall out from previous refactoring. (Jason Gunthorpe)
- Replace trivial open coded f_ops tests with new helper.
(Alex Williamson)
-----BEGIN PGP SIGNATURE-----
iQJPBAABCAA5FiEEQvbATlQL0amee4qQI5ubbjuwiyIFAmNGz2AbHGFsZXgud2ls
bGlhbXNvbkByZWRoYXQuY29tAAoJECObm247sIsiatYQAI+7bFjVsTKwCnWUhp/A
WnFmLpnh/OsBIYiXRbXGZBgIO4iPmMyFkxqjnv6e8H1WnKhLbuPy/xCaAvPrtI8b
YKCpzdrDnfrPfB4+0cyGLJx15Jqd3sOZy097kl2lQJTscELTjJxTl0uB/Fbf/s38
t1K2nIhBm+sGK3rTf3JjY4Jc7vDbwX7HQt6rUVEbd3NoyLJV1T/HdeSgwSMdyiED
WwkRZ0z/vU0hEDk5wk1ZyltkiUzdCSws3C8T0J39xRObPLHR1vYgKO8aeZhfQb4p
luD1fzGRMt3JinSXCPPm5HfADXq2Rozx7Y7a454fvCa7lpX4MNAgaQdfIzI64lZj
cMgSYAIskVq4vxCkO4bKec4FYrzJoxBMJwiXZvOZ4mF5SL4UIDwerMqQTA3fvtQ+
puS6x+/DF9XXHrEewEX7teg6QYPQueneSS+fWeFpMGzDXSjdQB6qV+rMWS297t+4
1KyITxkOxcZQ4+j1OLPGtxsRLKtWApawoNTpRMlaD+hSExxHLbUmKexOLXzuAoVP
nhbjud+jzEbpCnwps24Og/iEBdRYJcl2KwEeSRPI856YRDrNa9jPtiDlsAtKZOK2
gJnOixSss6R+wgVVYIyMDZ8tsvO+UDQruvqQ2kFku1FOlO86pvwD6UUVuTVosdNc
fktw6Dx90N3fdb/o8jjAjssx
=Z8+P
-----END PGP SIGNATURE-----
Merge tag 'vfio-v6.1-rc1' of https://github.com/awilliam/linux-vfio
Pull VFIO updates from Alex Williamson:
- Prune private items from vfio_pci_core.h to a new internal header,
fix missed function rename, and refactor vfio-pci interrupt defines
(Jason Gunthorpe)
- Create consistent naming and handling of ioctls with a function per
ioctl for vfio-pci and vfio group handling, use proper type args
where available (Jason Gunthorpe)
- Implement a set of low power device feature ioctls allowing userspace
to make use of power states such as D3cold where supported (Abhishek
Sahu)
- Remove device counter on vfio groups, which had restricted the page
pinning interface to singleton groups to account for limitations in
the type1 IOMMU backend. Document usage as limited to emulated IOMMU
devices, ie. traditional mdev devices where this restriction is
consistent (Jason Gunthorpe)
- Correct function prefix in hisi_acc driver incurred during previous
refactoring (Shameer Kolothum)
- Correct typo and remove redundant warning triggers in vfio-fsl driver
(Christophe JAILLET)
- Introduce device level DMA dirty tracking uAPI and implementation in
the mlx5 variant driver (Yishai Hadas & Joao Martins)
- Move much of the vfio_device life cycle management into vfio core,
simplifying and avoiding duplication across drivers. This also
facilitates adding a struct device to vfio_device which begins the
introduction of device rather than group level user support and fills
a gap allowing userspace identify devices as vfio capable without
implicit knowledge of the driver (Kevin Tian & Yi Liu)
- Split vfio container handling to a separate file, creating a more
well defined API between the core and container code, masking IOMMU
backend implementation from the core, allowing for an easier future
transition to an iommufd based implementation of the same (Jason
Gunthorpe)
- Attempt to resolve race accessing the iommu_group for a device
between vfio releasing DMA ownership and removal of the device from
the IOMMU driver. Follow-up with support to allow vfio_group to exist
with NULL iommu_group pointer to support existing userspace use cases
of holding the group file open (Jason Gunthorpe)
- Fix error code and hi/lo register manipulation issues in the hisi_acc
variant driver, along with various code cleanups (Longfang Liu)
- Fix a prior regression in GVT-g group teardown, resulting in
unreleased resources (Jason Gunthorpe)
- A significant cleanup and simplification of the mdev interface,
consolidating much of the open coded per driver sysfs interface
support into the mdev core (Christoph Hellwig)
- Simplification of tracking and locking around vfio_groups that fall
out from previous refactoring (Jason Gunthorpe)
- Replace trivial open coded f_ops tests with new helper (Alex
Williamson)
* tag 'vfio-v6.1-rc1' of https://github.com/awilliam/linux-vfio: (77 commits)
vfio: More vfio_file_is_group() use cases
vfio: Make the group FD disassociate from the iommu_group
vfio: Hold a reference to the iommu_group in kvm for SPAPR
vfio: Add vfio_file_is_group()
vfio: Change vfio_group->group_rwsem to a mutex
vfio: Remove the vfio_group->users and users_comp
vfio/mdev: add mdev available instance checking to the core
vfio/mdev: consolidate all the description sysfs into the core code
vfio/mdev: consolidate all the available_instance sysfs into the core code
vfio/mdev: consolidate all the name sysfs into the core code
vfio/mdev: consolidate all the device_api sysfs into the core code
vfio/mdev: remove mtype_get_parent_dev
vfio/mdev: remove mdev_parent_dev
vfio/mdev: unexport mdev_bus_type
vfio/mdev: remove mdev_from_dev
vfio/mdev: simplify mdev_type handling
vfio/mdev: embedd struct mdev_parent in the parent data structure
vfio/mdev: make mdev.h standalone includable
drm/i915/gvt: simplify vgpu configuration management
drm/i915/gvt: fix a memory leak in intel_gvt_init_vgpu_types
...
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCY0ZjFAAKCRCAXGG7T9hj
vjEsAP4rFMnqc6AXy4Mpvv8cxBtEuQZbwEqgBrMJUvK1jZQrBQD/dOJK2GBCVcfD
2yaVlefFiJGTw5WUlbPeohUlTZ8pJwg=
=xsHV
-----END PGP SIGNATURE-----
Merge tag 'for-linus-6.1-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen updates from Juergen Gross:
- Some minor typo fixes
- A fix of the Xen pcifront driver for supporting the device model to
run in a Linux stub domain
- A cleanup of the pcifront driver
- A series to enable grant-based virtio with Xen on x86
- A cleanup of Xen PV guests to distinguish between safe and faulting
MSR accesses
- Two fixes of the Xen gntdev driver
- Two fixes of the new xen grant DMA driver
* tag 'for-linus-6.1-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen: Kconfig: Fix spelling mistake "Maxmium" -> "Maximum"
xen/pv: support selecting safe/unsafe msr accesses
xen/pv: refactor msr access functions to support safe and unsafe accesses
xen/pv: fix vendor checks for pmu emulation
xen/pv: add fault recovery control to pmu msr accesses
xen/virtio: enable grant based virtio on x86
xen/virtio: use dom0 as default backend for CONFIG_XEN_VIRTIO_FORCE_GRANT
xen/virtio: restructure xen grant dma setup
xen/pcifront: move xenstore config scanning into sub-function
xen/gntdev: Accommodate VMA splitting
xen/gntdev: Prevent leaking grants
xen/virtio: Fix potential deadlock when accessing xen_grant_dma_devices
xen/virtio: Fix n_pages calculation in xen_grant_dma_map(unmap)_page()
xen/xenbus: Fix spelling mistake "hardward" -> "hardware"
xen-pcifront: Handle missed Connected state
not.
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCY0YhtwAKCRDdBJ7gKXxA
juJLAQDCa0g8sfe9cTw3PT1gRnn8gWLHEkMgUWVC/aBaqYFGeQEAta+g8muv9Tpd
qODv0JARH4cwONKEA24Oql+A5RnI6gQ=
=QZnW
-----END PGP SIGNATURE-----
Merge tag 'mm-hotfixes-stable-2022-10-11' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc hotfixes from Andrew Morton:
"Five hotfixes - three for nilfs2, two for MM. For are cc:stable, one
is not"
* tag 'mm-hotfixes-stable-2022-10-11' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
nilfs2: fix leak of nilfs_root in case of writer thread creation failure
nilfs2: fix NULL pointer dereference at nilfs_bmap_lookup_at_level()
nilfs2: fix use-after-free bug of struct nilfs_root
mm/damon/core: initialize damon_target->list in damon_new_target()
mm/hugetlb: fix races when looking up a CONT-PTE/PMD size hugetlb page
- Valentin Schneider makes crash-kexec work properly when invoked from
an NMI-time panic.
- ntfs bugfixes from Hawkins Jiawei
- Jiebin Sun improves IPC msg scalability by replacing atomic_t's with
percpu counters.
- nilfs2 cleanups from Minghao Chi
- lots of other single patches all over the tree!
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCY0Yf0gAKCRDdBJ7gKXxA
joapAQDT1d1zu7T8yf9cQXkYnZVuBKCjxKE/IsYvqaq1a42MjQD/SeWZg0wV05B8
DhJPj9nkEp6R3Rj3Mssip+3vNuceAQM=
=lUQY
-----END PGP SIGNATURE-----
Merge tag 'mm-nonmm-stable-2022-10-11' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull non-MM updates from Andrew Morton:
- hfs and hfsplus kmap API modernization (Fabio Francesco)
- make crash-kexec work properly when invoked from an NMI-time panic
(Valentin Schneider)
- ntfs bugfixes (Hawkins Jiawei)
- improve IPC msg scalability by replacing atomic_t's with percpu
counters (Jiebin Sun)
- nilfs2 cleanups (Minghao Chi)
- lots of other single patches all over the tree!
* tag 'mm-nonmm-stable-2022-10-11' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (71 commits)
include/linux/entry-common.h: remove has_signal comment of arch_do_signal_or_restart() prototype
proc: test how it holds up with mapping'less process
mailmap: update Frank Rowand email address
ia64: mca: use strscpy() is more robust and safer
init/Kconfig: fix unmet direct dependencies
ia64: update config files
nilfs2: replace WARN_ONs by nilfs_error for checkpoint acquisition failure
fork: remove duplicate included header files
init/main.c: remove unnecessary (void*) conversions
proc: mark more files as permanent
nilfs2: remove the unneeded result variable
nilfs2: delete unnecessary checks before brelse()
checkpatch: warn for non-standard fixes tag style
usr/gen_init_cpio.c: remove unnecessary -1 values from int file
ipc/msg: mitigate the lock contention with percpu counter
percpu: add percpu_counter_add_local and percpu_counter_sub_local
fs/ocfs2: fix repeated words in comments
relay: use kvcalloc to alloc page array in relay_alloc_page_array
proc: make config PROC_CHILDREN depend on PROC_FS
fs: uninline inode_maybe_inc_iversion()
...
The follow commands caused a crash:
# cd /sys/kernel/tracing
# echo 's:open char file[]' > dynamic_events
# echo 'hist:keys=common_pid:file=filename:onchange($file).trace(open,$file)' > events/syscalls/sys_enter_openat/trigger'
# echo 1 > events/synthetic/open/enable
BOOM!
The problem is that the synthetic event field "char file[]" will read
the value given to it as a string without any memory checks to make sure
the address is valid. The above example will pass in the user space
address and the sythetic event code will happily call strlen() on it
and then strscpy() where either one will cause an oops when accessing
user space addresses.
Use the helper functions from trace_kprobe and trace_eprobe that can
read strings safely (and actually succeed when the address is from user
space and the memory is mapped in).
Now the above can show:
packagekitd-1721 [000] ...2. 104.597170: open: file=/usr/lib/rpm/fileattrs/cmake.attr
in:imjournal-978 [006] ...2. 104.599642: open: file=/var/lib/rsyslog/imjournal.state.tmp
packagekitd-1721 [000] ...2. 104.626308: open: file=/usr/lib/rpm/fileattrs/debuginfo.attr
Link: https://lkml.kernel.org/r/20221012104534.826549315@goodmis.org
Cc: stable@vger.kernel.org
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Tom Zanussi <zanussi@kernel.org>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Reviewed-by: Tom Zanussi <zanussi@kernel.org>
Fixes: bd82631d7c ("tracing: Add support for dynamic strings to synthetic events")
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Have the specific functions for kernel probes that read strings to inject
the "(fault)" name directly. trace_probes.c does this too (for uprobes)
but as the code to read strings are going to be used by synthetic events
(and perhaps other utilities), it simplifies the code by making sure those
other uses do not need to implement the "(fault)" name injection as well.
Link: https://lkml.kernel.org/r/20221012104534.644803645@goodmis.org
Cc: stable@vger.kernel.org
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Tom Zanussi <zanussi@kernel.org>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Reviewed-by: Tom Zanussi <zanussi@kernel.org>
Fixes: bd82631d7c ("tracing: Add support for dynamic strings to synthetic events")
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
The functions:
fetch_store_strlen_user()
fetch_store_strlen()
fetch_store_string_user()
fetch_store_string()
are identical in both trace_kprobe.c and trace_eprobe.c. Move them into
a new header file trace_probe_kernel.h to share it. This code will later
be used by the synthetic events as well.
Marked for stable as a fix for a crash in synthetic events requires it.
Link: https://lkml.kernel.org/r/20221012104534.467668078@goodmis.org
Cc: stable@vger.kernel.org
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Tom Zanussi <zanussi@kernel.org>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Reviewed-by: Tom Zanussi <zanussi@kernel.org>
Fixes: bd82631d7c ("tracing: Add support for dynamic strings to synthetic events")
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
- Core code:
- Provide a generic wrapper which can be utilized in drivers to handle
the problem of force threaded demultiplex interrupts on RT enabled
kernels. This avoids conditionals and horrible quirks in drivers all
over the place.
- Fix up affected pinctrl and GPIO drivers to make them cleanly RT safe.
- Interrupt drivers:
- A new driver for the FSL MU platform specific MSI implementation.
- Make irqchip_init() available for pure ACPI based systems.
- Provide a functional DT binding for the Realtek RTL interrupt chip.
- The usual DT updates and small code improvements all over the place.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmNGxRYTHHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYoWJyD/0emJAlIuD0DzkEkoAtnHSq7eyGFMpI
PFMyZ0IYXlVWuxEmQMyd7E9M+fmlRqnnhErg6x7jPW1bKzoyIn1A7eNE/cvhXPru
BiTy6g2o7pNegUh5bQrE8p0Yyq6/HsVO4YyE3RGxpUQVh/qwB+RKnzUY6RfDj87z
naQx10+15b+76SXvTQpIrvQTWhfTswk9un2MYDkjHctfVgjcnb/8dTPQuXsZrdTQ
VBWWwjLpCKcqqQS1e9MQqmQKpVqGs/DGW8XNTPk3jI4QF1fIHjhNdcoI51/lM4Ri
r912FPE8R48FS9g0dQgpMxGmHjikYpf3rXXosn8uyWkt5zNy6CXOEEg3DRIoAIdg
czKve+bgZZXUK/QcSSdPuPthBoLKQCG5MZsVFNF8IArmPCHaiYcOQBe7pel3U4cc
MpQe9yUXJI40XgwTAyAOlidjmD69384nEhzbI5d/AfJI5ssdXcBMrFN/xEeBDWdz
Dg2+Yle9HNglxBA6E3GX3yiaCQJxHFhKMnqd1zhxWjXFRzkfGF7bBpRj1j+vXnzN
ap/wMQuMlOWriWsH3UkZtFrC4PvgByGVfzlzYA076CjutyYfQolQ8k0bLHnp2VSu
VWUn4WATfaxJcqij7vyI9BYtFXdrB/yYhFasDBepQbDgiy8WEAmX+bObvXWs9XYa
UGVCNGsYx2TKMA==
=2ok5
-----END PGP SIGNATURE-----
Merge tag 'irq-core-2022-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull interrupt updates from Thomas Gleixner:
"Core code:
- Provide a generic wrapper which can be utilized in drivers to
handle the problem of force threaded demultiplex interrupts on RT
enabled kernels. This avoids conditionals and horrible quirks in
drivers all over the place
- Fix up affected pinctrl and GPIO drivers to make them cleanly RT
safe
Interrupt drivers:
- A new driver for the FSL MU platform specific MSI implementation
- Make irqchip_init() available for pure ACPI based systems
- Provide a functional DT binding for the Realtek RTL interrupt chip
- The usual DT updates and small code improvements all over the
place"
* tag 'irq-core-2022-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (21 commits)
irqchip: IMX_MU_MSI should depend on ARCH_MXC
irqchip/imx-mu-msi: Fix wrong register offset for 8ulp
irqchip/ls-extirq: Fix invalid wait context by avoiding to use regmap
dt-bindings: irqchip: Describe the IMX MU block as a MSI controller
irqchip: Add IMX MU MSI controller driver
dt-bindings: irqchip: renesas,irqc: Add r8a779g0 support
irqchip/gic-v3: Fix typo in comment
dt-bindings: interrupt-controller: ti,sci-intr: Fix missing reg property in the binding
dt-bindings: irqchip: ti,sci-inta: Fix warning for missing #interrupt-cells
irqchip: Allow extra fields to be passed to IRQCHIP_PLATFORM_DRIVER_END
platform-msi: Export symbol platform_msi_create_irq_domain()
irqchip/realtek-rtl: use parent interrupts
dt-bindings: interrupt-controller: realtek,rtl-intc: require parents
irqchip/realtek-rtl: use irq_domain_add_linear()
irqchip: Make irqchip_init() usable on pure ACPI systems
bcma: gpio: Use generic_handle_irq_safe()
gpio: mlxbf2: Use generic_handle_irq_safe()
platform/x86: intel_int0002_vgpio: Use generic_handle_irq_safe()
ssb: gpio: Use generic_handle_irq_safe()
pinctrl: amd: Use generic_handle_irq_safe()
...
Fix the interrupt order of the charger in the binding example.
Fixes: 76f52f815f1a ("dt-bindings: mfd: Add MediaTek MT6370")
Signed-off-by: ChiaEn Wu <chiaen_wu@richtek.com>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Link: https://lore.kernel.org/r/fcf4e7e7594070a8698dc0d4b96e031bcaa9b3a3.1665585952.git.chiaen_wu@richtek.com
Signed-off-by: Rob Herring <robh@kernel.org>
Add '$ref' and 'unevaluatedProperties: false' in 'multi-led', and remove
unused 'allOf' property.
Fixes: 440c57dabb ("dt-bindings: leds: mt6370: Add MediaTek MT6370 current sink type LED indicator")
Signed-off-by: ChiaEn Wu <chiaen_wu@richtek.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/435f6888ebc20c5abae63eb9cb3a055b60db2ed1.1665050503.git.chiaen_wu@richtek.com
Signed-off-by: Rob Herring <robh@kernel.org>
- add NVME_QUIRK_BOGUS_NID for Lexar NM760 (Abhijit)
- avoid the deepest sleep state on ZHITAI TiPro5000 SSDs (Xi Ruoyao)
- fix possible hang caused during ctrl deletion (Sagi Grimberg)
- fix possible hang in live ns resize with ANA access (Sagi Grimberg)
-----BEGIN PGP SIGNATURE-----
iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAmNGqA0LHGhjaEBsc3Qu
ZGUACgkQD55TZVIEUYPHwg/9F8fl/kIgsYlyGnJ8moSoX3jn4G2iodZalvQ5xOkX
ItVieh/BuuZwvcCFKRcIFT51E6+Wg0a+FFiJwMwuUG+mgS9pOY8hEFkVCblL4C5u
w81ZXeOuMBwmVqowP3oJzmWeRAW94Si/Cnut33l2oBn+Zg0wVVSD/paSHRhqZ4Uo
TgsKppSkkwSBlFZhiGD/Rkj5TCVrNCZ2xpbPUCI1hPreQrLSnFLDhb0Pu1v9xaDS
1N5+IyoB4vp1c+sNIeCRzyudipea/W3uUIwma3LX8/tgHbghpJV9lW8vq59RMefc
GBGk/Ab50pTaWLGzLq3l3CHxiT0N7+oD8+D//pYi+neaXWM6MIbBAjcWLu7rTw1j
foL8pjez6XdrDpnq6rVDc71RgOSb3nqBv+HYktg6O5b/oYa9x0kuuAgB16qSt6cA
nx/I1qO6gdqnZJXpuh7/2UDTBhVSm+vPLZTLgT/lQRqAsPrXz0PXJxXAMaOiQrDc
VQAPBZTxqBjCLnWeATdphbf5ZJeWj8bG2iDrOsqG2u49AjBBV+GhcjKs59MENamP
IGOCMd6ty+uXCbDX4fk4LxLlqJO4lSTm/lTTs6oWcN2F1P2SGBoQKrMvbbMN+kB2
AT5SxGHGARmGfdAucgzpZwWMcwaLbzYgpSF6hd8ItABFGuUwITB2tLKU7XuShOJ8
oOw=
=z547
-----END PGP SIGNATURE-----
Merge tag 'nvme-6.1-2022-10-12' of git://git.infradead.org/nvme into block-6.1
Pull NVMe fixes from Christoph:
"nvme fixes for Linux 6.1
- add NVME_QUIRK_BOGUS_NID for Lexar NM760 (Abhijit)
- avoid the deepest sleep state on ZHITAI TiPro5000 SSDs (Xi Ruoyao)
- fix possible hang caused during ctrl deletion (Sagi Grimberg)
- fix possible hang in live ns resize with ANA access (Sagi Grimberg)"
* tag 'nvme-6.1-2022-10-12' of git://git.infradead.org/nvme:
nvme-multipath: fix possible hang in live ns resize with ANA access
nvme-pci: avoid the deepest sleep state on ZHITAI TiPro5000 SSDs
nvme-pci: add NVME_QUIRK_BOGUS_NID for Lexar NM760
nvme-tcp: fix possible hang caused during ctrl deletion
nvme-rdma: fix possible hang caused during ctrl deletion
kernel/trace/ring_buffer.c:895: warning: expecting prototype for ring_buffer_nr_pages_dirty(). Prototype was for ring_buffer_nr_dirty_pages() instead.
kernel/trace/ring_buffer.c:5313: warning: expecting prototype for ring_buffer_reset_cpu(). Prototype was for ring_buffer_reset_online_cpus() instead.
kernel/trace/ring_buffer.c:5382: warning: expecting prototype for rind_buffer_empty(). Prototype was for ring_buffer_empty() instead.
Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=2340
Link: https://lkml.kernel.org/r/20221009020642.12506-1-jiapeng.chong@linux.alibaba.com
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Currently, we have a bug where a simultaneous DROPTAG ioctl and socket
close may race, as we attempt to remove a key from lists twice, and
perform an unref for each removal operation. This may result in a uaf
when we attempt the second unref.
This change fixes the race by making __mctp_key_remove tolerant to being
called on a key that has already been removed from the socket/net lists,
and only performs the unref when we do the actual remove. We also need
to hold the list lock on the ioctl cleanup path.
This fix is based on a bug report and comprehensive analysis from
butt3rflyh4ck <butterflyhuangxx@gmail.com>, found via syzkaller.
Cc: stable@vger.kernel.org
Fixes: 63ed1aab3d ("mctp: Add SIOCMCTP{ALLOC,DROP}TAG ioctls for tag control")
Reported-by: butt3rflyh4ck <butterflyhuangxx@gmail.com>
Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Westphal says:
====================
netfilter fixes for net
This series from Phil Sutter for the *net* tree fixes a problem with a change
from the 6.1 development phase: the change to nft_fib should have used
the more recent flowic_l3mdev field. Pointed out by Guillaume Nault.
This also makes the older iptables module follow the same pattern.
Also add selftest case and avoid test failure in nft_fib.sh when the
host environment has set rp_filter=1.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
If net.ipv4.conf.all.rp_filter is set, it overrides the per-interface
setting and thus defeats the fix from bbe4c0896d ("selftests:
netfilter: disable rp_filter on router"). Unset it as well to cover that
case.
Fixes: bbe4c0896d ("selftests: netfilter: disable rp_filter on router")
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Florian Westphal <fw@strlen.de>
Use the introduced field for correct operation with VRF devices instead
of conditionally overwriting flowic_oif. This is a partial revert of
commit b575b24b8e ("netfilter: Fix rpfilter dropping vrf packets by
mistake"), implementing a simpler solution.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Reviewed-by: David Ahern <dsahern@kernel.org>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Test reverse path (filter) matches in iptables, ip6tables and nftables.
Both with a regular interface and a VRF.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
When ftrace bug happened, following log shows every hex data in
problematic ip address:
actual: ffffffe8:6b:ffffffd9:01:21
But so many 'f's seem a little confusing, and that is because format
'%x' being used to print signed chars in array 'ins'. As suggested
by Joe, change to use format "%*phC" to print array 'ins'.
After this patch, the log is like:
actual: e8:6b:d9:01:21
Link: https://lkml.kernel.org/r/20221011120352.1878494-1-zhengyejian1@huawei.com
Fixes: 6c14133d2d ("ftrace: Do not blindly read the ip address in ftrace_bug()")
Suggested-by: Joe Perches <joe@perches.com>
Signed-off-by: Zheng Yejian <zhengyejian1@huawei.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
When we revalidate paths as part of ns size change (as of commit
e7d65803e2), it is possible that during the path revalidation, the
only paths that is IO capable (i.e. optimized/non-optimized) are the
ones that ns resize was not yet informed to the host, which will cause
inflight requests to be requeued (as we have available paths but none
are IO capable). These requests on the requeue list are waiting for
someone to resubmit them at some point.
The IO capable paths will eventually notify the ns resize change to the
host, but there is nothing that will kick the requeue list to resubmit
the queued requests.
Fix this by always kicking the requeue list, and if no IO capable path
exists, these requests will be queued again.
A typical log that indicates that IOs are requeued:
--
nvme nvme1: creating 4 I/O queues.
nvme nvme1: new ctrl: "testnqn1"
nvme nvme2: creating 4 I/O queues.
nvme nvme2: mapped 4/0/0 default/read/poll queues.
nvme nvme2: new ctrl: NQN "testnqn1", addr 127.0.0.1:8009
nvme nvme1: rescanning namespaces.
nvme1n1: detected capacity change from 2097152 to 4194304
block nvme1n1: no usable path - requeuing I/O
block nvme1n1: no usable path - requeuing I/O
block nvme1n1: no usable path - requeuing I/O
block nvme1n1: no usable path - requeuing I/O
block nvme1n1: no usable path - requeuing I/O
block nvme1n1: no usable path - requeuing I/O
block nvme1n1: no usable path - requeuing I/O
block nvme1n1: no usable path - requeuing I/O
block nvme1n1: no usable path - requeuing I/O
block nvme1n1: no usable path - requeuing I/O
nvme nvme2: rescanning namespaces.
--
Reported-by: Yogev Cohen <yogev@lightbitslabs.com>
Fixes: e7d65803e2 ("nvme-multipath: revalidate paths during rescan")
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Cc: <stable@vger.kernel.org> # v5.15+
Signed-off-by: Christoph Hellwig <hch@lst.de>
When we delete a controller, we execute the following:
1. nvme_stop_ctrl() - stop some work elements that may be
inflight or scheduled (specifically also .stop_ctrl
which cancels ctrl error recovery work)
2. nvme_remove_namespaces() - which first flushes scan_work
to avoid competing ns addition/removal
3. continue to teardown the controller
However, if err_work was scheduled to run in (1), it is designed to
cancel any inflight I/O, particularly I/O that is originating from ns
scan_work in (2), but because it is cancelled in .stop_ctrl(), we can
prevent forward progress of (2) as ns scanning is blocking on I/O
(that will never be cancelled).
The race is:
1. transport layer error observed -> err_work is scheduled
2. scan_work executes, discovers ns, generate I/O to it
3. nvme_ctop_ctrl() -> .stop_ctrl() -> cancel_work_sync(err_work)
- err_work never executed
4. nvme_remove_namespaces() -> flush_work(scan_work)
--> deadlock, because scan_work is blocked on I/O that was supposed
to be cancelled by err_work, but was cancelled before executing (see
stack trace [1]).
Fix this by flushing err_work instead of cancelling it, to force it
to execute and cancel all inflight I/O.
[1]:
--
Call Trace:
<TASK>
__schedule+0x390/0x910
? scan_shadow_nodes+0x40/0x40
schedule+0x55/0xe0
io_schedule+0x16/0x40
do_read_cache_page+0x55d/0x850
? __page_cache_alloc+0x90/0x90
read_cache_page+0x12/0x20
read_part_sector+0x3f/0x110
amiga_partition+0x3d/0x3e0
? osf_partition+0x33/0x220
? put_partition+0x90/0x90
bdev_disk_changed+0x1fe/0x4d0
blkdev_get_whole+0x7b/0x90
blkdev_get_by_dev+0xda/0x2d0
device_add_disk+0x356/0x3b0
nvme_mpath_set_live+0x13c/0x1a0 [nvme_core]
? nvme_parse_ana_log+0xae/0x1a0 [nvme_core]
nvme_update_ns_ana_state+0x3a/0x40 [nvme_core]
nvme_mpath_add_disk+0x120/0x160 [nvme_core]
nvme_alloc_ns+0x594/0xa00 [nvme_core]
nvme_validate_or_alloc_ns+0xb9/0x1a0 [nvme_core]
? __nvme_submit_sync_cmd+0x1d2/0x210 [nvme_core]
nvme_scan_work+0x281/0x410 [nvme_core]
process_one_work+0x1be/0x380
worker_thread+0x37/0x3b0
? process_one_work+0x380/0x380
kthread+0x12d/0x150
? set_kthread_struct+0x50/0x50
ret_from_fork+0x1f/0x30
</TASK>
INFO: task nvme:6725 blocked for more than 491 seconds.
Not tainted 5.15.65-f0.el7.x86_64 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:nvme state:D
stack: 0 pid: 6725 ppid: 1761 flags:0x00004000
Call Trace:
<TASK>
__schedule+0x390/0x910
? sched_clock+0x9/0x10
schedule+0x55/0xe0
schedule_timeout+0x24b/0x2e0
? try_to_wake_up+0x358/0x510
? finish_task_switch+0x88/0x2c0
wait_for_completion+0xa5/0x110
__flush_work+0x144/0x210
? worker_attach_to_pool+0xc0/0xc0
flush_work+0x10/0x20
nvme_remove_namespaces+0x41/0xf0 [nvme_core]
nvme_do_delete_ctrl+0x47/0x66 [nvme_core]
nvme_sysfs_delete.cold.96+0x8/0xd [nvme_core]
dev_attr_store+0x14/0x30
sysfs_kf_write+0x38/0x50
kernfs_fop_write_iter+0x146/0x1d0
new_sync_write+0x114/0x1b0
? intel_pmu_handle_irq+0xe0/0x420
vfs_write+0x18d/0x270
ksys_write+0x61/0xe0
__x64_sys_write+0x1a/0x20
do_syscall_64+0x37/0x90
entry_SYSCALL_64_after_hwframe+0x61/0xcb
--
Fixes: 3f2304f8c6 ("nvme-tcp: add NVMe over TCP host driver")
Reported-by: Jonathan Nicklin <jnicklin@blockbridge.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Tested-by: Jonathan Nicklin <jnicklin@blockbridge.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
When we delete a controller, we execute the following:
1. nvme_stop_ctrl() - stop some work elements that may be
inflight or scheduled (specifically also .stop_ctrl
which cancels ctrl error recovery work)
2. nvme_remove_namespaces() - which first flushes scan_work
to avoid competing ns addition/removal
3. continue to teardown the controller
However, if err_work was scheduled to run in (1), it is designed to
cancel any inflight I/O, particularly I/O that is originating from ns
scan_work in (2), but because it is cancelled in .stop_ctrl(), we can
prevent forward progress of (2) as ns scanning is blocking on I/O
(that will never be cancelled).
The race is:
1. transport layer error observed -> err_work is scheduled
2. scan_work executes, discovers ns, generate I/O to it
3. nvme_ctop_ctrl() -> .stop_ctrl() -> cancel_work_sync(err_work)
- err_work never executed
4. nvme_remove_namespaces() -> flush_work(scan_work)
--> deadlock, because scan_work is blocked on I/O that was supposed
to be cancelled by err_work, but was cancelled before executing.
Fix this by flushing err_work instead of cancelling it, to force it
to execute and cancel all inflight I/O.
Fixes: b435ecea2a ("nvme: Add .stop_ctrl to nvme ctrl ops")
Fixes: f6c8e432cb ("nvme: flush namespace scanning work just before removing namespaces")
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
This add ACPI-based generic laptop driver for Loongson-3. Some of the
codes are derived from drivers/platform/x86/thinkpad_acpi.c.
Signed-off-by: Jianmin Lv <lvjianmin@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
BPF programs are normally handled by a BPF interpreter, add BPF JIT
support for LoongArch to allow the kernel to generate native code when
a program is loaded into the kernel. This will significantly speed-up
processing of BPF programs.
Co-developed-by: Youling Tang <tangyouling@loongson.cn>
Signed-off-by: Youling Tang <tangyouling@loongson.cn>
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
{signed,unsigned}_imm_check() will also be used in the bpf jit, so move
them from module.c to inst.h, this is preparation for later patches.
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
This patch adds support for kdump. In kdump case the normal kernel will
reserve a region for the crash kernel and jump there on panic.
Arch-specific functions are added to allow for implementing a crash dump
file interface, /proc/vmcore, which can be viewed as a ELF file.
A user-space tool, such as kexec-tools, is responsible for allocating a
separate region for the core's ELF header within the crash kdump kernel
memory and filling it in when executing kexec_load().
Then, its location will be advertised to the crash dump kernel via a
command line argument "elfcorehdr=", and the crash dump kernel will
preserve this region for later use with arch_reserve_vmcore() at boot
time.
At the same time, the crash kdump kernel is also limited within the
"crashkernel" area via a command line argument "mem=", so as not to
destroy the original kernel dump data.
In the crash dump kernel environment, /proc/vmcore is used to access the
primary kernel's memory with copy_oldmem_page().
I tested kdump on LoongArch machines (Loongson-3A5000) and it works as
expected (suggested crashkernel parameter is "crashkernel=512M@2560M"),
you may test it by triggering a crash through /proc/sysrq-trigger:
$ sudo kexec -p /boot/vmlinux-kdump --reuse-cmdline --append="nr_cpus=1"
# echo c > /proc/sysrq-trigger
Signed-off-by: Youling Tang <tangyouling@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Add three new files, kexec.h, machine_kexec.c and relocate_kernel.S to
the LoongArch architecture, so as to add support for the kexec re-boot
mechanism (CONFIG_KEXEC) on LoongArch platforms.
Kexec supports loading vmlinux.elf in ELF format and vmlinux.efi in PE
format.
I tested kexec on LoongArch machines (Loongson-3A5000) and it works as
expected:
$ sudo kexec -l /boot/vmlinux.efi --reuse-cmdline
$ sudo kexec -e
Signed-off-by: Youling Tang <tangyouling@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Inspired by commit 9fb7410f955("arm64/BUG: Use BRK instruction for
generic BUG traps"), do similar for LoongArch to use generic BUG()
handler.
This patch uses the BREAK software breakpoint instruction to generate
a trap instead, similarly to most other arches, with the generic BUG
code generating the dmesg boilerplate.
This allows bug metadata to be moved to a separate table and reduces
the amount of inline code at BUG() and WARN() sites. This also avoids
clobbering any registers before they can be dumped.
To mitigate the size of the bug table further, this patch makes use of
the existing infrastructure for encoding addresses within the bug table
as 32-bit relative pointers instead of absolute pointers.
(Note: this limits the max kernel size to 2GB.)
Before patch:
[ 3018.338013] lkdtm: Performing direct entry BUG
[ 3018.342445] Kernel bug detected[#5]:
[ 3018.345992] CPU: 2 PID: 865 Comm: cat Tainted: G D 6.0.0-rc6+ #35
After patch:
[ 125.585985] lkdtm: Performing direct entry BUG
[ 125.590433] ------------[ cut here ]------------
[ 125.595020] kernel BUG at drivers/misc/lkdtm/bugs.c:78!
[ 125.600211] Oops - BUG[#1]:
[ 125.602980] CPU: 3 PID: 410 Comm: cat Not tainted 6.0.0-rc6+ #36
Out-of-line file/line data information obtained compared to before.
Signed-off-by: Youling Tang <tangyouling@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
The perf events infrastructure of LoongArch is very similar to old MIPS-
based Loongson, so most of the codes are derived from MIPS.
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
We can support more cache attributes (e.g., CC, SUC and WUC) and page
protection when we use TLB for ioremap(). The implementation is based
on GENERIC_IOREMAP.
The existing simple ioremap() implementation has better performance so
we keep it and introduce ARCH_IOREMAP to control the selection.
We move pagetable_init() earlier to make early ioremap() works, and we
modify the PCI ecam mapping because the TLB-based version of ioremap()
will actually take the size into account.
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>