Fix:
Julia Lawall pointed out a null pointer dereference.
Cleanup:
Vlastimil Babka sent me a patch to remove some SLAB related code.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEIGSFVdO6eop9nER2z0QOqevODb4FAmXSUcsACgkQz0QOqevO
Db5VmxAAiQlsxX2Ki3q00rMgaXFS4yTwSPsGsTrC5eQlyJfq4xEGMTWJGjjRS56k
L2FGioP7OmIlo2VrzCc9Ms4ve/NyQjXpaoDMnsEUWSUfd8OHSJkBrpeVUWWfcHHk
zLEmxNb2AETcJupAgoJOWOoSb59ggCKpCqmLezYoSZmlnb9qg6lhFbnWtkVC6q+p
AREOfByoLIrJUtVh4Bmexo4nO5w3F84cfAV2WAmLMnXKjGnyFLGkqQvy8yXW0sA1
hsZW+VjmoRaCG78M6OX+Sl3ok0V8i2AcOkguWPj+dkCPb8XLkxhGwjnqLziJ54Z5
aFrtSzeZiNQOqy7b6cj6+x2KcWE5FhphAKjX/psEZrZNa0e6ZvNfby3yJ4TzNWaN
eajtOtcq+Ec9IruWXt/WCsm0zYwW1HumUhga5QCHjQRRjOt36ua4QC02iCx2sYuX
SBnsBCgQo1xxAta3uOMj2sG38lUwYoH0U5wlPsqrGh1nsbGbc49Ok7BYX/wWF8os
CYnT5t2KR9yUvblV+dH9XTj2EwqgINMRYBW7uBjZqY9gq2v/RKrtQCjnedAAA+yx
B6UUob/naV5VpXfhwpXiw2oJrQy/kqQuwOEcgY/a6cwmENd9RuwGXBiBI6hrAOzl
ftxiUcByW8/hS13G04qJ7pGTACs4njteMvvg+Y68nWUmPWMZTio=
=/gAC
-----END PGP SIGNATURE-----
Merge tag 'for-linus-6.9-ofs1' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux
Pull orangefs updates from Mike Marshall:
"One fix, one cleanup...
Fix: Julia Lawall pointed out a null pointer dereference.
Cleanup: Vlastimil Babka sent me a patch to remove some SLAB related
code"
* tag 'for-linus-6.9-ofs1' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux:
Julia Lawall reported this null pointer dereference, this should fix it.
fs/orangefs: remove ORANGEFS_CACHE_CREATE_FLAGS
In this round, there are a number of updates on mainly two areas: Zoned block
device support and Per-file compression. For example, we've found several issues
to support Zoned block device especially having large sections regarding to GC
and file pinning used for Android devices. In compression side, we've fixed many
corner race conditions that had broken the design assumption.
Enhancement:
- Support file pinning for Zoned block device having large section
- Enhance the data recovery after sudden power cut on Zoned block device
- Add more error injection cases to easily detect the kernel panics
- add a proc entry show the entire disk layout
- Improve various error paths paniced by BUG_ON in block allocation and GC
- support SEEK_DATA and SEEK_HOLE for compression files
Bug fix:
- fix to avoid use-after-free issue in f2fs_filemap_fault
- fix some race conditions to break the atomic write design assumption
- fix to truncate meta inode pages forcely
- resolve various per-file compression issues wrt the space management and
compression policies
- fix some swap-related bugs
In addition, we removed deprecated codes such as io_bits and heap_allocation,
and also fixed minor error handling routines with neat debugging messages.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE00UqedjCtOrGVvQiQBSofoJIUNIFAmX4gS0ACgkQQBSofoJI
UNLmgBAAg4mvbWjmJ5VbXs4zGLOgLRJYcY1sZRO5Ufg4LhWzoGRxL1Dru+TELw0t
1Ck2EQvP91XZ5weA5AZOfWbxcijy4+8L3P8L7ohOShudfACci0wQsx6IaUUWWylC
ILA4+DkovpZrlu6th12Gj9QAM6TN9gdy3V1VLT5O/KmE1x6Pekwp2hQoIvVJRH5L
I3KxOf5fTe3oWLvEN6m7yCz/8qGqz8+w0ae90UG0fqi0wVEuZJ99zsVPnuhu6uBo
riFm2A6ra0I/JqoPyqn2QM6ApItM867ULo9EoyQVgq56Q1w31ENOJXsU9N7N4Wxt
olgujH1SijkWk9ni57iKtMhR68e3Rs+pVsuNFmJuOPq0HASoggB66QRrVvCgM9JG
z3D//CB2ONtX2XiKJMiTcX9VqIqrMw6L1eVxEZu0P96C3CS70MoBU69mdSR9Og2S
5nQXja3yzFhdk3thp6+wAJ3I04ZQkf3qoHZB+0chU2Xl1pV+5NIkBgBsSw8g/TY3
EIHMfK+TX0SBSNCvkUDEJ+Z8ZRID6tcbAquTSsBr6wxB+F9mq7onEvI8O7xwyH9W
DU8xhymOE2QUoluNtyW7ww6HK913ripXIenI9LaYJnuj0XeDAcMIoPsgR7AGU5UG
hshvirFdUdWRMTfXxNNUrvhOWI0qurQSVx+VV6Qb62DGqR5ofOw=
=Qpvy
-----END PGP SIGNATURE-----
Merge tag 'f2fs-for-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs
Pull f2fs update from Jaegeuk Kim:
"In this round, there are a number of updates on mainly two areas:
Zoned block device support and Per-file compression. For example,
we've found several issues to support Zoned block device especially
having large sections regarding to GC and file pinning used for
Android devices. In compression side, we've fixed many corner race
conditions that had broken the design assumption.
Enhancements:
- Support file pinning for Zoned block device having large section
- Enhance the data recovery after sudden power cut on Zoned block
device
- Add more error injection cases to easily detect the kernel panics
- add a proc entry show the entire disk layout
- Improve various error paths paniced by BUG_ON in block allocation
and GC
- support SEEK_DATA and SEEK_HOLE for compression files
Bug fixes:
- avoid use-after-free issue in f2fs_filemap_fault
- fix some race conditions to break the atomic write design
assumption
- fix to truncate meta inode pages forcely
- resolve various per-file compression issues wrt the space
management and compression policies
- fix some swap-related bugs
In addition, we removed deprecated codes such as io_bits and
heap_allocation, and also fixed minor error handling routines with
neat debugging messages"
* tag 'f2fs-for-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (60 commits)
f2fs: fix to avoid use-after-free issue in f2fs_filemap_fault
f2fs: truncate page cache before clearing flags when aborting atomic write
f2fs: mark inode dirty for FI_ATOMIC_COMMITTED flag
f2fs: prevent atomic write on pinned file
f2fs: fix to handle error paths of {new,change}_curseg()
f2fs: unify the error handling of f2fs_is_valid_blkaddr
f2fs: zone: fix to remove pow2 check condition for zoned block device
f2fs: fix to truncate meta inode pages forcely
f2fs: compress: fix reserve_cblocks counting error when out of space
f2fs: compress: relocate some judgments in f2fs_reserve_compress_blocks
f2fs: add a proc entry show disk layout
f2fs: introduce SEGS_TO_BLKS/BLKS_TO_SEGS for cleanup
f2fs: fix to check return value of f2fs_gc_range
f2fs: fix to check return value __allocate_new_segment
f2fs: fix to do sanity check in update_sit_entry
f2fs: fix to reset fields for unloaded curseg
f2fs: clean up new_curseg()
f2fs: relocate f2fs_precache_extents() in f2fs_swap_activate()
f2fs: fix blkofs_end correctly in f2fs_migrate_blocks()
f2fs: ro: don't start discard thread for readonly image
...
There are reports that since version 6.7 update-grub fails to find the
device of the root on systems without initrd and on a single device.
This looks like the device name changed in the output of
/proc/self/mountinfo:
6.5-rc5 working
18 1 0:16 / / rw,noatime - btrfs /dev/sda8 ...
6.7 not working:
17 1 0:15 / / rw,noatime - btrfs /dev/root ...
and "update-grub" shows this error:
/usr/sbin/grub-probe: error: cannot find a device for / (is /dev mounted?)
This looks like it's related to the device name, but grub-probe
recognizes the "/dev/root" path and tries to find the underlying device.
However there's a special case for some filesystems, for btrfs in
particular.
The generic root device detection heuristic is not done and it all
relies on reading the device infos by a btrfs specific ioctl. This ioctl
returns the device name as it was saved at the time of device scan (in
this case it's /dev/root).
The change in 6.7 for temp_fsid to allow several single device
filesystem to exist with the same fsid (and transparently generate a new
UUID at mount time) was to skip caching/registering such devices.
This also skipped mounted device. One step of scanning is to check if
the device name hasn't changed, and if yes then update the cached value.
This broke the grub-probe as it always read the device /dev/root and
couldn't find it in the system. A temporary workaround is to create a
symlink but this does not survive reboot.
The right fix is to allow updating the device path of a mounted
filesystem even if this is a single device one.
In the fix, check if the device's major:minor number matches with the
cached device. If they do, then we can allow the scan to happen so that
device_list_add() can take care of updating the device path. The file
descriptor remains unchanged.
This does not affect the temp_fsid feature, the UUID of the mounted
filesystem remains the same and the matching is based on device major:minor
which is unique per mounted filesystem.
This covers the path when the device (that exists for all mounted
devices) name changes, updating /dev/root to /dev/sdx. Any other single
device with filesystem and is not mounted is still skipped.
Note that if a system is booted and initial mount is done on the
/dev/root device, this will be the cached name of the device. Only after
the command "btrfs device scan" it will change as it triggers the
rename.
The fix was verified by users whose systems were affected.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=218353
Link: https://lore.kernel.org/lkml/CAKLYgeJ1tUuqLcsquwuFqjDXPSJpEiokrWK2gisPKDZLs8Y2TQ@mail.gmail.com/
Fixes: bc27d6f0aa ("btrfs: scan but don't register device on single device filesystem")
CC: stable@vger.kernel.org # 6.7+
Tested-by: Alex Romosan <aromosan@gmail.com>
Tested-by: CHECK_1234543212345@protonmail.com
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Some architectures, like aarch64 ones, need a dtb file to configure the
hardware. The default dtb file can be preloaded from u-boot, but the final
and/or more complete dtb file needs to be able to be loaded later from
rootfs.
Add the possible dtb files to the kernel rpm and mimic Fedora shipping
process, storing the dtb files in the module directory. These dtb files
will be copied to /boot directory by the install scripts, but add fallback
just in case, checking if the content in /boot directory is correct.
Mark the files installed to /boot as %ghost to make sure they will be
removed when the package is uninstalled.
Tested with Fedora Rawhide (x86_64 and aarch64) with dnf and rpm tools.
In addition, fallback was also tested after modifying the install scripts.
Signed-off-by: Jose Ignacio Tornos Martinez <jtornosm@redhat.com>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
When the condition 'sym == NULL' is met, the code will reach the
'next_menu' label regardless of the return value from menu_is_visible().
menu_is_visible() calculates some symbol values as a side-effect, for
instance by calling expr_calc_value(menu->visibility), but all the
symbol values will be calculated eventually.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Remove inputbox_order, searchbox, searchbox_title, searchbox_border
because they are initialized, but not used anywhere.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
For MENUCONFIG_COLOR=blackbg, the text in inactive buttons is invisible
because both the foreground and background are black.
Change the foreground color to white and remove the highlighting.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
If the find_fromsym() call fails and returns NULL, the warn() call
will dereference this NULL pointer and cause the program to crash.
This happened when I tried to build with "test_user_copy" module.
With this fix, it prints lots of warnings like this:
WARNING: modpost: lib/test_user_copy: section mismatch in reference: (unknown)+0x4 (section: .text.fixup) -> (unknown) (section: .init.text)
masahiroy@kernel.org:
The issue is reproduced with ARCH=arm allnoconfig + CONFIG_MODULES=y +
CONFIG_RUNTIME_TESTING_MENU=y + CONFIG_TEST_USER_COPY=m
Signed-off-by: Max Kellermann <max.kellermann@ionos.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZfglxgAKCRCRxhvAZXjc
ovK9APsF7/TMFhNbtW+JsghSyrEk0cOVPizi8JkRDDWNW3qY+wEAxtydhbmWpbKq
MpIjMHqwjPx3zXBL8Ec/b4vAoJqpJwQ=
=NgvO
-----END PGP SIGNATURE-----
Merge tag 'vfs-6.9-rc1.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs fixes from Christian Brauner:
"This contains a few small fixes for this merge window:
- Undo the hiding of silly-rename files in afs. If they're hidden
they can't be deleted by rm manually anymore causing regressions
- Avoid caching the preferred address for an afs server to avoid
accidently overriding an explicitly specified preferred server
address
- Fix bad stat() and rmdir() interaction in afs
- Take a passive reference on the superblock when opening a block
device so the holder is available to concurrent callers from the
block layer
- Clear private data pointer in fscache_begin_operation() to avoid it
being falsely treated as valid"
* tag 'vfs-6.9-rc1.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
fscache: Fix error handling in fscache_begin_operation()
fs,block: get holder during claim
afs: Fix occasional rmdir-then-VNOVNODE with generic/011
afs: Don't cache preferred address
afs: Revert "afs: Hide silly-rename files from userspace"
Two regression fixes that had been introduced in the previous PR,
additional HD-audio quirks, and a further enhancement for the new
kunit.
-----BEGIN PGP SIGNATURE-----
iQJCBAABCAAsFiEEIXTw5fNLNI7mMiVaLtJE4w1nLE8FAmX4AQYOHHRpd2FpQHN1
c2UuZGUACgkQLtJE4w1nLE9/kRAAmxKK49XzcyvnR2hPAqjaBQAijigw+YWwn0kH
wtEmx3NYHmd3TIKI8D9YpfGWQjtKxn0UfvAEJx9KVEKvfpU5oADDCsgDPDuVh7PI
eYML4UZLbuOyIavtEhczjmT2wbEVbhEJbXPGmf/YQ/BXOZbf0xGSxZx46NmVADxY
2Mp7gBtX6gubcFuvzJXNNsi+afMypRRclOHBdsITw1wWOHkj1HIvFuPd7ALPcHGY
lfSpBmKYP+lgfW4pWUSOZcC7e0mnvhxFzmqk5WiCmrSerkiTriQUnSug5Xwpo+O1
m3Febyo6OUaK2QTXc681lNLilfL14+9mJZC5nLxFGvMid+5OwKCv7urHpTbo8f50
YkAR3O9ZbfiNax3P2IkArC6HMAJ0KHJKbSF08t+P9VIwAP3VWTuGqpvzytpI+k60
9+SuV16UOb1pB+hfgkz07EYHCqEJhU4pw0+CZff5YrB3kbsoDfGvuwtz2ps+Q6d7
fDBD27dPtwbHDp66w8U0QrM9uC5P+P2q7r95r+uQ5NrlT3th9JJSWZ1Q4Va60bRu
3DbEChk9ylxLy0bFCrn4D+7rorfrM40jR+pgbPG0/FIaobhp31NvxyGyAy7D9qKD
t5tuvO1O1Yl9p97vuP4pedIYDN4m4gKrdFk0skFYSV4Fyi2Q8TjJnQ9jgX4W5jzD
6pYlKwk=
=Tz+i
-----END PGP SIGNATURE-----
Merge tag 'sound-fix-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"Two regression fixes that had been introduced in this merge window,
additional HD-audio quirks, and a further enhancement for the new
kunit"
* tag 'sound-fix-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: core: add kunitconfig
ALSA: hda/realtek: add in quirk for Acer Swift Go 16 - SFG16-71
Revert "ALSA: usb-audio: Name feature ctl using output if input is PCM"
ALSA: timer: Fix missing irq-disable at closing
ALSA: hda/realtek: Add quirk for Lenovo Yoga 9 14IMH9
The __string() helper macro of the TRACE_EVENT() macro is used to
determine how much of the ring buffer needs to be allocated to fit the
given source string. Some trace events have a string that is dependent on
another variable that could be NULL, and in those cases the string is
passed in to be NULL.
The __string() macro can handle being passed in a NULL pointer for which
it will turn it into "(null)". It does that with:
strlen((src) ? (const char *)(src) : "(null)") + 1
But if src itself has the same conditional type it can confuse the
compiler. That is:
__string(r ? dev(r)->name : NULL)
Would turn into:
strlen((r ? dev(r)->name : NULL) ? (r ? dev(r)->name : NULL) : "(null)" + 1
For which the compiler thinks that NULL is being passed to strlen() and
gives this kind of warning:
./include/trace/stages/stage5_get_offsets.h:50:21: warning: argument 1 null where non-null expected [-Wnonnull]
50 | strlen((src) ? (const char *)(src) : "(null)") + 1)
Instead, create a static inline function that takes the src string and
will return the string if it is not NULL and will return "(null)" if it
is. This will then make the strlen() line:
strlen(__string_src(src)) + 1
Where the compiler can see that strlen() will not end up with NULL and
does not warn about it.
Note that this depends on commit 51270d573a ("tracing/net_sched: Fix
tracepoints that save qdisc_dev() as a string") being applied, as passing
the qdisc_dev() into __string_src() will give an error.
Link: https://lore.kernel.org/all/ZfNmfCmgCs4Nc+EH@aschofie-mobl2/
Link: https://lore.kernel.org/linux-trace-kernel/20240314232754.345cea82@rorschach.local.home
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Reported-by: Alison Schofield <alison.schofield@intel.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
The WARN_ON() check in __assign_str() to catch where the source variable
to the macro doesn't match the source variable to __string() gives an
error in clang:
>> include/trace/events/sunrpc.h:703:4: warning: result of comparison against a string literal is unspecified (use an explicit string comparison function instead) [-Wstring-compare]
670 | __assign_str(progname, "unknown");
That's because the __assign_str() macro has:
WARN_ON_ONCE((src) != __data_offsets.dst##_ptr_);
Where "src" is a string literal. Clang warns when comparing a string
literal directly as it is undefined to what the value of the literal is.
Since this is still to make sure the same string that goes to __string()
is the same as __assign_str(), for string literals do a test for that and
then use strcmp() in those cases
Note that this depends on commit 51270d573a ("tracing/net_sched: Fix
tracepoints that save qdisc_dev() as a string") being applied, as this was
what found that bug.
Link: https://lore.kernel.org/linux-trace-kernel/20240312113002.00031668@gandalf.local.home
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202402292111.KIdExylU-lkp@intel.com/
Fixes: 433e1d88a3be ("tracing: Add warning if string in __assign_str() does not match __string()")
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
There are two WARN_ON*() warnings in tracepoint.h that deal with RCU
usage. But when they trigger, especially from using a TRACE_EVENT()
macro, the information is not very helpful and is confusing:
------------[ cut here ]------------
WARNING: CPU: 0 PID: 0 at include/trace/events/lock.h:24 lock_acquire+0x2b2/0x2d0
Where the above warning takes you to:
TRACE_EVENT(lock_acquire, <<<--- line 24 in lock.h
TP_PROTO(struct lockdep_map *lock, unsigned int subclass,
int trylock, int read, int check,
struct lockdep_map *next_lock, unsigned long ip),
[..]
Change the WARN_ON_ONCE() to WARN_ONCE() and add a string that allows
someone to search for exactly where the bug happened.
Link: https://lore.kernel.org/linux-trace-kernel/20240228133112.0d64fb1b@gandalf.local.home
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Reported-by: Borislav Petkov <bp@alien8.de>
Tested-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Fixes Coccinelle/coccicheck warnings reported by do_div.cocci.
Compared to do_div(), div64_u64() does not implicitly cast the divisor and
does not unnecessarily calculate the remainder.
Link: https://lore.kernel.org/linux-trace-kernel/20240225164507.232942-2-thorsten.blum@toblux.com
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Thorsten Blum <thorsten.blum@toblux.com>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Currently ftrace only dumps the global trace buffer on an OOPs. For
debugging a production usecase, instance trace will be helpful to
check specific problems since global trace buffer may be used for
other purposes.
This patch extend the ftrace_dump_on_oops parameter to dump a specific
or multiple trace instances:
- ftrace_dump_on_oops=0: as before -- don't dump
- ftrace_dump_on_oops[=1]: as before -- dump the global trace buffer
on all CPUs
- ftrace_dump_on_oops=2 or =orig_cpu: as before -- dump the global
trace buffer on CPU that triggered the oops
- ftrace_dump_on_oops=<instance_name>: new behavior -- dump the
tracing instance matching <instance_name>
- ftrace_dump_on_oops[=2/orig_cpu],<instance1_name>[=2/orig_cpu],
<instrance2_name>[=2/orig_cpu]: new behavior -- dump the global trace
buffer and multiple instance buffer on all CPUs, or only dump on CPU
that triggered the oops if =2 or =orig_cpu is given
Also, the sysctl node can handle the input accordingly.
Link: https://lore.kernel.org/linux-trace-kernel/20240223083126.1817731-1-quic_hyiwei@quicinc.com
Cc: Ross Zwisler <zwisler@google.com>
Cc: <mhiramat@kernel.org>
Cc: <mark.rutland@arm.com>
Cc: <mcgrof@kernel.org>
Cc: <keescook@chromium.org>
Cc: <j.granados@samsung.com>
Cc: <mathieu.desnoyers@efficios.com>
Cc: <corbet@lwn.net>
Signed-off-by: Huang Yiwei <quic_hyiwei@quicinc.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
The second parameter of __assign_rel_str() is no longer used. It can be removed.
Note, the only real users of rel_string is user events. This code is just
in the sample code for testing purposes.
This makes __assign_rel_str() different than __assign_str() but that's
fine. __assign_str() is used over 700 places and has a larger impact. That
change will come later.
Link: https://lore.kernel.org/linux-trace-kernel/20240223162519.2beb8112@gandalf.local.home
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
In preparation to remove the second parameter of __assign_str(), make sure
it is really a duplicate of __string() by adding a WARN_ON_ONCE().
Link: https://lore.kernel.org/linux-trace-kernel/20240223161356.63b72403@gandalf.local.home
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
There's no example code that uses __string_len(), and since the sample
code is used for testing the event logic, add a use case.
Link: https://lore.kernel.org/linux-trace-kernel/20240223152827.5f9f78e2@gandalf.local.home
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Now that __assign_str() gets the length from the __string() (and
__string_len()) macros, there's no reason to have a separate
__assign_str_len() macro as __assign_str() can get the length of the
string needed.
Also remove __assign_rel_str() although it had no users anyway.
Link: https://lore.kernel.org/linux-trace-kernel/20240223152206.0b650659@gandalf.local.home
Cc: Jeff Layton <jlayton@kernel.org>
Acked-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Reduce the number of kernel-doc warnings from 52 down to 10, i.e.,
fix 42 kernel-doc warnings by (a) using the Returns: format for
function return values or (b) using "@var:" instead of "@var -"
for function parameter descriptions.
Fix one return values list so that it is formatted correctly when
rendered for output.
Spell "non-zero" with a hyphen in several places.
Link: https://lore.kernel.org/linux-trace-kernel/20240223054833.15471-1-rdunlap@infradead.org
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/oe-kbuild-all/202312180518.X6fRyDSN-lkp@intel.com/
Reported-by: kernel test robot <lkp@intel.com>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Running the ftrace selftests caused the ring buffer mapping test to fail.
Investigating, I found that the snapshot counter would be incremented
every time a snapshot trigger was added, even if that snapshot trigger
failed.
# cd /sys/kernel/tracing
# echo "snapshot" > events/sched/sched_process_fork/trigger
# echo "snapshot" > events/sched/sched_process_fork/trigger
-bash: echo: write error: File exists
That second one that fails increments the snapshot counter but doesn't
decrement it. It needs to be decremented when the snapshot fails.
Link: https://lore.kernel.org/linux-trace-kernel/20240223013344.729055907@goodmis.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Vincent Donnefort <vdonnefort@google.com>
Fixes: 16f7e48ffc53a ("tracing: Add snapshot refcount")
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Running the ftrace selftests caused the ring buffer mapping test to fail.
Investigating, I found that the snapshot counter would be incremented
every time a tracer that uses the snapshot is enabled even if the snapshot
was used by the previous tracer.
That is:
# cd /sys/kernel/tracing
# echo wakeup_rt > current_tracer
# echo wakeup_dl > current_tracer
# echo nop > current_tracer
would leave the snapshot counter at 1 and not zero. That's because the
enabling of wakeup_dl would increment the counter again but the setting
the tracer to nop would only decrement it once.
Do not arm the snapshot for a tracer if the previous tracer already had it
armed.
Link: https://lore.kernel.org/linux-trace-kernel/20240223013344.570525723@goodmis.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Vincent Donnefort <vdonnefort@google.com>
Fixes: 16f7e48ffc53a ("tracing: Add snapshot refcount")
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
The TRACE_EVENT macros has some dependency if a __string() field is NULL,
where it will save "(null)" as the string. This string is also used by
__assign_str(). It's better to create a single macro instead of having
something that will not be caught by the compiler if there is an
unfortunate typo.
Link: https://lore.kernel.org/linux-trace-kernel/20240222211443.106216915@goodmis.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Chuck Lever <chuck.lever@oracle.com>
Suggested-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
The TRACE_EVENT() macro handles dynamic strings by having:
TP_PROTO(struct some_struct *s),
TP_ARGS(s),
TP_STRUCT__entry(
__string(my_string, s->string)
),
TP_fast_assign(
__assign_str(my_string, s->string);
)
TP_printk("%s", __get_str(my_string))
There's even some code that may call a function helper to find the
s->string value. The problem with the above is that the work to get the
s->string is done twice. Once at the __string() and again in the
__assign_str().
The length of the string is calculated via a strlen(), not once, but
twice. Once during the __string() macro and again in __assign_str(). But
the length is actually already recorded in the data location and here's no
reason to call strlen() again.
Just use the saved length that was saved in the __string() code for the
__assign_str() code.
Link: https://lore.kernel.org/linux-trace-kernel/20240222211442.793074999@goodmis.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
The TRACE_EVENT() macro handles dynamic strings by having:
TP_PROTO(struct some_struct *s),
TP_ARGS(s),
TP_STRUCT__entry(
__string(my_string, s->string)
),
TP_fast_assign(
__assign_str(my_string, s->string);
)
TP_printk("%s", __get_str(my_string))
There's even some code that may call a function helper to find the
s->string value. The problem with the above is that the work to get the
s->string is done twice. Once at the __string() and again in the
__assign_str().
But the __string() uses dynamic_array() which has a helper structure that
is created holding the offsets and length of the string fields. Instead of
finding the string twice, just save it off in another field from that
helper structure, and have __assign_str() use that instead.
Note, this also means that the second parameter of __assign_str() isn't
even used anymore, and may be removed in the future.
Link: https://lore.kernel.org/linux-trace-kernel/20240222211442.634192653@goodmis.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
The TP_STRUCT__entry that gets assigned the region name, or an
empty string if no region is present, is erroneously initialized
to the cxl_region pointer. It needs to be properly initialized
otherwise it's length is wrong and garbage chars can appear in
the kernel trace output: /sys/kernel/tracing/trace
The bad initialization was due in part to a naming conflict with
the parameter: struct cxl_region *region. The field 'region' is
already exposed externally as the region name, so changing that
to something logical, like 'region_name' is not an option. Instead
rename the internal only struct cxl_region to the commonly used
'cxlr'.
Impact is that tooling depending on that trace data can miss
picking up a valid event when searching by region name. The
TP_printk() output, if enabled, does emit the correct region
names in the dmesg log.
This was found during testing of the cxl-list option to report
media-errors for a region.
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: stable@vger.kernel.org
Fixes: ddf49d57b8 ("cxl/trace: Add TRACE support for CXL media-error records")
Signed-off-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
The __string() and __assign_str() helper macros of the TRACE_EVENT() macro
are going through some optimizations where only the source string of
__string() will be used and the __assign_str() source will be ignored and
later removed.
To make sure that there's no issues, a new check is added between the
__string() src argument and the __assign_str() src argument that does a
strcmp() to make sure they are the same string.
The hclgevf trace events have:
__assign_str(devname, &hdev->nic.kinfo.netdev->name);
Which triggers the warning:
hclgevf_trace.h:34:39: error: passing argument 1 of ‘strcmp’ from incompatible pointer type [-Werror=incompatible-pointer-types]
34 | __assign_str(devname, &hdev->nic.kinfo.netdev->name);
[..]
arch/x86/include/asm/string_64.h:75:24: note: expected ‘const char *’ but argument is of type ‘char (*)[16]’
75 | int strcmp(const char *cs, const char *ct);
| ~~~~~~~~~~~~^~
Because __assign_str() now has:
WARN_ON_ONCE(__builtin_constant_p(src) ? \
strcmp((src), __data_offsets.dst##_ptr_) : \
(src) != __data_offsets.dst##_ptr_); \
The problem is the '&' on hdev->nic.kinfo.netdev->name. That's because
that name is:
char name[IFNAMSIZ]
Where passing an address '&' of a char array is not compatible with strcmp().
The '&' is not necessary, remove it.
Link: https://lore.kernel.org/linux-trace-kernel/20240313093454.3909afe7@gandalf.local.home
Cc: netdev <netdev@vger.kernel.org>
Cc: Yisen Zhuang <yisen.zhuang@huawei.com>
Cc: Salil Mehta <salil.mehta@huawei.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Yufeng Mo <moyufeng@huawei.com>
Cc: Huazhong Tan <tanhuazhong@huawei.com>
Cc: stable@vger.kernel.org
Acked-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Jijie Shao <shaojijie@huawei.com>
Fixes: d8355240cf ("net: hns3: add trace event support for PF/VF mailbox")
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
I'm working on improving the __assign_str() and __string() macros to be
more efficient, and removed some unneeded semicolons. This triggered a bug
in the build as some of the __assign_str() macros in intel_display_trace
was missing a terminating semicolon.
Link: https://lore.kernel.org/linux-trace-kernel/20240222133057.2af72a19@gandalf.local.home
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: David Airlie <airlied@gmail.com>
Cc: stable@vger.kernel.org
Fixes: 2ceea5d880 ("drm/i915: Print plane name in fbc tracepoints")
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
I'm working on restructuring the __string* macros so that it doesn't need
to recalculate the string twice. That is, it will save it off when
processing __string() and the __assign_str() will not need to do the work
again as it currently does.
Currently __string_len(item, src, len) doesn't actually use "src", but my
changes will require src to be correct as that is where the __assign_str()
will get its value from.
The event class nfsd_clid_class has:
__string_len(name, name, clp->cl_name.len)
But the second "name" does not exist and causes my changes to fail to
build. That second parameter should be: clp->cl_name.data.
Link: https://lore.kernel.org/linux-trace-kernel/20240222122828.3d8d213c@gandalf.local.home
Cc: Neil Brown <neilb@suse.de>
Cc: Olga Kornievskaia <kolga@netapp.com>
Cc: Dai Ngo <Dai.Ngo@oracle.com>
Cc: Tom Talpey <tom@talpey.com>
Cc: stable@vger.kernel.org
Fixes: d27b74a867 ("NFSD: Use new __string_len C macros for nfsd_clid_class")
Acked-by: Chuck Lever <chuck.lever@oracle.com>
Acked-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Instead of using UTS_RELEASE, use init_utsname()->release, which means that
we don't need to rebuild the code just for the git head commit changing.
Link: https://lore.kernel.org/linux-trace-kernel/20240222124639.65629-1-john.g.garry@oracle.com
Signed-off-by: John Garry <john.g.garry@oracle.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
User programs can now ask user_events to handle the synchronization of
multiple different formats for an event with the same name via the new
USER_EVENT_REG_MULTI_FORMAT flag.
Add a section for USER_EVENT_REG_MULTI_FORMAT that explains the intended
purpose and caveats of using it. Explain how deletion works in these
cases and how to use /sys/kernel/tracing/dynamic_events for per-version
deletion.
Link: https://lore.kernel.org/linux-trace-kernel/20240222001807.1463-5-beaub@linux.microsoft.com
Signed-off-by: Beau Belgrave <beaub@linux.microsoft.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
User_events now has multi-format events which allow for the same
register name, but with different formats. When this occurs, different
tracepoints are created with unique names.
Add a new test that ensures the same name can be used for two different
formats. Ensure they are isolated from each other and that name and arg
matching still works if yet another register comes in with the same
format as one of the two.
Link: https://lore.kernel.org/linux-trace-kernel/20240222001807.1463-4-beaub@linux.microsoft.com
Signed-off-by: Beau Belgrave <beaub@linux.microsoft.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Currently user_events supports 1 event with the same name and must have
the exact same format when referenced by multiple programs. This opens
an opportunity for malicious or poorly thought through programs to
create events that others use with different formats. Another scenario
is user programs wishing to use the same event name but add more fields
later when the software updates. Various versions of a program may be
running side-by-side, which is prevented by the current single format
requirement.
Add a new register flag (USER_EVENT_REG_MULTI_FORMAT) which indicates
the user program wishes to use the same user_event name, but may have
several different formats of the event. When this flag is used, create
the underlying tracepoint backing the user_event with a unique name
per-version of the format. It's important that existing ABI users do
not get this logic automatically, even if one of the multi format
events matches the format. This ensures existing programs that create
events and assume the tracepoint name will match exactly continue to
work as expected. Add logic to only check multi-format events with
other multi-format events and single-format events to only check
single-format events during find.
Change system name of the multi-format event tracepoint to ensure that
multi-format events are isolated completely from single-format events.
This prevents single-format names from conflicting with multi-format
events if they end with the same suffix as the multi-format events.
Add a register_name (reg_name) to the user_event struct which allows for
split naming of events. We now have the name that was used to register
within user_events as well as the unique name for the tracepoint. Upon
registering events ensure matches based on first the reg_name, followed
by the fields and format of the event. This allows for multiple events
with the same registered name to have different formats. The underlying
tracepoint will have a unique name in the format of {reg_name}.{unique_id}.
For example, if both "test u32 value" and "test u64 value" are used with
the USER_EVENT_REG_MULTI_FORMAT the system would have 2 unique
tracepoints. The dynamic_events file would then show the following:
u:test u64 count
u:test u32 count
The actual tracepoint names look like this:
test.0
test.1
Both would be under the new user_events_multi system name to prevent the
older ABI from being used to squat on multi-formatted events and block
their use.
Deleting events via "!u:test u64 count" would only delete the first
tracepoint that matched that format. When the delete ABI is used all
events with the same name will be attempted to be deleted. If
per-version deletion is required, user programs should either not use
persistent events or delete them via dynamic_events.
Link: https://lore.kernel.org/linux-trace-kernel/20240222001807.1463-3-beaub@linux.microsoft.com
Signed-off-by: Beau Belgrave <beaub@linux.microsoft.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
The current code for finding and deleting events assumes that there will
never be cases when user_events are registered with the same name, but
different formats. Scenarios exist where programs want to use the same
name but have different formats. An example is multiple versions of a
program running side-by-side using the same event name, but with updated
formats in each version.
This change does not yet allow for multi-format events. If user_events
are registered with the same name but different arguments the programs
see the same return values as before. This change simply makes it
possible to easily accommodate for this.
Update find_user_event() to take in argument parameters and register
flags to accommodate future multi-format event scenarios. Have find
validate argument matching and return error pointers to cover when
an existing event has the same name but different format. Update
callers to handle error pointer logic.
Move delete_user_event() to use hash walking directly now that
find_user_event() has changed. Delete all events found that match the
register name, stop if an error occurs and report back to the user.
Update user_fields_match() to cover list_empty() scenarios now that
find_user_event() uses it directly. This makes the logic consistent
across several callsites.
Link: https://lore.kernel.org/linux-trace-kernel/20240222001807.1463-2-beaub@linux.microsoft.com
Signed-off-by: Beau Belgrave <beaub@linux.microsoft.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
When a ring-buffer is memory mapped by user-space, no trace or
ring-buffer swap is possible. This means the snapshot feature is
mutually exclusive with the memory mapping. Having a refcount on
snapshot users will help to know if a mapping is possible or not.
Instead of relying on the global trace_types_lock, a new spinlock is
introduced to serialize accesses to trace_array->snapshot. This intends
to allow access to that variable in a context where the mmap lock is
already held.
Link: https://lore.kernel.org/linux-trace-kernel/20240220202310.2489614-4-vdonnefort@google.com
Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
The default behavior of ring_buffer_wait() when passed a NULL "cond"
parameter is to exit the function the first time it is woken up. The
current implementation uses a counter that starts at zero and when it is
greater than one it exits the wait_event_interruptible().
But this relies on the internal working of wait_event_interruptible() as
that code basically has:
if (cond)
return;
prepare_to_wait();
if (!cond)
schedule();
finish_wait();
That is, cond is called twice before it sleeps. The default cond of
ring_buffer_wait() needs to account for that and wait for its counter to
increment twice before exiting.
Instead, use the seq/atomic_inc logic that is used by the tracing code
that calls this function. Add an atomic_t seq to rb_irq_work and when cond
is NULL, have the default callback take a descriptor as its data that
holds the rbwork and the value of the seq when it started.
The wakeups will now increment the rbwork->seq and the cond callback will
simply check if that number is different, and no longer have to rely on
the implementation of wait_event_interruptible().
Link: https://lore.kernel.org/linux-trace-kernel/20240315063115.6cb5d205@gandalf.local.home
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Fixes: 7af9ded0c2 ("ring-buffer: Use wait_event_interruptible() in ring_buffer_wait()")
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
This reverts commit 885c36e59f.
The patch currently broke the bpf selftest test_tc_dtime because
uapi field __sk_buff->tstamp_type depends on skb->mono_delivery_time which
does not necessarily mean mono with the original fix as the bit was re-used
for userspace timestamp as well to avoid tstamp reset in the forwarding
path. To solve this we need to keep mono_delivery_time as is and
introduce another bit called user_delivery_time and fall back to the
initial proposal of setting the user_delivery_time bit based on
sk_clockid set from userspace.
Fixes: 885c36e59f ("net: Re-use and set mono_delivery_time bit for userspace tstamp packets")
Link: https://lore.kernel.org/netdev/bc037db4-58bb-4861-ac31-a361a93841d3@linux.dev/
Signed-off-by: Abhishek Chauhan <quic_abchauha@quicinc.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
On MT7530, the HT_XTAL_FSEL field of the HWTRAP register stores a 2-bit
value that represents the frequency of the crystal oscillator connected to
the switch IC. The field is populated by the state of the ESW_P4_LED_0 and
ESW_P4_LED_0 pins, which is done right after reset is deasserted.
ESW_P4_LED_0 ESW_P3_LED_0 Frequency
-----------------------------------------
0 0 Reserved
0 1 20MHz
1 0 40MHz
1 1 25MHz
On MT7531, the XTAL25 bit of the STRAP register stores this. The LAN0LED0
pin is used to populate the bit. 25MHz when the pin is high, 40MHz when
it's low.
These pins are also used with LEDs, therefore, their state can be set to
something other than the bootstrapping configuration. For example, a link
may be established on port 3 before the DSA subdriver takes control of the
switch which would set ESW_P3_LED_0 to high.
Currently on mt7530_setup() and mt7531_setup(), 1000 - 1100 usec delay is
described between reset assertion and deassertion. Some switch ICs in real
life conditions cannot always have these pins set back to the bootstrapping
configuration before reset deassertion in this amount of delay. This causes
wrong crystal frequency to be selected which puts the switch in a
nonfunctional state after reset deassertion.
The tests below are conducted on an MT7530 with a 40MHz crystal oscillator
by Justin Swartz.
With a cable from an active peer connected to port 3 before reset, an
incorrect crystal frequency (0b11 = 25MHz) is selected:
[1] [3] [5]
: : :
_____________________________ __________________
ESW_P4_LED_0 |_______|
_____________________________
ESW_P3_LED_0 |__________________________
: : : :
: : [4]...:
: :
[2]................:
[1] Reset is asserted.
[2] Period of 1000 - 1100 usec.
[3] Reset is deasserted.
[4] Period of 315 usec. HWTRAP register is populated with incorrect
XTAL frequency.
[5] Signals reflect the bootstrapped configuration.
Increase the delay between reset_control_assert() and
reset_control_deassert(), and gpiod_set_value_cansleep(priv->reset, 0) and
gpiod_set_value_cansleep(priv->reset, 1) to 5000 - 5100 usec. This amount
ensures a higher possibility that the switch IC will have these pins back
to the bootstrapping configuration before reset deassertion.
With a cable from an active peer connected to port 3 before reset, the
correct crystal frequency (0b10 = 40MHz) is selected:
[1] [2-1] [3] [5]
: : : :
_____________________________ __________________
ESW_P4_LED_0 |_______|
___________________ _______
ESW_P3_LED_0 |_________| |__________________
: : : : :
: [2-2]...: [4]...:
[2]................:
[1] Reset is asserted.
[2] Period of 5000 - 5100 usec.
[2-1] ESW_P3_LED_0 goes low.
[2-2] Remaining period of 5000 - 5100 usec.
[3] Reset is deasserted.
[4] Period of 310 usec. HWTRAP register is populated with bootstrapped
XTAL frequency.
[5] Signals reflect the bootstrapped configuration.
ESW_P3_LED_0 low period before reset deassertion:
5000 usec
- 5100 usec
TEST RESET HOLD
# (usec)
---------------------
1 5410
2 5440
3 4375
4 5490
5 5475
6 4335
7 4370
8 5435
9 4205
10 4335
11 3750
12 3170
13 4395
14 4375
15 3515
16 4335
17 4220
18 4175
19 4175
20 4350
Min 3170
Max 5490
Median 4342.500
Avg 4466.500
Revert commit 2920dd92b9 ("net: dsa: mt7530: disable LEDs before reset").
Changing the state of pins via reset assertion is simpler and more
efficient than doing so by setting the LED controller off.
Fixes: b8f126a8d5 ("net-next: dsa: add dsa support for Mediatek MT7530 switch")
Fixes: c288575f78 ("net: dsa: mt7530: Add the support of MT7531 switch")
Co-developed-by: Justin Swartz <justin.swartz@risingedge.co.za>
Signed-off-by: Justin Swartz <justin.swartz@risingedge.co.za>
Signed-off-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ignat Korchagin says:
====================
net: veth: ability to toggle GRO and XDP independently
It is rather confusing that GRO is automatically enabled, when an XDP program
is attached to a veth interface. Moreover, it is not possible to disable GRO
on a veth, if an XDP program is attached (which might be desirable in some use
cases).
Make GRO and XDP independent for a veth interface.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
We should be able to independently flip either XDP or GRO states and toggling
one should not affect the other.
Adjust other tests as well that had implicit expectation that GRO would be
automatically enabled.
Signed-off-by: Ignat Korchagin <ignat@cloudflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit d3256efd8e ("veth: allow enabling NAPI even without XDP") tried to fix
the fact that GRO was not possible without XDP, because veth did not use NAPI
without XDP. However, it also introduced the behaviour that GRO is always
enabled, when XDP is enabled.
While it might be desired for most cases, it is confusing for the user at best
as the GRO flag suddenly changes, when an XDP program is attached. It also
introduces some complexities in state management as was partially addressed in
commit fe9f801355 ("net: veth: clear GRO when clearing XDP even when down").
But the biggest problem is that it is not possible to disable GRO at all, when
an XDP program is attached, which might be needed for some use cases.
Fix this by not touching the GRO flag on XDP enable/disable as the code already
supports switching to NAPI if either GRO or XDP is requested.
Link: https://lore.kernel.org/lkml/20240311124015.38106-1-ignat@cloudflare.com/
Fixes: d3256efd8e ("veth: allow enabling NAPI even without XDP")
Fixes: fe9f801355 ("net: veth: clear GRO when clearing XDP even when down")
Signed-off-by: Ignat Korchagin <ignat@cloudflare.com>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The missing check of x->encap caused to the situation where GSO packets
were created with UDP encapsulation.
As a solution return the encap check for non-offloaded SA.
Fixes: 983a73da1f ("xfrm: Pass UDP encapsulation in TX packet offload")
Closes: https://lore.kernel.org/all/a650221ae500f0c7cf496c61c96c1b103dcb6f67.camel@redhat.com
Reported-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
When the skb is reorganized during esp_output (!esp->inline), the pages
coming from the original skb fragments are supposed to be released back
to the system through put_page. But if the skb fragment pages are
originating from a page_pool, calling put_page on them will trigger a
page_pool leak which will eventually result in a crash.
This leak can be easily observed when using CONFIG_DEBUG_VM and doing
ipsec + gre (non offloaded) forwarding:
BUG: Bad page state in process ksoftirqd/16 pfn:1451b6
page:00000000de2b8d32 refcount:0 mapcount:0 mapping:0000000000000000 index:0x1451b6000 pfn:0x1451b6
flags: 0x200000000000000(node=0|zone=2)
page_type: 0xffffffff()
raw: 0200000000000000 dead000000000040 ffff88810d23c000 0000000000000000
raw: 00000001451b6000 0000000000000001 00000000ffffffff 0000000000000000
page dumped because: page_pool leak
Modules linked in: ip_gre gre mlx5_ib mlx5_core xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink iptable_nat nf_nat xt_addrtype br_netfilter rpcrdma rdma_ucm ib_iser libiscsi scsi_transport_iscsi ib_umad rdma_cm ib_ipoib iw_cm ib_cm ib_uverbs ib_core overlay zram zsmalloc fuse [last unloaded: mlx5_core]
CPU: 16 PID: 96 Comm: ksoftirqd/16 Not tainted 6.8.0-rc4+ #22
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
Call Trace:
<TASK>
dump_stack_lvl+0x36/0x50
bad_page+0x70/0xf0
free_unref_page_prepare+0x27a/0x460
free_unref_page+0x38/0x120
esp_ssg_unref.isra.0+0x15f/0x200
esp_output_tail+0x66d/0x780
esp_xmit+0x2c5/0x360
validate_xmit_xfrm+0x313/0x370
? validate_xmit_skb+0x1d/0x330
validate_xmit_skb_list+0x4c/0x70
sch_direct_xmit+0x23e/0x350
__dev_queue_xmit+0x337/0xba0
? nf_hook_slow+0x3f/0xd0
ip_finish_output2+0x25e/0x580
iptunnel_xmit+0x19b/0x240
ip_tunnel_xmit+0x5fb/0xb60
ipgre_xmit+0x14d/0x280 [ip_gre]
dev_hard_start_xmit+0xc3/0x1c0
__dev_queue_xmit+0x208/0xba0
? nf_hook_slow+0x3f/0xd0
ip_finish_output2+0x1ca/0x580
ip_sublist_rcv_finish+0x32/0x40
ip_sublist_rcv+0x1b2/0x1f0
? ip_rcv_finish_core.constprop.0+0x460/0x460
ip_list_rcv+0x103/0x130
__netif_receive_skb_list_core+0x181/0x1e0
netif_receive_skb_list_internal+0x1b3/0x2c0
napi_gro_receive+0xc8/0x200
gro_cell_poll+0x52/0x90
__napi_poll+0x25/0x1a0
net_rx_action+0x28e/0x300
__do_softirq+0xc3/0x276
? sort_range+0x20/0x20
run_ksoftirqd+0x1e/0x30
smpboot_thread_fn+0xa6/0x130
kthread+0xcd/0x100
? kthread_complete_and_exit+0x20/0x20
ret_from_fork+0x31/0x50
? kthread_complete_and_exit+0x20/0x20
ret_from_fork_asm+0x11/0x20
</TASK>
The suggested fix is to introduce a new wrapper (skb_page_unref) that
covers page refcounting for page_pool pages as well.
Cc: stable@vger.kernel.org
Fixes: 6a5bcd84e8 ("page_pool: Allow drivers to hint on SKB recycling")
Reported-and-tested-by: Anatoli N.Chechelnickiy <Anatoli.Chechelnickiy@m.interpipe.biz>
Reported-by: Ian Kumlien <ian.kumlien@gmail.com>
Link: https://lore.kernel.org/netdev/CAA85sZvvHtrpTQRqdaOx6gd55zPAVsqMYk_Lwh4Md5knTq7AyA@mail.gmail.com
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Mina Almasry <almasrymina@google.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Remove a unnecessary level of indenting in some areas of the reference
section. No text changes.
Signed-off-by: Thorsten Leemhuis <linux@leemhuis.info>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <01f1a407e92b92d9f8614bd34882956694bab123.1710750972.git.linux@leemhuis.info>
A bunch of minor fixes and improvements and two other things:
- Explain the 'v' version prefix when it's first used, but drop it
everywhere in the text for consistency. Also drop single quotes around
a few version numbers.
- Point out that testing a stable/longterm kernel only makes sense if
the series is still supported.
Signed-off-by: Thorsten Leemhuis <linux@leemhuis.info>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <f13d203d5975419608996300992eaa2e4fcc2dc1.1710750972.git.linux@leemhuis.info>
Instruct readers to check the taint flag, as the reason why it's set
might directly or indirectly cause the bug or interfere with testing.
Signed-off-by: Thorsten Leemhuis <linux@leemhuis.info>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <8fcaffa8e85f36d51178d61061355c3c8bc85a0f.1710750972.git.linux@leemhuis.info>