linux

mirror of https://github.com/torvalds/linux.git synced 2024-11-25 13:41:51 +00:00

History

Linus Torvalds af9c191ac2 ring-buffer: Updates for v6.12: - Merged v6.11-rc3 into trace/ring-buffer/core The v6.10 ring buffer pull request was not made due to Mathieu Desnoyers making a comment to the pull request. Mathieu and I resolved it on IRC, but we did not let Linus know that it was resolved. Linus did not do the pull thinking it still had some unresolved issues. The ring buffer work for 6.12 was dependent on both this pull request as well as the reserve_mem kernel command line option that was going upstream through the memory management tree. The ring buffer repo was being used by others so it could not be rebased. In order to continue the work, the v6.11-rc3 branch was pulled in to get access to the reserve_mem work. This has the 6.11 pull request that did not make it into 6.11, which was: tracing/ring-buffer: Have persistent buffer across reboots This allows for the tracing instance ring buffer to stay persistent across reboots. The way this is done is by adding to the kernel command line: trace_instance=boot_map@0x285400000:12M This will reserve 12 megabytes at the address 0x285400000, and then map the tracing instance "boot_map" ring buffer to that memory. This will appear as a normal instance in the tracefs system: /sys/kernel/tracing/instances/boot_map A user could enable tracing in that instance, and on reboot or kernel crash, if the memory is not wiped by the firmware, it will recreate the trace in that instance. For example, if one was debugging a shutdown of a kernel reboot: # cd /sys/kernel/tracing # echo function > instances/boot_map/current_tracer # reboot [..] # cd /sys/kernel/tracing # tail instances/boot_map/trace swapper/0-1 [000] d..1. 164.549800: restore_boot_irq_mode <-native_machine_shutdown swapper/0-1 [000] d..1. 164.549801: native_restore_boot_irq_mode <-native_machine_shutdown swapper/0-1 [000] d..1. 164.549802: disconnect_bsp_APIC <-native_machine_shutdown swapper/0-1 [000] d..1. 164.549811: hpet_disable <-native_machine_shutdown swapper/0-1 [000] d..1. 164.549812: iommu_shutdown_noop <-native_machine_restart swapper/0-1 [000] d..1. 164.549813: native_machine_emergency_restart <-__do_sys_reboot swapper/0-1 [000] d..1. 164.549813: tboot_shutdown <-native_machine_emergency_restart swapper/0-1 [000] d..1. 164.549820: acpi_reboot <-native_machine_emergency_restart swapper/0-1 [000] d..1. 164.549821: acpi_reset <-acpi_reboot swapper/0-1 [000] d..1. 164.549822: acpi_os_write_port <-acpi_reboot On reboot, the buffer is examined to make sure it is valid. The validation check even steps through every event to make sure the meta data of the event is correct. If any test fails, it will simply reset the buffer, and the buffer will be empty on boot. The new changes for 6.12 are: - Allow the tracing persistent boot buffer to use the "reserve_mem" option Instead of having the admin find a physical address to store the persistent buffer, which can be very tedious if they have to administrate several different machines, allow them to use the "reserve_mem" option that will find a location for them. It is not as reliable because of KASLR, as the loading of the kernel in different locations can cause the memory allocated to be inconsistent. Booting with "nokaslr" can make reserve_mem more reliable. - Have function graph tracer handle offsets from a previous boot. The ring buffer output from a previous boot may have different addresses due to kaslr. Have the function graph tracer handle these by using the delta from the previous boot to the new boot address space. - Only reset the saved meta offset when the buffer is started or reset In the persistent memory meta data, it holds the previous address space information, so that it can calculate the delta to have function tracing work. But this gets updated after being read to hold the new address space. But if the buffer isn't used for that boot, on reboot, the delta is now calculated from the previous boot and not the boot that holds the data in the ring buffer. This causes the functions not to be shown. Do not save the address space information of the current kernel until it is being recorded. - Add a magic variable to test the valid meta data Add a magic variable in the meta data that can also be used for validation. The validator of the previous buffer doesn't need this magic data, but it can be used if the meta data is changed by a new kernel, which may have the same format that passes the validator but is used differently. This magic number can also be used as a "versioning" of the meta data. - Align user space mapped ring buffer sub buffers to improve TLB entries Linus mentioned that the mapped ring buffer sub buffers were misaligned between the meta page and the sub-buffers, so that if the sub-buffers were bigger than PAGE_SIZE, it wouldn't allow the TLB to use bigger entries. - Add new kernel command line "traceoff" to disable tracing on boot for instances If tracing is enabled for a boot instance, there needs a way to be able to disable it on boot so that new events do not get entered into the ring buffer and be mixed with events from a previous boot, as that can be confusing. - Allow trace_printk() to go to other instances Currently, trace_printk() can only go to the top level instance. When debugging with a persistent buffer, it is really useful to be able to add trace_printk() to go to that buffer, so that you have access to them after a crash. - Do not use "bin_printk()" for traces to a boot instance The bin_printk() saves only a pointer to the printk format in the ring buffer, as the reader of the buffer can still have access to it. But this is not the case if the buffer is from a previous boot. If the trace_printk() is going to a "persistent" buffer, it will use the slower version that writes the printk format into the buffer. - Add command line option to allow trace_printk() to go to an instance Allow the kernel command line to define which instance the trace_printk() goes to, instead of forcing the admin to set it for every boot via the tracefs options. - Start a document that explains how to use tracefs to debug the kernel - Add some more kernel selftests to test user mapped ring buffer -----BEGIN PGP SIGNATURE----- iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCZu/PxxQccm9zdGVkdEBn b29kbWlzLm9yZwAKCRAp5XQQmuv6qowiAQCx86Nm48aCACjrvGWCFb+jgQZn8QdO MeK15Fcc5C3b5gEAkJkDKqtul7ybI9+vq+3yNzdl7pO7Y7+pCNzz3PfVaQA= =Ce81 -----END PGP SIGNATURE----- Merge tag 'trace-ring-buffer-v6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull ring-buffer updates from Steven Rostedt: - tracing/ring-buffer: persistent buffer across reboots This allows for the tracing instance ring buffer to stay persistent across reboots. The way this is done is by adding to the kernel command line: trace_instance=boot_map@0x285400000:12M This will reserve 12 megabytes at the address 0x285400000, and then map the tracing instance "boot_map" ring buffer to that memory. This will appear as a normal instance in the tracefs system: /sys/kernel/tracing/instances/boot_map A user could enable tracing in that instance, and on reboot or kernel crash, if the memory is not wiped by the firmware, it will recreate the trace in that instance. For example, if one was debugging a shutdown of a kernel reboot: # cd /sys/kernel/tracing # echo function > instances/boot_map/current_tracer # reboot [..] # cd /sys/kernel/tracing # tail instances/boot_map/trace swapper/0-1 [000] d..1. 164.549800: restore_boot_irq_mode <-native_machine_shutdown swapper/0-1 [000] d..1. 164.549801: native_restore_boot_irq_mode <-native_machine_shutdown swapper/0-1 [000] d..1. 164.549802: disconnect_bsp_APIC <-native_machine_shutdown swapper/0-1 [000] d..1. 164.549811: hpet_disable <-native_machine_shutdown swapper/0-1 [000] d..1. 164.549812: iommu_shutdown_noop <-native_machine_restart swapper/0-1 [000] d..1. 164.549813: native_machine_emergency_restart <-__do_sys_reboot swapper/0-1 [000] d..1. 164.549813: tboot_shutdown <-native_machine_emergency_restart swapper/0-1 [000] d..1. 164.549820: acpi_reboot <-native_machine_emergency_restart swapper/0-1 [000] d..1. 164.549821: acpi_reset <-acpi_reboot swapper/0-1 [000] d..1. 164.549822: acpi_os_write_port <-acpi_reboot On reboot, the buffer is examined to make sure it is valid. The validation check even steps through every event to make sure the meta data of the event is correct. If any test fails, it will simply reset the buffer, and the buffer will be empty on boot. - Allow the tracing persistent boot buffer to use the "reserve_mem" option Instead of having the admin find a physical address to store the persistent buffer, which can be very tedious if they have to administrate several different machines, allow them to use the "reserve_mem" option that will find a location for them. It is not as reliable because of KASLR, as the loading of the kernel in different locations can cause the memory allocated to be inconsistent. Booting with "nokaslr" can make reserve_mem more reliable. - Have function graph tracer handle offsets from a previous boot. The ring buffer output from a previous boot may have different addresses due to kaslr. Have the function graph tracer handle these by using the delta from the previous boot to the new boot address space. - Only reset the saved meta offset when the buffer is started or reset In the persistent memory meta data, it holds the previous address space information, so that it can calculate the delta to have function tracing work. But this gets updated after being read to hold the new address space. But if the buffer isn't used for that boot, on reboot, the delta is now calculated from the previous boot and not the boot that holds the data in the ring buffer. This causes the functions not to be shown. Do not save the address space information of the current kernel until it is being recorded. - Add a magic variable to test the valid meta data Add a magic variable in the meta data that can also be used for validation. The validator of the previous buffer doesn't need this magic data, but it can be used if the meta data is changed by a new kernel, which may have the same format that passes the validator but is used differently. This magic number can also be used as a "versioning" of the meta data. - Align user space mapped ring buffer sub buffers to improve TLB entries Linus mentioned that the mapped ring buffer sub buffers were misaligned between the meta page and the sub-buffers, so that if the sub-buffers were bigger than PAGE_SIZE, it wouldn't allow the TLB to use bigger entries. - Add new kernel command line "traceoff" to disable tracing on boot for instances If tracing is enabled for a boot instance, there needs a way to be able to disable it on boot so that new events do not get entered into the ring buffer and be mixed with events from a previous boot, as that can be confusing. - Allow trace_printk() to go to other instances Currently, trace_printk() can only go to the top level instance. When debugging with a persistent buffer, it is really useful to be able to add trace_printk() to go to that buffer, so that you have access to them after a crash. - Do not use "bin_printk()" for traces to a boot instance The bin_printk() saves only a pointer to the printk format in the ring buffer, as the reader of the buffer can still have access to it. But this is not the case if the buffer is from a previous boot. If the trace_printk() is going to a "persistent" buffer, it will use the slower version that writes the printk format into the buffer. - Add command line option to allow trace_printk() to go to an instance Allow the kernel command line to define which instance the trace_printk() goes to, instead of forcing the admin to set it for every boot via the tracefs options. - Start a document that explains how to use tracefs to debug the kernel - Add some more kernel selftests to test user mapped ring buffer * tag 'trace-ring-buffer-v6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (28 commits) selftests/ring-buffer: Handle meta-page bigger than the system selftests/ring-buffer: Verify the entire meta-page padding tracing/Documentation: Start a document on how to debug with tracing tracing: Add option to set an instance to be the trace_printk destination tracing: Have trace_printk not use binary prints if boot buffer tracing: Allow trace_printk() to go to other instance buffers tracing: Add "traceoff" flag to boot time tracing instances ring-buffer: Align meta-page to sub-buffers for improved TLB usage ring-buffer: Add magic and struct size to boot up meta data ring-buffer: Don't reset persistent ring-buffer meta saved addresses tracing/fgraph: Have fgraph handle previous boot function addresses tracing: Allow boot instances to use reserve_mem boot memory tracing: Fix ifdef of snapshots to not prevent last_boot_info file ring-buffer: Use vma_pages() helper function tracing: Fix NULL vs IS_ERR() check in enable_instances() tracing: Add last boot delta offset for stack traces tracing: Update function tracing output for previous boot buffer tracing: Handle old buffer mappings for event strings and functions tracing/ring-buffer: Add last_boot_info file to boot instance ring-buffer: Save text and data locations in mapped meta data ...		2024-09-22 09:47:16 -07:00
..
bpf	bpf-next-6.12	2024-09-21 09:27:50 -07:00
cgroup	ALong with the usual shower of singleton patches, notable patch series in	2024-09-21 07:29:05 -07:00
configs	mm/slab: Plumb kmem_buckets into __do_kmalloc_node()	2024-07-03 12:24:19 +02:00
debug	kdb: Get rid of redundant kdb_curr_task()	2024-06-21 15:49:29 +01:00
dma	dma-mapping: reflow dma_supported	2024-09-12 16:28:00 +02:00
entry	treewide: context_tracking: Rename CONTEXT_* into CT_STATE_*	2024-07-29 07:33:10 +05:30
events	perf: Fix topology_sibling_cpumask check warning on ARM	2024-09-22 09:03:22 -07:00
futex	fault-inject: improve build for CONFIG_FAULT_INJECTION=n	2024-09-01 20:43:33 -07:00
gcov	gcov: add support for GCC 14	2024-06-15 10:43:06 -07:00
irq	Updates for the interrupt subsystem:	2024-09-17 07:09:17 +02:00
kcsan	kcsan: Use min() to fix Coccinelle warning	2024-08-01 16:40:44 -07:00
livepatch	livepatch: Replace snprintf() with sysfs_emit()	2024-07-02 16:56:18 +02:00
locking	Many singleton patches - please see the various changelogs for details.	2024-09-21 08:20:50 -07:00
module	Updates for KCOV instrumentation on x86:	2024-09-17 12:40:34 +02:00
power	PM: hibernate: Remove unused stub for saveable_highmem_page()	2024-09-10 20:11:40 +02:00
printk	drm next for 6.12-rc1	2024-09-19 10:18:15 +02:00
rcu	slab updates for 6.12	2024-09-18 08:53:53 +02:00
sched	sched_ext: Initial pull request for v6.12	2024-09-21 09:44:57 -07:00
time	In the v6.12 scheduler development cycle we had 63 commits from 18 contributors:	2024-09-19 15:55:58 +02:00
trace	ring-buffer: Updates for v6.12:	2024-09-22 09:47:16 -07:00
.gitignore
acct.c	kernel misc: Remove the now superfluous sentinel elements from ctl_table array	2024-04-24 09:43:53 +02:00
async.c	async: Use a dedicated unbound workqueue with raised min_active	2024-02-09 11:13:59 -10:00
audit_fsnotify.c
audit_tree.c	fsnotify: create a wrapper fsnotify_find_inode_mark()	2024-04-04 16:24:16 +02:00
audit_watch.c	fsnotify: create a wrapper fsnotify_find_inode_mark()	2024-04-04 16:24:16 +02:00
audit.c	audit: Make use of str_enabled_disabled() helper	2024-09-03 16:35:16 -04:00
audit.h
auditfilter.c	audit: use task_tgid_nr() instead of task_pid_nr()	2024-08-28 16:48:28 -04:00
auditsc.c	audit: use task_tgid_nr() instead of task_pid_nr()	2024-08-28 16:48:28 -04:00
backtracetest.c	backtracetest: add MODULE_DESCRIPTION()	2024-06-24 22:24:55 -07:00
bounds.c	bounds: Use the right number of bits for power-of-two CONFIG_NR_CPUS	2024-04-29 08:29:29 -07:00
capability.c
cfi.c
compat.c
configs.c
context_tracking.c	context_tracking, rcu: Rename rcu_dyntick trace event into rcu_watching	2024-08-15 21:30:43 +05:30
cpu_pm.c
cpu.c	Updates for timers and timekeeping:	2024-09-17 07:25:37 +02:00
crash_core.c	Document/kexec: generalize crash hotplug description	2024-09-01 20:43:37 -07:00
crash_reserve.c	crash: fix crash memory reserve exceed system memory bug	2024-09-01 20:43:30 -07:00
cred.c	cred: Use KMEM_CACHE() instead of kmem_cache_create()	2024-02-23 17:33:31 -05:00
delayacct.c	sysctl: treewide: constify the ctl_table argument of proc_handlers	2024-07-24 20:59:29 +02:00
dma.c
elfcorehdr.c	crash: remove dependency of FA_DUMP on CRASH_DUMP	2024-02-23 17:48:22 -08:00
exec_domain.c
exit.c	ALong with the usual shower of singleton patches, notable patch series in	2024-09-21 07:29:05 -07:00
exit.h	exit: add internal include file with helpers	2023-09-21 12:03:50 -06:00
extable.c
fail_function.c
fork.c	sched_ext: Initial pull request for v6.12	2024-09-21 09:44:57 -07:00
freezer.c	sched,freezer: Mark TASK_FROZEN special	2024-08-17 11:06:44 +02:00
gen_kheaders.sh	kheaders: use `command -v` to test for existence of `cpio`	2024-05-30 01:13:20 +09:00
groups.c	groups: Convert group_info.usage to refcount_t	2023-09-29 11:28:39 -07:00
hung_task.c	sysctl: treewide: constify the ctl_table argument of proc_handlers	2024-07-24 20:59:29 +02:00
iomem.c
irq_work.c
jump_label.c	jump_label: Fix the fix, brown paper bags galore	2024-07-31 12:57:39 +02:00
kallsyms_internal.h	kallsyms: get rid of code for absolute kallsyms	2024-07-20 16:33:21 +09:00
kallsyms_selftest.c	kallsyms: Match symbols exactly with CONFIG_LTO_CLANG	2024-08-15 09:33:35 -07:00
kallsyms_selftest.h
kallsyms.c	kallsyms: Match symbols exactly with CONFIG_LTO_CLANG	2024-08-15 09:33:35 -07:00
kcmp.c	file: convert to SLAB_TYPESAFE_BY_RCU	2023-10-19 11:02:48 +02:00
Kconfig.freezer
Kconfig.hz
Kconfig.kexec	crash: clean up kdump related config items	2024-02-23 17:48:22 -08:00
Kconfig.locks
Kconfig.preempt	sched_ext: Build fix on !CONFIG_STACKTRACE[_SUPPORT]	2024-08-01 07:08:01 -10:00
kcov.c	Updates for KCOV instrumentation on x86:	2024-09-17 12:40:34 +02:00
kexec_core.c	sysctl: treewide: constify the ctl_table argument of proc_handlers	2024-07-24 20:59:29 +02:00
kexec_elf.c
kexec_file.c	kexec_file: fix elfcorehdr digest exclusion when CONFIG_CRASH_HOTPLUG=y	2024-09-01 17:59:01 -07:00
kexec_internal.h	kexec: use atomic_try_cmpxchg_acquire() in kexec_trylock()	2024-09-01 20:43:23 -07:00
kexec.c	crash: add a new kexec flag for hotplug support	2024-04-23 14:59:01 +10:00
kheaders.c
kprobes.c	kprobes: Fix to check symbol prefixes correctly	2024-08-05 14:04:03 +09:00
ksyms_common.c
ksysfs.c	profiling: remove prof_cpu_mask	2024-07-29 10:45:54 -07:00
kthread.c	kthread: Fix task state in kthread worker if being frozen	2024-09-10 09:51:14 +02:00
latencytop.c	sysctl: treewide: constify the ctl_table argument of proc_handlers	2024-07-24 20:59:29 +02:00
Makefile	mm: move kernel/numa.c to mm/	2024-09-03 21:15:26 -07:00
module_signature.c
notifier.c
nsproxy.c	pidfd: add pidfs	2024-03-01 12:23:37 +01:00
padata.c	This update includes the following changes:	2024-09-16 06:28:28 +02:00
panic.c	drm next for 6.12-rc1	2024-09-19 10:18:15 +02:00
params.c	params: Fix multi-line comment style	2023-12-01 09:51:44 -08:00
pid_namespace.c	sysctl: treewide: constify the ctl_table argument of proc_handlers	2024-07-24 20:59:29 +02:00
pid_sysctl.h	sysctl: treewide: constify the ctl_table argument of proc_handlers	2024-07-24 20:59:29 +02:00
pid.c	pidfs: remove config option	2024-03-13 12:53:53 -07:00
profile.c	profiling: remove profile=sleep support	2024-08-04 13:36:28 -07:00
ptrace.c	ptrace_attach: shift send(SIGSTOP) into ptrace_set_stopped()	2024-02-22 15:38:52 -08:00
range.c
reboot.c	kernel misc: Remove the now superfluous sentinel elements from ctl_table array	2024-04-24 09:43:53 +02:00
regset.c	regset: use kvzalloc() for regset_get_alloc()	2024-04-25 21:07:03 -07:00
relay.c	kernel: relay: remove relay_file_splice_read dead code, doesn't work	2023-12-29 12:22:27 -08:00
resource_kunit.c	resource, kunit: add test case for region_intersects()	2024-09-17 01:07:00 -07:00
resource.c	ALong with the usual shower of singleton patches, notable patch series in	2024-09-21 07:29:05 -07:00
rseq.c
scftorture.c	scftorture: Make torture_type static	2024-05-30 15:31:51 -07:00
scs.c
seccomp.c	sysctl: treewide: constify the ctl_table argument of proc_handlers	2024-07-24 20:59:29 +02:00
signal.c	execve updates for v6.12-rc1	2024-09-18 11:53:31 +02:00
smp.c	smp: print only local CPU info when sched_clock goes backward	2024-08-15 00:06:48 +05:30
smpboot.c	kthread: add kthread_stop_put	2023-10-04 10:41:57 -07:00
smpboot.h
softirq.c	softirq: Remove unused 'action' parameter from action callback	2024-08-20 17:13:40 +02:00
stackleak.c	sysctl: treewide: constify the ctl_table argument of proc_handlers	2024-07-24 20:59:29 +02:00
stacktrace.c	stacktrace: fix kernel-doc typo	2023-12-29 12:22:29 -08:00
static_call_inline.c
static_call.c
stop_machine.c	rcu: Rename rcu_momentary_dyntick_idle() into rcu_momentary_eqs()	2024-08-15 21:30:42 +05:30
sys_ni.c	Probes updates for v6.11:	2024-07-18 12:19:20 -07:00
sys.c	In the v6.12 scheduler development cycle we had 63 commits from 18 contributors:	2024-09-19 15:55:58 +02:00
sysctl-test.c	sysctl: Add module description to sysctl-testing	2024-06-03 15:20:37 +02:00
sysctl.c	sysctl: treewide: constify the ctl_table argument of proc_handlers	2024-07-24 20:59:29 +02:00
task_work.c	task_work: make TWA_NMI_CURRENT handling conditional on IRQ_WORK	2024-07-29 12:05:06 -07:00
taskstats.c	taskstats: fill_stats_for_tgid: use for_each_thread()	2023-10-04 10:41:57 -07:00
torture.c	torture: Add MODULE_DESCRIPTION()	2024-05-30 15:31:38 -07:00
tracepoint.c
tsacct.c	tsacct: replace strncpy() with strscpy()	2024-07-12 16:39:53 -07:00
ucount.c	sysctl changes for v6.10-rc1	2024-05-17 17:31:24 -07:00
uid16.c
uid16.h
umh.c	sysctl: treewide: constify the ctl_table argument of proc_handlers	2024-07-24 20:59:29 +02:00
up.c	smp: Change function signatures to use call_single_data_t	2023-09-13 14:59:24 +02:00
user_namespace.c	user_namespace: use kmemdup_array() instead of kmemdup() for multiple allocation	2024-09-09 16:47:42 -07:00
user-return-notifier.c
user.c	uidgid: make sure we fit into one cacheline	2024-09-12 12:16:09 +02:00
usermode_driver.c
utsname_sysctl.c	sysctl: treewide: constify the ctl_table argument of proc_handlers	2024-07-24 20:59:29 +02:00
utsname.c
vhost_task.c	vhost_task: Handle SIGKILL by flushing work and exiting	2024-05-22 08:31:15 -04:00
vmcore_info.c	mm: support only one page_type per page	2024-09-03 21:15:43 -07:00
watch_queue.c	watch_queue: fix kcalloc() arguments order	2023-12-21 13:17:54 +01:00
watchdog_buddy.c
watchdog_perf.c	watchdog/perf: properly initialize the turbo mode timestamp and rearm counter	2024-07-17 21:11:34 -07:00
watchdog.c	watchdog: handle the ENODEV failure case of lockup_detector_delay_init() separately	2024-09-01 20:43:32 -07:00
workqueue_internal.h
workqueue.c	workqueue: Changes for v6.12	2024-09-18 06:59:44 +02:00