linux/kernel/trace
Zheng Yejian 7e42907f3a ring-buffer: Fix deadloop issue on reading trace_pipe
Soft lockup occurs when reading file 'trace_pipe':

  watchdog: BUG: soft lockup - CPU#6 stuck for 22s! [cat:4488]
  [...]
  RIP: 0010:ring_buffer_empty_cpu+0xed/0x170
  RSP: 0018:ffff88810dd6fc48 EFLAGS: 00000246
  RAX: 0000000000000000 RBX: 0000000000000246 RCX: ffffffff93d1aaeb
  RDX: ffff88810a280040 RSI: 0000000000000008 RDI: ffff88811164b218
  RBP: ffff88811164b218 R08: 0000000000000000 R09: ffff88815156600f
  R10: ffffed102a2acc01 R11: 0000000000000001 R12: 0000000051651901
  R13: 0000000000000000 R14: ffff888115e49500 R15: 0000000000000000
  [...]
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007f8d853c2000 CR3: 000000010dcd8000 CR4: 00000000000006e0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  Call Trace:
   __find_next_entry+0x1a8/0x4b0
   ? peek_next_entry+0x250/0x250
   ? down_write+0xa5/0x120
   ? down_write_killable+0x130/0x130
   trace_find_next_entry_inc+0x3b/0x1d0
   tracing_read_pipe+0x423/0xae0
   ? tracing_splice_read_pipe+0xcb0/0xcb0
   vfs_read+0x16b/0x490
   ksys_read+0x105/0x210
   ? __ia32_sys_pwrite64+0x200/0x200
   ? switch_fpu_return+0x108/0x220
   do_syscall_64+0x33/0x40
   entry_SYSCALL_64_after_hwframe+0x61/0xc6

Through the vmcore, I found it's because in tracing_read_pipe(),
ring_buffer_empty_cpu() found some buffer is not empty but then it
cannot read anything due to "rb_num_of_entries() == 0" always true,
Then it infinitely loop the procedure due to user buffer not been
filled, see following code path:

  tracing_read_pipe() {
    ... ...
    waitagain:
      tracing_wait_pipe() // 1. find non-empty buffer here
      trace_find_next_entry_inc()  // 2. loop here try to find an entry
        __find_next_entry()
          ring_buffer_empty_cpu();  // 3. find non-empty buffer
          peek_next_entry()  // 4. but peek always return NULL
            ring_buffer_peek()
              rb_buffer_peek()
                rb_get_reader_page()
                  // 5. because rb_num_of_entries() == 0 always true here
                  //    then return NULL
      // 6. user buffer not been filled so goto 'waitgain'
      //    and eventually leads to an deadloop in kernel!!!
  }

By some analyzing, I found that when resetting ringbuffer, the 'entries'
of its pages are not all cleared (see rb_reset_cpu()). Then when reducing
the ringbuffer, and if some reduced pages exist dirty 'entries' data, they
will be added into 'cpu_buffer->overrun' (see rb_remove_pages()), which
cause wrong 'overrun' count and eventually cause the deadloop issue.

To fix it, we need to clear every pages in rb_reset_cpu().

Link: https://lore.kernel.org/linux-trace-kernel/20230708225144.3785600-1-zhengyejian1@huawei.com

Cc: stable@vger.kernel.org
Fixes: a5fb833172 ("ring-buffer: Fix uninitialized read_stamp")
Signed-off-by: Zheng Yejian <zhengyejian1@huawei.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-07-12 12:15:26 -04:00
..
rv tracing/tools: Updates for 6.4 2023-04-28 16:11:26 -07:00
blktrace.c block: remove more NULL checks after bdev_get_queue() 2023-02-21 09:23:22 -07:00
bpf_trace.c Probes updates for v6.5: 2023-06-30 10:44:53 -07:00
bpf_trace.h
error_report-traces.c
fgraph.c tracing: arm64: Avoid missing-prototype warnings 2023-07-12 12:06:04 -04:00
fprobe.c tracing/probes: Add fprobe events for tracing function entry and exit. 2023-06-06 21:39:55 +09:00
ftrace_internal.h tracing: arm64: Avoid missing-prototype warnings 2023-07-12 12:06:04 -04:00
ftrace.c Tracing updates for 6.5: 2023-06-30 10:33:17 -07:00
Kconfig Probes updates for v6.5: 2023-06-30 10:44:53 -07:00
kprobe_event_gen_test.c tracing: Fix wrong return in kprobe_event_gen_test.c 2023-03-19 12:20:48 -04:00
Makefile tracing/probes: Add fprobe events for tracing function entry and exit. 2023-06-06 21:39:55 +09:00
pid_list.c tracing: Cleanup double word in comment 2022-04-26 17:58:50 -04:00
pid_list.h tracing: Create a sparse bitmask for pid filtering 2021-10-05 17:38:45 -04:00
power-traces.c
preemptirq_delay_test.c
rethook.c fprobe: Pass return address to the handlers 2023-06-06 21:39:55 +09:00
ring_buffer_benchmark.c ring_buffer: Remove unused "event" parameter 2022-11-23 19:08:30 -05:00
ring_buffer.c ring-buffer: Fix deadloop issue on reading trace_pipe 2023-07-12 12:15:26 -04:00
rpm-traces.c
synth_event_gen_test.c tracing: Always use canonical ftrace path 2023-02-18 14:34:09 -05:00
trace_benchmark.c tracing: Add numeric delta time to the trace event benchmark 2022-09-26 13:01:09 -04:00
trace_benchmark.h tracing: Add numeric delta time to the trace event benchmark 2022-09-26 13:01:09 -04:00
trace_boot.c tracing/boot: Test strscpy() against less than zero for error 2023-07-05 10:30:49 -04:00
trace_branch.c
trace_clock.c
trace_dynevent.c tracing: Free buffers when a used dynamic event is removed 2022-11-23 19:07:12 -05:00
trace_dynevent.h
trace_entries.h function_graph: Support recording and printing the return value of function 2023-06-20 18:38:37 -04:00
trace_eprobe.c tracing/probes: Move event parameter fetching code to common parser 2023-06-06 21:39:56 +09:00
trace_event_perf.c tracing/perf: Use strndup_user instead of kzalloc/strncpy_from_user 2022-11-23 19:08:31 -05:00
trace_events_filter_test.h
trace_events_filter.c tracing: Add a way to filter function addresses to function names 2023-01-25 10:31:11 -05:00
trace_events_hist.c tracing: Rename stacktrace field to common_stacktrace 2023-05-23 23:38:23 -04:00
trace_events_inject.c tracing: Replace all non-returning strlcpy with strscpy 2023-05-26 13:52:19 -07:00
trace_events_synth.c tracing/synthetic: Make lastcmd_mutex static 2023-04-06 15:08:18 -04:00
trace_events_trigger.c tracing: Do not synchronize freeing of trigger filter on boot up 2022-12-14 08:50:56 -05:00
trace_events_user.c tracing/user_events: Fix struct arg size match check 2023-07-10 21:38:13 -04:00
trace_events.c hardening updates for v6.5-rc1 2023-06-27 21:24:18 -07:00
trace_export.c tracing: Fix TASK_COMM_LEN in trace event format file 2023-02-12 10:23:39 -05:00
trace_fprobe.c tracing/probes: Add $arg* meta argument for all function args 2023-06-06 21:39:56 +09:00
trace_functions_graph.c function_graph: Support recording and printing the return value of function 2023-06-20 18:38:37 -04:00
trace_functions.c ftrace: disable preemption when recursion locked 2021-10-27 11:21:49 -04:00
trace_hwlat.c tracing/hwlat: Replace sched_setaffinity with set_cpus_allowed_ptr 2023-03-19 13:23:22 -04:00
trace_irqsoff.c
trace_kdb.c
trace_kprobe_selftest.c tracing: arm64: Avoid missing-prototype warnings 2023-07-12 12:06:04 -04:00
trace_kprobe_selftest.h
trace_kprobe.c Probes updates for v6.5: 2023-06-30 10:44:53 -07:00
trace_mmiotrace.c
trace_nop.c
trace_osnoise.c tracing/timerlat: Add user-space interface 2023-06-22 10:39:56 -04:00
trace_output.c Tracing updates for 6.5: 2023-06-30 10:33:17 -07:00
trace_output.h tracing: Add "fields" option to show raw trace event fields 2023-03-29 06:52:08 -04:00
trace_preemptirq.c cpuidle: tracing, preempt: Squash _rcuidle tracing 2023-01-31 15:01:46 +01:00
trace_printk.c tracing: Disable "other" permission bits in the tracefs files 2021-10-08 18:08:43 -04:00
trace_probe_kernel.h kernel/trace: Provide default impelentations defined in trace_probe_tmpl.h 2023-02-24 09:44:27 +09:00
trace_probe_tmpl.h kernel/trace: extract common part in process_fetch_insn 2023-02-24 10:04:13 +09:00
trace_probe.c Probes updates for v6.5: 2023-06-30 10:44:53 -07:00
trace_probe.h tracing/probes: Add BTF retval type support 2023-06-06 21:39:56 +09:00
trace_recursion_record.c tracing: Use trace_create_file() to simplify creation of tracefs entries 2022-05-26 21:12:52 -04:00
trace_sched_switch.c sched/tracing: Append prev_state to tp args instead 2022-05-12 00:37:11 +02:00
trace_sched_wakeup.c sched/tracing: Append prev_state to tp args instead 2022-05-12 00:37:11 +02:00
trace_selftest_dynamic.c
trace_selftest.c tracing: Have function_graph selftest call cond_resched() 2023-05-28 21:15:46 -04:00
trace_seq.c tracing: Acquire buffer from temparary trace sequence 2023-02-07 12:42:54 -05:00
trace_stack.c tracing: Disable "other" permission bits in the tracefs files 2021-10-08 18:08:43 -04:00
trace_stat.c tracing: Disable "other" permission bits in the tracefs files 2021-10-08 18:08:43 -04:00
trace_stat.h
trace_synth.h tracing: Allow synthetic events to pass around stacktraces 2023-01-25 10:31:24 -05:00
trace_syscalls.c tracing: Remove unused __bad_type_size() method 2022-11-17 20:21:06 -05:00
trace_uprobe.c tracing/probes: Move event parameter fetching code to common parser 2023-06-06 21:39:56 +09:00
trace.c Tracing fixes for 6.5: 2023-07-06 19:07:15 -07:00
trace.h Probes updates for v6.5: 2023-06-30 10:44:53 -07:00
tracing_map.c tracing: Remove unused variable 'dups' 2022-10-03 12:20:31 -04:00
tracing_map.h