linux/kernel/trace
Martin KaFai Lau a7636d9ecf kprobes: Optimize hot path by using percpu counter to collect 'nhit' statistics
When doing ebpf+kprobe on some hot TCP functions (e.g.
tcp_rcv_established), the kprobe_dispatcher() function
shows up in 'perf report'.

In kprobe_dispatcher(), there is a lot of cache bouncing
on 'tk->nhit++'.  'tk->nhit' and 'tk->tp.flags' also share
the same cacheline.

perf report (cycles:pp):

	8.30%  ipv4_dst_check
	4.74%  copy_user_enhanced_fast_string
	3.93%  dst_release
	2.80%  tcp_v4_rcv
	2.31%  queued_spin_lock_slowpath
	2.30%  _raw_spin_lock
	1.88%  mlx4_en_process_rx_cq
	1.84%  eth_get_headlen
	1.81%  ip_rcv_finish
	~~~~
	1.71%  kprobe_dispatcher
	~~~~
	1.55%  mlx4_en_xmit
	1.09%  __probe_kernel_read

perf report after patch:

	9.15%  ipv4_dst_check
	5.00%  copy_user_enhanced_fast_string
	4.12%  dst_release
	2.96%  tcp_v4_rcv
	2.50%  _raw_spin_lock
	2.39%  queued_spin_lock_slowpath
	2.11%  eth_get_headlen
	2.03%  mlx4_en_process_rx_cq
	1.69%  mlx4_en_xmit
	1.19%  ip_rcv_finish
	1.12%  __probe_kernel_read
	1.02%  ehci_hcd_cleanup

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Josef Bacik <jbacik@fb.com>
Cc: Kernel Team <kernel-team@fb.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1454531308-2441898-1-git-send-email-kafai@fb.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-09 11:08:58 +01:00
..
blktrace.c convert a bunch of open-coded instances of memdup_user_nul() 2016-01-04 10:26:58 -05:00
bpf_trace.c perf/bpf: Convert perf_event_array to use struct file 2016-01-29 08:35:25 +01:00
ftrace.c ftrace: Fix the race between ftrace and insmod 2016-01-07 15:56:21 -05:00
Kconfig Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-11-10 18:11:41 -08:00
Makefile bpf: Fix the build on BPF_SYSCALL=y && !CONFIG_TRACING kernels, make it more configurable 2015-04-02 16:28:06 +02:00
power-traces.c PM / sleep: export suspend_resume trace event 2015-01-30 02:10:41 +01:00
ring_buffer_benchmark.c ring_buffer: Remove unneeded smp_wmb() before wakeup of reader benchmark 2015-11-03 16:19:02 -05:00
ring_buffer.c ring-buffer: Process commits whenever moving to a new page. 2015-11-25 15:24:05 -05:00
rpm-traces.c
trace_benchmark.c tracing: Only benchmark the time tracepoints take if tracing is on 2015-11-02 13:34:58 -05:00
trace_benchmark.h
trace_branch.c tracing: Remove {start,stop}_branch_trace 2015-10-21 10:10:09 -04:00
trace_clock.c tracing: Export tracing clock functions 2015-05-12 15:56:57 -04:00
trace_entries.h tracing: %pF is only for function pointers 2015-03-25 08:57:22 -04:00
trace_event_perf.c Not much new with tracing for this release. Mostly just clean ups and 2016-01-12 20:04:15 -08:00
trace_events_filter_test.h
trace_events_filter.c tracing: is_legal_op() can return boolean 2015-11-02 14:26:51 -05:00
trace_events_trigger.c Not much new with tracing for this release. Mostly just clean ups and 2016-01-12 20:04:15 -08:00
trace_events.c kernel/*: switch to memdup_user_nul() 2016-01-04 10:27:55 -05:00
trace_export.c tracing: ftrace_event_is_function() can return boolean 2015-11-02 14:28:05 -05:00
trace_functions_graph.c tracing: Remove unused ftrace_cpu_disabled per cpu variable 2015-11-07 13:25:14 -05:00
trace_functions.c tracing/trivial: Fix typos and make an int into a bool 2014-11-20 10:05:36 -05:00
trace_irqsoff.c tracing: report_latency() in trace_irqsoff.c can return boolean 2015-11-02 14:20:19 -05:00
trace_kdb.c tracing: Move trace_flags from global to a trace_array field 2015-09-30 15:22:55 -04:00
trace_kprobe.c kprobes: Optimize hot path by using percpu counter to collect 'nhit' statistics 2016-02-09 11:08:58 +01:00
trace_mmiotrace.c tracing: Pass trace_array into trace_buffer_unlock_commit() 2015-09-25 17:38:44 -04:00
trace_nop.c tracing: Remove unneeded includes of debugfs.h and fs.h 2015-01-22 11:19:48 -05:00
trace_output.c tracing: Move trace_flags from global to a trace_array field 2015-09-30 15:22:55 -04:00
trace_output.h tracing: Turn seq_print_user_ip() into a static function 2015-09-28 10:16:12 -04:00
trace_printk.c tracing: Fix setting of start_index in find_next() 2016-01-04 15:22:47 -05:00
trace_probe.c trace: Don't use __weak in header files 2015-03-25 08:57:23 -04:00
trace_probe.h kernel/trace_probe: is_good_name can be boolean 2015-09-22 13:11:30 -04:00
trace_sched_switch.c sched/core: Fix trace_sched_switch() 2015-10-06 17:08:15 +02:00
trace_sched_wakeup.c Most of the changes are clean ups and small fixes. Some of them have 2015-11-06 13:30:20 -08:00
trace_selftest_dynamic.c
trace_selftest.c Seems that Peter Zijlstra added a new check that is making old 2014-10-12 07:28:55 -04:00
trace_seq.c tracing: use %*pb[l] to print bitmaps including cpumasks and nodemasks 2015-02-13 21:21:37 -08:00
trace_stack.c tracing/stacktrace: Show entire trace if passed in function not found 2016-01-29 12:19:10 -05:00
trace_stat.c tracing: Convert the tracing facility over to use tracefs 2015-02-03 12:48:41 -05:00
trace_stat.h
trace_syscalls.c tracing: Move trace_flags from global to a trace_array field 2015-09-30 15:22:55 -04:00
trace_uprobe.c tracing/uprobes: Do not print '0x (null)' when offset is 0 2015-08-26 10:43:01 -03:00
trace.c tracing: Fix stacktrace skip depth in trace_buffer_unlock_commit_regs() 2016-01-14 09:28:19 -05:00
trace.h tracing: Fix comment to use tracing_on over tracing_enable 2015-12-23 14:27:25 -05:00