Porting ftrace to the marker infrastructure.
Don't need to chain to the wakeup tracer from the sched tracer, because markers
support multiple probes connected.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
CC: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
To support the forthcoming "immediate values" marker optimization, we must have
a way to declare markers in few code paths that does not use instruction
modification based enable. This will be the case of printk(), some traps and
eventually lockdep instrumentation.
Changelog :
- Fix reversed boolean logic of "generic".
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Denys Vlasenko <vda.linux@googlemail.com> :
> Not in this patch, but I noticed:
>
> #define __trace_mark(name, call_private, format, args...) \
> do { \
> static const char __mstrtab_##name[] \
> __attribute__((section("__markers_strings"))) \
> = #name "\0" format; \
> static struct marker __mark_##name \
> __attribute__((section("__markers"), aligned(8))) = \
> { __mstrtab_##name, &__mstrtab_##name[sizeof(#name)], \
> 0, 0, marker_probe_cb, \
> { __mark_empty_function, NULL}, NULL }; \
> __mark_check_format(format, ## args); \
> if (unlikely(__mark_##name.state)) { \
> (*__mark_##name.call) \
> (&__mark_##name, call_private, \
> format, ## args); \
> } \
> } while (0)
>
> In this call:
>
> (*__mark_##name.call) \
> (&__mark_##name, call_private, \
> format, ## args); \
>
> you make gcc allocate duplicate format string. You can use
> &__mstrtab_##name[sizeof(#name)] instead since it holds the same string,
> or drop ", format," above and "const char *fmt" from here:
>
> void (*call)(const struct marker *mdata, /* Probe wrapper */
> void *call_private, const char *fmt, ...);
>
> since mdata->format is the same and all callees which need it can take it there.
Very good point. I actually thought about dropping it, since it would
remove an unnecessary argument from the stack. And actually, since I now
have the marker_probe_cb sitting between the marker site and the
callbacks, there is no API change required. Thanks :)
Mathieu
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
CC: Denys Vlasenko <vda.linux@googlemail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Currently there is no protection from the root user to use up all of
memory for trace buffers. If the root user allocates too many entries,
the OOM killer might start kill off all tasks.
This patch adds an algorith to check the following condition:
pages_requested > (freeable_memory + current_trace_buffer_pages) / 4
If the above is met then the allocation fails. The above prevents more
than 1/4th of freeable memory from being used by trace buffers.
To determine the freeable_memory, I made determine_dirtyable_memory in
mm/page-writeback.c global.
Special thanks goes to Peter Zijlstra for suggesting the above calculation.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Refactor code from tracing_read_pipe() and create trace_seq_to_user().
Moved trace_seq_reset() call before iter->trace->read() call so that
when all leftover data is returned, trace_seq is reset automatically.
Signed-off-by: Pekka Paalanen <pq@iki.fi>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Due to debug hooks in the kernel that can change the way smp_processor_id
works, use raw_smp_processor_id in mcount called functions (namely
ftrace_record_ip). Currently we annotate most debug functions from calling
mcount, but we should not rely on that to prevent kernel lockups.
This patch uses the raw_smp_processor_id to prevent a recusive crash
that can happen if a debug hook in smp_processor_id calls mcount.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
In resetting the iterator in read_pipe, the reset of pos was
postitioned in the wrong location with respect to the memset
operation. The current code sets pos, incorrectly, to zero.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
text_poke is sleepable.
The original fix by Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>.
Signed-off-by: Pekka Paalanen <pq@iki.fi>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This patch adds a method for open_pipe and open_read to the pluggins
so that they can add a header to the trace pipe call.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This patch sets up the infrastructure to record overruns of the tracing
buffer.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Without CONFIG_DYNAMIC_FTRACE, mark_rodata_ro() would mark a wrong
number of pages as no-execute. The bug was introduced in the patch
"ftrace: dont write protect kernel text". The symptom was machine reboot
after a CPU hotplug.
Signed-off-by: Pekka Paalanen <pq@iki.fi>
Acked-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
In cleaning up of the sched_switch code, the function trace recording
of task comms was removed. This patch adds back the recording of comms
for function trace. The output of ftrace now has the task comm instead
of <...>.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This is first installment of adding documentation to the ftrace.
Expect many more patches of this kind in the near future.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Currently ftrace allocates a trace buffer for every possible CPU.
Work is being done to change it to only online CPUs and add hooks
to hotplug CPUS.
This patch lays out the infrastructure for such a change.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Andrew Morton suggested using strict_strtoul over simple_strtoul.
This patch replaces them in ftrace.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Andrew Morton mentioned some clean ups that should be done to ftrace.
This patch does some of the simple clean ups.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
With the introduction of ftrace, it is possible to recurse into
the lockdep functions via the mcount call. To prevent possible
lockups, updating the lockdep_recursion counter on grabbing the internal
lockdep_lock should prevent deadlocks.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This patch adds /debug/tracing/trace_entries that allows users to
see as well as modify the number of trace entries the buffers hold.
The number of entries only increments in ENTRIES_PER_PAGE which is
calculated by the size of an entry with the number of entries that
can fit in a page. The user does not need to use an exact size, but
the entries will be rounded to one of the increments.
Trying to set the entries to 0 will return with -EINVAL.
To avoid race conditions, the modification of the buffer size can only
be done when tracing is completely disabled (current_tracer == none).
A info message will be printed if a user tries to modify the buffer size
when not set to none.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The wakeup selftest used an internal API for setting the test task priority.
This patch fixes it to use the proper API for performing such a task.
Thanks goes to Randy Dunlap for pointing out this build failure.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This patch implements "NONBLOCK" for trace_pipe. If the trace_pipe is opened
with O_NONBLOCK, then the trace_pipe read will not block when buffer is empty.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Break out of while loop with EOF when the current_trace changes.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The trace iterator is reset in the read. We still need to restore the tracer
that the trace_pipe was opened with.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
We expect things like "cat" to block on reads to trace_pipe. That's what
trace_pipe is for.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The conversion of task states to a character in the sched_switch tracer (part
of latency tracer infrastructure), seems to be incorrect. We currently do it
by indexing into the state_to_char array using the state value. The state
values do not map directly into the array index and are thus incorrect. The
following patch addresses this issue. This is also what is being done even
in the show_task() routine in kernel/sched.c
The patch has been compile and run tested.
Signed-off-by: Ankita Garg <ankita@in.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Provides kernel modules a way to register custom page fault handlers.
On every page fault this will call a list of registered functions. The
functions may handle the fault and force do_page_fault() to return
immediately.
This functionality is similar to the now removed page fault notifiers.
Custom page fault handlers are used by debugging and reverse engineering
tools. Mmiotrace is one such tool and a patch to add it into the tree
will follow.
The custom page fault handlers are called earlier in do_page_fault()
than the page fault notifiers were.
Signed-off-by: Pekka Paalanen <pq@iki.fi>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Dynamic ftrace cant work when the kernel has its text write protected.
This patch keeps the kernel from being write protected when
dynamic ftrace is in place.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The fault label to jump to on fault of updating the code was misplaced
preventing the fault from being recorded.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
There is a slight race condition in the selftest where the max update
of the wakeup and irqs/preemption off tests can be doing a max update as
the buffers are being tested. If this happens the system can crash with
a GPF.
This patch adds the max update spinlock around the checking of the
buffers to prevent such a race.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
If the trace output changes on reading the trace files, there is a chance
that the start function will return NULL. If the start function of a sequence
returns NULL the stop equivalent is not called. In this case, all locks
that are taken must be released even if they are released in the stop function.
This patch fixes a case that a mutex was not released on return of NULL
in the start sequence function.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
When dynamic ftrace fails and sets itself disabled, the ftraced daemon
will go back to sleep everytime it wakes up. The setting of the
ftraced state to UNINTERRUPTIBLE is skipped in this process, and the
daemon takes up 100% of the CPU. This patch makes sure the ftraced daemon
sets itself to UNINTERRUPTIBLE in that loop.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This patch removes the "notrace" annotation from lockdep and adds the debugging
files in the kernel director to those that should not be compiled with
"-pg" mcount tracing.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The debug functions in spin_lock debugging pollute the output of the
function tracer. This patch adds the debug files in the lib director
to those that should not be compiled with mcount tracing.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Lock debugging enabled cause huge performance problems for tracing. Having
the lock verification happening for every function that is called
because mcount calls spin_lock can cripple the system.
This patch converts the spin_locks used by ftrace into raw_spin_locks.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This patch changes the use of __get_cpu_var to explicitly calling
raw_smp_processor_id and using the per_cpu() macro. On some debug
configurations, the use of __get_cpu_var may cause ftrace to trigger
and this can cause problems with the irqsoff tracing.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
With the adding of the configuration changes in the Makefile to prevent
tracing of functions in the ftrace code, all tracing of all the ftrace
code has been removed. Unfortunately, one of the selftests, relied on
a function to be traced. With the new change, the function was no longer
traced and the test failed.
This patch separates out the test function into its own file so that
we can add the "-pg" flag to the compilation of that function and the
adding of the mcount call to that function.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The selftest validation code checks for valid entries in the trace buffer.
TRACE_STACK and TRACE_SPECIAL have been added to the code but not to
the validator. This patch adds the two to prevent them from flagging a
failure in the selftest.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
printk called from wakeup critical timings and irqs off can
cause deadlocks since printk might do a wakeup itself. If the
call to printk happens with the runqueue lock held, it can
deadlock.
This patch protects the printk from being called in trace irqs off
with a test to see if the runqueue for the current CPU is locked.
If it is locked, the printk is skipped.
The wakeup always holds the runqueue lock, so the printk is
simply removed.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
trace_function is called by mcount and calling wake_up from that
can have unpredictable results. This patch removes the wakeup from
trace_function.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>