Commit Graph

843 Commits

Author SHA1 Message Date
Steven Rostedt (Red Hat)
0fc1b09ff1 tracing: Use temp buffer when filtering events
Filtering of events requires the data to be written to the ring buffer
before it can be decided to filter or not. This is because the parameters of
the filter are based on the result that is written to the ring buffer and
not on the parameters that are passed into the trace functions.

The ftrace ring buffer is optimized for writing into the ring buffer and
committing. The discard procedure used when filtering decides the event
should be discarded is much more heavy weight. Thus, using a temporary
filter when filtering events can speed things up drastically.

Without a temp buffer we have:

 # trace-cmd start -p nop
 # perf stat -r 10 hackbench 50
       0.790706626 seconds time elapsed ( +-  0.71% )

 # trace-cmd start -e all
 # perf stat -r 10 hackbench 50
       1.566904059 seconds time elapsed ( +-  0.27% )

 # trace-cmd start -e all -f 'common_preempt_count==20'
 # perf stat -r 10 hackbench 50
       1.690598511 seconds time elapsed ( +-  0.19% )

 # trace-cmd start -e all -f 'common_preempt_count!=20'
 # perf stat -r 10 hackbench 50
       1.707486364 seconds time elapsed ( +-  0.30% )

The first run above is without any tracing, just to get a based figure.
hackbench takes ~0.79 seconds to run on the system.

The second run enables tracing all events where nothing is filtered. This
increases the time by 100% and hackbench takes 1.57 seconds to run.

The third run filters all events where the preempt count will equal "20"
(this should never happen) thus all events are discarded. This takes 1.69
seconds to run. This is 10% slower than just committing the events!

The last run enables all events and filters where the filter will commit all
events, and this takes 1.70 seconds to run. The filtering overhead is
approximately 10%. Thus, the discard and commit of an event from the ring
buffer may be about the same time.

With this patch, the numbers change:

 # trace-cmd start -p nop
 # perf stat -r 10 hackbench 50
       0.778233033 seconds time elapsed ( +-  0.38% )

 # trace-cmd start -e all
 # perf stat -r 10 hackbench 50
       1.582102692 seconds time elapsed ( +-  0.28% )

 # trace-cmd start -e all -f 'common_preempt_count==20'
 # perf stat -r 10 hackbench 50
       1.309230710 seconds time elapsed ( +-  0.22% )

 # trace-cmd start -e all -f 'common_preempt_count!=20'
 # perf stat -r 10 hackbench 50
       1.786001924 seconds time elapsed ( +-  0.20% )

The first run is again the base with no tracing.

The second run is all tracing with no filtering. It is a little slower, but
that may be well within the noise.

The third run shows that discarding all events only took 1.3 seconds. This
is a speed up of 23%! The discard is much faster than even the commit.

The one downside is shown in the last run. Events that are not discarded by
the filter will take longer to add, this is due to the extra copy of the
event.

Cc: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2016-05-03 17:59:24 -04:00
Steven Rostedt (Red Hat)
904d1857ad tracing: Remove unused function trace_current_buffer_lock_reserve()
trace_current_buffer_lock_reserve() has no more users. Remove it.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2016-04-29 18:11:54 -04:00
Steven Rostedt (Red Hat)
33fddff24d tracing: Have trace_buffer_unlock_commit() call the _regs version with NULL
There's no real difference between trace_buffer_unlock_commit() and
trace_buffer_unlock_commit_regs() except that the former passes NULL to
ftrace_stack_trace() instead of regs. Have the former be a static inline of
the latter which passes NULL for regs.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2016-04-29 17:44:01 -04:00
Steven Rostedt (Red Hat)
a9fe48dcde tracing: Remove unused function trace_current_buffer_discard_commit()
The function trace_current_buffer_discard_commit() has no callers, remove
it.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2016-04-29 16:14:13 -04:00
Steven Rostedt (Red Hat)
fa66ddb870 tracing: Move trace_buffer_unlock_commit{_regs}() to local header
The functions trace_buffer_unlock_commit() and the _regs() version are only
used within the kernel/trace directory. Move them to the local header and
remove the export as well.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2016-04-29 16:14:12 -04:00
Steven Rostedt (Red Hat)
9cbb1506ab tracing: Fold filter_check_discard() into its only user
The function filter_check_discard() is small and only called by one user,
its code can be folded into that one caller and make the code a bit less
comlplex.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2016-04-29 16:14:08 -04:00
Steven Rostedt (Red Hat)
65da9a0a3b tracing: Make filter_check_discard() local
Nothing outside of the tracing directory calls filter_check_discard() or
check_filter_check_discard(). They should not be called by modules. Move
their prototypes into the local tracing header and remove their
EXPORT_SYMBOL() macros.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2016-04-27 10:13:46 -04:00
Wang Xiaoqiang
4afe6495e5 tracing: Don't use the address of the buffer array name in copy_from_user
With the following code snippet:

    ...
    char buf[64];
    ...
    if (copy_from_user(&buf, ubuf, cnt))
    ...

Even though the value of "&buf" equals "buf", but there is no need
to get the address of the "buf" again. Use "buf" instead of "&buf".

Link: http://lkml.kernel.org/r/20160418152329.18b72bea@debian

Signed-off-by: Wang Xiaoqiang <wangxq10@lzu.edu.cn>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2016-04-26 14:42:03 -04:00
Steven Rostedt (Red Hat)
205506228b tracing: Do not inherit event-fork option for instances
As the event-fork option requires doing work when enabled and disabled, it
can not be passed down to created instances. The instance must clear this
flag when it is created, and must clear it when its removed.

As more options may be created with this need, a macro ZEROED_TRACE_FLAGS is
created that holds the flags that must not be inherited by the top level
instance, and must be cleared on removal of instances.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2016-04-25 22:40:12 -04:00
Namhyung Kim
4b94f5b7b4 tracing: Add hist trigger 'log2' modifier
Allow users to have numeric fields displayed as log2 values in case
value range is very wide by appending '.log2' to field names.

For example,

  # echo 'hist:key=bytes_req' > kmalloc/trigger
  # cat kmalloc/hist

  { bytes_req:        504 } hitcount:          1
  { bytes_req:         11 } hitcount:          1
  { bytes_req:        104 } hitcount:          1
  { bytes_req:         48 } hitcount:          1
  { bytes_req:       2048 } hitcount:          1
  { bytes_req:       4096 } hitcount:          1
  { bytes_req:        240 } hitcount:          1
  { bytes_req:        392 } hitcount:          1
  { bytes_req:         13 } hitcount:          1
  { bytes_req:         28 } hitcount:          1
  { bytes_req:         12 } hitcount:          1
  { bytes_req:         64 } hitcount:          2
  { bytes_req:        128 } hitcount:          2
  { bytes_req:         32 } hitcount:          2
  { bytes_req:          8 } hitcount:         11
  { bytes_req:         10 } hitcount:         13
  { bytes_req:         24 } hitcount:         25
  { bytes_req:        160 } hitcount:         29
  { bytes_req:         16 } hitcount:         33
  { bytes_req:         80 } hitcount:         36

When using '.log2' modifier, the output looks like:

  # echo 'hist:key=bytes_req.log2' > kmalloc/trigger
  # cat kmalloc/hist

  { bytes_req: ~ 2^12 } hitcount:          1
  { bytes_req: ~ 2^11 } hitcount:          1
  { bytes_req: ~ 2^9  } hitcount:          2
  { bytes_req: ~ 2^6  } hitcount:          3
  { bytes_req: ~ 2^3  } hitcount:         13
  { bytes_req: ~ 2^5  } hitcount:         19
  { bytes_req: ~ 2^8  } hitcount:         49
  { bytes_req: ~ 2^7  } hitcount:         57
  { bytes_req: ~ 2^4  } hitcount:         74

Link: http://lkml.kernel.org/r/7ff396b246c6a881f46b979735fddf05a0d6c71a.1457029949.git.tom.zanussi@linux.intel.com

Cc: Tom Zanussi <tom.zanussi@linux.intel.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2016-04-19 18:56:03 -04:00
Tom Zanussi
5463bfda32 tracing: Add support for named hist triggers
Allow users to define 'named' hist triggers.  All triggers created
with the same 'name=xxx' option will update the same shared histogram
data.

This expands the hist trigger syntax from this:

    # echo hist:keys=xxx ... [ if filter] > event/trigger

to this:

    # echo hist:name=xxx:keys=xxx ... [ if filter] > event/trigger

Named histograms must use a 'compatible' set of keys and values, which
means each event added to a set of named triggers must have the same
names and types.

Reading the 'hist' file of any of the participating events will
produce the same output as any other participating event, which is to
be expected since they share the same data.

Link: http://lkml.kernel.org/r/1dbc84ee3322a75daaf5b3ef1d0cc0a2fb682fc7.1457029949.git.tom.zanussi@linux.intel.com

Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2016-04-19 18:56:01 -04:00
Tom Zanussi
52a7f16ded tracing: Add support for multiple hist triggers per event
Allow users to define any number of hist triggers per trace event.
Any number of hist triggers may be added for a given event, which may
differ by key, value, or filter.

Reading the event's 'hist' file will display the output of all the
hist triggers defined on an event concatenated in the order they were
defined.

Link: http://lkml.kernel.org/r/48a0c8dd34c344571de880fb35e211c6d9a28961.1457029949.git.tom.zanussi@linux.intel.com

Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2016-04-19 18:55:59 -04:00
Tom Zanussi
d0bad49bb0 tracing: Add enable_hist/disable_hist triggers
Similar to enable_event/disable_event triggers, these triggers enable
and disable the aggregation of events into maps rather than enabling
and disabling their writing into the trace buffer.

They can be used to automatically start and stop hist triggers based
on a matching filter condition.

If there's a paused hist trigger on system:event, the following would
start it when the filter condition was hit:

  # echo enable_hist:system:event [ if filter] > event/trigger

And the following would disable a running system:event hist trigger:

  # echo disable_hist:system:event [ if filter] > event/trigger

See Documentation/trace/events.txt for real examples.

Link: http://lkml.kernel.org/r/f812f086e52c8b7c8ad5443487375e03c96a601f.1457029949.git.tom.zanussi@linux.intel.com

Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2016-04-19 18:55:57 -04:00
Tom Zanussi
69a0200c2e tracing: Add hist trigger support for stacktraces as keys
It's often useful to be able to use a stacktrace as a hash key, for
keeping a count of the number of times a particular call path resulted
in a trace event, for instance.  Add a special key named 'stacktrace'
which can be used as key in a 'keys=' param for this purpose:

    # echo hist:keys=stacktrace ... \
               [ if filter] > event/trigger

Link: http://lkml.kernel.org/r/87515e90b3785232a874a12156174635a348edb1.1457029949.git.tom.zanussi@linux.intel.com

Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2016-04-19 12:19:01 -04:00
Tom Zanussi
316961988b tracing: Add hist trigger 'syscall' modifier
Allow users to have syscall id fields displayed as syscall names in
the output by appending '.syscall' to field names:

   # echo hist:keys=aaa.syscall ... \
              [ if filter] > event/trigger

Link: http://lkml.kernel.org/r/2bab1e59933d76a14b545bd2e02f80b8b08ac4d3.1457029949.git.tom.zanussi@linux.intel.com

Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2016-04-19 12:18:04 -04:00
Tom Zanussi
6b4827ad02 tracing: Add hist trigger 'execname' modifier
Allow users to have common_pid field values displayed as program names
in the output by appending '.execname' to a common_pid field name:

   # echo hist:keys=common_pid.execname ... \
              [ if filter] > event/trigger

Link: http://lkml.kernel.org/r/e172e81f10f5b8d1f08450e3763c850f39fbf698.1457029949.git.tom.zanussi@linux.intel.com

Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2016-04-19 12:17:56 -04:00
Tom Zanussi
c6afad49d1 tracing: Add hist trigger 'sym' and 'sym-offset' modifiers
Allow users to have address fields displayed as symbols in the output
by appending '.sym' or 'sym-offset' to field names:

   # echo hist:keys=aaa.sym,bbb.sym-offset ... \
              [ if filter] > event/trigger

Link: http://lkml.kernel.org/r/87d4935821491c0275513f0fbfb9bab8d3d3f079.1457029949.git.tom.zanussi@linux.intel.com

Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2016-04-19 12:17:51 -04:00
Tom Zanussi
0c4a6b4666 tracing: Add hist trigger 'hex' modifier for displaying numeric fields
Allow users to have numeric fields displayed as hex values in the
output by appending '.hex' to field names:

   # echo hist:keys=aaa,bbb.hex:vals=ccc.hex ... \
              [ if filter] > event/trigger

Link: http://lkml.kernel.org/r/67bd431edda2af5798d7694818f7e8d71b6b3463.1457029949.git.tom.zanussi@linux.intel.com

Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2016-04-19 12:17:43 -04:00
Tom Zanussi
e86ae9baac tracing: Add hist trigger support for clearing a trace
Allow users to append 'clear' to an existing trigger in order to have
the hash table cleared.

This expands the hist trigger syntax from this:
    # echo hist:keys=xxx:vals=yyy:sort=zzz.descending:pause/cont \
           [ if filter] >> event/trigger

to this:

    # echo hist:keys=xxx:vals=yyy:sort=zzz.descending:pause/cont/clear \
          [ if filter] >> event/trigger

Link: http://lkml.kernel.org/r/ae15dd0d9b2f7af07a37c1ff682063e2dbcdf160.1457029949.git.tom.zanussi@linux.intel.com

Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2016-04-19 12:17:35 -04:00
Tom Zanussi
83e99914c9 tracing: Add hist trigger support for pausing and continuing a trace
Allow users to append 'pause' or 'continue' to an existing trigger in
order to have it paused or to have a paused trace continue.

This expands the hist trigger syntax from this:
    # echo hist:keys=xxx:vals=yyy:sort=zzz.descending \
          [ if filter] >> event/trigger

to this:

    # echo hist:keys=xxx:vals=yyy:sort=zzz.descending:pause or cont \
          [ if filter] >> event/trigger

Link: http://lkml.kernel.org/r/b672a92c14702cb924cdf6fc27ea1809bed04907.1457029949.git.tom.zanussi@linux.intel.com

Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2016-04-19 12:17:29 -04:00
Tom Zanussi
e62347d245 tracing: Add hist trigger support for user-defined sorting ('sort=' param)
Allow users to specify keys and/or values to sort on.  With this
addition, keys and values specified using the 'keys=' and 'vals='
keywords can be used to sort the hist trigger output via a new 'sort='
keyword.  If multiple sort keys are specified, the output will be
sorted using the second key as a secondary sort key, etc.  The default
sort order is ascending; if the user wants a different sort order,
'.descending' can be appended to the specific sort key.  Before this
addition, output was always sorted by 'hitcount' in ascending order.

This expands the hist trigger syntax from this:

    # echo hist:keys=xxx:vals=yyy \
          [ if filter] > event/trigger

to this:

    # echo hist:keys=xxx:vals=yyy:sort=zzz.descending \
          [ if filter] > event/trigger

Link: http://lkml.kernel.org/r/b30a41db66ba486979c4f987aff5fab500ea53b3.1457029949.git.tom.zanussi@linux.intel.com

Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2016-04-19 12:17:19 -04:00
Tom Zanussi
76a3b0c8ac tracing: Add hist trigger support for compound keys
Allow users to specify multiple trace event fields to use in keys by
allowing multiple fields in the 'keys=' keyword.  With this addition,
any unique combination of any of the fields named in the 'keys'
keyword will result in a new entry being added to the hash table.

Link: http://lkml.kernel.org/r/0cfa24e6ac3b0dcece7737d94aa1f322ae3afc4b.1457029949.git.tom.zanussi@linux.intel.com

Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2016-04-19 12:16:33 -04:00
Tom Zanussi
f2606835d7 tracing: Add hist trigger support for multiple values ('vals=' param)
Allow users to specify trace event fields to use in aggregated sums
via a new 'vals=' keyword.  Before this addition, the only aggregated
sum supported was the implied value 'hitcount'.  With this addition,
'hitcount' is also supported as an explicit value field, as is any
numeric trace event field.

This expands the hist trigger syntax from this:

  # echo hist:keys=xxx [ if filter] > event/trigger

to this:

  # echo hist:keys=xxx:vals=yyy [ if filter] > event/trigger

Link: http://lkml.kernel.org/r/2a5d1adb5ba6c65d7bb2148e379f2fed47f29a68.1457029949.git.tom.zanussi@linux.intel.com

Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2016-04-19 12:16:23 -04:00
Tom Zanussi
7ef224d1d0 tracing: Add 'hist' event trigger command
'hist' triggers allow users to continually aggregate trace events,
which can then be viewed afterwards by simply reading a 'hist' file
containing the aggregation in a human-readable format.

The basic idea is very simple and boils down to a mechanism whereby
trace events, rather than being exhaustively dumped in raw form and
viewed directly, are automatically 'compressed' into meaningful tables
completely defined by the user.

This is done strictly via single-line command-line commands and
without the aid of any kind of programming language or interpreter.

A surprising number of typical use cases can be accomplished by users
via this simple mechanism.  In fact, a large number of the tasks that
users typically do using the more complicated script-based tracing
tools, at least during the initial stages of an investigation, can be
accomplished by simply specifying a set of keys and values to be used
in the creation of a hash table.

The Linux kernel trace event subsystem happens to provide an extensive
list of keys and values ready-made for such a purpose in the form of
the event format files associated with each trace event.  By simply
consulting the format file for field names of interest and by plugging
them into the hist trigger command, users can create an endless number
of useful aggregations to help with investigating various properties
of the system.  See Documentation/trace/events.txt for examples.

hist triggers are implemented on top of the existing event trigger
infrastructure, and as such are consistent with the existing triggers
from a user's perspective as well.

The basic syntax follows the existing trigger syntax.  Users start an
aggregation by writing a 'hist' trigger to the event of interest's
trigger file:

  # echo hist:keys=xxx [ if filter] > event/trigger

Once a hist trigger has been set up, by default it continually
aggregates every matching event into a hash table using the event key
and a value field named 'hitcount'.

To view the aggregation at any point in time, simply read the 'hist'
file in the same directory as the 'trigger' file:

  # cat event/hist

The detailed syntax provides additional options for user control, and
is described exhaustively in Documentation/trace/events.txt and in the
virtual tracing/README file in the tracing subsystem.

Link: http://lkml.kernel.org/r/72d263b5e1853fe9c314953b65833c3aa75479f2.1457029949.git.tom.zanussi@linux.intel.com

Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2016-04-19 12:16:14 -04:00
Steven Rostedt
c37775d578 tracing: Add infrastructure to allow set_event_pid to follow children
Add the infrastructure needed to have the PIDs in set_event_pid to
automatically add PIDs of the children of the tasks that have their PIDs in
set_event_pid. This will also remove PIDs from set_event_pid when a task
exits

This is implemented by adding hooks into the fork and exit tracepoints. On
fork, the PIDs are added to the list, and on exit, they are removed.

Add a new option called event_fork that when set, PIDs in set_event_pid will
automatically get their children PIDs added when they fork, as well as any
task that exits will have its PID removed from set_event_pid.

This works for instances as well.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2016-04-19 10:28:28 -04:00
Linus Torvalds
e46b4e2b46 Nothing major this round. Mostly small clean ups and fixes.
Some visible changes:
 
  A new flag was added to distinguish traces done in NMI context.
 
  Preempt tracer now shows functions where preemption is disabled but
  interrupts are still enabled.
 
 Other notes:
 
  Updates were done to function tracing to allow better performance
  with perf.
 
  Infrastructure code has been added to allow for a new histogram
  feature for recording live trace event histograms that can be
  configured by simple user commands. The feature itself was just
  finished, but needs a round in linux-next before being pulled.
  This only includes some infrastructure changes that will be needed.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJW8/WPAAoJEKKk/i67LK/8wrAH/j2gU9ZfjVxTu8068TBGWRJP
 yvvzq0cK5evB3dsVuUmKKRfU52nSv4J1WcFF569X0RulSLylR0dHlcxFJMn4kkgR
 bm0AHRrqOf87ub3VimcpG146iVQij37l5A0SRoFbvSPLQx1KUW18v99x41Ji8dv6
 oWXRc6/YhdzEE7l0nUsVjmScQ4b2emsems3cxZzXOY+nRJsiim6i+VaDeatdyey1
 csLVqtRCs+x62TVtxG3+GhcLdRoPRbnHAGzrKDFIn1SrQaRXCc54wN5d2hWxjgNI
 1laOwaj070lnJiWfBLIP/K+lx+VKRx5/O0rKZX35foLUTqJJKSyjAbKXuMCcSAM=
 =2h2K
 -----END PGP SIGNATURE-----

Merge tag 'trace-v4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing updates from Steven Rostedt:
 "Nothing major this round.  Mostly small clean ups and fixes.

  Some visible changes:

   - A new flag was added to distinguish traces done in NMI context.

   - Preempt tracer now shows functions where preemption is disabled but
     interrupts are still enabled.

  Other notes:

   - Updates were done to function tracing to allow better performance
     with perf.

   - Infrastructure code has been added to allow for a new histogram
     feature for recording live trace event histograms that can be
     configured by simple user commands.  The feature itself was just
     finished, but needs a round in linux-next before being pulled.

     This only includes some infrastructure changes that will be needed"

* tag 'trace-v4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (22 commits)
  tracing: Record and show NMI state
  tracing: Fix trace_printk() to print when not using bprintk()
  tracing: Remove redundant reset per-CPU buff in irqsoff tracer
  x86: ftrace: Fix the misleading comment for arch/x86/kernel/ftrace.c
  tracing: Fix crash from reading trace_pipe with sendfile
  tracing: Have preempt(irqs)off trace preempt disabled functions
  tracing: Fix return while holding a lock in register_tracer()
  ftrace: Use kasprintf() in ftrace_profile_tracefs()
  ftrace: Update dynamic ftrace calls only if necessary
  ftrace: Make ftrace_hash_rec_enable return update bool
  tracing: Fix typoes in code comment and printk in trace_nop.c
  tracing, writeback: Replace cgroup path to cgroup ino
  tracing: Use flags instead of bool in trigger structure
  tracing: Add an unreg_all() callback to trigger commands
  tracing: Add needs_rec flag to event triggers
  tracing: Add a per-event-trigger 'paused' field
  tracing: Add get_syscall_name()
  tracing: Add event record param to trigger_ops.func()
  tracing: Make event trigger functions available
  tracing: Make ftrace_event_field checking functions available
  ...
2016-03-24 10:52:25 -07:00
Joe Perches
a395d6a7e3 kernel/...: convert pr_warning to pr_warn
Use the more common logging method with the eventual goal of removing
pr_warning altogether.

Miscellanea:

 - Realign arguments
 - Coalesce formats
 - Add missing space between a few coalesced formats

Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	[kernel/power/suspend.c]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-22 15:36:02 -07:00
Peter Zijlstra
7e6867bf83 tracing: Record and show NMI state
The latency tracer format has a nice column to indicate IRQ state, but
this is not able to tell us about NMI state.

When tracing perf interrupt handlers (which often run in NMI context)
it is very useful to see how the events nest.

Link: http://lkml.kernel.org/r/20160318153022.105068893@infradead.org

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2016-03-22 18:04:10 -04:00
Steven Rostedt (Red Hat)
a29054d947 tracing: Fix crash from reading trace_pipe with sendfile
If tracing contains data and the trace_pipe file is read with sendfile(),
then it can trigger a NULL pointer dereference and various BUG_ON within the
VM code.

There's a patch to fix this in the splice_to_pipe() code, but it's also a
good idea to not let that happen from trace_pipe either.

Link: http://lkml.kernel.org/r/1457641146-9068-1-git-send-email-rabin@rab.in

Cc: stable@vger.kernel.org # 2.6.30+
Reported-by: Rabin Vincent <rabin.vincent@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2016-03-18 15:51:42 -04:00
Chunyu Hu
c8ca003b2f tracing: Fix return while holding a lock in register_tracer()
commit d39cdd2036 ("tracing: Make tracer_flags use the right set_flag
callback")  introduces a potential mutex deadlock issue, as it forgets to
free the mutex when allocaing the tracer_flags gets fail.

The issue was found by Dan Carpenter through Smatch static code check tool.

Link: http://lkml.kernel.org/r/1457958941-30265-1-git-send-email-chuhu@redhat.com

Fixes: d39cdd2036 ("tracing: Make tracer_flags use the right set_flag callback")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Chunyu Hu <chuhu@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2016-03-18 10:36:21 -04:00
Chunyu Hu
d39cdd2036 tracing: Make tracer_flags use the right set_flag callback
When I was updating the ftrace_stress test of ltp. I encountered
a strange phenomemon, excute following steps:

echo nop > /sys/kernel/debug/tracing/current_tracer
echo 0 > /sys/kernel/debug/tracing/options/funcgraph-cpu
bash: echo: write error: Invalid argument

check dmesg:
[ 1024.903855] nop_test_refuse flag set to 0: we refuse.Now cat trace_options to see the result

The reason is that the trace option test will randomly setup trace
option under tracing/options no matter what the current_tracer is.
but the set_tracer_option is always using the set_flag callback
from the current_tracer. This patch adds a pointer to tracer_flags
and make it point to the tracer it belongs to. When the option is
setup, the set_flag of the right tracer will be used no matter
what the the current_tracer is.

And the old dummy_tracer_flags is used for all the tracers which
doesn't have a tracer_flags, having issue to use it to save the
pointer of a tracer. So remove it and use dynamic dummy tracer_flags
for tracers needing a dummy tracer_flags, as a result, there are no
tracers sharing tracer_flags, so remove the check code.

And save the current tracer to trace_option_dentry seems not good as
it may waste mem space when mount the debug/trace fs more than one time.

Link: http://lkml.kernel.org/r/1457444222-8654-1-git-send-email-chuhu@redhat.com

Signed-off-by: Chunyu Hu <chuhu@redhat.com>
[ Fixed up function tracer options to work with the change ]
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2016-03-08 11:19:08 -05:00
Steven Rostedt (Red Hat)
7717c6be69 tracing: Fix stacktrace skip depth in trace_buffer_unlock_commit_regs()
While cleaning the stacktrace code I unintentially changed the skip depth of
trace_buffer_unlock_commit_regs() from 0 to 6. kprobes uses this function,
and with skipping 6 call backs, it can easily produce no stack.

Here's how I tested it:

 # echo 'p:ext4_sync_fs ext4_sync_fs ' > /sys/kernel/debug/tracing/kprobe_events
 # echo 1 > /sys/kernel/debug/tracing/events/kprobes/enable
 # cat /sys/kernel/debug/trace
            sync-2394  [005]   502.457060: ext4_sync_fs: (ffffffff81317650)
            sync-2394  [005]   502.457063: kernel_stack:         <stack trace>
            sync-2394  [005]   502.457086: ext4_sync_fs: (ffffffff81317650)
            sync-2394  [005]   502.457087: kernel_stack:         <stack trace>
            sync-2394  [005]   502.457091: ext4_sync_fs: (ffffffff81317650)

After putting back the skip stack to zero, we have:

            sync-2270  [000]   748.052693: ext4_sync_fs: (ffffffff81317650)
            sync-2270  [000]   748.052695: kernel_stack:         <stack trace>
 => iterate_supers (ffffffff8126412e)
 => sys_sync (ffffffff8129c4b6)
 => entry_SYSCALL_64_fastpath (ffffffff8181f0b2)
            sync-2270  [000]   748.053017: ext4_sync_fs: (ffffffff81317650)
            sync-2270  [000]   748.053019: kernel_stack:         <stack trace>
 => iterate_supers (ffffffff8126412e)
 => sys_sync (ffffffff8129c4b6)
 => entry_SYSCALL_64_fastpath (ffffffff8181f0b2)
            sync-2270  [000]   748.053381: ext4_sync_fs: (ffffffff81317650)
            sync-2270  [000]   748.053383: kernel_stack:         <stack trace>
 => iterate_supers (ffffffff8126412e)
 => sys_sync (ffffffff8129c4b6)
 => entry_SYSCALL_64_fastpath (ffffffff8181f0b2)

Cc: stable@vger.kernel.org # v4.4+
Fixes: 73dddbb57b "tracing: Only create stacktrace option when STACKTRACE is configured"
Reported-by: Brendan Gregg <brendan.d.gregg@gmail.com>
Tested-by: Brendan Gregg <brendan.d.gregg@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2016-01-14 09:28:19 -05:00
Chen Gang
e428abbbf6 tracing: #ifdef out uses of max trace when CONFIG_TRACER_MAX_TRACE is not set
tracing_max_lat_fops is used only when TRACER_MAX_TRACE enabled, so also
swith the related code. The related warning with defconfig under x86_64:

    CC      kernel/trace/trace.o
  kernel/trace/trace.c:5466:37: warning: ‘tracing_max_lat_fops’ defined but not used [-Wunused-const-variable]
   static const struct file_operations tracing_max_lat_fops = {

Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2015-11-10 10:16:05 -05:00
Dmitry Safonov
03e88ae6b3 tracing: Remove unused ftrace_cpu_disabled per cpu variable
Since the ring buffer is lockless, there is no need to disable ftrace on
CPU. And no one doing so: after commit 68179686ac ("tracing: Remove
ftrace_disable/enable_cpu()") ftrace_cpu_disabled stays the same after
initialization, nothing changes it.
ftrace_cpu_disabled shouldn't be used by any external module since it
disables only function and graph_function tracers but not any other
tracer.

Link: http://lkml.kernel.org/r/1446836846-22239-1-git-send-email-0x7f454c46@gmail.com

Signed-off-by: Dmitry Safonov <0x7f454c46@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2015-11-07 13:25:14 -05:00
Jiaxing Wang
8b1291994d tracing: Make tracing work when debugfs is not configured in
Currently tracing_init_dentry() returns -ENODEV when debugfs is not
configured in, which causes tracefs not populated with tracing files and
directories, so we will get an empty directory even after we manually
mount tracefs.

We can make tracing_init_dentry() return NULL if debugfs is not
configured in and can manually mount tracefs. But return -ENODEV
if debugfs is configured in but not initialized or failed to create
automount point as that would break backward compatibility with older
tools.

Link: http://lkml.kernel.org/r/1446797056-11683-1-git-send-email-hello.wjx@gmail.com

Signed-off-by: Jiaxing Wang <hello.wjx@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2015-11-06 10:02:33 -05:00
Steven Rostedt (Red Hat)
43ed384339 tracing: Put back comma for empty fields in boot string parsing
Both early_enable_events() and apply_trace_boot_options() parse a boot
string that may get parsed later on. They both use strsep() which converts a
comma into a nul character. To still allow the boot string to be parsed
again the same way, the nul character gets converted back to a comma after
the token is processed.

The problem is that these two functions check for an empty parameter (two
commas in a row ",,"), and continue the loop if the parameter is empty, but
fails to place the comma back. In this case, the second parsing will end at
this blank field, and not process fields afterward.

In most cases, users should not have an empty field, but if its going to be
checked, the code might as well be correct.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2015-11-03 22:15:14 -05:00
Jiaxing Wang
a4d1e68823 tracing: Apply tracer specific options from kernel command line.
Currently, the trace_options parameter is only applied in
tracer_alloc_buffers() when global_trace.current_trace is nop_trace,
so a tracer specific option will not be applied even when the specific
tracer is also enabled from kernel command line. For example, the
'func_stack_trace' option can't be enabled with the following kernel
parameter:

  ftrace=function ftrace_filter=kfree trace_options=func_stack_trace

We can enable tracer specific options by simply apply the options again
if the specific tracer is also supplied from command line and started
in register_tracer().

To make trace_boot_options_buf can be parsed again, a comma and a space
is put back if they were replaced by strsep and strstrip respectively.

Also make register_tracer() be __init to access the __init data, and
in fact register_tracer is only called from __init code.

Link: http://lkml.kernel.org/r/1446599669-9294-1-git-send-email-hello.wjx@gmail.com

Signed-off-by: Jiaxing Wang <hello.wjx@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2015-11-03 21:51:43 -05:00
Sasha Levin
919cd97999 tracing: Allow dumping traces without tracking trace started cpus
We don't init iter->started when dumping the ftrace buffer, and there's no
real need to do so - so allow skipping that check if the iter doesn't have
an initialized ->started cpumask.

Link: http://lkml.kernel.org/r/1441385156-27279-1-git-send-email-sasha.levin@oracle.com

Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2015-11-03 16:10:08 -05:00
Jiaxing Wang
681a4a2f45 tracing: Update instance_rmdir() to use tracefs_remove_recursive
Update instancd_rmdir to use tracefs_remove_recursive instead of
debugfs_remove_recursive.This was left in the transition from debugfs
to tracefs.

Link: http://lkml.kernel.org/r/1445169490-18315-2-git-send-email-hello.wjx@gmail.com

Cc: stable@vger.kernel.org # 4.1+
Fixes: 8434dc9340 ("tracing: Convert the tracing facility over to use tracefs")
Signed-off-by: Jiaxing Wang <hello.wjx@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2015-11-02 13:59:06 -05:00
Steven Rostedt (Red Hat)
37aea98b84 tracing: Add trace options for tracer options to instances
Add the tracer options to instances options directory as well. Only add the
options for tracers that are allowed to be enabled by an instance. But note,
that tracer options are global. That is, tracer options enabled in an
instance, also take affect at the top level and in other instances.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2015-09-30 15:22:58 -04:00
Steven Rostedt (Red Hat)
16270145ce tracing: Add trace options for core options to instances
Allow instances to have their own options, at least for the core options
(non tracer specific ones). There are a few global options that should not
be added to instances, like enabling of trace_printk, and the sched comm
recording, which do not have a specific trace instance associated to them.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2015-09-30 15:22:57 -04:00
Steven Rostedt (Red Hat)
2d34f48955 tracing: Make ftrace_trace_stack() depend on general trace_array flag
In preparation for the multi buffer instances to have their own trace_flags,
the check in ftrace_trace_stack() needs to test the trace_array descriptor
flag that is for the current event, not the global_trace descriptor.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2015-09-30 15:22:57 -04:00
Steven Rostedt (Red Hat)
9a38a8856f tracing: Add a method to pass in trace_array descriptor to option files
In preparation of having the multi buffer instances having their own trace
option flags, the trace option files needs a way to not only pass in the
flag they represent, but also the trace_array descriptor.

A new field is added to the trace_array descriptor called trace_flags_index,
which is a 32 byte character array representing a bit. This array is simply
filled with the index of the array, where

  index_array[n] = n;

Then the address of this array is passed to the file callbacks instead of
the index of the flag index. Then to retrieve both the flag index and the
trace_array descriptor:

  data is the passed in argument.

  index = *(unsigned char *)data;

  data -= index;

  /* Now data points to the address of the array in the trace_array */

  tr = container_of(data, struct trace_array, trace_flags_index);

Suggested-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2015-09-30 15:22:56 -04:00
Steven Rostedt (Red Hat)
983f938ae6 tracing: Move trace_flags from global to a trace_array field
In preparation to make trace options per instance, the global trace_flags
needs to be moved from being a global variable to a field within the trace
instance trace_array structure.

There's still more work to do, as there's some functions that use
trace_flags without passing in a way to get to the current_trace array. For
those, the global_trace is used directly (from trace.c). This includes
setting and clearing the trace_flags. This means that when a new instance is
created, it just gets the trace_flags of the global_trace and will not be
able to modify them. Depending on the functions that have access to the
trace_array, the flags of an instance may not affect parts of its trace,
where the global_trace is used. These will be fixed in future changes.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2015-09-30 15:22:55 -04:00
Steven Rostedt (Red Hat)
5557720415 tracing: Move sleep-time and graph-time options out of the core trace_flags
The sleep-time and graph-time options are only for the function graph tracer
and are not used by anything else. As tracer options are now visible when
the tracer is not activated, its better to move the function graph specific
tracer options into the function graph tracer.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2015-09-30 15:22:42 -04:00
Steven Rostedt (Red Hat)
b9f9108cad tracing: Remove access to trace_flags in trace_printk.c
In the effort to move the global trace_flags to the tracing instances, the
direct access to trace_flags must be removed from trace_printk.c

Instead, add a new trace_printk_enabled boolean that is set by a new access
function trace_printk_control(), that will enable or disable trace_printk.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2015-09-30 04:35:18 -04:00
Steven Rostedt (Red Hat)
b5e87c0581 tracing: Add build bug if we have more trace_flags than bits
Add a enum that denotes the last bit of the trace_flags and have a
BUILD_BUG_ON(last_bit > 32).

If we add more bits than we have in trace_flags, the kernel wont build.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2015-09-30 04:35:18 -04:00
Steven Rostedt (Red Hat)
41d9c0becc tracing: Always show all tracer options in the options directory
There are options that are unique to a specific tracer (like function and
function graph). Currently, these options are only visible in the options
directory when the tracer is enabled.

This has been a pain, especially for something like the func_stack_trace
option that if used inappropriately, could bring the system to a crawl. But
the only way to see it, is to enable the function tracer.

For example, if one had done:

 # cd /sys/kernel/tracing
 # echo __schedule > set_ftrace_filter
 # echo 1 > options/func_stack_trace
 # echo function > current_tracer

The __schedule call will be traced and a stack trace will also be recorded
there. Now when you were done, you may do...

 # echo nop > current_tracer
 # echo > set_ftrace_filter

But you forgot to disable the func_stack_trace. The only way to disable it
is to re-enable function tracing first. If you do not add a filter to
set_ftrace_filter and just do:

 # echo function > current_tracer

Now you would be performing a stack trace on *every* function! On some
systems, that causes a live lock. Others may take a few minutes to fix your
mistake.

Having the func_stack_trace option visible allows you to check it and
disable it before enabling the funtion tracer.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2015-09-30 04:34:54 -04:00
Steven Rostedt (Red Hat)
73dddbb57b tracing: Only create stacktrace option when STACKTRACE is configured
Only create the stacktrace trace option when CONFIG_STACKTRACE is
configured.

Cleaned up the ftrace_trace_stack() function call a little to allow better
encapsulation of the stacktrace trace flag.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2015-09-29 15:38:55 -04:00
Steven Rostedt (Red Hat)
8179e8a15b tracing: Do not create function tracer options when not compiled in
When the function tracer is not compiled in, do not create the option files
for it.

Fix up both the sched_wakeup and irqsoff tracers to handle the change.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2015-09-29 15:01:34 -04:00