Commit Graph

10089 Commits

Author SHA1 Message Date
Linus Torvalds
bca13ce455 Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar:
 "This update is pretty big and almost exclusively includes tooling
  changes, because v4.9's LTS status forced to completion most of the
  pending kernel side hardware enablement work and because we tried to
  freeze core perf work a bit to give a time window for the fuzzing
  efforts.

  The diff is large mostly due to the JSON hardware event tables added
  for Intel and Power8 CPUs. This was a popular feature request from
  people working close to hardware and from the HPC community.

  Tree size is big because this added the CPU event tables for over a
  decade of Intel CPUs. Future changes for a CPU vendor alrady support
  should be much smaller, as events for new models are added. The new
  events are listed in 'perf list', for the CPU model the tool is
  running on. If you find an interesting event it can be used as-is:

      $ perf stat -a -e l2_lines_out.pf_clean sleep 1

      Performance counter stats for 'system wide':

            7,860,403      l2_lines_out.pf_clean

           1.000624918 seconds time elapsed

  The event lists can be searched the usual 'perf list' fashion for
  (case insensitive) substrings as well:

      $ perf list l2_lines_out

      List of pre-defined events (to be used in -e):

      cache:
        l2_lines_out.demand_clean
             [Clean L2 cache lines evicted by demand]
        l2_lines_out.demand_dirty
             [Dirty L2 cache lines evicted by demand]
        l2_lines_out.dirty_all
             [Dirty L2 cache lines filling the L2]
        l2_lines_out.pf_clean
             [Clean L2 cache lines evicted by L2 prefetch]
        l2_lines_out.pf_dirty
             [Dirty L2 cache lines evicted by L2 prefetch]

  etc.

  There's a few high level categories as well that can be listed:
  'cache', 'floating point', 'frontend', 'memory', 'pipeline', 'virtual
  memory'.

  Existing generic events and workflows should work as-is.

  The only kernel side change is a late breaking fix for an older
  regression, related to Intel BTS, LBR and PT feature interaction.

  On the tooling side there are three new tools / major features:

   - The new 'perf c2c' tool provides means for Shared Data C2C/HITM
     analysis.

     This allows you to track down cacheline contention. The tool is
     based on x86's load latency and precise store facility events
     provided by Intel CPUs.

     It was tested by Joe Mario and has proven to be useful, finding
     some cacheline contentions. Joe also wrote a blog about c2c tool
     with examples:

        https://joemario.github.io/blog/2016/09/01/c2c-blog/

     excerpt of the content on this site:

         At a high level, “perf c2c” will show you:

          * The cachelines where false sharing was detected.
          * The readers and writers to those cachelines, and the offsets where those accesses occurred.
          * The pid, tid, instruction addr, function name, binary object name for those readers and writers.
          * The source file and line number for each reader and writer.
          * The average load latency for the loads to those cachelines.
          * Which numa nodes the samples a cacheline came from and which CPUs were involved.

         Using perf c2c is similar to using the Linux perf tool today.
         First collect data with “perf c2c record”, then generate a
         report output with “perf c2c report”

     There one finds extensive details on using the tool, with tips on
     reducing the volume of samples while still capturing enough to do
     its job. (Dick Fowles, Joe Mario, Don Zickus, Jiri Olsa)

   - The new 'perf sched timehist' tool provides tailored analysis of
     scheduling events.

     Example usage:

          perf sched record -- sleep 1
          perf sched timehist

     By default it shows the individual schedule events, including the
     wait time (time between sched-out and next sched-in events for the
     task), the task scheduling delay (time between wakeup and actually
     running) and run time for the task:

            time    cpu  task name         wait time  sch delay  run time
                         [tid/pid]            (msec)     (msec)    (msec)
        -------- ------  ----------------  ---------  ---------  --------
        1.874569 [0011]  gcc[31949]            0.014      0.000     1.148
        1.874591 [0010]  gcc[31951]            0.000      0.000     0.024
        1.874603 [0010]  migration/10[59]      3.350      0.004     0.011
        1.874604 [0011]  <idle>                1.148      0.000     0.035
        1.874723 [0005]  <idle>                0.016      0.000     1.383
        1.874746 [0005]  gcc[31949]            0.153      0.078     0.022
      ...

     Times are in msec.usec. (David Ahern, Namhyung Kim)

   - Add CPU vendor hardware event tables:

     Add JSON files with vendor event naming for Intel and Power8
     processors, allowing users of tools like oprofile to keep using the
     event names they are used to, as well as people reading vendor
     documentation, where such naming is used. (Andi Kleen, Sukadev
     Bhattiprolu)

     You should see all the new events with 'perf list' and you should
     be able to search them, for example 'perf list miss' will list all
     the myriads of miss events.

  Other tooling features added were:

   - Cross-arch annotation support:

     o Improve ARM support in the annotation code, affecting 'perf
       annotate', 'perf report' and live annotation in 'perf top' (Kim
       Phillips)

     o Initial support for PowerPC in the annotation code (Ravi
       Bangoria)

     o Support AArch64 in the 'annotate' code, native/local and
       cross-arch/remote (Kim Phillips)

   - Allow considering just events in a given time interval, via the
     '--time start.s.ms,end.s.ms' command line, added to 'perf kmem',
     'perf report', 'perf sched timehist' and 'perf script' (David
     Ahern)

   - Add option to stop printing a callchain at one of a given group of
     symbol names (David Ahern)

   - Track memory freed in 'perf kmem stat' (David Ahern)

   - Allow querying and setting .perfconfig variables (Taeung Song)

   - Show branch information in callchains (predicted, TSX aborts, loop
     iteractions, etc) (Jin Yao)

   - Dynamicly change verbosity level by pressing 'V' in the 'perf
     top/report' hists TUI browser (Alexis Berlemont)

   - Implement 'perf trace --delay' in the same fashion as in 'perf
     record --delay', to skip sampling workload initialization events
     (Alexis Berlemont)

   - Make vendor named events case insensitive in 'perf list', i.e.
     'perf list LONGEST_LAT' works just the same as 'perf list
     longest_lat' (Andi Kleen)

   - Add unwinding support for jitdump (Stefano Sanfilippo)

  Tooling infrastructure changes:

   - Support linking perf with clang and LLVM libraries, initially
     statically, but this limitation will be lifted and shared
     libraries, when available, will be preferred to the static build,
     that should, as with other features, be enabled explicitly (Wang
     Nan)

   - Add initial support (and perf test entry) for tooling hooks,
     starting with 'record_start' and 'record_end', that will have as
     its initial user the eBPF infrastructure, where perf_ prefixed
     functions will be JITed and run when such hooks are called (Wang
     Nan)

   - Implement assorted libbpf improvements (Wang Nan)"

  ... and lots of other changes, features, cleanups and refactorings I
  did not list, see the shortlog and the git log for details"

* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (220 commits)
  perf/x86: Fix exclusion of BTS and LBR for Goldmont
  perf tools: Explicitly document that --children is enabled by default
  perf sched timehist: Cleanup idle_max_cpu handling
  perf sched timehist: Handle zero sample->tid properly
  perf callchain: Introduce callchain_cursor__copy()
  perf sched: Cleanup option processing
  perf sched timehist: Improve error message when analyzing wrong file
  perf tools: Move perf build related variables under non fixdep leg
  perf tools: Force fixdep compilation at the start of the build
  perf tools: Move PERF-VERSION-FILE target into rules area
  perf build: Check LLVM version in feature check
  perf annotate: Show raw form for jump instruction with indirect target
  perf tools: Add non config targets
  perf tools: Cleanup build directory before each test
  perf tools: Move python/perf.so target into rules area
  perf tools: Move install-gtk target into rules area
  tools build: Move tabs to spaces where suitable
  tools build: Make the .cmd file more readable
  perf clang: Compile BPF script using builtin clang support
  perf clang: Support compile IR to BPF object and add testcase
  ...
2016-12-12 11:46:21 -08:00
Linus Torvalds
718c0ddd6a Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RCU updates from Ingo Molnar:
 "The main RCU changes in this development cycle were:

   - Miscellaneous fixes, including a change to call_rcu()'s rcu_head
     alignment check.

   - Security-motivated list consistency checks, which are disabled by
     default behind DEBUG_LIST.

   - Torture-test updates.

   - Documentation updates, yet again just simple changes"

* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  torture: Prevent jitter from delaying build-only runs
  torture: Remove obsolete files from rcutorture .gitignore
  rcu: Don't kick unless grace period or request
  rcu: Make expedited grace periods recheck dyntick idle state
  torture: Trace long read-side delays
  rcu: RCU_TRACE enables event tracing as well as debugfs
  rcu: Remove obsolete comment from __call_rcu()
  rcu: Remove obsolete rcu_check_callbacks() header comment
  rcu: Tighten up __call_rcu() rcu_head alignment check
  Documentation/RCU: Fix minor typo
  documentation: Present updated RCU guarantee
  bug: Avoid Kconfig warning for BUG_ON_DATA_CORRUPTION
  lib/Kconfig.debug: Fix typo in select statement
  lkdtm: Add tests for struct list corruption
  bug: Provide toggle for BUG on data corruption
  list: Split list_del() debug checking into separate function
  rculist: Consolidate DEBUG_LIST for list_add_rcu()
  list: Split list_add() debug checking into separate function
2016-12-12 09:09:54 -08:00
Ingo Molnar
6de75a37b8 Merge branch 'linus' into perf/core, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-12-11 13:05:59 +01:00
David S. Miller
821781a9f4 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-12-10 16:21:55 -05:00
Linus Torvalds
810ac7b755 Merge branch 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
Pull libnvdimm fixes from Dan Williams:
 "Several fixes to the DSM (ACPI device specific method) marshaling
  implementation.

  I consider these urgent enough to send for 4.9 consideration since
  they fix the kernel's handling of ARS (Address Range Scrub) commands.
  Especially for platforms without machine-check-recovery capabilities,
  successful execution of ARS commands enables the platform to
  potentially break out of an infinite reboot problem if a media error
  is present in the boot path. There is also a one line fix for a
  device-dax read-only mapping regression.

  Commits 9a901f5495 ("acpi, nfit: fix extended status translations
  for ACPI DSMs") and 325896ffdf ("device-dax: fix private mapping
  restriction, permit read-only") are true regression fixes for changes
  introduced this cycle.

  Commit efda1b5d87 ("acpi, nfit, libnvdimm: fix / harden ars_status
  output length handling") fixes the kernel's handling of zero-length
  results, this never would have worked in the past, but we only just
  recently discovered a BIOS implementation that emits this arguably
  spec non-compliant result.

  The remaining two commits are additional fall out from thinking
  through the implications of a zero / truncated length result of the
  ARS Status command.

  In order to mitigate the risk that these changes introduce yet more
  regressions they are backstopped by a new unit test in commit
  a7de92dac9 ("tools/testing/nvdimm: unit test acpi_nfit_ctl()") that
  mocks up inputs to acpi_nfit_ctl()"

* 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
  device-dax: fix private mapping restriction, permit read-only
  tools/testing/nvdimm: unit test acpi_nfit_ctl()
  acpi, nfit: fix bus vs dimm confusion in xlat_status
  acpi, nfit: validate ars_status output buffer size
  acpi, nfit, libnvdimm: fix / harden ars_status output length handling
  acpi, nfit: fix extended status translations for ACPI DSMs
2016-12-09 11:27:22 -08:00
Linus Torvalds
2b41226b39 Revert "radix tree test suite: fix compilation"
This reverts commit 53855d10f4.

It shouldn't have come in yet - it depends on the changes in linux-next
that will come in during the next merge window.  As Matthew Wilcox says,
the test suite is broken with the current state without the revert.

Requested-by: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-09 10:41:42 -08:00
Matthew Wilcox
53855d10f4 radix tree test suite: fix compilation
Patch "lib/radix-tree: Convert to hotplug state machine" breaks the test
suite as it adds a call to cpuhp_setup_state_nocalls() which is not
currently emulated in the test suite.  Add it, and delete the emulation
of the old CPU hotplug mechanism.

Link: http://lkml.kernel.org/r/1480369871-5271-36-git-send-email-mawilcox@linuxonhyperv.com
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Tested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-07 17:10:00 -08:00
Yannick Brosseau
108a7c103b perf tools: Explicitly document that --children is enabled by default
The fact that the --children option is enabled by default is buried deep
at the end of the help page, in the overhead calculation section. This
make it explicit right where the option is listed, following the same
way other default options are described

Signed-off-by: Yannick Brosseau <scientist@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: kernel-team@fb.com
Link: http://lkml.kernel.org/r/20161202160732.29058-1-scientist@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-07 12:00:35 -03:00
Namhyung Kim
b336352b41 perf sched timehist: Cleanup idle_max_cpu handling
It treats the idle_max_cpu little bit confusingly IMHO.  Let's make it
more straight forward.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20161206034010.6499-6-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-07 12:00:34 -03:00
Namhyung Kim
5d92d96a94 perf sched timehist: Handle zero sample->tid properly
Sometimes samples have tid of 0 but non-0 pid.  It ends up having a new
thread of 0 tid/pid (instead of referring idle task) since tid is used
to search matching task.  But I guess it's wrong to use 0 as a tid when
pid is set.  This patch uses tid only if it has a non-zero value or same
as pid (of 0).

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20161206034010.6499-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-07 12:00:34 -03:00
Namhyung Kim
571f1eb9b9 perf callchain: Introduce callchain_cursor__copy()
The callchain_cursor__copy() function is to save current callchain
captured by a cursor.  It'll be used to keep callchains when switching
to idle task for each cpu.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20161206034010.6499-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-07 12:00:33 -03:00
Namhyung Kim
6fa94258ce perf sched: Cleanup option processing
The -D/--dump-raw-trace option is in the parent option so no need to
repeat it.  Also move -f/--force option to parent as it's common to
handle data file.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20161206034010.6499-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-07 12:00:33 -03:00
David Ahern
f45bf8d393 perf sched timehist: Improve error message when analyzing wrong file
Arnaldo reported an unhelpful error message when running perf sched
timehist on a file that did not contain sched tracepoints:

    [root@jouet ~]# perf sched timehist
    No trace sample to read. Did you call 'perf record -R'?

    [root@jouet ~]# perf evlist -v
    cycles:ppp: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|CALLCHAIN|CPU|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1

Change the has_traces check to look for the sched_switch event. Analysis
for perf sched timehist requires at least this event.

Now when analyzing a file without sched tracepoints you get:

    root@f21-vbox:/tmp$ perf sched timehist
    No sched_switch events found. Have you run 'perf sched record'?

Signed-off-by: David Ahern <dsahern@gmail.com>
Reported-and-Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1480451988-43673-1-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-07 12:00:32 -03:00
Dan Williams
a7de92dac9 tools/testing/nvdimm: unit test acpi_nfit_ctl()
A recent flurry of bug discoveries in the nfit driver's DSM marshalling
routine has highlighted the fact that we do not have unit test coverage
for this routine. Add a self-test of acpi_nfit_ctl() routine before
probing the "nfit_test.0" device. This mocks stimulus to acpi_nfit_ctl()
and if any of the tests fail "nfit_test.0" will be unavailable causing
the rest of the tests to not run / fail.

This unit test will also be a place to land reproductions of quirky BIOS
behavior discovered in the field and ensure the kernel does not regress
against implementations it has seen in practice.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-12-06 17:42:36 -08:00
Jiri Olsa
8ac1eb7bab perf tools: Move perf build related variables under non fixdep leg
Because there's no need for them in fixdep build.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1481030331-31944-4-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-06 13:23:03 -03:00
Jiri Olsa
abb26210a3 perf tools: Force fixdep compilation at the start of the build
The fixdep tool needs to be built before everything else, because it fixes
every object dependency file.

We handle this currently by making all objects to depend on fixdep, which is
error prone and is easily forgotten when new object is added.

Instead of this, this patch force fixdep tool to be built as the first target
in the separate make session. This way we don't need to handle extra fixdep
dependencies and we are certain there's no fixdep race with any parallel make
job.

Committer notes:

Testing it:

Before:

  $ rm -rf /tmp/build/perf/ ; mkdir -p /tmp/build/perf ; make -k O=/tmp/build/perf -C tools/perf install-bin
  make: Entering directory '/home/acme/git/linux/tools/perf'
    BUILD:   Doing 'make -j4' parallel build

  Auto-detecting system features:
  ...                         dwarf: [ on  ]
  ...            dwarf_getlocations: [ on  ]
  ...                         glibc: [ on  ]
  ...                          gtk2: [ on  ]
  ...                      libaudit: [ on  ]
  ...                        libbfd: [ on  ]
  ...                        libelf: [ on  ]
  ...                       libnuma: [ on  ]
  ...        numa_num_possible_cpus: [ on  ]
  ...                       libperl: [ on  ]
  ...                     libpython: [ on  ]
  ...                      libslang: [ on  ]
  ...                     libcrypto: [ on  ]
  ...                     libunwind: [ on  ]
  ...            libdw-dwarf-unwind: [ on  ]
  ...                          zlib: [ on  ]
  ...                          lzma: [ on  ]
  ...                     get_cpuid: [ on  ]
  ...                           bpf: [ on  ]

    GEN      /tmp/build/perf/common-cmds.h
    HOSTCC   /tmp/build/perf/fixdep.o
    HOSTLD   /tmp/build/perf/fixdep-in.o
    LINK     /tmp/build/perf/fixdep
    MKDIR    /tmp/build/perf/pmu-events/
    HOSTCC   /tmp/build/perf/pmu-events/json.o
    MKDIR    /tmp/build/perf/pmu-events/
    HOSTCC   /tmp/build/perf/pmu-events/jsmn.o
    HOSTCC   /tmp/build/perf/pmu-events/jevents.o
    HOSTLD   /tmp/build/perf/pmu-events/jevents-in.o
    PERF_VERSION = 4.9.rc8.g868cd5
    CC       /tmp/build/perf/perf-read-vdso32
  <SNIP>

After:

  $ rm -rf /tmp/build/perf/ ; mkdir -p /tmp/build/perf ; make -k O=/tmp/build/perf -C tools/perf install-bin
  make: Entering directory '/home/acme/git/linux/tools/perf'
    BUILD:   Doing 'make -j4' parallel build
    HOSTCC   /tmp/build/perf/fixdep.o
    HOSTLD   /tmp/build/perf/fixdep-in.o
    LINK     /tmp/build/perf/fixdep

  Auto-detecting system features:
  ...                         dwarf: [ on  ]
  ...            dwarf_getlocations: [ on  ]
  ...                         glibc: [ on  ]
  ...                          gtk2: [ on  ]
  ...                      libaudit: [ on  ]
  ...                        libbfd: [ on  ]
  ...                        libelf: [ on  ]
  ...                       libnuma: [ on  ]
  ...        numa_num_possible_cpus: [ on  ]
  ...                       libperl: [ on  ]
  ...                     libpython: [ on  ]
  ...                      libslang: [ on  ]
  ...                     libcrypto: [ on  ]
  ...                     libunwind: [ on  ]
  ...            libdw-dwarf-unwind: [ on  ]
  ...                          zlib: [ on  ]
  ...                          lzma: [ on  ]
  ...                     get_cpuid: [ on  ]
  ...                           bpf: [ on  ]

    GEN      /tmp/build/perf/common-cmds.h
    MKDIR    /tmp/build/perf/fd/
    CC       /tmp/build/perf/fd/array.o
    LD       /tmp/build/perf/fd/libapi-in.o
    MKDIR    /tmp/build/perf/fs/
    CC       /tmp/build/perf/event-parse.o
    CC       /tmp/build/perf/fs/fs.o
    PERF_VERSION = 4.9.rc8.g57a92f
    CC       /tmp/build/perf/event-plugin.o
    MKDIR    /tmp/build/perf/fs/
    CC       /tmp/build/perf/fs/tracing_path.o
  <SNIP>

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1481030331-31944-3-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-06 13:23:02 -03:00
Jiri Olsa
16e2ef4ed2 perf tools: Move PERF-VERSION-FILE target into rules area
An upcoming fixdep fix needs all targets at the same area, so they'll
fit under a signal condition block.

Moving PERF-VERSION-FILE target into rules section.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1481030331-31944-2-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-06 13:23:00 -03:00
Wang Nan
a940cad331 perf build: Check LLVM version in feature check
Cancel builtin llvm and clang support when LLVM version is less than
3.9.0: following commits uses newer API.

Since Clang/LLVM's API is not guaranteed to be stable, add a
test-llvm-version.cpp feature checker, issue warning if LLVM found in
compiling environment is not tested yet.

Committer Notes:

Testing it:

Environment:

  $ cat /etc/fedora-release
  Fedora release 25 (Twenty Five)
  $ rpm -q llvm-devel clang-devel
  llvm-devel-3.8.0-1.fc25.x86_64
  clang-devel-3.8.0-2.fc25.x86_64
  $

Before:

  $  make -k LIBCLANGLLVM=1 O=/tmp/build/perf -C tools/perf install-bin
  make: Entering directory '/home/acme/git/linux/tools/perf'
    BUILD:   Doing 'make -j4' parallel build
  Warning: tools/include/uapi/linux/bpf.h differs from kernel
  Warning: tools/arch/arm/include/uapi/asm/kvm.h differs from kernel
    INSTALL  GTK UI
    LINK     /tmp/build/perf/perf
  /tmp/build/perf/libperf.a(libperf-in.o): In function `perf::createCompilerInvocation(llvm::SmallVector<char const*, 16u>, llvm::StringRef&, clang::DiagnosticsEngine&)':
  /home/acme/git/linux/tools/perf/util/c++/clang.cpp:56: undefined reference to `clang::tooling::newInvocation(clang::DiagnosticsEngine*, llvm::SmallVector<char const*, 16u> const&)'
  /tmp/build/perf/libperf.a(libperf-in.o): In function `perf::getModuleFromSource(llvm::SmallVector<char const*, 16u>, llvm::StringRef, llvm::IntrusiveRefCntPtr<clang::vfs::FileSystem>)':
  /home/acme/git/linux/tools/perf/util/c++/clang.cpp:68: undefined reference to `clang::CompilerInstance::CompilerInstance(std::shared_ptr<clang::PCHContainerOperations>, bool)'
  /home/acme/git/linux/tools/perf/util/c++/clang.cpp:69: undefined reference to `clang::CompilerInstance::createDiagnostics(clang::DiagnosticConsumer*, bool)'
  <SNIP>

After:

  Makefile.config:807: No suitable libLLVM found, disabling builtin clang and llvm support. Please install llvm-dev(el) (>= 3.9.0)

Updating the environment to a locally built LLVM 4.0 + clang 3.9 (forgot
to git pull, duh) combo, all works as expected, it is properly detected
and built into the resulting perf binary.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Reported-and-Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Joe Stringer <joe@ovn.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/20161206072230.7651-1-wangnan0@huawei.com
[ Change the warning message a bit (add 'suitable' and 'builtin'), clarifying it, see committer notes above ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-06 13:21:55 -03:00
Thomas Graf
3f731d89e4 bpf: add additional verifier tests for BPF_PROG_TYPE_LWT_*
- direct packet read is allowed for LWT_*
 - direct packet write for LWT_IN/LWT_OUT is prohibited
 - direct packet write for LWT_XMIT is allowed
 - access to skb->tc_classid is prohibited for LWT_*

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-06 10:14:52 -05:00
Haiyang Zhang
fd7aabb062 tools: hv: Enable network manager for bonding scripts on RHEL
We found network manager is necessary on RHEL to make the synthetic
NIC, VF NIC bonding operations handled automatically. So, enabling
network manager here.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-06 10:13:55 -05:00
Jiri Slaby
69042bf200 objtool: Fix bytes check of lea's rex_prefix
For the "lea %(rsp), %rbp" case, we check if there is a rex_prefix.
But we check 'bytes' which is insn_byte_t[4] in rex_prefix (insn_field
structure). Therefore, the check is always true.

Instead, check 'nbytes' which is the right one.

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20161205105551.25917-1-jslaby@suse.cz
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-12-06 09:20:59 +01:00
Ravi Bangoria
bec60e50af perf annotate: Show raw form for jump instruction with indirect target
For jump instructions that does not include target address as direct operand,
show the original disassembled line for them. This is needed for certain
powerpc jump instructions that use target address in a register (such as bctr,
btar, ...).

Before:
     ld     r12,32088(r12)
     mtctr  r12
  v  bctr   ffffffffffffca2c
     std    r2,24(r1)
     addis  r12,r2,-1

After:
     ld     r12,32088(r12)
     mtctr  r12
  v  bctr
     std    r2,24(r1)
     addis  r12,r2,-1

Committer notes:

Testing it using a perf.data file and vmlinux for powerpc64,
cross-annotating it on a x86_64 workstation:

Before:

  .__bpf_prog_run  vmlinux.powerpc
         │        std    r10,512(r9)                      ▒
         │        lbz    r9,0(r31)                        ▒
         │        rldicr r9,r9,3,60                       ▒
         │        ldx    r9,r30,r9                        ▒
         │        mtctr  r9                               ▒
  100.00 │      ↓ bctr   3fffffffffe01510                 ▒
         │        lwa    r10,4(r31)                       ▒
         │        lwz    r9,0(r31)                        ▒
  <SNIP>
  Invalid jump offset: 3fffffffffe01510

After:

  .__bpf_prog_run  vmlinux.powerpc
         │        std    r10,512(r9)                      ▒
         │        lbz    r9,0(r31)                        ▒
         │        rldicr r9,r9,3,60                       ▒
         │        ldx    r9,r30,r9                        ▒
         │        mtctr  r9                               ▒
  100.00 │      ↓ bctr                                    ▒
         │        lwa    r10,4(r31)                       ▒
         │        lwz    r9,0(r31)                        ▒
  <SNIP>
  Invalid jump offset: 3fffffffffe01510

This, in turn, uncovers another problem with jumps without operands, the
ENTER/-> operation, to jump to the target, still continues using the bogus
target :-)

BTW, this was the file used for the above tests:

  [acme@jouet ravi_bangoria]$ perf report --header-only -i perf.data.f22vm.powerdev
  # ========
  # captured on: Thu Nov 24 12:40:38 2016
  # hostname : pdev-f22-qemu
  # os release : 4.4.10-200.fc22.ppc64
  # perf version : 4.9.rc1.g6298ce
  # arch : ppc64
  # nrcpus online : 48
  # nrcpus avail : 48
  # cpudesc : POWER7 (architected), altivec supported
  # cpuid : 74,513
  # total memory : 4158976 kB
  # cmdline : /home/ravi/Workspace/linux/tools/perf/perf record -a
  # event : name = cycles:ppp, , size = 112, { sample_period, sample_freq } = 4000, sample_type = IP|TID|TIME|CPU|PERIOD, disabled = 1, inherit = 1, mmap = 1, c
  # HEADER_CPU_TOPOLOGY info available, use -I to display
  # HEADER_NUMA_TOPOLOGY info available, use -I to display
  # pmu mappings: cpu = 4, software = 1, tracepoint = 2, breakpoint = 5
  # missing features: HEADER_TRACING_DATA HEADER_BRANCH_STACK HEADER_GROUP_DESC HEADER_AUXTRACE HEADER_STAT HEADER_CACHE
  # ========
  #
  [acme@jouet ravi_bangoria]$

Suggested-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Chris Riyder <chris.ryder@arm.com>
Cc: Kim Phillips <kim.phillips@arm.com>
Cc: Markus Trippelsdorf <markus@trippelsdorf.de>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Taeung Song <treeze.taeung@gmail.com>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lkml.kernel.org/r/1480953407-7605-1-git-send-email-ravi.bangoria@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-05 17:21:57 -03:00
Jiri Olsa
d4dcadec43 perf tools: Add non config targets
Adding some missing non config targets that were for some reason
omitted.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1480884178-8072-7-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-05 16:18:49 -03:00
Jiri Olsa
207da4739e perf tools: Cleanup build directory before each test
Cleanup the fixdep tool before every test.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1480884178-8072-8-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-05 16:18:26 -03:00
Jiri Olsa
0b4d4b0762 perf tools: Move python/perf.so target into rules area
Following fixdep fix needs all targets at the same area, so they'll fit
under signal condition block.

Moving python/perf.so target into rules section and intentionally
removing the perl script related comment.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1480884178-8072-5-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-05 16:11:50 -03:00
Jiri Olsa
5c319a67b1 perf tools: Move install-gtk target into rules area
The upcoming fixdep fix needs all targets at the same area, so they'll
fit under a signal condition block.

Move install-gtk target into the rules section.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1480884178-8072-4-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-05 16:06:50 -03:00
Jiri Olsa
2fedf79b69 tools build: Move tabs to spaces where suitable
We've been hit several times by a Makefile bug where line indented by
tab was falsely considered as target command.

We prevent this by always using space indentation for everything except
for the target commands.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1480884178-8072-3-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-05 16:06:03 -03:00
Jiri Olsa
a5ba0a1a5a tools build: Make the .cmd file more readable
Putting extra line between dependencies and cmd_* definition
to make it more readable.

Before:

  $ cat .builtin-top.o.cmd
  ...
  /home/jolsa/kernel/linux-perf/tools/include/linux/stringify.h \
  /home/jolsa/kernel/linux-perf/tools/include/linux/time64.h
  cmd_builtin-top.o := gcc -Wp,-MD,./.builtin-top.o.d -Wp,-MT,builtin-...
  ...

After:

  $ cat .builtin-top.o.cmd
  ...
  /home/jolsa/kernel/linux-perf/tools/include/linux/stringify.h \
  /home/jolsa/kernel/linux-perf/tools/include/linux/time64.h

  cmd_builtin-top.o := gcc -Wp,-MD,./.builtin-top.o.d -Wp,-MT,builtin-...
  ...

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1480884178-8072-2-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-05 16:04:08 -03:00
Wang Nan
edd695b032 perf clang: Compile BPF script using builtin clang support
After this patch, perf utilizes builtin clang support to build BPF
script, no longer depend on external clang, but fallbacking to it
if for some reason the builtin compiling framework fails.

Test:

  $ type clang
  -bash: type: clang: not found
  $ cat ~/.perfconfig
  $ echo '#define LINUX_VERSION_CODE 0x040700' > ./test.c
  $ cat ./tools/perf/tests/bpf-script-example.c >> ./test.c
  $ ./perf record -v --dry-run -e ./test.c 2>&1 | grep builtin
  bpf: successfull builtin compilation
  $

Can't pass cflags so unable to include kernel headers now. Will be fixed
by following commits.

Committer notes:

Make sure '-v' comes before the '-e ./test.c' in the command line otherwise the
'verbose' variable will not be set when the bpf event is parsed and thus the
pr_debug indicating a 'successfull builtin compilation' will not be output, as
the debug level (1) will be less than what 'verbose' has at that point (0).

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Joe Stringer <joe@ovn.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/20161126070354.141764-16-wangnan0@huawei.com
[ Spell check/reflow successfull pr_debug string ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-05 15:51:45 -03:00
Wang Nan
5e08a76525 perf clang: Support compile IR to BPF object and add testcase
getBPFObjectFromModule() is introduced to compile LLVM IR(Module)
to BPF object. Add new testcase for it.

Test result:
  $ ./buildperf/perf test -v clang
  51: builtin clang support                               :
  51.1: builtin clang compile C source to IR              :
  --- start ---
  test child forked, pid 21822
  test child finished with 0
  ---- end ----
  builtin clang support subtest 0: Ok
  51.2: builtin clang compile C source to ELF object      :
  --- start ---
  test child forked, pid 21823
  test child finished with 0
  ---- end ----
  builtin clang support subtest 1: Ok

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Joe Stringer <joe@ovn.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/20161126070354.141764-15-wangnan0@huawei.com
[ Remove redundant "Test" from entry descriptions ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-05 15:51:44 -03:00
Wang Nan
e67d52d411 perf clang: Update test case to use real BPF script
Allow C++ code to use util.h and tests/llvm.h. Let 'perf test' compile a
real BPF script.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Joe Stringer <joe@ovn.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/20161126070354.141764-14-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-05 15:51:44 -03:00
Wang Nan
a9495fe9dc perf clang: Allow passing CFLAGS to builtin clang
Improve getModuleFromSource() API to accept a cflags list. This feature
will be used to pass LINUX_VERSION_CODE and -I flags.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Joe Stringer <joe@ovn.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/20161126070354.141764-13-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-05 15:51:44 -03:00
Wang Nan
77dfa84a84 perf clang: Use real file system for #include
Utilize clang's OverlayFileSystem facility, allow CompilerInstance to
access real file system.

With this patch the '#include' directive can be used.

Add a new getModuleFromSource for real file.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Joe Stringer <joe@ovn.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/20161126070354.141764-12-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-05 15:51:44 -03:00
Wang Nan
00b86691c7 perf clang: Add builtin clang support ant test case
Add basic clang support in clang.cpp and test__clang() testcase. The
first testcase checks if builtin clang is able to generate LLVM IR.

tests/clang.c is a proxy. Real testcase resides in
utils/c++/clang-test.cpp in c++ and exports C interface to perf test
subsystem.

Test result:

   $ perf test -v clang
   51: builtin clang support                               :
   51.1: Test builtin clang compile C source to IR              :
   --- start ---
   test child forked, pid 13215
   test child finished with 0
   ---- end ----
   Test builtin clang support subtest 0: Ok

Committer note:

Make sure you've enabled CLANG and LLVM builtin support by setting
the LIBCLANGLLVM variable on the make command line, e.g.:

  make LIBCLANGLLVM=1 O=/tmp/build/perf -C tools/perf install-bin

Otherwise you'll get this when trying to do the 'perf test' call above:

  # perf test clang
  51: builtin clang support                      : Skip (not compiled in)
  #

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Joe Stringer <joe@ovn.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/20161126070354.141764-11-wangnan0@huawei.com
[ Removed "Test" from descriptions, redundant and already removed from all the other entries ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-05 15:51:43 -03:00
Wang Nan
d58ac0bf8d perf build: Add clang and llvm compile and linking support
Add necessary c++ flags and link libraries to support builtin clang and
LLVM. Add all llvm and clang libraries, so don't need to worry about
clang changes its libraries setting. However, linking perf would take
much longer than usual.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Joe Stringer <joe@ovn.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/20161126070354.141764-10-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-05 15:51:43 -03:00
Wang Nan
c7fb4f62e2 tools build: Add feature detection for clang
Check if basic clang compiling environment is ready.

Doesn't like 'llvm-config --libs' which can returns llvm libraries in right
order and duplicates some libraries if necessary, there's no correspondence for
clang libraries (-lclangxxx). to avoid extra complexity and to avoid new clang
breaking libraries ordering, use --start-group and --end-group.

In this test case, manually identify required clang libs and hope it to be
stable. Putting all clang libraries here is possible (use make's wildcard), but
then feature checking becomes very slow.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Joe Stringer <joe@ovn.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/20161126070354.141764-9-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-05 15:51:43 -03:00
Wang Nan
cb40d55b59 tools build: Add feature detection for LLVM
Check if basic LLVM compiling environment is ready.

Use llvm-config to detect include and library directories. Avoid using
'llvm-config --cxxflags' because its result contain some unwanted flags
like --sysroot (if LLVM is built by yocto).

Use '?=' to set LLVM_CONFIG, so explicitly passing LLVM_CONFIG to make
would override it.

Use 'llvm-config --libs BPF' to check if BPF backend is compiled in.
Since now BPF bytecode is the only required backend, no need to waste
time linking llvm and clang if BPF backend is missing. This also
introduce an implicit requirement that LLVM should be new enough.  Old
LLVM doesn't support BPF backend.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Joe Stringer <joe@ovn.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/20161126070354.141764-8-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-05 15:51:42 -03:00
Wang Nan
2bd42de0e1 perf llvm: Extract helpers in llvm-utils.c
The following commits will use builtin clang to compile BPF scripts.

llvm__get_kbuild_opts() and llvm__get_nr_cpus() are extracted to help
building '-DKERNEL_VERSION_CODE' and '-D__NR_CPUS__' macros.

Doing object dumping in bpf loader, so further builtin clang compiling
needn't consider it.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Joe Stringer <joe@ovn.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/20161126070354.141764-7-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-05 15:51:42 -03:00
Wang Nan
8ad85e9e6f perf tools: Pass context to perf hook functions
Pass a pointer to perf hook functions so they receive context
information during setup.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Joe Stringer <joe@ovn.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/20161126070354.141764-6-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-05 15:51:42 -03:00
Peter Foley
baa1973ebc tools build: Fix objtool build with clang
Clang doesn't support multiple arguments being passed to -Wp, so split
them.

  Fixes this error:
  HOSTCC   tools/objtool/fixdep.o
  cat: tools/objtool/.fixdep.o.d: No such file or directory

Signed-off-by: Peter Foley <pefoley2@pefoley.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/20161128024346.17371-1-pefoley2@pefoley.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-05 15:51:42 -03:00
Jiri Olsa
1cd6472e3f tools build: Make fixdep parsing wait for last target
The fixdep tool, among other things, replaces the target of the object
in the gcc generated dependency output file.

The parsing code assumes there's only single target in the rule but this
is not always the case as described in here:

  https://gcc.gnu.org/ml/gcc-help/2016-11/msg00099.html

Make the fixdep code smart enough to skip all the possible targets.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Peter Foley <pefoley2@pefoley.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/20161201130025.GA16430@krava
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-05 15:51:42 -03:00
Gianluca Borello
3c839744b3 bpf: Preserve const register type on const OR alu ops
Occasionally, clang (e.g. version 3.8.1) translates a sum between two
constant operands using a BPF_OR instead of a BPF_ADD. The verifier is
currently not handling this scenario, and the destination register type
becomes UNKNOWN_VALUE even if it's still storing a constant. As a result,
the destination register cannot be used as argument to a helper function
expecting a ARG_CONST_STACK_*, limiting some use cases.

Modify the verifier to handle this case, and add a few tests to make sure
all combinations are supported, and stack boundaries are still verified
even with BPF_OR.

Signed-off-by: Gianluca Borello <g.borello@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-05 13:40:05 -05:00
Kim Phillips
0fcb1da4ab perf annotate: AArch64 support
This is a regex converted version from the original:

	https://lkml.org/lkml/2016/5/19/461

Add basic support to recognise AArch64 assembly. This allows perf to
identify AArch64 instructions that branch to other parts within the
same function, thereby properly annotating them.

Rebased onto new cross-arch annotation bits:

	https://lkml.org/lkml/2016/11/25/546

Sample output:

security_file_permission  vmlinux
  5.80 │    ← ret                                                  ▒
       │70:   ldr    w0, [x21,#68]                                 ▒
  4.44 │    ↓ tbnz   d0                                            ▒
       │      mov    w0, #0x24                       // #36        ▒
  1.37 │      ands   w0, w22, w0                                   ▒
       │    ↑ b.eq   60                                            ▒
  1.37 │    ↓ tbnz   e4                                            ▒
       │      mov    w19, #0x20000                   // #131072    ▒
  1.02 │    ↓ tbz    ec                                            ▒
       │90:┌─→ldr    x3, [x21,#24]                                 ▒
  1.37 │   │  add    x21, x21, #0x10                               ▒
       │   │  mov    w2, w19                                       ▒
  1.02 │   │  mov    x0, x21                                       ▒
       │   │  mov    x1, x3                                        ▒
  1.71 │   │  ldr    x20, [x3,#48]                                 ▒
       │   │→ bl     __fsnotify_parent                             ▒
  0.68 │   │↑ cbnz   60                                            ▒
       │   │  mov    x2, x21                                       ▒
  1.37 │   │  mov    w1, w19                                       ▒
       │   │  mov    x0, x20                                       ▒
  0.68 │   │  mov    w5, #0x0                        // #0         ▒
       │   │  mov    x4, #0x0                        // #0         ▒
  1.71 │   │  mov    w3, #0x1                        // #1         ▒
       │   │→ bl     fsnotify                                      ▒
  1.37 │   │↑ b      60                                            ▒
       │d0:│  mov    w0, #0x0                        // #0         ▒
       │   │  ldp    x19, x20, [sp,#16]                            ▒
       │   │  ldp    x21, x22, [sp,#32]                            ▒
       │   │  ldp    x29, x30, [sp],#48                            ▒
       │   │← ret                                                  ▒
       │e4:│  mov    w19, #0x10000                   // #65536     ▒
       │   └──b      90                                            ◆
       │ec:   brk    #0x800                                        ▒
Press 'h' for help on key bindings

Signed-off-by: Kim Phillips <kim.phillips@arm.com>
Signed-off-by: Chris Ryder <chris.ryder@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Pawel Moll <pawel.moll@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>
Link: http://lkml.kernel.org/r/20161130092344.012e18e3e623bea395162f95@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-01 13:03:19 -03:00
Kim Phillips
859afa6ca9 perf annotate: Use arch->objdump.comment_char in dec__parse()
Presume neglected in commit 786c1b5 "perf annotate: Start supporting
cross arch annotation".  This doesn't fix a bug since none of the
affected arches support parsing dec/inc instructions yet.

Signed-off-by: Kim Phillips <kim.phillips@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Chris Ryder <chris.ryder@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Pawel Moll <pawel.moll@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>
Link: http://lkml.kernel.org/r/20161130092333.1cca5dd2c77e1790d61c1e9c@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-01 13:03:18 -03:00
David Ahern
46690a8051 perf report: Add option to specify time window of interest
Add option to allow user to control analysis window. e.g., collect data
for time window and analyze a segment of interest within that window.

Committer notes:

Testing it:

Using the perf.data file captured via 'perf kmem record':

  # perf report --header-only
  # ========
  # captured on: Tue Nov 29 16:01:53 2016
  # hostname : jouet
  # os release : 4.8.8-300.fc25.x86_64
  # perf version : 4.9.rc6.g5a6aca
  # arch : x86_64
  # nrcpus online : 4
  # nrcpus avail : 4
  # cpudesc : Intel(R) Core(TM) i7-5600U CPU @ 2.60GHz
  # cpuid : GenuineIntel,6,61,4
  # total memory : 20254660 kB
  # cmdline : /home/acme/bin/perf kmem record usleep 1
  # event : name = kmem:kmalloc, , id = { 931980, 931981, 931982, 931983 }, type = 2, size = 112, config = 0x1b9, { sample_period, sample_freq } = 1, sample_typ
  # event : name = kmem:kmalloc_node, , id = { 931984, 931985, 931986, 931987 }, type = 2, size = 112, config = 0x1b7, { sample_period, sample_freq } = 1, sampl
  # event : name = kmem:kfree, , id = { 931988, 931989, 931990, 931991 }, type = 2, size = 112, config = 0x1b5, { sample_period, sample_freq } = 1, sample_type
  # event : name = kmem:kmem_cache_alloc, , id = { 931992, 931993, 931994, 931995 }, type = 2, size = 112, config = 0x1b8, { sample_period, sample_freq } = 1, s
  # event : name = kmem:kmem_cache_alloc_node, , id = { 931996, 931997, 931998, 931999 }, type = 2, size = 112, config = 0x1b6, { sample_period, sample_freq } =
  # event : name = kmem:kmem_cache_free, , id = { 932000, 932001, 932002, 932003 }, type = 2, size = 112, config = 0x1b4, { sample_period, sample_freq } = 1, sa
  # HEADER_CPU_TOPOLOGY info available, use -I to display
  # HEADER_NUMA_TOPOLOGY info available, use -I to display
  # pmu mappings: cpu = 4, intel_pt = 7, intel_bts = 6, uncore_arb = 13, cstate_pkg = 15, breakpoint = 5, uncore_cbox_1 = 12, power = 9, software = 1, uncore_im
  # HEADER_CACHE info available, use -I to display
  # missing features: HEADER_BRANCH_STACK HEADER_GROUP_DESC HEADER_AUXTRACE HEADER_STAT
  # ========
  #
  # # Looking at just the histogram entries for the first event:
  #
  # perf report  | head -33
  # To display the perf.data header info, please use --header/--header-only options.
  #
  #
  # Total Lost Samples: 0
  #
  # Samples: 40  of event 'kmem:kmalloc'
  # Event count (approx.): 40
  #
  # Overhead  Trace output
  # ........  ...............................................................................................................
  #
    37.50%  call_site=ffffffffb91ad3c7 ptr=0xffff88895fc05000 bytes_req=4096 bytes_alloc=4096 gfp_flags=GFP_KERNEL
    10.00%  call_site=ffffffffb9258416 ptr=0xffff888a1dc61f00 bytes_req=240 bytes_alloc=256 gfp_flags=GFP_KERNEL|__GFP_ZERO
     7.50%  call_site=ffffffffb9258416 ptr=0xffff888a2640ac00 bytes_req=240 bytes_alloc=256 gfp_flags=GFP_KERNEL|__GFP_ZERO
     2.50%  call_site=ffffffffb92759ba ptr=0xffff888a26776000 bytes_req=4096 bytes_alloc=4096 gfp_flags=GFP_KERNEL
     2.50%  call_site=ffffffffb9276864 ptr=0xffff8886f6b82600 bytes_req=136 bytes_alloc=192 gfp_flags=GFP_KERNEL|__GFP_ZERO
     2.50%  call_site=ffffffffb9276903 ptr=0xffff888aefcf0460 bytes_req=32 bytes_alloc=32 gfp_flags=GFP_KERNEL
     2.50%  call_site=ffffffffb92ad0ce ptr=0xffff888756c98a00 bytes_req=392 bytes_alloc=512 gfp_flags=GFP_KERNEL
     2.50%  call_site=ffffffffb92ad0ce ptr=0xffff888756c9ba00 bytes_req=504 bytes_alloc=512 gfp_flags=GFP_KERNEL
     2.50%  call_site=ffffffffb92ad301 ptr=0xffff888a31747600 bytes_req=128 bytes_alloc=128 gfp_flags=GFP_KERNEL
     2.50%  call_site=ffffffffb92ad511 ptr=0xffff888a9d26a2a0 bytes_req=28 bytes_alloc=32 gfp_flags=GFP_KERNEL
     2.50%  call_site=ffffffffb936a7fb ptr=0xffff88873e8c11a0 bytes_req=24 bytes_alloc=32 gfp_flags=GFP_KERNEL
     2.50%  call_site=ffffffffb936a7fb ptr=0xffff88873e8c12c0 bytes_req=24 bytes_alloc=32 gfp_flags=GFP_KERNEL
     2.50%  call_site=ffffffffb936a7fb ptr=0xffff88873e8c1540 bytes_req=24 bytes_alloc=32 gfp_flags=GFP_KERNEL
     2.50%  call_site=ffffffffb936a7fb ptr=0xffff88873e8c15a0 bytes_req=24 bytes_alloc=32 gfp_flags=GFP_KERNEL
     2.50%  call_site=ffffffffb936a7fb ptr=0xffff88873e8c15e0 bytes_req=24 bytes_alloc=32 gfp_flags=GFP_KERNEL
     2.50%  call_site=ffffffffb936a7fb ptr=0xffff88873e8c16e0 bytes_req=24 bytes_alloc=32 gfp_flags=GFP_KERNEL
     2.50%  call_site=ffffffffb936a7fb ptr=0xffff88873e8c1c20 bytes_req=24 bytes_alloc=32 gfp_flags=GFP_KERNEL
     2.50%  call_site=ffffffffb936a7fb ptr=0xffff888a9d26a2a0 bytes_req=24 bytes_alloc=32 gfp_flags=GFP_KERNEL
     2.50%  call_site=ffffffffb9373e66 ptr=0xffff8889f1931240 bytes_req=64 bytes_alloc=64 gfp_flags=GFP_ATOMIC|__GFP_ZERO
     2.50%  call_site=ffffffffb9373e66 ptr=0xffff8889f1931980 bytes_req=64 bytes_alloc=64 gfp_flags=GFP_ATOMIC|__GFP_ZERO
     2.50%  call_site=ffffffffb9373e66 ptr=0xffff8889f1931a00 bytes_req=64 bytes_alloc=64 gfp_flags=GFP_ATOMIC|__GFP_ZERO

  #
  # # And then limiting using the example for 'perf kmem stat --time' used
  # # in the previous changeset committer note we see that there were no
  # # kmem:kmalloc in that last part of the file, but there were some
  # # kmem:kmem_cache_alloc ones:
  #
  # perf report --time 20119.782088, --stdio
  #
  # Total Lost Samples: 0
  #
  # Samples: 0  of event 'kmem:kmalloc'
  # Event count (approx.): 0
  #
  # Overhead  Trace output
  # ........  ............
  #

  # Samples: 0  of event 'kmem:kmalloc_node'
  # Event count (approx.): 0
  #
  # Overhead  Trace output
  # ........  ............
  #

  # Samples: 0  of event 'kmem:kfree'
  # Event count (approx.): 0
  #
  # Overhead  Trace output
  # ........  ............
  #

  # Samples: 8  of event 'kmem:kmem_cache_alloc'
  # Event count (approx.): 8
  #
  # Overhead  Trace output
  # ........  ..................................................................................................................
  #
    75.00%  call_site=ffffffffb9333b42 ptr=0xffff888bdf1a39c0 bytes_req=48 bytes_alloc=48 gfp_flags=GFP_NOFS|__GFP_ZERO
    12.50%  call_site=ffffffffb90ad33a ptr=0xffff8889f071f6e0 bytes_req=160 bytes_alloc=160 gfp_flags=GFP_ATOMIC|__GFP_NOTRACK
    12.50%  call_site=ffffffffb9287cc1 ptr=0xffff8889b12722d8 bytes_req=104 bytes_alloc=104 gfp_flags=GFP_NOFS|__GFP_ZERO
  #

Signed-off-by: David Ahern <dsahern@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1480439746-42695-7-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-01 13:03:10 -03:00
David Ahern
2a865bd8dd perf kmem: Add option to specify time window of interest
Add option to allow user to control analysis window. e.g., collect data
for time window and analyze a segment of interest within that window.

Committer notes:

Testing it:

  # perf kmem record usleep 1
  [ perf record: Woken up 0 times to write data ]
  [ perf record: Captured and wrote 1.540 MB perf.data (2049 samples) ]
  # perf evlist
  kmem:kmalloc
  kmem:kmalloc_node
  kmem:kfree
  kmem:kmem_cache_alloc
  kmem:kmem_cache_alloc_node
  kmem:kmem_cache_free
  # Tip: use 'perf evlist --trace-fields' to show fields for tracepoint events
  #
  # # Use 'perf script' to get a first approach, select a chunk for then using
  # # with 'perf kmem stat --time'
  #
  # perf script | tail -15
    usleep 9889 [0] 20119.782088:  kmem:kmem_cache_free: (selinux_file_free_security+0x27) call_site=ffffffffb936aa07 ptr=0xffff888a1df49fc0
      perf 9888 [3] 20119.782088:  kmem:kmem_cache_free: (jbd2_journal_stop+0x1a1) call_site=ffffffffb9334581 ptr=0xffff888bdf1a39c0
      perf 9888 [3] 20119.782089: kmem:kmem_cache_alloc: (jbd2__journal_start+0x72) call_site=ffffffffb9333b42 ptr=0xffff888bdf1a39c0 bytes_req=48 bytes_alloc=48 gfp_flags=GFP_NOFS|__GFP_ZERO
      perf 9888 [3] 20119.782090:  kmem:kmem_cache_free: (jbd2_journal_stop+0x1a1) call_site=ffffffffb9334581 ptr=0xffff888bdf1a39c0
      perf 9888 [3] 20119.782090: kmem:kmem_cache_alloc: (jbd2__journal_start+0x72) call_site=ffffffffb9333b42 ptr=0xffff888bdf1a39c0 bytes_req=48 bytes_alloc=48 gfp_flags=GFP_NOFS|__GFP_ZERO
    usleep 9889 [0] 20119.782091: kmem:kmem_cache_alloc: (__sigqueue_alloc+0x4a) call_site=ffffffffb90ad33a ptr=0xffff8889f071f6e0 bytes_req=160 bytes_alloc=160 gfp_flags=GFP_ATOMIC|__GFP_NOTRACK
      perf 9888 [3] 20119.782091:  kmem:kmem_cache_free: (jbd2_journal_stop+0x1a1) call_site=ffffffffb9334581 ptr=0xffff888bdf1a39c0
      perf 9888 [3] 20119.782093:  kmem:kmem_cache_free: (__sigqueue_free.part.17+0x33) call_site=ffffffffb90ad3f3 ptr=0xffff8889f071f6e0
      perf 9888 [3] 20119.782098: kmem:kmem_cache_alloc: (jbd2__journal_start+0x72) call_site=ffffffffb9333b42 ptr=0xffff888bdf1a39c0 bytes_req=48 bytes_alloc=48 gfp_flags=GFP_NOFS|__GFP_ZERO
      perf 9888 [3] 20119.782098:  kmem:kmem_cache_free: (jbd2_journal_stop+0x1a1) call_site=ffffffffb9334581 ptr=0xffff888bdf1a39c0
      perf 9888 [3] 20119.782099: kmem:kmem_cache_alloc: (jbd2__journal_start+0x72) call_site=ffffffffb9333b42 ptr=0xffff888bdf1a39c0 bytes_req=48 bytes_alloc=48 gfp_flags=GFP_NOFS|__GFP_ZERO
      perf 9888 [3] 20119.782100: kmem:kmem_cache_alloc: (alloc_buffer_head+0x21) call_site=ffffffffb9287cc1 ptr=0xffff8889b12722d8 bytes_req=104 bytes_alloc=104 gfp_flags=GFP_NOFS|__GFP_ZERO
      perf 9888 [3] 20119.782101:  kmem:kmem_cache_free: (jbd2_journal_stop+0x1a1) call_site=ffffffffb9334581 ptr=0xffff888bdf1a39c0
      perf 9888 [3] 20119.782102: kmem:kmem_cache_alloc: (jbd2__journal_start+0x72) call_site=ffffffffb9333b42 ptr=0xffff888bdf1a39c0 bytes_req=48 bytes_alloc=48 gfp_flags=GFP_NOFS|__GFP_ZERO
      perf 9888 [3] 20119.782103:  kmem:kmem_cache_free: (jbd2_journal_stop+0x1a1) call_site=ffffffffb9334581 ptr=0xffff888bdf1a39c0
  #
  # # stats for the whole perf.data file, i.e. no interval specified
  #
  # perf kmem stat

  SUMMARY (SLAB allocator)
  ========================
  Total bytes requested: 172,628
  Total bytes allocated: 173,088
  Total bytes freed:     161,280
  Net total bytes allocated: 11,808
  Total bytes wasted on internal fragmentation: 460
  Internal fragmentation: 0.265761%
  Cross CPU allocations: 0/851
  #
  # # stats for an end open interval, after a certain time:
  #
  # perf kmem stat --time 20119.782088,

  SUMMARY (SLAB allocator)
  ========================
  Total bytes requested: 552
  Total bytes allocated: 552
  Total bytes freed:     448
  Net total bytes allocated: 104
  Total bytes wasted on internal fragmentation: 0
  Internal fragmentation: 0.000000%
  Cross CPU allocations: 0/8
  #

Signed-off-by: David Ahern <dsahern@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1480439746-42695-6-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-01 13:03:02 -03:00
David Ahern
853b740711 perf sched timehist: Add option to specify time window of interest
Add option to allow user to control analysis window. e.g., collect data
for time window and analyze a segment of interest within that window.

Committer notes:

Testing it:

  # perf sched record -a usleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 1.593 MB perf.data (25 samples) ]
  #
  # perf sched timehist | head -18
  Samples do not have callchains.
          time    cpu   task name       wait time  sch delay  run time
                        [tid/pid]          (msec)     (msec)    (msec)
  ------------- ------  --------------- ---------  ---------  --------
   19818.635579 [0002]  <idle>              0.000      0.000     0.000
   19818.635613 [0000]  perf[9116]          0.000      0.000     0.000
   19818.635676 [0000]  <idle>              0.000      0.000     0.063
   19818.635678 [0000]  rcuos/2[29]         0.000      0.002     0.001
   19818.635696 [0002]  perf[9117]          0.000      0.004     0.116
   19818.635702 [0000]  <idle>              0.001      0.000     0.024
   19818.635709 [0002]  migration/2[25]     0.000      0.003     0.012
   19818.636263 [0000]  usleep[9117]        0.005      0.000     0.560
   19818.636316 [0000]  <idle>              0.560      0.000     0.053
   19818.636358 [0002]  <idle>              0.129      0.000     0.649
   19818.636358 [0000]  usleep[9117]        0.053      0.002     0.042
  #

  # perf sched timehist --time 19818.635696,
  Samples do not have callchains.
           time    cpu  task name       wait time  sch delay  run time
                        [tid/pid]          (msec)     (msec)    (msec)
  ------------- ------  ---------------  --------  --------- ---------
   19818.635696 [0002]  perf[9117]          0.000      0.120     0.000
   19818.635702 [0000]  <idle>              0.019      0.000     0.006
   19818.635709 [0002]  migration/2[25]     0.000      0.003     0.012
   19818.636263 [0000]  usleep[9117]        0.005      0.000     0.560
   19818.636316 [0000]  <idle>              0.560      0.000     0.053
   19818.636358 [0002]  <idle>              0.129      0.000     0.649
   19818.636358 [0000]  usleep[9117]        0.053      0.002     0.042
  #
  # perf sched timehist --time 19818.635696,19818.635709
  Samples do not have callchains.
           time    cpu  task name       wait time  sch delay  run time
                        [tid/pid]          (msec)     (msec)    (msec)
  ------------- ------  --------------- ---------  --------- ---------
   19818.635696 [0002]  perf[9117]          0.000      0.120     0.000
   19818.635702 [0000]  <idle>              0.019      0.000     0.006
   19818.635709 [0002]  migration/2[25]     0.000      0.003     0.012
   19818.635709 [0000]  usleep[9117]        0.005      0.000     0.006
  #

Signed-off-by: David Ahern <dsahern@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1480439746-42695-5-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-01 13:02:52 -03:00
David Ahern
a91f4c473f perf script: Add option to specify time window of interest
Add option to allow user to control analysis window. e.g., collect data
for some amount of time and analyze a segment of interest within that
window.

Committer notes:

Testing it:

  # perf evlist -v
  cycles:ppp: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|CALLCHAIN|CPU|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
  #
  # perf script --hide-call-graph | head -15
    swapper    0 [0] 9693.370039:      1 cycles:ppp: ffffffffb90072ad x86_pmu_enable (.../4.8.8-300.fc25.x86_64/vmlinux)
    swapper    0 [0] 9693.370044:      1 cycles:ppp: ffffffffb900ca1b intel_pmu_handle_irq (.../4.8.8-300.fc25.x86_64/vmlinux)
    swapper    0 [0] 9693.370046:      7 cycles:ppp: ffffffffb902fd93 native_sched_clock (.../4.8.8-300.fc25.x86_64/vmlinux)
    swapper    0 [0] 9693.370048:    126 cycles:ppp: ffffffffb902fd93 native_sched_clock (.../4.8.8-300.fc25.x86_64/vmlinux)
    swapper    0 [0] 9693.370049:   2701 cycles:ppp: ffffffffb902fd93 native_sched_clock (.../4.8.8-300.fc25.x86_64/vmlinux)
    swapper    0 [0] 9693.370051:  58823 cycles:ppp: ffffffffb90cd2e0 idle_cpu (.../4.8.8-300.fc25.x86_64/vmlinux)
    swapper    0 [1] 9693.370059:      1 cycles:ppp: ffffffffb91a713a ctx_resched (.../4.8.8-300.fc25.x86_64/vmlinux)
    swapper    0 [1] 9693.370062:      1 cycles:ppp: ffffffffb900ca1b intel_pmu_handle_irq (.../4.8.8-300.fc25.x86_64/vmlinux)
    swapper    0 [1] 9693.370064:     13 cycles:ppp: ffffffffb902fd93 native_sched_clock (.../4.8.8-300.fc25.x86_64/vmlinux)
    swapper    0 [1] 9693.370065:    250 cycles:ppp: ffffffffb902fd93 native_sched_clock (.../4.8.8-300.fc25.x86_64/vmlinux)
    swapper    0 [1] 9693.370067:   5269 cycles:ppp: ffffffffb902fe79 sched_clock (.../4.8.8-300.fc25.x86_64/vmlinux)
    swapper    0 [1] 9693.370069: 114602 cycles:ppp: ffffffffb90c1c5a atomic_notifier_call_chain (.../4.8.8-300.fc25.x86_64/vmlinux)
       perf 5124 [2] 9693.370076:      1 cycles:ppp: ffffffffb91a76c1 __perf_event_enable (.../4.8.8-300.fc25.x86_64/vmlinux)
       perf 5124 [2] 9693.370091:      1 cycles:ppp: ffffffffb900ca1b intel_pmu_handle_irq (.../4.8.8-300.fc25.x86_64/vmlinux)
       perf 5124 [2] 9693.370095:      3 cycles:ppp: ffffffffb902fd93 native_sched_clock (.../4.8.8-300.fc25.x86_64/vmlinux)
  #
  # perf script --hide-call-graph --time ,9693.370048
    swapper    0 [0] 9693.370039:      1 cycles:ppp: ffffffffb90072ad x86_pmu_enable (.../4.8.8-300.fc25.x86_64/vmlinux)
    swapper    0 [0] 9693.370044:      1 cycles:ppp: ffffffffb900ca1b intel_pmu_handle_irq (.../4.8.8-300.fc25.x86_64/vmlinux)
    swapper    0 [0] 9693.370046:      7 cycles:ppp: ffffffffb902fd93 native_sched_clock (.../4.8.8-300.fc25.x86_64/vmlinux)
  # perf script --hide-call-graph --time 9693.370064,9693.370076
    swapper    0 [1] 9693.370064:     13 cycles:ppp: ffffffffb902fd93 native_sched_clock (.../4.8.8-300.fc25.x86_64/vmlinux)
    swapper    0 [1] 9693.370065:    250 cycles:ppp: ffffffffb902fd93 native_sched_clock (.../4.8.8-300.fc25.x86_64/vmlinux)
    swapper    0 [1] 9693.370067:   5269 cycles:ppp: ffffffffb902fe79 sched_clock (.../4.8.8-300.fc25.x86_64/vmlinux)
    swapper    0 [1] 9693.370069: 114602 cycles:ppp: ffffffffb90c1c5a atomic_notifier_call_chain (.../4.8.8-300.fc25.x86_64/vmlinux)
  #

Signed-off-by: David Ahern <dsahern@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1480439746-42695-4-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-01 13:02:45 -03:00
David Ahern
c284d669a2 perf tools: Move parse_nsec_time to time-utils.c
Code move only; no functional change intended.

Committer notes:

Fix the build on Ubuntu 16.04 x86-64 cross-compiling to S/390, with this
set of auto-detected features:

  ...                         dwarf: [ on  ]
  ...            dwarf_getlocations: [ on  ]
  ...                         glibc: [ on  ]
  ...                          gtk2: [ OFF ]
  ...                      libaudit: [ OFF ]
  ...                        libbfd: [ OFF ]
  ...                        libelf: [ on  ]
  ...                       libnuma: [ OFF ]
  ...        numa_num_possible_cpus: [ OFF ]
  ...                       libperl: [ OFF ]
  ...                     libpython: [ OFF ]
  ...                      libslang: [ OFF ]
  ...                     libcrypto: [ OFF ]
  ...                     libunwind: [ OFF ]
  ...            libdw-dwarf-unwind: [ on  ]
  ...                          zlib: [ on  ]
  ...                          lzma: [ OFF ]
  ...                     get_cpuid: [ OFF ]
  ...                           bpf: [ on  ]

Where it was failing with:

    CC       /tmp/build/perf/util/time-utils.o
  util/time-utils.c: In function 'parse_nsec_time':
  util/time-utils.c:17:13: error: implicit declaration of function 'strtoul' [-Werror=implicit-function-declaration]
    time_sec = strtoul(str, &end, 10);
               ^
  util/time-utils.c:17:2: error: nested extern declaration of 'strtoul' [-Werror=nested-externs]
    time_sec = strtoul(str, &end, 10);
    ^
  util/time-utils.c: In function 'perf_time__parse_str':
  util/time-utils.c:93:2: error: implicit declaration of function 'free' [-Werror=implicit-function-declaration]
    free(str);
    ^
  util/time-utils.c:93:2: error: incompatible implicit declaration of built-in function 'free' [-Werror]
  util/time-utils.c:93:2: note: include '<stdlib.h>' or provide a declaration of 'free'

Do as suggested and add a '#include <stdlib.h>' to get the free() and strtoul()
declarations and fix the build.

Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1480439746-42695-3-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-01 13:02:39 -03:00
David Ahern
fdf9dc4b34 perf tools: Add time-based utility functions
Add function to parse a user time string of the form <start>,<stop>
where start and stop are time in sec.nsec format. Both start and stop
times are optional.

Add function to determine if a sample time is within a given time
time window of interest.

Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1480439746-42695-2-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-01 13:02:32 -03:00