linux/tools/perf/jvmti/jvmti_agent.c
Stephane Eranian 209045adc2 perf tools: add JVMTI agent library
This is a standalone JVMTI library to help  profile Java jitted code with perf
record/perf report. The library is not installed or compiled automatically by
perf Makefile. It is not used directly by perf. It is arch agnostic and has
been tested on X86 and ARM. It needs to be used with a Java runtime, such as
OpenJDK, as follows:

  $ java -agentpath:libjvmti.so .......

See the "Committer Notes" below on how to build it.

When used this way, java will generate a jitdump binary file in
$HOME/.debug/java/jit/java-jit-*

This binary dump file contains information to help symbolize and
annotate jitted code.

The jitdump information must be injected into the perf.data file
using:

  $ perf inject --jit -i perf.data -o perf.data.jitted

This injects the MMAP records to cover the jitted code and also generates
one ELF image for each jitted function. The ELF images are created in the
same subdir as the jitdump file. The MMAP records point there too.

Then, to visualize the function or asm profile, simply use the regular
perf commands:

  $ perf report -i perf.data.jitted

or

  $ perf annotate -i perf.data.jitted

JVMTI agent code adapted from the OProfile's opagent code.

This version of the JVMTI agent is using the CLOCK_MONOTONIC as the time
source to timestamp jit samples. To correlate with perf_events samples,
it needs to run on kernel 4.0.0-rc5+ or later with the following commit
from Peter Zijlstra:

  34f439278c ("perf: Add per event clockid support")

With this patch recording jitted code is done as follows:

   $ perf record -k mono -- java -agentpath:libjvmti.so .......

 --------------------------------------------------------------------------

Committer Notes:

Extended testing instructions:

  $ cd tools/perf/jvmti/
  $ dnf install java-devel
  $ make

Then, create some simple java stuff to record some samples:

  $ cat hello.java
  public class hello {
	public static void main(String[] args) {
                 System.out.println("Hello, World");
       	}
  }
  $ javac hello.java
  $ java hello
  Hello, World
  $

And then record it using this jvmti thing:

  $ perf record -k mono java -agentpath:/home/acme/git/linux/tools/perf/jvmti/libjvmti.so hello
  java: jvmti: jitdump in /home/acme/.debug/jit/java-jit-20160205.XXWIEDls/jit-1908.dump
  Hello, World
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.030 MB perf.data (268 samples) ]
  $

Now lets insert the PERF_RECORD_MMAP2 records to point jitted mmaps to
files created by the agent:

  $ perf inject --jit -i perf.data -o perf.data.jitted

And finally see that it did its job:

  $ perf report -D -i perf.data.jitted | grep PERF_RECORD_MMAP2 | tail -5
  79197149129422 0xfe10 [0xa0]: PERF_RECORD_MMAP2 1908/1923: [0x7f172428bd60(0x80) @ 0x40 fd:02 1840554 1]: --xs /home/acme/.debug/jit/java-jit-20160205.XXWIEDls/jitted-1908-283.so
  79197149235701 0xfeb0 [0xa0]: PERF_RECORD_MMAP2 1908/1923: [0x7f172428ba60(0x180) @ 0x40 fd:02 1840555 1]: --xs /home/acme/.debug/jit/java-jit-20160205.XXWIEDls/jitted-1908-284.so
  79197149250558 0xff50 [0xa0]: PERF_RECORD_MMAP2 1908/1923: [0x7f172428b860(0x180) @ 0x40 fd:02 1840556 1]: --xs /home/acme/.debug/jit/java-jit-20160205.XXWIEDls/jitted-1908-285.so
  79197149714746 0xfff0 [0xa0]: PERF_RECORD_MMAP2 1908/1923: [0x7f172428b660(0x180) @ 0x40 fd:02 1840557 1]: --xs /home/acme/.debug/jit/java-jit-20160205.XXWIEDls/jitted-1908-286.so
  79197149806558 0x10090 [0xa0]: PERF_RECORD_MMAP2 1908/1923: [0x7f172428b460(0x180) @ 0x40 fd:02 1840558 1]: --xs /home/acme/.debug/jit/java-jit-20160205.XXWIEDls/jitted-1908-287.so
  $

So:

  $ perf report -D -i perf.data | grep PERF_RECORD_MMAP2 | wc -l
  Failed to open /tmp/perf-1908.map, continuing without symbols
  21
  $ perf report -D -i perf.data.jitted | grep PERF_RECORD_MMAP2 | wc -l
  307
  $ echo $((307 - 21))
  286
  $

286 extra PERF_RECORD_MMAP2 records.

All for thise tiny, with just one function, ELF files:

  $ file /home/acme/.debug/jit/java-jit-20160205.XXWIEDls/jitted-1908-9.so
  /home/acme/.debug/jit/java-jit-20160205.XXWIEDls/jitted-1908-9.so: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), corrupted program header size, BuildID[sha1]=ae54a2ebc3ecf0ba547bfc8cabdea1519df5203f, not stripped
  $ readelf -sw /home/acme/.debug/jit/java-jit-20160205.XXWIEDls/jitted-1908-9.so

  Symbol table '.symtab' contains 2 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND
     1: 0000000000000040     9 FUNC    LOCAL  DEFAULT    1 atomic_cmpxchg_long
  $

Inserted into the build-id cache:

  $ ls -la ~/.debug/.build-id/ae/54a2ebc3ecf0ba547bfc8cabdea1519df5203f
  lrwxrwxrwx. 1 acme acme 111 Feb  5 11:30 /home/acme/.debug/.build-id/ae/54a2ebc3ecf0ba547bfc8cabdea1519df5203f -> ../../home/acme/.debug/jit/java-jit-20160205.XXWIEDls/jitted-1908-9.so/ae54a2ebc3ecf0ba547bfc8cabdea1519df5203f

Note: check why 'file' reports that 'corrupted program header size'.

With a stupid java hog to do some profiling:

$ cat hog.java
  public class hog {
	private static double do_something_else(int i) {
		double total = 0;
		while (i > 0) {
			total += Math.log(i--);
		}
		return total;
	}
	private static double do_something(int i) {
		double total = 0;
		while (i > 0) {
			total += Math.sqrt(i--) + do_something_else(i / 100);
		}
		return total;
	}
	public static void main(String[] args) {
		System.out.println(String.format("%s=%f & %f", args[0],
				   do_something(Integer.parseInt(args[0])),
				   do_something_else(Integer.parseInt(args[1]))));
	}
  }
  $ javac hog.java
  $ perf record -F 10000 -g -k mono java -agentpath:/home/acme/git/linux/tools/perf/jvmti/libjvmti.so hog 100000 2345000
  java: jvmti: jitdump in /home/acme/.debug/jit/java-jit-20160205.XX4sqd14/jit-8670.dump
  100000=291561592.669602 & 32050989.778714
  [ perf record: Woken up 6 times to write data ]
  [ perf record: Captured and wrote 1.536 MB perf.data (12538 samples) ]
  $ perf inject --jit -i perf.data -o perf.data.jitted

Looking at the 'perf report' TUI, at one expanded callchain leading
to the jitted code:

  $ perf report --no-children -i perf.data.jitted

Samples: 12K of event 'cycles:pp', Event count (approx.): 3829569932
  Overhead  Comm  Shared Object       Symbol
-   93.38%  java  jitted-8670-291.so  [.] class hog.do_something_else(int)
     class hog.do_something_else(int)
   - Interpreter
      - 75.86% call_stub
           JavaCalls::call_helper
           jni_invoke_static
           jni_CallStaticVoidMethod
           JavaMain
           start_thread
      - 17.52% JavaCalls::call_helper
           jni_invoke_static
           jni_CallStaticVoidMethod
           JavaMain
           start_thread

Signed-off-by: Stephane Eranian <eranian@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Carl Love <cel@us.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John McCutchan <johnmccutchan@google.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Pawel Moll <pawel.moll@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sonny Rao <sonnyrao@chromium.org>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/1448874143-7269-4-git-send-email-eranian@google.com
[ Made it build on fedora23, added some build/usage instructions ]
[ Check if filename != NULL in compiled_method_load_cb, fixing segfault ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-05 12:26:31 -03:00

466 lines
9.3 KiB
C

/*
* jvmti_agent.c: JVMTI agent interface
*
* Adapted from the Oprofile code in opagent.c:
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2.1 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, write to the Free Software
* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
*
* Copyright 2007 OProfile authors
* Jens Wilke
* Daniel Hansel
* Copyright IBM Corporation 2007
*/
#include <sys/types.h>
#include <sys/stat.h> /* for mkdir() */
#include <stdio.h>
#include <errno.h>
#include <string.h>
#include <stdlib.h>
#include <stdint.h>
#include <limits.h>
#include <fcntl.h>
#include <unistd.h>
#include <time.h>
#include <sys/mman.h>
#include <syscall.h> /* for gettid() */
#include <err.h>
#include "jvmti_agent.h"
#include "../util/jitdump.h"
#define JIT_LANG "java"
static char jit_path[PATH_MAX];
static void *marker_addr;
/*
* padding buffer
*/
static const char pad_bytes[7];
static inline pid_t gettid(void)
{
return (pid_t)syscall(__NR_gettid);
}
static int get_e_machine(struct jitheader *hdr)
{
ssize_t sret;
char id[16];
int fd, ret = -1;
int m = -1;
struct {
uint16_t e_type;
uint16_t e_machine;
} info;
fd = open("/proc/self/exe", O_RDONLY);
if (fd == -1)
return -1;
sret = read(fd, id, sizeof(id));
if (sret != sizeof(id))
goto error;
/* check ELF signature */
if (id[0] != 0x7f || id[1] != 'E' || id[2] != 'L' || id[3] != 'F')
goto error;
sret = read(fd, &info, sizeof(info));
if (sret != sizeof(info))
goto error;
m = info.e_machine;
if (m < 0)
m = 0; /* ELF EM_NONE */
hdr->elf_mach = m;
ret = 0;
error:
close(fd);
return ret;
}
#define NSEC_PER_SEC 1000000000
static int perf_clk_id = CLOCK_MONOTONIC;
static inline uint64_t
timespec_to_ns(const struct timespec *ts)
{
return ((uint64_t) ts->tv_sec * NSEC_PER_SEC) + ts->tv_nsec;
}
static inline uint64_t
perf_get_timestamp(void)
{
struct timespec ts;
int ret;
ret = clock_gettime(perf_clk_id, &ts);
if (ret)
return 0;
return timespec_to_ns(&ts);
}
static int
debug_cache_init(void)
{
char str[32];
char *base, *p;
struct tm tm;
time_t t;
int ret;
time(&t);
localtime_r(&t, &tm);
base = getenv("JITDUMPDIR");
if (!base)
base = getenv("HOME");
if (!base)
base = ".";
strftime(str, sizeof(str), JIT_LANG"-jit-%Y%m%d", &tm);
snprintf(jit_path, PATH_MAX - 1, "%s/.debug/", base);
ret = mkdir(jit_path, 0755);
if (ret == -1) {
if (errno != EEXIST) {
warn("jvmti: cannot create jit cache dir %s", jit_path);
return -1;
}
}
snprintf(jit_path, PATH_MAX - 1, "%s/.debug/jit", base);
ret = mkdir(jit_path, 0755);
if (ret == -1) {
if (errno != EEXIST) {
warn("cannot create jit cache dir %s", jit_path);
return -1;
}
}
snprintf(jit_path, PATH_MAX - 1, "%s/.debug/jit/%s.XXXXXXXX", base, str);
p = mkdtemp(jit_path);
if (p != jit_path) {
warn("cannot create jit cache dir %s", jit_path);
return -1;
}
return 0;
}
static int
perf_open_marker_file(int fd)
{
long pgsz;
pgsz = sysconf(_SC_PAGESIZE);
if (pgsz == -1)
return -1;
/*
* we mmap the jitdump to create an MMAP RECORD in perf.data file.
* The mmap is captured either live (perf record running when we mmap)
* or in deferred mode, via /proc/PID/maps
* the MMAP record is used as a marker of a jitdump file for more meta
* data info about the jitted code. Perf report/annotate detect this
* special filename and process the jitdump file.
*
* mapping must be PROT_EXEC to ensure it is captured by perf record
* even when not using -d option
*/
marker_addr = mmap(NULL, pgsz, PROT_READ|PROT_EXEC, MAP_PRIVATE, fd, 0);
return (marker_addr == MAP_FAILED) ? -1 : 0;
}
static void
perf_close_marker_file(void)
{
long pgsz;
if (!marker_addr)
return;
pgsz = sysconf(_SC_PAGESIZE);
if (pgsz == -1)
return;
munmap(marker_addr, pgsz);
}
void *jvmti_open(void)
{
int pad_cnt;
char dump_path[PATH_MAX];
struct jitheader header;
int fd;
FILE *fp;
/*
* check if clockid is supported
*/
if (!perf_get_timestamp())
warnx("jvmti: kernel does not support %d clock id", perf_clk_id);
memset(&header, 0, sizeof(header));
debug_cache_init();
/*
* jitdump file name
*/
snprintf(dump_path, PATH_MAX, "%s/jit-%i.dump", jit_path, getpid());
fd = open(dump_path, O_CREAT|O_TRUNC|O_RDWR, 0666);
if (fd == -1)
return NULL;
/*
* create perf.data maker for the jitdump file
*/
if (perf_open_marker_file(fd)) {
warnx("jvmti: failed to create marker file");
return NULL;
}
fp = fdopen(fd, "w+");
if (!fp) {
warn("jvmti: cannot create %s", dump_path);
close(fd);
goto error;
}
warnx("jvmti: jitdump in %s", dump_path);
if (get_e_machine(&header)) {
warn("get_e_machine failed\n");
goto error;
}
header.magic = JITHEADER_MAGIC;
header.version = JITHEADER_VERSION;
header.total_size = sizeof(header);
header.pid = getpid();
/* calculate amount of padding '\0' */
pad_cnt = PADDING_8ALIGNED(header.total_size);
header.total_size += pad_cnt;
header.timestamp = perf_get_timestamp();
if (!fwrite(&header, sizeof(header), 1, fp)) {
warn("jvmti: cannot write dumpfile header");
goto error;
}
/* write padding '\0' if necessary */
if (pad_cnt && !fwrite(pad_bytes, pad_cnt, 1, fp)) {
warn("jvmti: cannot write dumpfile header padding");
goto error;
}
return fp;
error:
fclose(fp);
return NULL;
}
int
jvmti_close(void *agent)
{
struct jr_code_close rec;
FILE *fp = agent;
if (!fp) {
warnx("jvmti: incalid fd in close_agent");
return -1;
}
rec.p.id = JIT_CODE_CLOSE;
rec.p.total_size = sizeof(rec);
rec.p.timestamp = perf_get_timestamp();
if (!fwrite(&rec, sizeof(rec), 1, fp))
return -1;
fclose(fp);
fp = NULL;
perf_close_marker_file();
return 0;
}
int
jvmti_write_code(void *agent, char const *sym,
uint64_t vma, void const *code, unsigned int const size)
{
static int code_generation = 1;
struct jr_code_load rec;
size_t sym_len;
size_t padding_count;
FILE *fp = agent;
int ret = -1;
/* don't care about 0 length function, no samples */
if (size == 0)
return 0;
if (!fp) {
warnx("jvmti: invalid fd in write_native_code");
return -1;
}
sym_len = strlen(sym) + 1;
rec.p.id = JIT_CODE_LOAD;
rec.p.total_size = sizeof(rec) + sym_len;
padding_count = PADDING_8ALIGNED(rec.p.total_size);
rec.p. total_size += padding_count;
rec.p.timestamp = perf_get_timestamp();
rec.code_size = size;
rec.vma = vma;
rec.code_addr = vma;
rec.pid = getpid();
rec.tid = gettid();
if (code)
rec.p.total_size += size;
/*
* If JVM is multi-threaded, nultiple concurrent calls to agent
* may be possible, so protect file writes
*/
flockfile(fp);
/*
* get code index inside lock to avoid race condition
*/
rec.code_index = code_generation++;
ret = fwrite_unlocked(&rec, sizeof(rec), 1, fp);
fwrite_unlocked(sym, sym_len, 1, fp);
if (padding_count)
fwrite_unlocked(pad_bytes, padding_count, 1, fp);
if (code)
fwrite_unlocked(code, size, 1, fp);
funlockfile(fp);
ret = 0;
return ret;
}
int
jvmti_write_debug_info(void *agent, uint64_t code, const char *file,
jvmtiAddrLocationMap const *map,
jvmtiLineNumberEntry *li, jint num)
{
static const char *prev_str = "\xff";
struct jr_code_debug_info rec;
size_t sret, len, size, flen;
size_t padding_count;
FILE *fp = agent;
int i;
/*
* no entry to write
*/
if (!num)
return 0;
if (!fp) {
warnx("jvmti: invalid fd in write_debug_info");
return -1;
}
flen = strlen(file) + 1;
rec.p.id = JIT_CODE_DEBUG_INFO;
size = sizeof(rec);
rec.p.timestamp = perf_get_timestamp();
rec.code_addr = (uint64_t)(uintptr_t)code;
rec.nr_entry = num;
/*
* on disk source line info layout:
* uint64_t : addr
* int : line number
* file[] : source file name
* padding : pad to multiple of 8 bytes
*/
size += num * (sizeof(uint64_t) + sizeof(int));
size += flen + (num - 1) * 2;
/*
* pad to 8 bytes
*/
padding_count = PADDING_8ALIGNED(size);
rec.p.total_size = size + padding_count;
/*
* If JVM is multi-threaded, nultiple concurrent calls to agent
* may be possible, so protect file writes
*/
flockfile(fp);
sret = fwrite_unlocked(&rec, sizeof(rec), 1, fp);
if (sret != 1)
goto error;
for (i = 0; i < num; i++) {
uint64_t addr;
addr = (uint64_t)map[i].start_address;
len = sizeof(addr);
sret = fwrite_unlocked(&addr, len, 1, fp);
if (sret != 1)
goto error;
len = sizeof(int);
sret = fwrite_unlocked(&li[i].line_number, len, 1, fp);
if (sret != 1)
goto error;
if (i == 0) {
sret = fwrite_unlocked(file, flen, 1, fp);
} else {
sret = fwrite_unlocked(prev_str, 2, 1, fp);
}
if (sret != 1)
goto error;
}
if (padding_count)
sret = fwrite_unlocked(pad_bytes, padding_count, 1, fp);
if (sret != 1)
goto error;
funlockfile(fp);
return 0;
error:
funlockfile(fp);
return -1;
}