linux/tools
Joseph Schuchart c0268e8d1f perf script python: Fix mem leak due to missing Py_DECREFs on dict entries
We are using the Python scripting interface in perf to extract kernel
events relevant for performance analysis of HPC codes. We noticed that
the "perf script" call allocates a significant amount of memory (in the
order of several 100 MiB) during it's run, e.g. 125 MiB for a 25 MiB
input file:

  $> perf record -o perf.data -a -R -g fp \
       -e power:cpu_frequency -e sched:sched_switch \
       -e sched:sched_migrate_task -e sched:sched_process_exit \
       -e sched:sched_process_fork -e sched:sched_process_exec \
       -e cycles  -m 4096 --freq 4000
  $> /usr/bin/time perf script -i perf.data -s dummy_script.py
  0.84user 0.13system 0:01.92elapsed 51%CPU (0avgtext+0avgdata
  125532maxresident)k
  73072inputs+0outputs (57major+33086minor)pagefaults 0swaps

Upon further investigation using the valgrind massif tool, we noticed
that Python objects that are created in trace-event-python.c via
PyString_FromString*() (and their Integer and Long counterparts) are
never free'd.

The reason for this seem to be missing Py_DECREF calls on the objects
that are returned by these functions and stored in the Python
dictionaries. The Python dictionaries do not steal references (as
opposed to Python tuples and lists) but instead add their own reference.

Hence, the reference that is returned by these object creation functions
is never released and the memory is leaked. (see [1,2])

The attached patch fixes this by wrapping all relevant calls to
PyDict_SetItemString() and decrementing the reference counter
immediately after the Python function call.

This reduces the allocated memory to a reasonable amount:

  $> /usr/bin/time perf script -i perf.data -s dummy_script.py
  0.73user 0.05system 0:00.79elapsed 99%CPU (0avgtext+0avgdata
  49132maxresident)k
  0inputs+0outputs (0major+14045minor)pagefaults 0swaps

For comparison, with a 120 MiB input file the memory consumption
reported by time drops from almost 600 MiB to 146 MiB.

The patch has been tested using Linux 3.8.2 with Python 2.7.4 and Linux
3.11.6 with Python 2.7.5.

Please let me know if you need any further information.

[1] http://docs.python.org/2/c-api/tuple.html#PyTuple_SetItem
[2] http://docs.python.org/2/c-api/dict.html#PyDict_SetItemString

Signed-off-by: Joseph Schuchart <joseph.schuchart@tu-dresden.de>
Reviewed-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tom.zanussi@linux.intel.com>
Link: http://lkml.kernel.org/r/1381468543-25334-4-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-10-24 10:16:54 -03:00
..
cgroup cgroups: fix cgroup_event_listener error handling 2013-01-08 10:00:44 -08:00
firewire tools/firewire: nosy-dump: check for allocation failure 2012-12-02 20:10:18 +01:00
hv Tools: hv: use full nlmsghdr in netlink_send 2013-08-12 15:44:57 -07:00
include/tools tools/include: use stdint types for user-space byteshift headers 2013-07-03 16:02:28 +02:00
lguest tools/lguest: offer VIRTIO_F_ANY_LAYOUT for net device. 2013-07-15 11:18:32 +09:30
lib tools lib lk: Uninclude linux/magic.h in debugfs.c 2013-09-19 15:08:53 -03:00
net filter: add minimal BPF JIT image disassembler 2013-03-21 11:35:41 -04:00
nfsd
perf perf script python: Fix mem leak due to missing Py_DECREFs on dict entries 2013-10-24 10:16:54 -03:00
power cpupower: Add Haswell family 0x45 specific idle monitor to show PC8,9,10 states 2013-07-05 01:52:19 +02:00
scripts tools: Get only verbose output with V=1 2013-07-08 17:31:34 -03:00
testing tools/testing/selftests: fix uninitialized variable 2013-10-16 21:35:53 -07:00
usb tools: usb: ffs-test: Fix build failure 2013-03-07 12:23:17 +08:00
virtio virtio tools: add .gitignore 2013-07-15 11:18:31 +09:30
vm tools/vm: Switch to liblk library 2013-03-15 13:06:01 -03:00
Makefile Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2013-05-01 14:08:52 -07:00