linux/mm
Linus Torvalds 654443e20d Merge branch 'perf-uprobes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull user-space probe instrumentation from Ingo Molnar:
 "The uprobes code originates from SystemTap and has been used for years
  in Fedora and RHEL kernels.  This version is much rewritten, reviews
  from PeterZ, Oleg and myself shaped the end result.

  This tree includes uprobes support in 'perf probe' - but SystemTap
  (and other tools) can take advantage of user probe points as well.

  Sample usage of uprobes via perf, for example to profile malloc()
  calls without modifying user-space binaries.

  First boot a new kernel with CONFIG_UPROBE_EVENT=y enabled.

  If you don't know which function you want to probe you can pick one
  from 'perf top' or can get a list all functions that can be probed
  within libc (binaries can be specified as well):

	$ perf probe -F -x /lib/libc.so.6

  To probe libc's malloc():

	$ perf probe -x /lib64/libc.so.6 malloc
	Added new event:
	probe_libc:malloc    (on 0x7eac0)

  You can now use it in all perf tools, such as:

	perf record -e probe_libc:malloc -aR sleep 1

  Make use of it to create a call graph (as the flat profile is going to
  look very boring):

	$ perf record -e probe_libc:malloc -gR make
	[ perf record: Woken up 173 times to write data ]
	[ perf record: Captured and wrote 44.190 MB perf.data (~1930712

	$ perf report | less

	  32.03%            git  libc-2.15.so   [.] malloc
	                    |
	                    --- malloc

	  29.49%            cc1  libc-2.15.so   [.] malloc
	                    |
	                    --- malloc
	                       |
	                       |--0.95%-- 0x208eb1000000000
	                       |
	                       |--0.63%-- htab_traverse_noresize

	  11.04%             as  libc-2.15.so   [.] malloc
	                     |
	                     --- malloc
	                        |

	   7.15%             ld  libc-2.15.so   [.] malloc
	                     |
	                     --- malloc
	                        |

	   5.07%             sh  libc-2.15.so   [.] malloc
	                     |
	                     --- malloc
	                        |
	   4.99%  python-config  libc-2.15.so   [.] malloc
	          |
	          --- malloc
	             |
	   4.54%           make  libc-2.15.so   [.] malloc
	                   |
	                   --- malloc
	                      |
	                      |--7.34%-- glob
	                      |          |
	                      |          |--93.18%-- 0x41588f
	                      |          |
	                      |           --6.82%-- glob
	                      |                     0x41588f

	   ...

  Or:

	$ perf report -g flat | less

	# Overhead        Command  Shared Object      Symbol
	# ........  .............  .............  ..........
	#
	  32.03%            git  libc-2.15.so   [.] malloc
	          27.19%
	              malloc

	  29.49%            cc1  libc-2.15.so   [.] malloc
	          24.77%
	              malloc

	  11.04%             as  libc-2.15.so   [.] malloc
	          11.02%
	              malloc

	   7.15%             ld  libc-2.15.so   [.] malloc
	           6.57%
	              malloc

	 ...

  The core uprobes design is fairly straightforward: uprobes probe
  points register themselves at (inode:offset) addresses of
  libraries/binaries, after which all existing (or new) vmas that map
  that address will have a software breakpoint injected at that address.
  vmas are COW-ed to preserve original content.  The probe points are
  kept in an rbtree.

  If user-space executes the probed inode:offset instruction address
  then an event is generated which can be recovered from the regular
  perf event channels and mmap-ed ring-buffer.

  Multiple probes at the same address are supported, they create a
  dynamic callback list of event consumers.

  The basic model is further complicated by the XOL speedup: the
  original instruction that is probed is copied (in an architecture
  specific fashion) and executed out of line when the probe triggers.
  The XOL area is a single vma per process, with a fixed number of
  entries (which limits probe execution parallelism).

  The API: uprobes are installed/removed via
  /sys/kernel/debug/tracing/uprobe_events, the API is integrated to
  align with the kprobes interface as much as possible, but is separate
  to it.

  Injecting a probe point is privileged operation, which can be relaxed
  by setting perf_paranoid to -1.

  You can use multiple probes as well and mix them with kprobes and
  regular PMU events or tracepoints, when instrumenting a task."

Fix up trivial conflicts in mm/memory.c due to previous cleanup of
unmap_single_vma().

* 'perf-uprobes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (21 commits)
  perf probe: Detect probe target when m/x options are absent
  perf probe: Provide perf interface for uprobes
  tracing: Fix kconfig warning due to a typo
  tracing: Provide trace events interface for uprobes
  tracing: Extract out common code for kprobes/uprobes trace events
  tracing: Modify is_delete, is_return from int to bool
  uprobes/core: Decrement uprobe count before the pages are unmapped
  uprobes/core: Make background page replacement logic account for rss_stat counters
  uprobes/core: Optimize probe hits with the help of a counter
  uprobes/core: Allocate XOL slots for uprobes use
  uprobes/core: Handle breakpoint and singlestep exceptions
  uprobes/core: Rename bkpt to swbp
  uprobes/core: Make order of function parameters consistent across functions
  uprobes/core: Make macro names consistent
  uprobes: Update copyright notices
  uprobes/core: Move insn to arch specific structure
  uprobes/core: Remove uprobe_opcode_sz
  uprobes/core: Make instruction tables volatile
  uprobes: Move to kernel/events/
  uprobes/core: Clean up, refactor and improve the code
  ...
2012-05-24 11:39:34 -07:00
..
backing-dev.c backing-dev: fix wakeup timer races with bdi_unregister() 2012-02-01 16:52:49 +08:00
bootmem.c bootmem/sparsemem: remove limit constraint in alloc_bootmem_section 2012-03-21 17:54:58 -07:00
bounce.c mm: remove the second argument of k[un]map_atomic() 2012-03-20 21:48:27 +08:00
cleancache.c mm: cleancache: Use __read_mostly as appropiate. 2012-01-23 16:08:09 -05:00
compaction.c mm: compaction: make compact_control order signed 2012-03-21 17:54:56 -07:00
debug-pagealloc.c mm, x86: Remove debug_pagealloc_enabled 2011-12-06 09:24:07 +01:00
dmapool.c mm: fix implicit stat.h usage in dmapool.c 2011-10-31 09:20:12 -04:00
fadvise.c fadvise: only initiate writeback for specified range with FADV_DONTNEED 2012-01-10 16:30:43 -08:00
failslab.c switch debugfs to umode_t 2012-01-03 22:54:56 -05:00
filemap_xip.c mm/filemap_xip.c: fix race condition in xip_file_fault() 2012-02-03 16:16:41 -08:00
filemap.c radix-tree: use iterators in find_get_pages* functions 2012-03-28 17:14:37 -07:00
fremap.c mm: delete various needless include <linux/module.h> 2011-10-31 09:20:11 -04:00
highmem.c Merge branch 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux 2011-11-06 19:44:47 -08:00
huge_memory.c thp: optimize away unnecessary page table locking 2012-03-21 17:54:57 -07:00
hugetlb.c hugetlb: prevent BUG_ON in hugetlb_fault() -> hugetlb_cow() 2012-05-10 15:06:44 -07:00
hwpoison-inject.c HWPOISON: Clean up memory_failure() vs. __memory_failure() 2012-01-03 12:06:32 -08:00
init-mm.c atomic: use <linux/atomic.h> 2011-07-26 16:49:47 -07:00
internal.h mm: thp: tail page refcounting fix 2011-11-02 16:06:57 -07:00
Kconfig Merge branch 'master' into x86/memblock 2011-11-28 09:46:22 -08:00
Kconfig.debug mm: more intensive memory corruption debugging 2012-01-10 16:30:42 -08:00
kmemcheck.c
kmemleak-test.c kmemleak: remove memset by using kzalloc 2011-01-27 18:31:51 +00:00
kmemleak.c kmemleak: Disable early logging when kmemleak is off by default 2012-01-20 16:57:05 +00:00
ksm.c ksm: cleanup: introduce find_mergeable_vma() 2012-03-21 17:54:59 -07:00
maccess.c mm: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
madvise.c coredump: add VM_NODUMP, MADV_NODUMP, MADV_CLEAR_NODUMP 2012-03-23 16:58:42 -07:00
Makefile Cross Memory Attach 2011-10-31 17:30:44 -07:00
memblock.c memblock: memblock should be able to handle zero length operations 2012-04-20 11:18:46 -07:00
memcontrol.c Merge branch 'for-3.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup 2012-05-22 17:40:19 -07:00
memory_hotplug.c mm: compaction: introduce sync-light migration for use by compaction 2012-01-12 20:13:09 -08:00
memory-failure.c Merge branch 'x86-mce-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-03-22 09:42:04 -07:00
memory.c Merge branch 'perf-uprobes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-05-24 11:39:34 -07:00
mempolicy.c mm: mempolicy: Let vma_merge and vma_split handle vma->vm_policy linkages 2012-05-23 17:57:33 -07:00
mempool.c mempool: fix first round failure behavior 2012-01-10 16:30:45 -08:00
migrate.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2012-05-23 17:42:39 -07:00
mincore.c mm: thp: fix pmd_bad() triggering in code paths holding mmap_sem read mode 2012-03-21 17:54:54 -07:00
mlock.c vm: avoid using find_vma_prev() unnecessarily 2012-03-06 18:23:36 -08:00
mm_init.c mm: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
mmap.c Merge branch 'perf-uprobes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-05-24 11:39:34 -07:00
mmu_context.c mm, counters: remove task argument to sync_mm_rss() and __sync_task_rss_stat() 2012-03-21 17:54:59 -07:00
mmu_notifier.c mm: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
mmzone.c mm: delete various needless include <linux/module.h> 2011-10-31 09:20:11 -04:00
mprotect.c Merge branch 'akpm' (Andrew's patch-bomb) 2012-03-22 09:04:48 -07:00
mremap.c mm: collapse security_vm_enough_memory() variants into a single function 2012-02-14 10:45:39 +11:00
msync.c
nobootmem.c mm: nobootmem: fix sign extend problem in __free_pages_memory() 2012-05-10 15:06:44 -07:00
nommu.c kill mm argument of vm_munmap() 2012-04-21 01:58:20 -04:00
oom_kill.c userns: Store uid and gid values in struct cred with kuid_t and kgid_t types 2012-05-03 03:28:38 -07:00
page_alloc.c Driver core pull for 3.5-rc1 2012-05-22 16:02:13 -07:00
page_cgroup.c page_cgroup: fix horrid swap accounting regression 2012-03-06 08:18:23 -08:00
page_io.c block: kill off REQ_UNPLUG 2011-03-10 08:52:27 +01:00
page_isolation.c mm: page_isolation: codeclean fix comment and rm unneeded val init 2010-10-26 16:52:11 -07:00
page-writeback.c Ext4 commits for 3.3 merge window; mostly cleanups and bug fixes 2012-03-28 10:02:55 -07:00
pagewalk.c mm: thp: fix pmd_bad() triggering in code paths holding mmap_sem read mode 2012-03-21 17:54:54 -07:00
percpu-km.c percpu: clear memory allocated with the km allocator 2010-10-02 10:28:42 +03:00
percpu-vm.c percpu: use bitmap_clear 2012-01-20 09:23:16 -08:00
percpu.c kmemleak: Fix the kmemleak tracking of the percpu areas with !SMP 2012-05-09 10:13:29 -07:00
pgtable-generic.c thp: add HPAGE_PMD_* definitions for !CONFIG_TRANSPARENT_HUGEPAGE 2012-03-21 17:55:02 -07:00
prio_tree.c sanitize <linux/prefetch.h> usage 2011-05-20 12:50:29 -07:00
process_vm_access.c Fix race in process_vm_rw_core 2012-02-02 12:55:17 -08:00
quicklist.c mm: delete various needless include <linux/module.h> 2011-10-31 09:20:11 -04:00
readahead.c mm: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
rmap.c memcg: use new logic for page stat accounting 2012-03-21 17:55:01 -07:00
shmem.c userns: Convert tmpfs to use kuid and kgid where appropriate 2012-05-15 14:59:29 -07:00
slab.c Merge branch 'slab/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux 2012-03-28 15:04:26 -07:00
slob.c mm: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
slub.c slub: missing test for partial pages flush work in flush_all() 2012-05-17 18:00:51 -07:00
sparse-vmemmap.c mm: delete various needless include <linux/module.h> 2011-10-31 09:20:11 -04:00
sparse.c bootmem/sparsemem: remove limit constraint in alloc_bootmem_section 2012-03-21 17:54:58 -07:00
swap_state.c mm: fix s390 BUG by __set_page_dirty_no_writeback on swap 2012-04-23 18:19:22 -07:00
swap.c mm: drain percpu lru add/rotate page-vectors on cpu hot-unplug 2012-03-21 17:54:58 -07:00
swapfile.c swapon: check validity of swap_flags 2012-03-28 17:14:35 -07:00
thrash.c mm/thrash.c: quiet sparse noise 2011-10-31 17:30:50 -07:00
truncate.c mm for fs: add truncate_pagecache_range() 2012-03-28 17:14:35 -07:00
util.c procfs: mark thread stack correctly in proc/<pid>/maps 2012-03-21 17:54:58 -07:00
vmalloc.c mm: remove the second argument of k[un]map_atomic() 2012-03-20 21:48:27 +08:00
vmscan.c mm: fix up the vmscan stat in vmstat 2012-04-25 21:26:33 -07:00
vmstat.c mm: fix up the vmscan stat in vmstat 2012-04-25 21:26:33 -07:00