linux/Documentation
Daniel Borkmann 74451e66d5 bpf: make jited programs visible in traces
Long standing issue with JITed programs is that stack traces from
function tracing check whether a given address is kernel code
through {__,}kernel_text_address(), which checks for code in core
kernel, modules and dynamically allocated ftrace trampolines. But
what is still missing is BPF JITed programs (interpreted programs
are not an issue as __bpf_prog_run() will be attributed to them),
thus when a stack trace is triggered, the code walking the stack
won't see any of the JITed ones. The same for address correlation
done from user space via reading /proc/kallsyms. This is read by
tools like perf, but the latter is also useful for permanent live
tracing with eBPF itself in combination with stack maps when other
eBPF types are part of the callchain. See offwaketime example on
dumping stack from a map.

This work tries to tackle that issue by making the addresses and
symbols known to the kernel. The lookup from *kernel_text_address()
is implemented through a latched RB tree that can be read under
RCU in fast-path that is also shared for symbol/size/offset lookup
for a specific given address in kallsyms. The slow-path iteration
through all symbols in the seq file done via RCU list, which holds
a tiny fraction of all exported ksyms, usually below 0.1 percent.
Function symbols are exported as bpf_prog_<tag>, in order to aide
debugging and attribution. This facility is currently enabled for
root-only when bpf_jit_kallsyms is set to 1, and disabled if hardening
is active in any mode. The rationale behind this is that still a lot
of systems ship with world read permissions on kallsyms thus addresses
should not get suddenly exposed for them. If that situation gets
much better in future, we always have the option to change the
default on this. Likewise, unprivileged programs are not allowed
to add entries there either, but that is less of a concern as most
such programs types relevant in this context are for root-only anyway.
If enabled, call graphs and stack traces will then show a correct
attribution; one example is illustrated below, where the trace is
now visible in tooling such as perf script --kallsyms=/proc/kallsyms
and friends.

Before:

  7fff8166889d bpf_clone_redirect+0x80007f0020ed (/lib/modules/4.9.0-rc8+/build/vmlinux)
         f5d80 __sendmsg_nocancel+0xffff006451f1a007 (/usr/lib64/libc-2.18.so)

After:

  7fff816688b7 bpf_clone_redirect+0x80007f002107 (/lib/modules/4.9.0-rc8+/build/vmlinux)
  7fffa0575728 bpf_prog_33c45a467c9e061a+0x8000600020fb (/lib/modules/4.9.0-rc8+/build/vmlinux)
  7fffa07ef1fc cls_bpf_classify+0x8000600020dc (/lib/modules/4.9.0-rc8+/build/vmlinux)
  7fff81678b68 tc_classify+0x80007f002078 (/lib/modules/4.9.0-rc8+/build/vmlinux)
  7fff8164d40b __netif_receive_skb_core+0x80007f0025fb (/lib/modules/4.9.0-rc8+/build/vmlinux)
  7fff8164d718 __netif_receive_skb+0x80007f002018 (/lib/modules/4.9.0-rc8+/build/vmlinux)
  7fff8164e565 process_backlog+0x80007f002095 (/lib/modules/4.9.0-rc8+/build/vmlinux)
  7fff8164dc71 net_rx_action+0x80007f002231 (/lib/modules/4.9.0-rc8+/build/vmlinux)
  7fff81767461 __softirqentry_text_start+0x80007f0020d1 (/lib/modules/4.9.0-rc8+/build/vmlinux)
  7fff817658ac do_softirq_own_stack+0x80007f00201c (/lib/modules/4.9.0-rc8+/build/vmlinux)
  7fff810a2c20 do_softirq+0x80007f002050 (/lib/modules/4.9.0-rc8+/build/vmlinux)
  7fff810a2cb5 __local_bh_enable_ip+0x80007f002085 (/lib/modules/4.9.0-rc8+/build/vmlinux)
  7fff8168d452 ip_finish_output2+0x80007f002152 (/lib/modules/4.9.0-rc8+/build/vmlinux)
  7fff8168ea3d ip_finish_output+0x80007f00217d (/lib/modules/4.9.0-rc8+/build/vmlinux)
  7fff8168f2af ip_output+0x80007f00203f (/lib/modules/4.9.0-rc8+/build/vmlinux)
  [...]
  7fff81005854 do_syscall_64+0x80007f002054 (/lib/modules/4.9.0-rc8+/build/vmlinux)
  7fff817649eb return_from_SYSCALL_64+0x80007f002000 (/lib/modules/4.9.0-rc8+/build/vmlinux)
         f5d80 __sendmsg_nocancel+0xffff01c484812007 (/usr/lib64/libc-2.18.so)

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-17 13:40:05 -05:00
..
ABI Revert "driver core: Add deferred_probe attribute to devices in sysfs" 2017-01-14 14:09:03 +01:00
accounting tools: move accounting tool from Documentation 2016-09-23 13:07:15 -06:00
acpi ACPI material for v4.10-rc1 2016-12-13 11:06:21 -08:00
admin-guide Merge branch 'stable/for-linus-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb 2017-01-06 10:53:21 -08:00
aoe
arm ARM: SoC platform updates for v4.10 2016-12-15 15:39:02 -08:00
arm64 arm64 updates for 4.9: 2016-10-03 08:58:35 -07:00
auxdisplay samples: move auxdisplay example code from Documentation 2016-09-23 11:52:32 -06:00
backlight
blackfin samples: move blackfin gptimers-example from Documentation 2016-10-10 07:12:02 -06:00
block block: fix up io_poll documentation 2017-01-03 16:47:13 -07:00
blockdev docs: fix locations of several documents that got moved 2016-10-24 08:12:35 -02:00
bus-devices
cdrom
cgroup-v1 docs: fix locations of several documents that got moved 2016-10-24 08:12:35 -02:00
cma
connector
console
core-api core-api: remove an unexpected unident 2016-12-01 10:46:01 -07:00
cpu-freq Documentation: intel_pstate: Document HWP energy/performance hints 2016-12-08 01:43:05 +01:00
cpuidle
cris
crypto This pull contains one set of changes: a conversion of the crypto DocBook 2016-12-17 16:00:34 -08:00
dev-tools Documentation/sparse: drop __CHECK_ENDIAN__ 2016-12-16 00:13:41 +02:00
device-mapper Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2016-12-14 11:12:25 -08:00
devicetree net: phy: Add LED mode driver for Microsemi PHYs. 2017-02-08 13:29:04 -05:00
dmaengine dmaengine: Documentation: Fix typo in pxa_dma.txt 2016-11-14 08:14:24 +05:30
doc-guide docs-rst: parse-headers.pl: cleanup the documentation 2016-11-30 17:08:09 -07:00
DocBook docs: Fix build failure 2016-12-27 13:05:36 -07:00
driver-api For 4.11, we seem to have more than in the past few releases: 2017-01-14 12:02:15 -05:00
driver-model devres: add devm_alloc_percpu() 2016-11-15 22:34:25 -05:00
early-userspace
EDID
extcon
fault-injection
fb
features 2nd round of ARC udpates for 4.10rc1 2016-12-23 10:22:47 -08:00
filesystems Documentation/filesystems/proc.txt: add VmPin 2017-01-24 16:26:14 -08:00
firmware_class
fmc
fpga fpga: Clarify how write_init works streaming modes 2016-11-29 15:51:49 -06:00
frv docs: fix locations of several documents that got moved 2016-10-24 08:12:35 -02:00
gpio Bulk GPIO changes for the v4.10 kernel cycle: 2016-12-13 07:54:57 -08:00
gpu Main pull request for drm for 4.10 kernel 2016-12-13 09:35:09 -08:00
hid
hwmon hwmon updates for v4.10 2016-12-13 15:43:56 -08:00
i2c Merge branch 'i2c/for-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux 2016-12-15 12:56:35 -08:00
ia64 selftests: move ia64 tests from Documentation/ia64 2016-09-20 09:58:12 -06:00
ide
iio iio: Documentation: Correct the path used to create triggers. 2016-10-01 00:49:58 -06:00
infiniband IB/hfi1: Document new sysfs entries for hfi1 driver 2016-10-02 08:42:19 -04:00
input Input: ALPS - add V8 protocol documentation 2016-10-04 11:47:02 -07:00
ioctl doc: ioctl: Add some clarifications to botching-up-ioctls 2016-09-06 06:00:22 -06:00
isdn docs: fix locations of several documents that got moved 2016-10-24 08:12:35 -02:00
kbuild Kconfig: Introduce the "imply" keyword 2016-11-16 09:26:33 +01:00
kdump Documentation: kdump: Add description of enable multi-cpus support 2016-09-20 18:02:54 -06:00
laptops platform/x86: thinkpad_acpi: Add support for X1 Yoga (2016) Tablet Mode 2016-12-13 09:29:06 -08:00
leds leds/leds-lp5523.txt: make documentation match reality 2016-11-22 12:07:02 +01:00
livepatch Documentation/livepatch: Fix stale link to gmame 2016-12-09 13:41:46 +01:00
locking locking/lglock: Remove lglock implementation 2016-09-22 15:25:56 +02:00
m68k docs: fix locations of several documents that got moved 2016-10-24 08:12:35 -02:00
media [media] videodev2.h: go back to limited range Y'CbCr for SRGB and, ADOBERGB 2017-02-13 14:33:56 -02:00
memory-devices
metag
mic samples: move mic/mpssd example code from Documentation 2016-09-20 12:38:48 -06:00
mips
misc-devices samples: move misc-devices/mei example code from Documentation 2016-09-23 11:51:43 -06:00
mmc
mn10300
mtd
namespaces
netlabel
networking Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next 2017-02-03 16:58:20 -05:00
nfc
nios2
nvdimm
nvmem
parisc
PCI PCI changes for the v4.9 merge window: 2016-10-07 11:46:37 -07:00
pcmcia tools: move pcmcia crc32hash tool from Documentation 2016-09-23 13:07:27 -06:00
perf perf: xgene: Add APM X-Gene SoC Performance Monitoring Unit driver 2016-09-15 11:20:55 -07:00
phy
platform
power Merge branches 'pm-sleep' and 'pm-cpufreq' 2017-01-27 00:08:59 +01:00
powerpc powerpc updates for 4.9 2016-10-07 20:19:31 -07:00
pps
prctl selftests: move prctl tests from Documentation/prctl 2016-09-20 09:09:09 -06:00
process Doc: Correct typo, "Introdution" => "Introduction" 2016-12-01 10:44:08 -07:00
pti
ptp selftests: move ptp tests from Documentation/ptp 2016-09-20 09:54:38 -06:00
rapidio rapidio/documentation/mport_cdev: add missing parameter description 2016-09-01 17:52:02 -07:00
RCU Documentation/RCU: Fix minor typo 2016-11-14 10:39:48 -08:00
s390
scheduler docs/completion.txt: drop dangling reference to completions-design.txt 2016-11-16 16:27:50 -07:00
scsi Merge branch 'misc' into for-linus 2016-12-22 12:32:33 -08:00
security Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security 2016-12-14 13:57:44 -08:00
serial
sh
sound Merge remote-tracking branch 'sound/topic/restize-docs' into sound 2016-11-18 16:19:28 -07:00
sphinx docs: sphinx-extensions: make rstFlatTable work with docutils 0.13 2016-12-18 13:30:29 -07:00
sphinx-static This is the documentation update pull for the 4.9 merge window. 2016-10-04 13:54:07 -07:00
spi Doc: update 00-INDEX files to reflect the runnable code move 2016-10-10 07:12:09 -06:00
sysctl bpf: make jited programs visible in traces 2017-02-17 13:40:05 -05:00
target
thermal thermal: Add support for hardware-tracked trip points 2016-09-27 14:02:16 +08:00
timers Doc: update 00-INDEX files to reflect the runnable code move 2016-10-10 07:12:09 -06:00
trace This release has a few updates: 2016-12-15 13:49:34 -08:00
translations Documentation/sparse: drop __CHECK_ENDIAN__ 2016-12-16 00:13:41 +02:00
usb
virtual KVM: hyperv: fix locking of struct kvm_hv fields 2016-12-16 17:53:38 +01:00
vm mm: add documentation for page fragment APIs 2017-01-10 18:31:55 -08:00
w1
watchdog docs: fix locations of several documents that got moved 2016-10-24 08:12:35 -02:00
wimax
x86 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-12-23 16:54:46 -08:00
xtensa
.gitignore
00-INDEX edac: adjust docs location at MAINTAINERS and 00-INDEX 2016-12-15 08:57:16 -02:00
bcache.txt
bt8xxgpio.txt
btmrvl.txt
bus-virt-phys-mapping.txt
cachetlb.txt
cgroup-v2.txt
Changes docs: add back 'Documentation/Changes' file (as symlink) 2016-12-14 16:30:12 -08:00
circular-buffers.txt Documentation: circular-buffers: use READ_ONCE() 2016-11-16 16:17:45 -07:00
clk.txt
CodingStyle doc: re-add CodingStyle and SubmittingPatches 2016-10-24 08:12:35 -02:00
conf.py docs-rst: doc-guide: split the kernel-documentation.rst contents 2016-11-19 10:22:04 -07:00
cpu-hotplug.txt Documentation: cpu-hotplug: Fix typos 2016-10-25 17:07:52 -06:00
cpu-load.txt
cputopology.txt
crc32.txt
dcdbas.txt
debugging-modules.txt
debugging-via-ohci1394.txt
dell_rbu.txt
digsig.txt
DMA-API-HOWTO.txt Documentation: DMA-API-HOWTO: Fix a typo 2016-09-20 17:58:46 -06:00
DMA-API.txt dma-mapping: add dma_{map,unmap}_resource 2016-09-26 22:16:41 +05:30
DMA-attributes.txt dma-mapping: introduce the DMA_ATTR_NO_WARN attribute 2016-10-11 15:06:32 -07:00
dma-buf-sharing.txt
DMA-ISA-LPC.txt
docutils.conf
dontdiff Remove last traces of ikconfig.h 2016-12-14 10:54:28 +01:00
efi-stub.txt
eisa.txt
flexible-arrays.txt
futex-requeue-pi.txt
gcc-plugins.txt
highuid.txt
hw_random.txt
hwspinlock.txt
index.rst crypto: doc - convert crypto API documentation to Sphinx 2016-12-13 16:37:54 -07:00
intel_txt.txt
Intel-IOMMU.txt
io_ordering.txt
io-mapping.txt
iostats.txt
IPMI.txt ipmi: Update documentation 2016-11-07 12:16:06 -06:00
IRQ-affinity.txt
IRQ-domain.txt
IRQ.txt
irqflags-tracing.txt
isa.txt
isapnp.txt
kernel-doc-nano-HOWTO.txt docs-rst: doc-guide: split the kernel-documentation.rst contents 2016-11-19 10:22:04 -07:00
kernel-per-CPU-kthreads.txt docs: fix locations of several documents that got moved 2016-10-24 08:12:35 -02:00
kobject.txt
kprobes.txt
kref.txt
kselftest.txt Doc: update kselftest.txt with details on how to run tests after install 2016-11-07 18:04:18 -07:00
ldm.txt
lockup-watchdogs.txt docs: fix locations of several documents that got moved 2016-10-24 08:12:35 -02:00
logo.gif
logo.txt
lzo.txt
mailbox.txt
Makefile samples: move blackfin gptimers-example from Documentation 2016-10-10 07:12:02 -06:00
Makefile.sphinx docs-rst: fix media cleandocs target 2016-11-30 17:08:03 -07:00
md-cluster.txt
memory-barriers.txt
memory-hotplug.txt docs: fix locations of several documents that got moved 2016-10-24 08:12:35 -02:00
men-chameleon-bus.txt
nommu-mmap.txt
ntb.txt
numastat.txt
padata.txt
parport-lowlevel.txt
percpu-rw-semaphore.txt
phy.txt
pi-futex.txt
pinctrl.txt
pnp.txt
preempt-locking.txt
printk-formats.txt
pwm.txt
rbtree.txt
remoteproc.txt remoteproc: Split driver and consumer dereferencing 2016-10-02 22:50:21 -07:00
rfkill.txt docs: fix locations of several documents that got moved 2016-10-24 08:12:35 -02:00
robust-futex-ABI.txt
robust-futexes.txt
rpmsg.txt
rtc.txt
SAK.txt
sgi-ioc4.txt
siphash.txt siphash: implement HalfSipHash1-3 for hash tables 2017-01-09 13:58:57 -05:00
SM501.txt
smsc_ece1099.txt
static-keys.txt jump_labels: Allow array initialisers 2016-09-07 09:41:11 +01:00
SubmittingPatches doc: re-add CodingStyle and SubmittingPatches 2016-10-24 08:12:35 -02:00
svga.txt
sync_file.txt dma-buf: Rename struct fence to dma_fence 2016-10-25 14:40:39 +02:00
this_cpu_ops.txt
unaligned-memory-access.txt Documentation/unaligned-memory-access.txt: fix incorrect comparison operator 2016-12-27 13:08:42 -07:00
unshare.txt
vfio-mediated-device.txt vfio-mdev: Make mdev_parent private 2016-12-30 08:13:41 -07:00
vfio.txt
video-output.txt
xillybus.txt
xz.txt
zorro.txt