linux/Documentation
Mel Gorman cb2517653f sched/debug: Make schedstats a runtime tunable that is disabled by default
schedstats is very useful during debugging and performance tuning but it
incurs overhead to calculate the stats. As such, even though it can be
disabled at build time, it is often enabled as the information is useful.

This patch adds a kernel command-line and sysctl tunable to enable or
disable schedstats on demand (when it's built in). It is disabled
by default as someone who knows they need it can also learn to enable
it when necessary.

The benefits are dependent on how scheduler-intensive the workload is.
If it is then the patch reduces the number of cycles spent calculating
the stats with a small benefit from reducing the cache footprint of the
scheduler.

These measurements were taken from a 48-core 2-socket
machine with Xeon(R) E5-2670 v3 cpus although they were also tested on a
single socket machine 8-core machine with Intel i7-3770 processors.

netperf-tcp
                           4.5.0-rc1             4.5.0-rc1
                             vanilla          nostats-v3r1
Hmean    64         560.45 (  0.00%)      575.98 (  2.77%)
Hmean    128        766.66 (  0.00%)      795.79 (  3.80%)
Hmean    256        950.51 (  0.00%)      981.50 (  3.26%)
Hmean    1024      1433.25 (  0.00%)     1466.51 (  2.32%)
Hmean    2048      2810.54 (  0.00%)     2879.75 (  2.46%)
Hmean    3312      4618.18 (  0.00%)     4682.09 (  1.38%)
Hmean    4096      5306.42 (  0.00%)     5346.39 (  0.75%)
Hmean    8192     10581.44 (  0.00%)    10698.15 (  1.10%)
Hmean    16384    18857.70 (  0.00%)    18937.61 (  0.42%)

Small gains here, UDP_STREAM showed nothing intresting and neither did
the TCP_RR tests. The gains on the 8-core machine were very similar.

tbench4
                                 4.5.0-rc1             4.5.0-rc1
                                   vanilla          nostats-v3r1
Hmean    mb/sec-1         500.85 (  0.00%)      522.43 (  4.31%)
Hmean    mb/sec-2         984.66 (  0.00%)     1018.19 (  3.41%)
Hmean    mb/sec-4        1827.91 (  0.00%)     1847.78 (  1.09%)
Hmean    mb/sec-8        3561.36 (  0.00%)     3611.28 (  1.40%)
Hmean    mb/sec-16       5824.52 (  0.00%)     5929.03 (  1.79%)
Hmean    mb/sec-32      10943.10 (  0.00%)    10802.83 ( -1.28%)
Hmean    mb/sec-64      15950.81 (  0.00%)    16211.31 (  1.63%)
Hmean    mb/sec-128     15302.17 (  0.00%)    15445.11 (  0.93%)
Hmean    mb/sec-256     14866.18 (  0.00%)    15088.73 (  1.50%)
Hmean    mb/sec-512     15223.31 (  0.00%)    15373.69 (  0.99%)
Hmean    mb/sec-1024    14574.25 (  0.00%)    14598.02 (  0.16%)
Hmean    mb/sec-2048    13569.02 (  0.00%)    13733.86 (  1.21%)
Hmean    mb/sec-3072    12865.98 (  0.00%)    13209.23 (  2.67%)

Small gains of 2-4% at low thread counts and otherwise flat.  The
gains on the 8-core machine were slightly different

tbench4 on 8-core i7-3770 single socket machine
Hmean    mb/sec-1        442.59 (  0.00%)      448.73 (  1.39%)
Hmean    mb/sec-2        796.68 (  0.00%)      794.39 ( -0.29%)
Hmean    mb/sec-4       1322.52 (  0.00%)     1343.66 (  1.60%)
Hmean    mb/sec-8       2611.65 (  0.00%)     2694.86 (  3.19%)
Hmean    mb/sec-16      2537.07 (  0.00%)     2609.34 (  2.85%)
Hmean    mb/sec-32      2506.02 (  0.00%)     2578.18 (  2.88%)
Hmean    mb/sec-64      2511.06 (  0.00%)     2569.16 (  2.31%)
Hmean    mb/sec-128     2313.38 (  0.00%)     2395.50 (  3.55%)
Hmean    mb/sec-256     2110.04 (  0.00%)     2177.45 (  3.19%)
Hmean    mb/sec-512     2072.51 (  0.00%)     2053.97 ( -0.89%)

In constract, this shows a relatively steady 2-3% gain at higher thread
counts. Due to the nature of the patch and the type of workload, it's
not a surprise that the result will depend on the CPU used.

hackbench-pipes
                         4.5.0-rc1             4.5.0-rc1
                           vanilla          nostats-v3r1
Amean    1        0.0637 (  0.00%)      0.0660 ( -3.59%)
Amean    4        0.1229 (  0.00%)      0.1181 (  3.84%)
Amean    7        0.1921 (  0.00%)      0.1911 (  0.52%)
Amean    12       0.3117 (  0.00%)      0.2923 (  6.23%)
Amean    21       0.4050 (  0.00%)      0.3899 (  3.74%)
Amean    30       0.4586 (  0.00%)      0.4433 (  3.33%)
Amean    48       0.5910 (  0.00%)      0.5694 (  3.65%)
Amean    79       0.8663 (  0.00%)      0.8626 (  0.43%)
Amean    110      1.1543 (  0.00%)      1.1517 (  0.22%)
Amean    141      1.4457 (  0.00%)      1.4290 (  1.16%)
Amean    172      1.7090 (  0.00%)      1.6924 (  0.97%)
Amean    192      1.9126 (  0.00%)      1.9089 (  0.19%)

Some small gains and losses and while the variance data is not included,
it's close to the noise. The UMA machine did not show anything particularly
different

pipetest
                             4.5.0-rc1             4.5.0-rc1
                               vanilla          nostats-v2r2
Min         Time        4.13 (  0.00%)        3.99 (  3.39%)
1st-qrtle   Time        4.38 (  0.00%)        4.27 (  2.51%)
2nd-qrtle   Time        4.46 (  0.00%)        4.39 (  1.57%)
3rd-qrtle   Time        4.56 (  0.00%)        4.51 (  1.10%)
Max-90%     Time        4.67 (  0.00%)        4.60 (  1.50%)
Max-93%     Time        4.71 (  0.00%)        4.65 (  1.27%)
Max-95%     Time        4.74 (  0.00%)        4.71 (  0.63%)
Max-99%     Time        4.88 (  0.00%)        4.79 (  1.84%)
Max         Time        4.93 (  0.00%)        4.83 (  2.03%)
Mean        Time        4.48 (  0.00%)        4.39 (  1.91%)
Best99%Mean Time        4.47 (  0.00%)        4.39 (  1.91%)
Best95%Mean Time        4.46 (  0.00%)        4.38 (  1.93%)
Best90%Mean Time        4.45 (  0.00%)        4.36 (  1.98%)
Best50%Mean Time        4.36 (  0.00%)        4.25 (  2.49%)
Best10%Mean Time        4.23 (  0.00%)        4.10 (  3.13%)
Best5%Mean  Time        4.19 (  0.00%)        4.06 (  3.20%)
Best1%Mean  Time        4.13 (  0.00%)        4.00 (  3.39%)

Small improvement and similar gains were seen on the UMA machine.

The gain is small but it stands to reason that doing less work in the
scheduler is a good thing. The downside is that the lack of schedstats and
tracepoints may be surprising to experts doing performance analysis until
they find the existence of the schedstats= parameter or schedstats sysctl.
It will be automatically activated for latencytop and sleep profiling to
alleviate the problem. For tracepoints, there is a simple warning as it's
not safe to activate schedstats in the context when it's known the tracepoint
may be wanted but is unavailable.

Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>
Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <mgalbraith@suse.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1454663316-22048-1-git-send-email-mgorman@techsingularity.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-09 11:54:23 +01:00
..
ABI Initial roundup of 4.5 merge window patches 2016-01-23 18:45:06 -08:00
accounting Documentation-getdelays: Apply a recommendation from "checkpatch.pl" in main() 2015-12-24 07:22:32 -07:00
acpi mfd: core: redo ACPI matching of the children devices 2015-10-26 15:25:53 +01:00
aoe
arm ARM: SoC multiplatform code changes for v4.5 2016-01-20 18:03:56 -08:00
arm64 arm64: Documentation: add list of software workarounds for errata 2015-12-11 17:33:21 +00:00
auxdisplay
backlight
blackfin
block A relatively boring cycle in the docs tree. There's a few kernel-doc 2016-01-17 11:55:07 -08:00
blockdev zram: update documentation 2015-09-24 15:39:42 -06:00
bus-devices
cdrom
cgroup-v1 cgroup: rename cgroup documentations 2016-01-11 23:14:51 -05:00
cma
connector
console
cpu-freq Documentation: cpufreq: intel_pstate: enhance documentation 2016-01-05 13:47:37 +01:00
cpuidle
cris
crypto KEYS: Merge the type-specific data with the payload data 2015-10-21 15:18:36 +01:00
development-process
device-mapper dm verity: add ignore_zero_blocks feature 2015-12-10 10:39:03 -05:00
devicetree Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-02-01 15:56:08 -08:00
dmaengine Merge branch 'topic/async' into for-linus 2016-01-06 15:17:47 +05:30
DocBook Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linux 2016-01-17 13:40:25 -08:00
driver-model driver-core: platform: Provide helpers for multi-driver modules 2015-10-05 05:02:40 +01:00
dvb [media] use https://linuxtv.org for LinuxTV URLs 2015-12-04 10:38:59 -02:00
early-userspace
EDID
extcon
fault-injection net: Add support for CHANGEUPPER notifier error injection 2015-12-03 11:49:23 -05:00
fb Documentation/fb: add documentation for sm712fb 2015-08-07 15:05:01 -07:00
features dma-mapping: always provide the dma_map_ops based implementation 2016-01-20 17:09:18 -08:00
filesystems mm: polish virtual memory accounting 2016-02-03 08:28:43 -08:00
firmware_class
fmc
fpga usage documentation for FPGA manager core 2015-10-07 18:07:20 +01:00
frv
gpio Doc: gpio: Fix typos in Documentation/gpio 2015-11-20 16:51:16 -07:00
hid
hwmon hwmon: (pmbus) Add client driver for LTC3815 2015-12-18 08:20:59 -08:00
i2c i2c: i801: add Intel Lewisburg device IDs 2015-11-20 16:22:21 +01:00
ia64
ide
iio iio: Documentation: Add IIO configfs documentation 2015-12-03 18:19:28 +00:00
infiniband IB: remove in-kernel support for memory windows 2015-12-23 14:29:04 -05:00
input Input: add userio module 2015-10-27 18:55:31 -07:00
ioctl Doc: ioctl: Fix typos in Documentation/ioctl 2015-11-20 16:52:50 -07:00
isdn
ja_JP Documentation: translations: update linux cross reference link 2016-01-11 18:26:58 -07:00
kbuild kbuild: document recursive dependency limitation / resolution 2015-10-08 15:36:16 +02:00
kdump
ko_KR Documentation: translations: update linux cross reference link 2016-01-11 18:26:58 -07:00
laptops Move freefall program from Documentation/ to tools/ 2015-06-08 16:42:07 -06:00
leds Documentation: leds: Add description of brightness setting API 2016-01-04 09:57:31 +01:00
locking Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2015-11-03 16:10:43 -08:00
m68k
memory-devices
metag
mic misc: mic: Update MIC host daemon with COSM changes 2015-10-04 12:54:54 +01:00
mips
misc-devices Doc:misc-devices: Fix typo in Documentation/misc-devices 2015-09-18 10:04:24 -06:00
mmc mmc: core: Remove MMC_CLKGATE 2015-10-26 16:00:09 +01:00
mn10300
mtd Documentation: mtd: improve nand_ecc.txt for readability and correctness 2015-11-17 17:05:14 -08:00
namespaces
netlabel
networking net: change tcp_syn_retries documentation 2016-01-20 18:55:08 -08:00
nfc NFC: Fix typo in nfc-hci.txt 2015-06-08 23:15:45 +02:00
nios2
nvdimm libnvdimm: documentation clarifications 2015-11-12 09:55:23 -08:00
nvmem Documentation: nvmem: add nvmem api level and how-to doc 2015-08-05 13:43:45 -07:00
parisc
PCI
pcmcia pcmcia: Fix typo in locking documentation 2015-08-07 14:34:58 +02:00
phy
platform
power Merge branches 'pm-pci' and 'pm-core' 2016-01-12 01:10:52 +01:00
powerpc SCSI misc on 20150901 2015-09-02 12:22:54 -07:00
pps Doc: pps: Fix file name in pps.txt 2015-07-14 12:35:42 -06:00
prctl Documentation/prctl: don't build tsc tests when cross compiling 2015-06-22 16:05:04 -06:00
pti
ptp testptp: Silence compiler warnings on ppc64 2015-09-29 21:16:56 -07:00
rapidio
RCU documentation: Update RCU requirements based on expedited changes 2015-12-05 12:34:32 -08:00
s390 s390/zcore: remove /sys/kernel/debug/zcore/mem 2015-11-27 09:24:12 +01:00
scheduler sched/dl/Documentation: Split Section 3 2015-05-19 08:39:21 +02:00
scsi st: allow debug output to be enabled or disabled via sysfs 2015-11-09 17:17:27 -08:00
security keys, trusted: seal with a TPM2 authorization policy 2015-12-20 15:27:13 +02:00
serial Documentation: improve line discipline method descriptions 2015-10-05 04:53:26 +01:00
sh
sound ASoC: img: Add documentation for SPDIF in controls 2015-11-16 10:06:58 +00:00
spi spi: tools: move spidev_test metadata 2015-11-30 12:14:12 +00:00
sysctl sched/debug: Make schedstats a runtime tunable that is disabled by default 2016-02-09 11:54:23 +01:00
target target: use per-attribute show and store methods 2015-10-13 22:17:49 -07:00
thermal thermal: add description for integral_cutoff unit 2016-01-14 13:29:08 -07:00
timers
tpm
trace x86, tracing, perf: Add trace point for MSR accesses 2015-12-06 12:56:10 +01:00
usb The chipidea changes for v4.5-rc1 2015-12-26 16:59:14 -08:00
vDSO Documentation/vDSO: don't build tests when cross compiling 2015-06-22 16:04:57 -06:00
video4linux [media] media framework: rename pads init function to media_entity_pads_init() 2016-01-11 12:19:03 -02:00
virtual KVM doc: Fix KVM_SMI chapter number 2016-01-26 16:29:59 +01:00
vm Merge branch 'akpm' (patches from Andrew) 2016-01-17 12:58:52 -08:00
w1 w1: masters: omap_hdq: add support for 1-wire mode 2015-10-05 04:47:09 +01:00
watchdog watchdog: Drop pointer to watchdog device from struct watchdog_device 2016-01-11 21:53:59 +01:00
wimax
x86 Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2015-09-01 10:07:40 -07:00
xtensa
zh_CN [media] media framework: rename pads init function to media_entity_pads_init() 2016-01-11 12:19:03 -02:00
00-INDEX
adding-syscalls.txt Documentation: describe how to add a system call 2015-08-13 17:54:06 -06:00
applying-patches.txt
assoc_array.txt
atomic_ops.txt locking/atomics, cmpxchg: Privatize the inclusion of asm/cmpxchg.h 2015-09-13 10:35:46 +02:00
bad_memory.txt
basic_profiling.txt
bcache.txt
binfmt_misc.txt
braille-console.txt
bt8xxgpio.txt
btmrvl.txt
BUG-HUNTING
bus-virt-phys-mapping.txt
cachetlb.txt
cgroup-v2.txt Documentation: cgroup-v2: add memory.stat::sock description 2016-02-03 08:28:43 -08:00
Changes There is a nice new document from Neil on how pathname lookups work and 2015-11-05 15:59:24 -08:00
circular-buffers.txt
clk.txt clk: change clk_ops' ->determine_rate() prototype 2015-07-27 18:12:01 -07:00
coccinelle.txt
CodeOfConflict
CodingStyle Documentation: fix typo in CodingStyle 2016-01-11 18:18:16 -07:00
cpu-hotplug.txt Documentation: cpu-hotplug: Fix sysfs mount instructions 2015-12-10 11:35:30 -07:00
cpu-load.txt
cputopology.txt Documentation: Update cputopology.txt 2015-05-27 15:22:15 +02:00
crc32.txt
dcdbas.txt
debugging-modules.txt
debugging-via-ohci1394.txt Doc: Change wikipedia's URL from http to https 2015-06-22 10:14:05 -06:00
dell_rbu.txt
devices.txt
digsig.txt
DMA-API-HOWTO.txt dma-mapping: always provide the dma_map_ops based implementation 2016-01-20 17:09:18 -08:00
DMA-API.txt DMA-API: fix confusing sentence in Documentation/DMA-API.txt 2016-01-11 18:29:00 -07:00
DMA-attributes.txt
dma-buf-sharing.txt
DMA-ISA-LPC.txt
dontdiff Documentation: dontdiff: remove media from dontdiff 2015-11-11 10:08:07 -07:00
dynamic-debug-howto.txt
edac.txt EDAC: Remove references to bluesmoke.sourceforge.net 2015-11-26 14:46:06 +01:00
efi-stub.txt
eisa.txt
email-clients.txt A few more documentation patches that wandered in and have no reason to 2015-11-13 09:19:05 -08:00
flexible-arrays.txt
futex-requeue-pi.txt
gcov.txt
gdb-kernel-debugging.txt
highuid.txt
HOWTO Documentation: HOWTO: update code cross reference link 2015-12-10 11:19:35 -07:00
hsi.txt
hw_random.txt hwrng: doc - Fix device node name reference /dev/hw_random => /dev/hwrng 2015-09-21 22:00:41 +08:00
hwspinlock.txt
init.txt
initrd.txt
intel_txt.txt
Intel-IOMMU.txt iommu/vt-d: Fix link to Intel IOMMU Specification 2016-01-29 12:32:12 +01:00
io_ordering.txt
io-mapping.txt
iostats.txt
IPMI.txt ipmi watchdog : add panic_wdt_timeout parameter 2015-11-16 06:28:43 -06:00
IRQ-affinity.txt
IRQ-domain.txt irqdomain: Documentation updates 2015-10-13 19:01:25 +02:00
IRQ.txt
irqflags-tracing.txt
isapnp.txt
java.txt
kasan.txt mm, slub, kasan: enable user tracking by default with KASAN=y 2015-11-05 19:34:48 -08:00
kernel-doc-nano-HOWTO.txt Documenation: Update location of docproc.c 2015-07-14 12:36:39 -06:00
kernel-docs.txt Documentation: translations: update linux cross reference link 2016-01-11 18:26:58 -07:00
kernel-parameters.txt sched/debug: Make schedstats a runtime tunable that is disabled by default 2016-02-09 11:54:23 +01:00
kernel-per-CPU-kthreads.txt irq_poll: make blk-iopoll available outside the block layer 2015-12-11 11:52:24 -08:00
kmemcheck.txt
kmemleak.txt Doc: Change wikipedia's URL from http to https 2015-06-22 10:14:05 -06:00
kobject.txt
kprobes.txt
kref.txt
kselftest.txt Documentation: Update kselftest.txt 2015-09-24 15:51:53 -06:00
ldm.txt
local_ops.txt
lockup-watchdogs.txt kernel/watchdog.c: add sysctl knob hardlockup_panic 2015-11-05 19:34:48 -08:00
logo.gif
logo.txt
lzo.txt
magic-number.txt
mailbox.txt Documentation: minor typo fix in mailbox.txt 2015-08-13 18:03:18 -06:00
Makefile spi: Move spi code from Documentation to tools 2015-11-23 14:54:01 +00:00
ManagementStyle
md-cluster.txt md-cluster: update the documentation 2016-01-06 11:39:06 +11:00
md.txt doc:md: fix typo in md.txt. 2015-06-23 06:49:44 -06:00
memory-barriers.txt virtio: barrier rework+fixes 2016-01-18 16:44:24 -08:00
memory-hotplug.txt
men-chameleon-bus.txt Documentation: Minor changes to men-chameleon-bus.txt 2015-07-24 15:15:17 +02:00
module-signing.txt Move certificate handling to its own directory 2015-08-14 16:06:13 +01:00
mono.txt
nommu-mmap.txt
ntb.txt NTB: Rename Intel code names to platform names 2015-07-04 14:09:25 -04:00
numastat.txt
oops-tracing.txt
padata.txt
parport-lowlevel.txt
parport.txt
percpu-rw-semaphore.txt
phy.txt
pi-futex.txt
pinctrl.txt
pnp.txt
preempt-locking.txt x86/fpu: Rename math_state_restore() to fpu__restore() 2015-05-19 15:47:18 +02:00
printk-formats.txt printk-formats.txt: remove unimplemented %pT 2016-01-16 11:17:30 -08:00
pwm.txt
ramoops.txt
rbtree.txt documentation: fix small typo in rbtree.txt 2015-09-13 14:38:50 -06:00
remoteproc.txt remoteproc: introduce rproc_get_by_phandle API 2015-06-16 21:12:52 +03:00
rfkill.txt
robust-futex-ABI.txt
robust-futexes.txt
rpmsg.txt
rtc.txt
SAK.txt
SecurityBugs
serial-console.txt
sgi-ioc4.txt
SM501.txt
smsc_ece1099.txt
sparse.txt
stable_api_nonsense.txt
stable_kernel_rules.txt stable_kernel_rules.txt: Remove extra space after Cc: 2015-11-20 16:54:57 -07:00
static-keys.txt locking/static_keys: Fix up the static keys documentation 2015-09-15 07:12:06 +02:00
SubmitChecklist
SubmittingDrivers
SubmittingPatches A few more documentation patches that wandered in and have no reason to 2015-11-13 09:19:05 -08:00
svga.txt
sysfs-rules.txt
sysrq.txt mm, oom: do not panic for oom kills triggered from sysrq 2015-09-08 15:35:28 -07:00
this_cpu_ops.txt
ubsan.txt UBSAN: run-time undefined behavior sanity checker 2016-01-20 17:09:18 -08:00
unaligned-memory-access.txt
unicode.txt
unshare.txt
vfio.txt vfio: powerpc/spapr: Support Dynamic DMA windows 2015-06-11 15:16:55 +10:00
VGA-softcursor.txt
vgaarbiter.txt
video-output.txt
vme_api.txt Documentation: mention vme_master_mmap() in VME API 2015-06-12 17:26:56 -07:00
volatile-considered-harmful.txt
workqueue.txt
xillybus.txt
xz.txt
zorro.txt