Counting SMIs is popular, so add a dedicated "-s" option to do it,
and juggle some of the other option letters.
-S is now system summary (was -s)
-c is 32 bit counter (was -d)
-C is 64-bit counter (was -D)
-p is 1st thread in core (was -c)
-P is 1st thread in package (was -p)
bump the minor version number
Signed-off-by: Len Brown <len.brown@intel.com>
# turbostat -d 0x34
is useful for printing the number of SMI's within an interval
on Nehalem and newer processors.
where
# turbostat -m 0x34
will simply print out the total SMI count since reset.
Suggested-by: Andi Kleen
Signed-off-by: Len Brown <len.brown@intel.com>
The -M option dumps the specified 64-bit MSR with every sample.
Previously it was output at the end of each line.
However, with the v2 style of printing, the lines are now staggered,
making MSR output hard to read.
So move the MSR output column to the left where things are aligned.
Signed-off-by: Len Brown <len.brown@intel.com>
The "turbo-limit" is the maximum opportunistic processor
speed, assuming no electrical or thermal constraints.
For a given processor, the turbo-limit varies, depending
on the number of active cores. Generally, there is more
opportunity when fewer cores are active.
Under the "-v" verbose option, turbostat would
print the turbo-limits for the four cases
of 1 to 4 cores active.
Expand that capability to cover the cases of turbo
opportunities with up to 16 cores active.
Note that not all hardware platforms supply this information,
and that sometimes a valid limit may be specified for
a core which is not actually present.
Signed-off-by: Len Brown <len.brown@intel.com>
Under some conditions, c1% was displayed as very large number,
much higher than 100%.
c1% is not measured, it is derived as "that, which is left over"
from other counters. However, the other counters are not collected
atomically, and so it is possible for c1% to be calaculagted as
a small negative number -- displayed as very large positive.
There was a check for mperf vs tsc for this already,
but it needed to also include the other counters
that are used to calculate c1.
Signed-off-by: Len Brown <len.brown@intel.com>
Measuring large profoundly-idle configurations
requires turbostat to be more lightweight.
Otherwise, the operation of turbostat itself
can interfere with the measurements.
This re-write makes turbostat topology aware.
Hardware is accessed in "topology order".
Redundant hardware accesses are deleted.
Redundant output is deleted.
Also, output is buffered and
local RDTSC use replaces remote MSR access for TSC.
From a feature point of view, the output
looks different since redundant figures are absent.
Also, there are now -c and -p options -- to restrict
output to the 1st thread in each core, and the 1st
thread in each package, respectively. This is helpful
to reduce output on big systems, where more detail
than the "-s" system summary is desired.
Finally, periodic mode output is now on stdout, not stderr.
Turbostat v2 is also slightly more robust in
handling run-time CPU online/offline events,
as it now checks the actual map of on-line cpus rather
than just the total number of on-line cpus.
Signed-off-by: Len Brown <len.brown@intel.com>
Initial IVB support went into turbostat in Linux-3.1:
553575f1ae
(tools turbostat: recognize and run properly on IVB)
However, when running on IVB, turbostat would fail
to report the new couters added with SNB, c7, pc2 and pc7.
So in scenarios where these counters are non-zero on IVB,
turbostat would report erroneous residencey results.
In particular c7 time would be added to c1 time,
since c1 time is calculated as "that which is left over".
Also, turbostat reports MHz capabilities when passed
the "-v" option, and it would incorrectly report 133MHz
bclk instead of 100MHz bclk for IVB, which would inflate
GHz reported with that option.
This patch is a backport of a fix already included in turbostat v2.
Signed-off-by: Len Brown <len.brown@intel.com>
Linux 3.4 included a modification to turbostat to
lower cross-call overhead by using scheduler affinity:
15aaa34654
(tools turbostat: reduce measurement overhead due to IPIs)
In the use-case where turbostat forks a child program,
that change had the un-intended side-effect of binding
the child to the last cpu in the system.
This change removed the binding before forking the child.
This is a back-port of a fix already included in turbostat v2.
Signed-off-by: Len Brown <len.brown@intel.com>
Sometimes users have turbostat running in interval mode
when they take processors offline/online.
Previously, turbostat would survive, but not gracefully.
Tighten up the error checking so turbostat notices
changesn sooner, and print just 1 line on change:
turbostat: re-initialized with num_cpus %d
Signed-off-by: Len Brown <len.brown@intel.com>
turbostat uses /dev/cpu/*/msr interface to read MSRs.
For modern systems, it reads 10 MSR/CPU. This can
be observed as 10 "Function Call Interrupts"
per CPU per sample added to /proc/interrupts.
This overhead is measurable on large idle systems,
and as Yoquan Song pointed out, it can even trick
cpuidle into thinking the system is busy.
Here turbostat re-schedules itself in-turn to each
CPU so that its MSR reads will always be local.
This replaces the 10 "Function Call Interrupts"
with a single "Rescheduling interrupt" per sample
per CPU.
On an idle 32-CPU system, this shifts some residency from
the shallow c1 state to the deeper c7 state:
# ./turbostat.old -s
%c0 GHz TSC %c1 %c3 %c6 %c7 %pc2 %pc3 %pc6 %pc7
0.27 1.29 2.29 0.95 0.02 0.00 98.77 20.23 0.00 77.41 0.00
0.25 1.24 2.29 0.98 0.02 0.00 98.75 20.34 0.03 77.74 0.00
0.27 1.22 2.29 0.54 0.00 0.00 99.18 20.64 0.00 77.70 0.00
0.26 1.22 2.29 1.22 0.00 0.00 98.52 20.22 0.00 77.74 0.00
0.26 1.38 2.29 0.78 0.02 0.00 98.95 20.51 0.05 77.56 0.00
^C
i# ./turbostat.new -s
%c0 GHz TSC %c1 %c3 %c6 %c7 %pc2 %pc3 %pc6 %pc7
0.27 1.20 2.29 0.24 0.01 0.00 99.49 20.58 0.00 78.20 0.00
0.27 1.22 2.29 0.25 0.00 0.00 99.48 20.79 0.00 77.85 0.00
0.27 1.20 2.29 0.25 0.02 0.00 99.46 20.71 0.03 77.89 0.00
0.28 1.26 2.29 0.25 0.01 0.00 99.46 20.89 0.02 77.67 0.00
0.27 1.20 2.29 0.24 0.01 0.00 99.48 20.65 0.00 78.04 0.00
cc: Youquan Song <youquan.song@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
turbostat -s
cuts down on the amount of output, per user request.
also treak some output whitespace and the man page.
Signed-off-by: Len Brown <len.brown@intel.com>
This includes initial support for the recently published ACPI 5.0 spec.
In particular, support for the "hardware-reduced" bit that eliminates
the dependency on legacy hardware.
APEI has patches resulting from testing on real hardware.
Plus other random fixes.
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux: (52 commits)
acpi/apei/einj: Add extensions to EINJ from rev 5.0 of acpi spec
intel_idle: Split up and provide per CPU initialization func
ACPI processor: Remove unneeded variable passed by acpi_processor_hotadd_init V2
ACPI processor: Remove unneeded cpuidle_unregister_driver call
intel idle: Make idle driver more robust
intel_idle: Fix a cast to pointer from integer of different size warning in intel_idle
ACPI: kernel-parameters.txt : Add intel_idle.max_cstate
intel_idle: remove redundant local_irq_disable() call
ACPI processor: Fix error path, also remove sysdev link
ACPI: processor: fix acpi_get_cpuid for UP processor
intel_idle: fix API misuse
ACPI APEI: Convert atomicio routines
ACPI: Export interfaces for ioremapping/iounmapping ACPI registers
ACPI: Fix possible alignment issues with GAS 'address' references
ACPI, ia64: Use SRAT table rev to use 8bit or 16/32bit PXM fields (ia64)
ACPI, x86: Use SRAT table rev to use 8bit or 32bit PXM fields (x86/x86-64)
ACPI: Store SRAT table revision
ACPI, APEI, Resolve false conflict between ACPI NVS and APEI
ACPI, Record ACPI NVS regions
ACPI, APEI, EINJ, Refine the fix of resource conflict
...
Field names were shortened: "pkg" is now "pk", "core" is now "cr"
Signed-off-by: Arun Thomas <arun.thomas@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Reduce columns for package number to 1.
If you can afford more than 9 packages,
you can also afford a terminal with more than 80 columns:-)
Also shave a column also off the package C-states
Signed-off-by: Len Brown <len.brown@intel.com>
Follow kernel coding style traditions more closely.
Delete typedef, re-name "per cpu counters" to
simply be counters etc.
This patch changes no functionality.
Suggested-by: Thiago Farina <tfransosi@gmail.com>
Signed-off-by: Len Brown <len.brown@intel.com>
bug could cause false positive on indicating
presence of invarient TSC or APERF support.
Signed-off-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Len Brown <len.brown@intel.com>
MSR_IA32_ENERGY_PERF_BIAS first became available on Westmere Xeon.
It is implemented in all Sandy Bridge processors -- mobile, desktop and server.
It is expected to become increasingly important in subsequent generations.
x86_energy_perf_policy is a user-space utility to set the
hardware energy vs performance policy hint in the processor.
Most systems would benefit from "x86_energy_perf_policy normal"
at system startup, as the hardware default is maximum performance
at the expense of energy efficiency.
See x86_energy_perf_policy.8 man page for more information.
Background:
Linux-2.6.36 added "epb" to /proc/cpuinfo to indicate
if an x86 processor supports MSR_IA32_ENERGY_PERF_BIAS,
without actually modifying the MSR.
In March, 2010, Venkatesh Pallipadi proposed a small driver
that programmed MSR_IA32_ENERGY_PERF_BIAS, based on
the cpufreq governor in use. It also offered
a boot-time cmdline option to override.
http://lkml.org/lkml/2010/3/4/457
But hiding the hardware policy behind the
governor choice was deemed "kinda icky".
In June, 2010, I proposed a generic user/kernel API to
generalize the power/performance policy trade-off.
"RFC: /sys/power/policy_preference"
http://lkml.org/lkml/2010/6/16/399
That is my preference for implementing this capability,
but I received no support on the list.
So in September, 2010, I sent x86_energy_perf_policy.c to LKML,
a user-space utility that scribbles directly to the MSR.
http://lkml.org/lkml/2010/9/28/246
Here is that same utility, after responding to some review feedback,
to live in tools/power/, where it is easily found.
Signed-off-by: Len Brown <len.brown@intel.com>
turbostat is a Linux tool to observe proper operation
of Intel(R) Turbo Boost Technology.
turbostat displays the actual processor frequency
on x86 processors that include APERF and MPERF MSRs.
Note that turbostat is of limited utility on Linux
kernels 2.6.29 and older, as acpi_cpufreq cleared
APERF/MPERF up through that release.
On Intel Core i3/i5/i7 (Nehalem) and newer processors,
turbostat also displays residency in idle power saving states,
which are necessary for diagnosing any cpuidle issues
that may have an effect on turbo-mode.
See the turbostat.8 man page for example usage.
Signed-off-by: Len Brown <len.brown@intel.com>