The column "GFX%c6" show the percentage of time the GPU
is in the "render C6" state, rc6. Deep package C-states on several
systems depend on the GPU being in RC6.
This information comes from the counter
/sys/class/drm/card0/power/rc6_residency_ms,
as read before and after the measurement interval.
Signed-off-by: Len Brown <len.brown@intel.com>
Under the column "GFXMHz", show a snapshot of this attribute:
/sys/class/graphics/fb0/device/drm/card0/gt_cur_freq_mhz
This is an instantaneous snapshot of what sysfs presents
at the end of the measurement interval. turbostat does
not average or otherwise perform any math on this value.
Signed-off-by: Len Brown <len.brown@intel.com>
The new IRQ column shows how many interrupts have occurred on each CPU
during the measurement inteval. This information comes from
the difference between /proc/interrupts shapshots made before
and after the measurement interval.
The first row, the system summary, shows the sum of the IRQS
for all CPUs during that interval.
Signed-off-by: Len Brown <len.brown@intel.com>
skip the open(2)/close(2) on each msr read
by keeping the /dev/cpu/*/msr files open.
The remaining read(2) is generally far fewer cycles
than the removed open(2) system call.
Signed-off-by: Len Brown <len.brown@intel.com>
By default...
Turbostat --debug gconfiguration info goes to stderr.
In FORK mode, turbostat statistics go to stderr.
In PERIODIC mode, turbostat statistics go to stdout.
These defaults do not change, but an option "--out file"
will send all output above only to the specified file.
Signed-off-by: Len Brown <len.brown@intel.com>
some tools processing turbostat output
have difficulty with items that begin with %...
Reported-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Following changes have been made:
- changed MSR_NHM_TURBO_RATIO_LIMIT to MSR_TURBO_RATIO_LIMIT in debug print
for consistency with Developer Manual
- updated definition of bitfields in MSR_TURBO_RATIO_LIMIT and appropriate
parsing code
- added x200 to list of architectures that do not support Nahlem compatible
definition of MSR_TURBO_RATIO_LIMIT register (x200 has the register but
bits definition is custom)
- fixed typo in code that parses MSR_TURBO_RATIO_LIMIT
(logical instead of bitwise operator)
- changed MSR_TURBO_RATIO_LIMIT parsing algorithm so the print out had the
same order as implementations for other platforms
Signed-off-by: Hubert Chrzaniuk <hubert.chrzaniuk@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
x200 does not enable any way to programmatically obtain bus clock
speed. Bclk for the architecture has a fixed value of 100 MHz.
At the same time x200 cannot be included in has_snb_msrs since
it does not support C7 idle state.
prior to this patch, MHz values reported on this chip
were erroneously calculated using bclk of 133MHz,
causing MHz values to be reported 33% higher than actual.
Signed-off-by: Hubert Chrzaniuk <hubert.chrzaniuk@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
turbostat -i interval_sec
will sample and display statistics every interval_sec.
interval_sec used to be a whole number of seconds,
but now we accept a decimal, as small as 0.001 sec (1 ms).
Signed-off-by: Len Brown <len.brown@intel.com>
for debugging, dump a few more fields:
CPUID(1): SSE3 MONITOR EIST TM2 TSC MSR ACPI-TM TM
cpu0: MSR_IA32_MISC_ENABLE: 0x00850089 (TCC EIST MONITOR)
Signed-off-by: Len Brown <len.brown@intel.com>
MSR_PLATFORM_INFO is the new name for MSR_NHM_PLATFORM_INFO
no functional change
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
MSR_TURBO_ACTIVATION_RATIO: 0x00000016 (MAX_NON_TURBO_RATIO=6 lock=0)
should print all 7 bits of MAX_NON_TURBO_RATIO (in decimal):
MSR_TURBO_ACTIVATION_RATIO: 0x00000016 (MAX_NON_TURBO_RATIO=22 lock=0)
Signed-off-by: Len Brown <len.brown@intel.com>
Bzy_MHz = TSC_delta*tsc_tweak/APERF_delta/MPERF_delta/measurement_interval
becomes
Bzy_MHz = base_mhz/APERF_delta/MPERF_delta
on systems which support MSR_NHM_PLATFORM_INFO.
base_mhz is calculated directly from the base_ratio
reported in MSR_NHM_PLATFORM_INFO * bclk,
and bclk is discovered via MSR or cpuid.
This reduces the dependency of Bzy_MHz calculation on the TSC.
Previously, there were 4 TSC readings required in each caculation,
the raw TSC delta combined with the measurement_interval.
This also removes the "tsc_tweak" correction factor used when
TSC runs on a different base clock from the CPU's bclk.
After this change, tsc_tweak is used only for %Busy.
The end-result should be a Bzy_MHz result slightly less prone to jitter.
Signed-off-by: Len Brown <len.brown@intel.com>
On a Skylake with 1500MHz base frequency,
the TSC runs at 1512MHz.
This is because the TSC is no longer in the n*100 MHz BCLK domain,
but is now in the m*24MHz crystal clock domain. (24 MHz * 63 = 1512 MHz)
This adds error to several calculations in turbostat,
unless the TSC sample sizes are adjusted for this difference.
Note that calculations in the time domain are immune
from this issue, as the timing sub-system has already
calibrated the TSC against a known wall clock.
AVG_MHz = APERF_delta/measurement_interval
need no adjustment. APERF_delta is in the BCLK domain,
and measurement_interval is in the time domain.
TSC_MHz = TSC_delta/measurement_interval
needs no adjustment -- as we really do want to report
the actual measured TSC delta here, and measurement_interval
is in the accurate time domain.
%Busy = MPERF_delta/TSC_delta
needs adjustment to use TSC_BCLK_DOMAIN_delta.
TSC_BCLK_DOMAIN_delta = TSC_delta * base_hz / tsc_hz
Bzy_MHz = TSC_delta/APERF_delta/MPERF_delta/measurement_interval
need adjustment as above.
No other metrics in turbostat need to be adjusted.
Before:
CPU Avg_MHz %Busy Bzy_MHz TSC_MHz
- 550 24.84 2216 1512
0 2191 98.73 2219 1514
2 0 0.01 2130 1512
1 9 0.43 2016 1512
3 2 0.08 2016 1512
After:
CPU Avg_MHz %Busy Bzy_MHz TSC_MHz
- 550 25.05 2198 1512
0 2190 99.62 2199 1512
2 0 0.01 2152 1512
1 9 0.46 2000 1512
3 2 0.10 2000 1512
Note that in this example, the "Before" Bzy_MHz
was reported as exceeding the 2200 max turbo rate.
Also, even a pinned spin loop would not be reported
as over 99% busy.
Signed-off-by: Len Brown <len.brown@intel.com>
KNL increments APERF and MPERF every 1024 clocks.
This is compliant with the architecture specification,
which requires that only the ratio of APERF/MPERF need be valid.
However, turbostat takes advantage of the fact that these
two MSRs increment every un-halted clock
at the actual and base frequency:
AVG_MHz = APERF_delta/measurement_interval
%Busy = MPERF_delta/TSC_delta
This quirk is needed for these calculations to also work on KNL,
which would otherwise show a value 1024x smaller than expected.
Signed-off-by: Hubert Chrzaniuk <hubert.chrzaniuk@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Staring in Linux-4.3-rc1,
commit 6fb3143b56 ("tools/power turbostat: dump CONFIG_TDP")
touches MSR 0x648, which is not supported on IVB-Xeon.
This results in "turbostat --debug" exiting on those systems:
turbostat: /dev/cpu/2/msr offset 0x648 read failed: Input/output error
Remove IVB-Xeon from the list of machines supporting with that MSR.
Signed-off-by: Len Brown <len.brown@intel.com>
Pull turbostat changes for v4.3 from Len Brown.
* 'turbostat' of https://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux:
tools/power turbostat: fix typo on DRAM column in Joules-mode
tools/power turbostat: fix parameter passing for forked command
tools/power turbostat: dump CONFIG_TDP
tools/power turbostat: cpu0 is no longer hard-coded, so update output
tools/power turbostat: update turbostat(8)
turbostat supports forked command when sampling cpu state. However,
the forked command is not allowed to be executed with options, otherwise
turbostat might regard these options as invalid turbostat options.
For example:
./turbostat stress -c 4 -t 10
./turbostat: unrecognized option '-t'
Reported-by: Chen Yu <yu.c.chen@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Config TDP is a feature that allows parts to be configured
for different thermal limits after they have left the factory.
This can have an effect on the operation of the part,
particularly in determiniing...
Max Non-turbo Ratio
Turbo Activation Ratio
Signed-off-by: Len Brown <len.brown@intel.com>
The --debug option reads a number of per-package MSRs.
Previously we explicitly read them on cpu0, but recently
turbostat changed to read them on the current "base_cpu".
Update the print-out to reflect base_cpu, rather than
the hard-coded cpu0.
Signed-off-by: Len Brown <len.brown@intel.com>
This header containing all MSRs and respective bit definitions
got exported to userspace in conjunction with the big UAPI
shuffle.
But, it doesn't belong in the UAPI headers because userspace can
do its own MSR defines and exporting them from the kernel blocks
us from doing cleanups/renames in that header. Which is
ridiculous - it is not kernel's job to export such a header and
keep MSRs list and their names stable.
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1433436928-31903-19-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Remove reference to the original Nehalem Turbo white paper,
since it has moved, and these mechanisms have now long since
been documented in the Software Developer's Manual.
Reported-by: Jeremie Lagraviere <jeremie@simula.no>
Signed-off-by: Len Brown <len.brown@intel.com>
Linux-3.7 added CONFIG_BOOTPARAM_HOTPLUG_CPU0,
allowing systems to offline cpu0.
But when cpu0 is offline, turbostat will not run:
# turbostat ls
turbostat: no /dev/cpu/0/msr
This patch replaces the hard-coded use of cpu0 in turbostat
with the current cpu, allowing it to run without a cpu0.
Fewer cross-calls may also be needed due to use of current cpu,
though this hard-coding was used only for the --debug preamble.
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: Len Brown <len.brown@intel.com>
When EPB is 0xF, turbosat was incorrectly describing it as "custom"
instead of calling it "powersave":
< cpu0: MSR_IA32_ENERGY_PERF_BIAS: 0x0000000f (custom)
> cpu0: MSR_IA32_ENERGY_PERF_BIAS: 0x0000000f (powersave)
Signed-off-by: Len Brown <len.brown@intel.com>
Changes mainly to account for minor differences in Knights Landing(KNL):
1. KNL supports C1 and C6 core states.
2. KNL supports PC2, PC3 and PC6 package states.
3. KNL has a different encoding of the TURBO_RATIO_LIMIT MSR
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Without this update, turbostat displays only 2 threads per core.
Some processors, such as Xeon Phi, have more.
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
HSW expanded MSR_PKG_CST_CONFIG_CONTROL.Package-C-State-Limit,
from bits[2:0] used by previous implementations, to [3:0].
The value 1000b is unlimited, and is used by BDW and SKL too.
Signed-off-by: Len Brown <len.brown@intel.com>
While not yet documented in the Software Developer's Manual,
the data-sheet for modern Xeon states that DRAM RAPL ENERGY units
are fixed at 15.3 uJ, rather than being discovered via MSR.
Before this patch, DRAM energy on these products is over-stated by turbostat
because the RAPL units are 4x larger.
ref: "Xeon E5-2600 v3/E5-1600 v3 Datasheet Volume 2"
http://www.intel.com/content/dam/www/public/us/en/documents/datasheets/xeon-e5-v3-datasheet-vol-2.pdf
Signed-off-by: Andrey Semin <andrey.semin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Skylake adds some additional residency counters.
Skylake supports a different mix of RAPL registers
from any previous product.
In most other ways, Skylake is like Broadwell.
Signed-off-by: Len Brown <len.brown@intel.com>
Since commit ee0778a301
("tools/power: turbostat: make Makefile a bit more capable")
turbostat's Makefile is using
[...]
BUILD_OUTPUT := $(PWD)
[...]
which obviously causes trouble when building "turbostat" with
make -C /usr/src/linux/tools/power/x86/turbostat ARCH=x86 turbostat
because GNU make does not update nor guarantee that $PWD is set.
This patch changes the Makefile to use $CURDIR instead, which GNU make
guarantees to set and update (i.e. when using "make -C ...") and also
adds support for the O= option (see "make help" in your root of your
kernel source tree for more details).
Link: https://bugs.gentoo.org/show_bug.cgi?id=533918
Fixes: ee0778a301 ("tools/power: turbostat: make Makefile a bit more capable")
Signed-off-by: Thomas D. <whissi@whissi.de>
Cc: Mark Asselstine <mark.asselstine@windriver.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Some distros (Ubuntu) ship the msr driver as a module.
If turbosat is run as root on those systems, and discovers
that there is no /dev/cpu/cpu0/msr, it will now "modprobe msr"
for the user.
If not root, the modprobe attempt will fail, and turbostat will exit as before:
turbostat: no /dev/cpu/0/msr, Try "# modprobe msr" : No such file or directory
Signed-off-by: Len Brown <len.brown@intel.com>
s/MSR_NHM_TURBO_RATIO_LIMIT/MSR_TURBO_RATIO_LIMIT/
s/MSR_IVT_TURBO_RATIO_LIMIT/MSR_TURBO_RATIO_LIMIT1/
syntax only -- use the documented strings describing these registers.
Signed-off-by: Len Brown <len.brown@intel.com>
syntax only.
The cool kids are now using the phrase "base frequency",
where in the past we used "max non-turbo frequency" or "TSC frequency".
This distinction becomes important when a processor has a TSC
that runs at a different speed than the "base frequency".
Signed-off-by: Len Brown <len.brown@intel.com>
cosmetic only.
order the decoding of MSR_PERF_LIMIT_REASONS bits
from MSB to LSB -- which you notice when more than 1 bit is set
and you are, say, comparing the output to the documentation...
Signed-off-by: Len Brown <len.brown@intel.com>
Casual turbostat users generally just want to know MHz.
So by default, just print enough information to make sense of MHz.
All the other configuration data and columns for C-states and temperature etc,
are printed with the --debug option.
Signed-off-by: Len Brown <len.brown@intel.com>
Long format options added, though the short ones should still work.
eg. the new "--Counter 0x10" is the same as the old "-C 0x10"
Note this Incompatibility:
Old:
-v displayed verbose debug output
New:
-v and --version simpaly display version
Additional parameters:
-d and --debug display verbose debug output
-h and --help display a help message
Updated turbosat.8 man page accordingly.
Signed-off-by: Len Brown <len.brown@intel.com>
Replaced previously open-coded Package C-state Limit decoding
with table-driven decoding. In doing so, updated to match January 2015
"Intel(R) 64 and IA-23 Architectures Software Developer's Manual".
In the past, turbostat would print package C-state residency columns
for all package states supported by the model's architecture, even though
a particular SKU may not support them, or they may be disabled by the BIOS.
Now turbostat will skip printing colunns if MSRs indicate that they are not enabled.
eg. many SKUs don't support PC7, and so that column will no longer be printed.
Signed-off-by: Len Brown <len.brown@intel.com>
While turbostat is significantly less useful on systems
with no APERF_MSR, it seems more friendly
to run on such systems and report what we can,
rather than refusing to run.
Update man page to reflect recent changes.
Signed-off-by: Len Brown <len.brown@intel.com>
Turbostat can be useful on systems that do not support invariant TSC,
so allow it to run on those systgems.
All arithmetic in turbostat using the TSC value is per-processsor,
so it does not depend on the TSC values being in sync acrosss processors.
Turbostat uses gettimeofday() for the measurement interval
rather than using the TSC directly, so that key metric
is also immune from variable TSC.
Turbostat prints a TSC sanity check column:
TSC_MHz = TSC_delta/interval
If this column is constant and is close to the processor
base frequency, then the TSC is behaving properly.
The other key turbostat columns are calculated this way:
Avg_Mhz = APERF_delta/interval
%Busy = MPERF_delta/TSC_delta
Bzy_MHz = TSC_delta/APERF_delta/MPERF_delta/interval
Tested on Core2 and Core2-Xeon, and so this patch includes
a few other changes to remove the assumption that target
systems are Nehalem and newer.
Signed-off-by: Len Brown <len.brown@intel.com>
The Processor generation code-named Haswell
added MSR_{CORE | GFX | RING}_PERF_LIMIT_REASONS
to explain when and how the processor limits frequency.
turbostat -v
will now decode these bits.
Each MSR has an "Active" set of bits which describe
current conditions, and a "Logged" set of bits,
which describe what has happened since last cleared.
Turbostat currently doesn't clear the log bits.
Signed-off-by: Len Brown <len.brown@intel.com>