mirror of
https://github.com/torvalds/linux.git
synced 2024-10-30 00:32:38 +00:00
tools/power x86_energy_perf_policy: support HWP.EPP
x86_energy_perf_policy(8) was created as an example of how the user, or upper-level OS, can manage MSR_IA32_ENERGY_PERF_BIAS (EPB). Hardware consults EPB when it makes internal decisions balancing energy-saving vs performance. For example, should HW quickly or slowly transition into and out of power-saving idles states? Should HW quickly or slowly ramp frequency up or down in response to demand in the turbo-frequency range? Depending on the processor, EPB may have package, core, or CPU thread scope. As such, the only general policy is to write the same value to EPB on every CPU in the system. Recent platforms add support for Hardware Performance States (HWP). HWP effectively extends hardware frequency control from the opportunistic turbo-frequency range to control the entire range of available processor frequencies. Just as turbo-mode used EPB, HWP can use EPB to help decicde how quickly to ramp frequency and voltage up and down in response to changing demand. Indeed, BDX and BDX-DE, the first processors to support HWP, use EPB for this purpose. Starting in SKL, HWP no longer looks to EPB for influence. Instead, it looks in a new MSR specifically for this purpose: IA32_HWP_REQUEST.Energy_Performance_Preference (HWP.EPP). HWP.EPP is like EPB, except that it is specific to HWP-mode frequency selection. Also, HWP.EPP is defined to have per CPU-thread scope. Starting in SKX, IA32_HWP_REQUEST is augmented by IA32_HWP_REQUEST_PKG -- which has the same function, but is defined to have package-wide scope. A new bit in IA32_HWP_REQUEST determines if it over-rides the IA32_HWP_REQUEST_PKG or not. Note that HWP-mode can be enabled in several ways. The "in-band" method is for HWP to be exposed in CPUID, and for the Linux intel_pstate driver to recognized that, and thus enable HWP. In this case, starting in Linux 4.10, intel_pstate exports cpufreq sysfs attribute "energy_performance_preference" which can be used to manage HWP.EPP. This interface can be used to set HWP.EPP to these values: 0 performance 128 balance_performance (default) 192 balance_power 255 power Here, x86_energy_performance_policy is updated to use idential strings and values as intel_pstate. But HWP-mode may also be enabled by firmware before the OS boots, and the OS may not be aware of HWP. In this case, intel_pstate is not available to provide sysfs attributes, and x86_energy_perf_policy or a similar utility is invaluable for managing HWP.EPP, for this utility works the same, no matter if cpufreq is enabled or not. Signed-off-by: Len Brown <len.brown@intel.com>
This commit is contained in:
parent
2fc49cb0b5
commit
4beec1d751
@ -1,10 +1,27 @@
|
||||
DESTDIR ?=
|
||||
CC = $(CROSS_COMPILE)gcc
|
||||
BUILD_OUTPUT := $(CURDIR)
|
||||
PREFIX := /usr
|
||||
DESTDIR :=
|
||||
|
||||
ifeq ("$(origin O)", "command line")
|
||||
BUILD_OUTPUT := $(O)
|
||||
endif
|
||||
|
||||
x86_energy_perf_policy : x86_energy_perf_policy.c
|
||||
CFLAGS += -Wall
|
||||
CFLAGS += -DMSRHEADER='"../../../../arch/x86/include/asm/msr-index.h"'
|
||||
|
||||
%: %.c
|
||||
@mkdir -p $(BUILD_OUTPUT)
|
||||
$(CC) $(CFLAGS) $< -o $(BUILD_OUTPUT)/$@
|
||||
|
||||
.PHONY : clean
|
||||
clean :
|
||||
rm -f x86_energy_perf_policy
|
||||
@rm -f $(BUILD_OUTPUT)/x86_energy_perf_policy
|
||||
|
||||
install : x86_energy_perf_policy
|
||||
install -d $(DESTDIR)$(PREFIX)/bin
|
||||
install $(BUILD_OUTPUT)/x86_energy_perf_policy $(DESTDIR)$(PREFIX)/bin/x86_energy_perf_policy
|
||||
install -d $(DESTDIR)$(PREFIX)/share/man/man8
|
||||
install x86_energy_perf_policy.8 $(DESTDIR)$(PREFIX)/share/man/man8
|
||||
|
||||
install :
|
||||
install x86_energy_perf_policy ${DESTDIR}/usr/bin/
|
||||
install x86_energy_perf_policy.8 ${DESTDIR}/usr/share/man/man8/
|
||||
|
@ -1,104 +1,213 @@
|
||||
.\" This page Copyright (C) 2010 Len Brown <len.brown@intel.com>
|
||||
.\" This page Copyright (C) 2010 - 2015 Len Brown <len.brown@intel.com>
|
||||
.\" Distributed under the GPL, Copyleft 1994.
|
||||
.TH X86_ENERGY_PERF_POLICY 8
|
||||
.SH NAME
|
||||
x86_energy_perf_policy \- read or write MSR_IA32_ENERGY_PERF_BIAS
|
||||
x86_energy_perf_policy \- Manage Energy vs. Performance Policy via x86 Model Specific Registers
|
||||
.SH SYNOPSIS
|
||||
.ft B
|
||||
.B x86_energy_perf_policy
|
||||
.RB [ "\-c cpu" ]
|
||||
.RB [ "\-v" ]
|
||||
.RB "\-r"
|
||||
.RB "[ options ] [ scope ] [field \ value]"
|
||||
.br
|
||||
.B x86_energy_perf_policy
|
||||
.RB [ "\-c cpu" ]
|
||||
.RB [ "\-v" ]
|
||||
.RB 'performance'
|
||||
.RB "scope: \-\-cpu\ cpu-list | \-\-pkg\ pkg-list"
|
||||
.br
|
||||
.B x86_energy_perf_policy
|
||||
.RB [ "\-c cpu" ]
|
||||
.RB [ "\-v" ]
|
||||
.RB 'normal'
|
||||
.RB "cpu-list, pkg-list: # | #,# | #-# | all"
|
||||
.br
|
||||
.B x86_energy_perf_policy
|
||||
.RB [ "\-c cpu" ]
|
||||
.RB [ "\-v" ]
|
||||
.RB 'powersave'
|
||||
.RB "field: \-\-all | \-\-epb | \-\-hwp-epp | \-\-hwp-min | \-\-hwp-max | \-\-hwp-desired"
|
||||
.br
|
||||
.B x86_energy_perf_policy
|
||||
.RB [ "\-c cpu" ]
|
||||
.RB [ "\-v" ]
|
||||
.RB n
|
||||
.RB "other: (\-\-force | \-\-hwp-enable | \-\-turbo-enable) value)"
|
||||
.br
|
||||
.RB "value: # | default | performance | balance-performance | balance-power | power"
|
||||
.SH DESCRIPTION
|
||||
\fBx86_energy_perf_policy\fP
|
||||
allows software to convey
|
||||
its policy for the relative importance of performance
|
||||
versus energy savings to the processor.
|
||||
displays and updates energy-performance policy settings specific to
|
||||
Intel Architecture Processors. Settings are accessed via Model Specific Register (MSR)
|
||||
updates, no matter if the Linux cpufreq sub-system is enabled or not.
|
||||
|
||||
The processor uses this information in model-specific ways
|
||||
when it must select trade-offs between performance and
|
||||
energy efficiency.
|
||||
Policy in MSR_IA32_ENERGY_PERF_BIAS (EPB)
|
||||
may affect a wide range of hardware decisions,
|
||||
such as how aggressively the hardware enters and exits CPU idle states (C-states)
|
||||
and Processor Performance States (P-states).
|
||||
This policy hint does not replace explicit OS C-state and P-state selection.
|
||||
Rather, it tells the hardware how aggressively to implement those selections.
|
||||
Further, it allows the OS to influence energy/performance trade-offs where there
|
||||
is no software interface, such as in the opportunistic "turbo-mode" P-state range.
|
||||
Note that MSR_IA32_ENERGY_PERF_BIAS is defined per CPU,
|
||||
but some implementations
|
||||
share a single MSR among all CPUs in each processor package.
|
||||
On those systems, a write to EPB on one processor will
|
||||
be visible, and will have an effect, on all CPUs
|
||||
in the same processor package.
|
||||
|
||||
This policy hint does not supersede Processor Performance states
|
||||
(P-states) or CPU Idle power states (C-states), but allows
|
||||
software to have influence where it would otherwise be unable
|
||||
to express a preference.
|
||||
Hardware P-States (HWP) are effectively an expansion of hardware
|
||||
P-state control from the opportunistic turbo-mode P-state range
|
||||
to include the entire range of available P-states.
|
||||
On Broadwell Xeon, the initial HWP implementation, EBP influenced HWP.
|
||||
That influence was removed in subsequent generations,
|
||||
where it was moved to the
|
||||
Energy_Performance_Preference (EPP) field in
|
||||
a pair of dedicated MSRs -- MSR_IA32_HWP_REQUEST and MSR_IA32_HWP_REQUEST_PKG.
|
||||
|
||||
For example, this setting may tell the hardware how
|
||||
aggressively or conservatively to control frequency
|
||||
in the "turbo range" above the explicitly OS-controlled
|
||||
P-state frequency range. It may also tell the hardware
|
||||
how aggressively is should enter the OS requested C-states.
|
||||
EPP is the most commonly managed knob in HWP mode,
|
||||
but MSR_IA32_HWP_REQUEST also allows the user to specify
|
||||
minimum-frequency for Quality-of-Service,
|
||||
and maximum-frequency for power-capping.
|
||||
MSR_IA32_HWP_REQUEST is defined per-CPU.
|
||||
|
||||
Support for this feature is indicated by CPUID.06H.ECX.bit3
|
||||
per the Intel Architectures Software Developer's Manual.
|
||||
MSR_IA32_HWP_REQUEST_PKG has the same capability as MSR_IA32_HWP_REQUEST,
|
||||
but it can simultaneously set the default policy for all CPUs within a package.
|
||||
A bit in per-CPU MSR_IA32_HWP_REQUEST indicates whether it is
|
||||
over-ruled-by or exempt-from MSR_IA32_HWP_REQUEST_PKG.
|
||||
|
||||
.SS Options
|
||||
\fB-c\fP limits operation to a single CPU.
|
||||
The default is to operate on all CPUs.
|
||||
Note that MSR_IA32_ENERGY_PERF_BIAS is defined per
|
||||
logical processor, but that the initial implementations
|
||||
of the MSR were shared among all processors in each package.
|
||||
MSR_HWP_CAPABILITIES shows the default values for the fields
|
||||
in MSR_IA32_HWP_REQUEST. It is displayed when no values
|
||||
are being written.
|
||||
|
||||
.SS SCOPE OPTIONS
|
||||
.PP
|
||||
\fB-v\fP increases verbosity. By default
|
||||
x86_energy_perf_policy is silent.
|
||||
\fB-c, --cpu\fP Operate on the MSR_IA32_HWP_REQUEST for each CPU in a CPU-list.
|
||||
The CPU-list may be comma-separated CPU numbers, with dash for range
|
||||
or the string "all". Eg. '--cpu 1,4,6-8' or '--cpu all'.
|
||||
When --cpu is used, \fB--hwp-use-pkg\fP is available, which specifies whether the per-cpu
|
||||
MSR_IA32_HWP_REQUEST should be over-ruled by MSR_IA32_HWP_REQUEST_PKG (1),
|
||||
or exempt from MSR_IA32_HWP_REQUEST_PKG (0).
|
||||
|
||||
\fB-p, --pkg\fP Operate on the MSR_IA32_HWP_REQUEST_PKG for each package in the package-list.
|
||||
The list is a string of individual package numbers separated
|
||||
by commas, and or ranges of package numbers separated by a dash,
|
||||
or the string "all".
|
||||
For example '--pkg 1,3' or '--pkg all'
|
||||
|
||||
.SS VALUE OPTIONS
|
||||
.PP
|
||||
\fB-r\fP is for "read-only" mode - the unchanged state
|
||||
is read and displayed.
|
||||
.PP
|
||||
.I performance
|
||||
Set a policy where performance is paramount.
|
||||
The processor will be unwilling to sacrifice any performance
|
||||
for the sake of energy saving. This is the hardware default.
|
||||
.PP
|
||||
.I normal
|
||||
.I normal | default
|
||||
Set a policy with a normal balance between performance and energy efficiency.
|
||||
The processor will tolerate minor performance compromise
|
||||
for potentially significant energy savings.
|
||||
This reasonable default for most desktops and servers.
|
||||
This is a reasonable default for most desktops and servers.
|
||||
"default" is a synonym for "normal".
|
||||
.PP
|
||||
.I powersave
|
||||
.I performance
|
||||
Set a policy for maximum performance,
|
||||
accepting no performance sacrifice for the benefit of energy efficiency.
|
||||
.PP
|
||||
.I balance-performance
|
||||
Set a policy with a high priority on performance,
|
||||
but allowing some performance loss to benefit energy efficiency.
|
||||
.PP
|
||||
.I balance-power
|
||||
Set a policy where the performance and power are balanced.
|
||||
This is the default.
|
||||
.PP
|
||||
.I power
|
||||
Set a policy where the processor can accept
|
||||
a measurable performance hit to maximize energy efficiency.
|
||||
.PP
|
||||
.I n
|
||||
Set MSR_IA32_ENERGY_PERF_BIAS to the specified number.
|
||||
The range of valid numbers is 0-15, where 0 is maximum
|
||||
performance and 15 is maximum energy efficiency.
|
||||
a measurable performance impact to maximize energy efficiency.
|
||||
|
||||
.PP
|
||||
The following table shows the mapping from the value strings above to actual MSR values.
|
||||
This mapping is defined in the Linux-kernel header, msr-index.h.
|
||||
|
||||
.nf
|
||||
VALUE STRING EPB EPP
|
||||
performance 0 0
|
||||
balance-performance 4 128
|
||||
normal, default 6 128
|
||||
balance-power 8 192
|
||||
power 15 255
|
||||
.fi
|
||||
.PP
|
||||
For MSR_IA32_HWP_REQUEST performance fields
|
||||
(--hwp-min, --hwp-max, --hwp-desired), the value option
|
||||
is in units of 100 MHz, Eg. 12 signifies 1200 MHz.
|
||||
|
||||
.SS FIELD OPTIONS
|
||||
\fB-a, --all value-string\fP Sets all EPB and EPP and HWP limit fields to the value associated with
|
||||
the value-string. In addition, enables turbo-mode and HWP-mode, if they were previous disabled.
|
||||
Thus "--all normal" will set a system without cpufreq into a well known configuration.
|
||||
.PP
|
||||
\fB-B, --epb\fP set EPB per-core or per-package.
|
||||
See value strings in the table above.
|
||||
.PP
|
||||
\fB-d, --debug\fP debug increases verbosity. By default
|
||||
x86_energy_perf_policy is silent for updates,
|
||||
and verbose for read-only mode.
|
||||
.PP
|
||||
\fB-P, --hwp-epp\fP set HWP.EPP per-core or per-package.
|
||||
See value strings in the table above.
|
||||
.PP
|
||||
\fB-m, --hwp-min\fP request HWP to not go below the specified core/bus ratio.
|
||||
The "default" is the value found in IA32_HWP_CAPABILITIES.min.
|
||||
.PP
|
||||
\fB-M, --hwp-max\fP request HWP not exceed a the specified core/bus ratio.
|
||||
The "default" is the value found in IA32_HWP_CAPABILITIES.max.
|
||||
.PP
|
||||
\fB-D, --hwp-desired\fP request HWP 'desired' frequency.
|
||||
The "normal" setting is 0, which
|
||||
corresponds to 'full autonomous' HWP control.
|
||||
Non-zero performance values request a specific performance
|
||||
level on this processor, specified in multiples of 100 MHz.
|
||||
.PP
|
||||
\fB-w, --hwp-window\fP specify integer number of microsec
|
||||
in the sliding window that HWP uses to maintain average frequency.
|
||||
This parameter is meaningful only when the "desired" field above is non-zero.
|
||||
Default is 0, allowing the HW to choose.
|
||||
.SH OTHER OPTIONS
|
||||
.PP
|
||||
\fB-f, --force\fP writes the specified values without bounds checking.
|
||||
.PP
|
||||
\fB-U, --hwp-use-pkg\fP (0 | 1), when used in conjunction with --cpu,
|
||||
indicates whether the per-CPU MSR_IA32_HWP_REQUEST should be overruled (1)
|
||||
or exempt (0) from per-Package MSR_IA32_HWP_REQUEST_PKG settings.
|
||||
The default is exempt.
|
||||
.PP
|
||||
\fB-H, --hwp-enable\fP enable HardWare-P-state (HWP) mode. Once enabled, system RESET is required to disable HWP mode.
|
||||
.PP
|
||||
\fB-t, --turbo-enable\fP enable (1) or disable (0) turbo mode.
|
||||
.PP
|
||||
\fB-v, --version\fP print version and exit.
|
||||
.PP
|
||||
If no request to change policy is made,
|
||||
the default behavior is to read
|
||||
and display the current system state,
|
||||
including the default capabilities.
|
||||
.SH WARNING
|
||||
.PP
|
||||
This utility writes directly to Model Specific Registers.
|
||||
There is no locking or coordination should this utility
|
||||
be used to modify HWP limit fields at the same time that
|
||||
intel_pstate's sysfs attributes access the same MSRs.
|
||||
.PP
|
||||
Note that --hwp-desired and --hwp-window are considered experimental.
|
||||
Future versions of Linux reserve the right to access these
|
||||
fields internally -- potentially conflicting with user-space access.
|
||||
.SH EXAMPLE
|
||||
.nf
|
||||
# sudo x86_energy_perf_policy
|
||||
cpu0: EPB 6
|
||||
cpu0: HWP_REQ: min 6 max 35 des 0 epp 128 window 0x0 (0*10^0us) use_pkg 0
|
||||
cpu0: HWP_CAP: low 1 eff 8 guar 27 high 35
|
||||
cpu1: EPB 6
|
||||
cpu1: HWP_REQ: min 6 max 35 des 0 epp 128 window 0x0 (0*10^0us) use_pkg 0
|
||||
cpu1: HWP_CAP: low 1 eff 8 guar 27 high 35
|
||||
cpu2: EPB 6
|
||||
cpu2: HWP_REQ: min 6 max 35 des 0 epp 128 window 0x0 (0*10^0us) use_pkg 0
|
||||
cpu2: HWP_CAP: low 1 eff 8 guar 27 high 35
|
||||
cpu3: EPB 6
|
||||
cpu3: HWP_REQ: min 6 max 35 des 0 epp 128 window 0x0 (0*10^0us) use_pkg 0
|
||||
cpu3: HWP_CAP: low 1 eff 8 guar 27 high 35
|
||||
.fi
|
||||
.SH NOTES
|
||||
.B "x86_energy_perf_policy "
|
||||
.B "x86_energy_perf_policy"
|
||||
runs only as root.
|
||||
.SH FILES
|
||||
.ta
|
||||
.nf
|
||||
/dev/cpu/*/msr
|
||||
.fi
|
||||
|
||||
.SH "SEE ALSO"
|
||||
.nf
|
||||
msr(4)
|
||||
Intel(R) 64 and IA-32 Architectures Software Developer's Manual
|
||||
.fi
|
||||
.PP
|
||||
.SH AUTHORS
|
||||
.nf
|
||||
Written by Len Brown <len.brown@intel.com>
|
||||
Len Brown
|
||||
|
File diff suppressed because it is too large
Load Diff
Loading…
Reference in New Issue
Block a user