linux/kernel/power/Kconfig

354 lines
12 KiB
Plaintext
Raw Normal View History

# SPDX-License-Identifier: GPL-2.0-only
config SUSPEND
bool "Suspend to RAM and standby"
depends on ARCH_SUSPEND_POSSIBLE
default y
help
Allow the system to enter sleep states in which main memory is
powered and thus its contents are preserved, such as the
suspend-to-RAM state (e.g. the ACPI S3 state).
config SUSPEND_FREEZER
bool "Enable freezer for suspend to RAM/standby" \
if ARCH_WANTS_FREEZER_CONTROL || BROKEN
depends on SUSPEND
default y
help
This allows you to turn off the freezer for suspend. If this is
done, no tasks are frozen for suspend to RAM/standby.
Turning OFF this setting is NOT recommended! If in doubt, say Y.
config SUSPEND_SKIP_SYNC
bool "Skip kernel's sys_sync() on suspend to RAM/standby"
depends on SUSPEND
depends on EXPERT
help
Skip the kernel sys_sync() before freezing user processes.
Some systems prefer not to pay this cost on every invocation
of suspend, or they are content with invoking sync() from
user-space before invoking suspend. There's a run-time switch
at '/sys/power/sync_on_suspend' to configure this behaviour.
This setting changes the default for the run-tim switch. Say Y
to change the default to disable the kernel sys_sync().
config HIBERNATE_CALLBACKS
bool
config HIBERNATION
bool "Hibernation (aka 'suspend to disk')"
depends on SWAP && ARCH_HIBERNATION_POSSIBLE
select HIBERNATE_CALLBACKS
select LZO_COMPRESS
select LZO_DECOMPRESS
select CRC32
help
Enable the suspend to disk (STD) functionality, which is usually
called "hibernation" in user interfaces. STD checkpoints the
system and powers it off; and restores that checkpoint on reboot.
You can suspend your machine with 'echo disk > /sys/power/state'
after placing resume=/dev/swappartition on the kernel command line
in your bootloader's configuration file.
Alternatively, you can use the additional userland tools available
from <http://suspend.sf.net>.
In principle it does not require ACPI or APM, although for example
ACPI will be used for the final steps when it is available. One
of the reasons to use software suspend is that the firmware hooks
for suspend states like suspend-to-RAM (STR) often don't work very
well with Linux.
It creates an image which is saved in your active swap. Upon the next
boot, pass the 'resume=/dev/swappartition' argument to the kernel to
have it detect the saved image, restore memory state from it, and
continue to run as before. If you do not want the previous state to
be reloaded, then use the 'noresume' kernel command line argument.
Note, however, that fsck will be run on your filesystems and you will
need to run mkswap against the swap partition used for the suspend.
It also works with swap files to a limited extent (for details see
<file:Documentation/power/swsusp-and-swap-files.rst>).
Right now you may boot without resuming and resume later but in the
meantime you cannot use the swap partition(s)/file(s) involved in
suspending. Also in this case you must not use the filesystems
that were mounted before the suspend. In particular, you MUST NOT
MOUNT any journaled filesystems mounted before the suspend or they
will get corrupted in a nasty way.
For more information take a look at <file:Documentation/power/swsusp.rst>.
config HIBERNATION_SNAPSHOT_DEV
bool "Userspace snapshot device"
depends on HIBERNATION
default y
help
Device used by the uswsusp tools.
Say N if no snapshotting from userspace is needed, this also
reduces the attack surface of the kernel.
If in doubt, say Y.
config PM_STD_PARTITION
string "Default resume partition"
depends on HIBERNATION
default ""
help
The default resume partition is the partition that the suspend-
to-disk implementation will look for a suspended disk image.
The partition specified here will be different for almost every user.
It should be a valid swap partition (at least for now) that is turned
on before suspending.
The partition specified can be overridden by specifying:
resume=/dev/<other device>
which will set the resume partition to the device specified.
Note there is currently not a way to specify which device to save the
suspended image to. It will simply pick the first available swap
device.
config PM_SLEEP
def_bool y
depends on SUSPEND || HIBERNATE_CALLBACKS
select PM
PM / sleep: wakeup: Fix build error caused by missing SRCU support Commit ea0212f40c6 (power: auto select CONFIG_SRCU) made the code in drivers/base/power/wakeup.c use SRCU instead of RCU, but it forgot to select CONFIG_SRCU in Kconfig, which leads to the following build error if CONFIG_SRCU is not selected somewhere else: drivers/built-in.o: In function `wakeup_source_remove': (.text+0x3c6fc): undefined reference to `synchronize_srcu' drivers/built-in.o: In function `pm_print_active_wakeup_sources': (.text+0x3c7a8): undefined reference to `__srcu_read_lock' drivers/built-in.o: In function `pm_print_active_wakeup_sources': (.text+0x3c84c): undefined reference to `__srcu_read_unlock' drivers/built-in.o: In function `device_wakeup_arm_wake_irqs': (.text+0x3d1d8): undefined reference to `__srcu_read_lock' drivers/built-in.o: In function `device_wakeup_arm_wake_irqs': (.text+0x3d228): undefined reference to `__srcu_read_unlock' drivers/built-in.o: In function `device_wakeup_disarm_wake_irqs': (.text+0x3d24c): undefined reference to `__srcu_read_lock' drivers/built-in.o: In function `device_wakeup_disarm_wake_irqs': (.text+0x3d29c): undefined reference to `__srcu_read_unlock' drivers/built-in.o:(.data+0x4158): undefined reference to `process_srcu' Fix this error by selecting CONFIG_SRCU when PM_SLEEP is enabled. Fixes: ea0212f40c6 (power: auto select CONFIG_SRCU) Cc: 4.2+ <stable@vger.kernel.org> # 4.2+ Signed-off-by: zhangyi (F) <yi.zhang@huawei.com> [ rjw: Minor subject/changelog fixups ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-08-14 02:34:42 +00:00
select SRCU
config PM_SLEEP_SMP
def_bool y
depends on SMP
depends on ARCH_SUSPEND_POSSIBLE || ARCH_HIBERNATION_POSSIBLE
depends on PM_SLEEP
select HOTPLUG_CPU
config PM_SLEEP_SMP_NONZERO_CPU
def_bool y
depends on PM_SLEEP_SMP
depends on ARCH_SUSPEND_NONZERO_CPU
help
If an arch can suspend (for suspend, hibernate, kexec, etc) on a
non-zero numbered CPU, it may define ARCH_SUSPEND_NONZERO_CPU. This
will allow nohz_full mask to include CPU0.
config PM_AUTOSLEEP
bool "Opportunistic sleep"
depends on PM_SLEEP
help
Allow the kernel to trigger a system transition into a global sleep
state automatically whenever there are no active wakeup sources.
config PM_USERSPACE_AUTOSLEEP
bool "Userspace opportunistic sleep"
depends on PM_SLEEP
help
Notify kernel of aggressive userspace autosleep power management policy.
This option changes the behavior of various sleep-sensitive code to deal
with frequent userspace-initiated transitions into a global sleep state.
Saying Y here, disables code paths that most users really should keep
enabled. In particular, only enable this if it is very common to be
asleep/awake for very short periods of time (<= 2 seconds).
Only platforms, such as Android, that implement opportunistic sleep from
a userspace power manager service should enable this option; and not
other machines. Therefore, you should say N here, unless you are
extremely certain that this is what you want. The option otherwise has
bad, undesirable effects, and should not be enabled just for fun.
PM / Sleep: Add user space interface for manipulating wakeup sources, v3 Android allows user space to manipulate wakelocks using two sysfs file located in /sys/power/, wake_lock and wake_unlock. Writing a wakelock name and optionally a timeout to the wake_lock file causes the wakelock whose name was written to be acquired (it is created before is necessary), optionally with the given timeout. Writing the name of a wakelock to wake_unlock causes that wakelock to be released. Implement an analogous interface for user space using wakeup sources. Add the /sys/power/wake_lock and /sys/power/wake_unlock files allowing user space to create, activate and deactivate wakeup sources, such that writing a name and optionally a timeout to wake_lock causes the wakeup source of that name to be activated, optionally with the given timeout. If that wakeup source doesn't exist, it will be created and then activated. Writing a name to wake_unlock causes the wakeup source of that name, if there is one, to be deactivated. Wakeup sources created with the help of wake_lock that haven't been used for more than 5 minutes are garbage collected and destroyed. Moreover, there can be only WL_NUMBER_LIMIT wakeup sources created with the help of wake_lock present at a time. The data type used to track wakeup sources created by user space is called "struct wakelock" to indicate the origins of this feature. This version of the patch includes an rbtree manipulation fix from John Stultz. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: NeilBrown <neilb@suse.de>
2012-04-29 20:53:42 +00:00
config PM_WAKELOCKS
bool "User space wakeup sources interface"
depends on PM_SLEEP
help
PM / Sleep: Add user space interface for manipulating wakeup sources, v3 Android allows user space to manipulate wakelocks using two sysfs file located in /sys/power/, wake_lock and wake_unlock. Writing a wakelock name and optionally a timeout to the wake_lock file causes the wakelock whose name was written to be acquired (it is created before is necessary), optionally with the given timeout. Writing the name of a wakelock to wake_unlock causes that wakelock to be released. Implement an analogous interface for user space using wakeup sources. Add the /sys/power/wake_lock and /sys/power/wake_unlock files allowing user space to create, activate and deactivate wakeup sources, such that writing a name and optionally a timeout to wake_lock causes the wakeup source of that name to be activated, optionally with the given timeout. If that wakeup source doesn't exist, it will be created and then activated. Writing a name to wake_unlock causes the wakeup source of that name, if there is one, to be deactivated. Wakeup sources created with the help of wake_lock that haven't been used for more than 5 minutes are garbage collected and destroyed. Moreover, there can be only WL_NUMBER_LIMIT wakeup sources created with the help of wake_lock present at a time. The data type used to track wakeup sources created by user space is called "struct wakelock" to indicate the origins of this feature. This version of the patch includes an rbtree manipulation fix from John Stultz. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: NeilBrown <neilb@suse.de>
2012-04-29 20:53:42 +00:00
Allow user space to create, activate and deactivate wakeup source
objects with the help of a sysfs-based interface.
config PM_WAKELOCKS_LIMIT
int "Maximum number of user space wakeup sources (0 = no limit)"
range 0 100000
default 100
depends on PM_WAKELOCKS
config PM_WAKELOCKS_GC
bool "Garbage collector for user space wakeup sources"
depends on PM_WAKELOCKS
default y
config PM
bool "Device power management core functionality"
help
Enable functionality allowing I/O devices to be put into energy-saving
(low power) states, for example after a specified period of inactivity
(autosuspended), and woken up in response to a hardware-generated
wake-up event or a driver's request.
Hardware support is generally required for this functionality to work
and the bus type drivers of the buses the devices are on are
responsible for the actual handling of device suspend requests and
wake-up events.
config PM_DEBUG
bool "Power Management Debug Support"
depends on PM
help
This option enables various debugging support in the Power Management
code. This is helpful when debugging and reporting PM bugs, like
suspend support.
config PM_ADVANCED_DEBUG
bool "Extra PM attributes in sysfs for low-level debugging/testing"
depends on PM_DEBUG
help
Add extra sysfs attributes allowing one to access some Power Management
fields of device objects from user space. If you are not a kernel
developer interested in debugging/testing Power Management, say "no".
config PM_TEST_SUSPEND
bool "Test suspend/resume and wakealarm during bootup"
depends on SUSPEND && PM_DEBUG && RTC_CLASS=y
help
This option will let you suspend your machine during bootup, and
make it wake up a few seconds later using an RTC wakeup alarm.
Enable this with a kernel parameter like "test_suspend=mem".
You probably want to have your system's RTC driver statically
linked, ensuring that it's available when this test runs.
config PM_SLEEP_DEBUG
def_bool y
depends on PM_DEBUG && PM_SLEEP
2013-10-17 17:48:46 +00:00
config DPM_WATCHDOG
bool "Device suspend/resume watchdog"
depends on PM_DEBUG && PSTORE && EXPERT
help
2013-10-17 17:48:46 +00:00
Sets up a watchdog timer to capture drivers that are
locked up attempting to suspend/resume a device.
A detected lockup causes system panic with message
captured in pstore device for inspection in subsequent
boot session.
config DPM_WATCHDOG_TIMEOUT
int "Watchdog timeout in seconds"
range 1 120
default 120
2013-10-17 17:48:46 +00:00
depends on DPM_WATCHDOG
config PM_TRACE
bool
help
This enables code to save the last PM event point across
reboot. The architecture needs to support this, x86 for
example does by saving things in the RTC, see below.
The architecture specific code must provide the extern
functions from <linux/resume-trace.h> as well as the
<asm/resume-trace.h> header with a TRACE_RESUME() macro.
The way the information is presented is architecture-
dependent, x86 will print the information during a
late_initcall.
config PM_TRACE_RTC
bool "Suspend/resume event tracing"
depends on PM_SLEEP_DEBUG
depends on X86
select PM_TRACE
help
This enables some cheesy code to save the last PM event point in the
RTC across reboots, so that you can debug a machine that just hangs
during suspend (or more commonly, during resume).
To use this debugging feature you should attempt to suspend the
machine, reboot it and then run
dmesg -s 1000000 | grep 'hash matches'
CAUTION: this option will cause your machine's real-time clock to be
set to an invalid time after a resume.
config APM_EMULATION
tristate "Advanced Power Management Emulation"
depends on SYS_SUPPORTS_APM_EMULATION
help
APM is a BIOS specification for saving power using several different
techniques. This is mostly useful for battery powered laptops with
APM compliant BIOSes. If you say Y here, the system time will be
reset after a RESUME operation, the /proc/apm device will provide
battery status information, and user-space programs will receive
notification of APM "events" (e.g. battery status change).
In order to use APM, you will need supporting software. For location
and more information, read <file:Documentation/power/apm-acpi.rst>
and the Battery Powered Linux mini-HOWTO, available from
<http://www.tldp.org/docs.html#howto>.
This driver does not spin down disk drives (see the hdparm(8)
manpage ("man 8 hdparm") for that), and it doesn't turn off
VESA-compliant "green" monitors.
Generally, if you don't have a battery in your machine, there isn't
much point in using this driver and you should say N. If you get
random kernel OOPSes or reboots that don't seem to be related to
anything, try disabling/enabling this option (or disabling/enabling
APM in your BIOS).
config PM_CLK
def_bool y
depends on PM && HAVE_CLK
config PM_GENERIC_DOMAINS
bool
depends on PM
config WQ_POWER_EFFICIENT_DEFAULT
bool "Enable workqueue power-efficient mode by default"
depends on PM
help
Per-cpu workqueues are generally preferred because they show
better performance thanks to cache locality; unfortunately,
per-cpu workqueues tend to be more power hungry than unbound
workqueues.
Enabling workqueue.power_efficient kernel parameter makes the
per-cpu workqueues which were observed to contribute
significantly to power consumption unbound, leading to measurably
lower power usage at the cost of small performance overhead.
This config option determines whether workqueue.power_efficient
is enabled by default.
If in doubt, say N.
config PM_GENERIC_DOMAINS_SLEEP
def_bool y
depends on PM_SLEEP && PM_GENERIC_DOMAINS
config PM_GENERIC_DOMAINS_OF
def_bool y
depends on PM_GENERIC_DOMAINS && OF
config CPU_PM
bool
PM: Introduce an Energy Model management framework Several subsystems in the kernel (task scheduler and/or thermal at the time of writing) can benefit from knowing about the energy consumed by CPUs. Yet, this information can come from different sources (DT or firmware for example), in different formats, hence making it hard to exploit without a standard API. As an attempt to address this, introduce a centralized Energy Model (EM) management framework which aggregates the power values provided by drivers into a table for each performance domain in the system. The power cost tables are made available to interested clients (e.g. task scheduler or thermal) via platform-agnostic APIs. The overall design is represented by the diagram below (focused on Arm-related drivers as an example, but applicable to any architecture): +---------------+ +-----------------+ +-------------+ | Thermal (IPA) | | Scheduler (EAS) | | Other | +---------------+ +-----------------+ +-------------+ | | em_pd_energy() | | | em_cpu_get() | +-----------+ | +--------+ | | | v v v +---------------------+ | | | Energy Model | | | | Framework | | | +---------------------+ ^ ^ ^ | | | em_register_perf_domain() +----------+ | +---------+ | | | +---------------+ +---------------+ +--------------+ | cpufreq-dt | | arm_scmi | | Other | +---------------+ +---------------+ +--------------+ ^ ^ ^ | | | +--------------+ +---------------+ +--------------+ | Device Tree | | Firmware | | ? | +--------------+ +---------------+ +--------------+ Drivers (typically, but not limited to, CPUFreq drivers) can register data in the EM framework using the em_register_perf_domain() API. The calling driver must provide a callback function with a standardized signature that will be used by the EM framework to build the power cost tables of the performance domain. This design should offer a lot of flexibility to calling drivers which are free of reading information from any location and to use any technique to compute power costs. Moreover, the capacity states registered by drivers in the EM framework are not required to match real performance states of the target. This is particularly important on targets where the performance states are not known by the OS. The power cost coefficients managed by the EM framework are specified in milli-watts. Although the two potential users of those coefficients (IPA and EAS) only need relative correctness, IPA specifically needs to compare the power of CPUs with the power of other components (GPUs, for example), which are still expressed in absolute terms in their respective subsystems. Hence, specifying the power of CPUs in milli-watts should help transitioning IPA to using the EM framework without introducing new problems by keeping units comparable across sub-systems. On the longer term, the EM of other devices than CPUs could also be managed by the EM framework, which would enable to remove the absolute unit. However, this is not absolutely required as a first step, so this extension of the EM framework is left for later. On the client side, the EM framework offers APIs to access the power cost tables of a CPU (em_cpu_get()), and to estimate the energy consumed by the CPUs of a performance domain (em_pd_energy()). Clients such as the task scheduler can then use these APIs to access the shared data structures holding the Energy Model of CPUs. Signed-off-by: Quentin Perret <quentin.perret@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: adharmap@codeaurora.org Cc: chris.redpath@arm.com Cc: currojerez@riseup.net Cc: dietmar.eggemann@arm.com Cc: edubezval@gmail.com Cc: gregkh@linuxfoundation.org Cc: javi.merino@kernel.org Cc: joel@joelfernandes.org Cc: juri.lelli@redhat.com Cc: morten.rasmussen@arm.com Cc: patrick.bellasi@arm.com Cc: pkondeti@codeaurora.org Cc: skannan@codeaurora.org Cc: smuckle@google.com Cc: srinivas.pandruvada@linux.intel.com Cc: thara.gopinath@linaro.org Cc: tkjos@google.com Cc: valentin.schneider@arm.com Cc: vincent.guittot@linaro.org Cc: viresh.kumar@linaro.org Link: https://lkml.kernel.org/r/20181203095628.11858-4-quentin.perret@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-12-03 09:56:16 +00:00
config ENERGY_MODEL
bool "Energy Model for devices with DVFS (CPUs, GPUs, etc)"
PM: Introduce an Energy Model management framework Several subsystems in the kernel (task scheduler and/or thermal at the time of writing) can benefit from knowing about the energy consumed by CPUs. Yet, this information can come from different sources (DT or firmware for example), in different formats, hence making it hard to exploit without a standard API. As an attempt to address this, introduce a centralized Energy Model (EM) management framework which aggregates the power values provided by drivers into a table for each performance domain in the system. The power cost tables are made available to interested clients (e.g. task scheduler or thermal) via platform-agnostic APIs. The overall design is represented by the diagram below (focused on Arm-related drivers as an example, but applicable to any architecture): +---------------+ +-----------------+ +-------------+ | Thermal (IPA) | | Scheduler (EAS) | | Other | +---------------+ +-----------------+ +-------------+ | | em_pd_energy() | | | em_cpu_get() | +-----------+ | +--------+ | | | v v v +---------------------+ | | | Energy Model | | | | Framework | | | +---------------------+ ^ ^ ^ | | | em_register_perf_domain() +----------+ | +---------+ | | | +---------------+ +---------------+ +--------------+ | cpufreq-dt | | arm_scmi | | Other | +---------------+ +---------------+ +--------------+ ^ ^ ^ | | | +--------------+ +---------------+ +--------------+ | Device Tree | | Firmware | | ? | +--------------+ +---------------+ +--------------+ Drivers (typically, but not limited to, CPUFreq drivers) can register data in the EM framework using the em_register_perf_domain() API. The calling driver must provide a callback function with a standardized signature that will be used by the EM framework to build the power cost tables of the performance domain. This design should offer a lot of flexibility to calling drivers which are free of reading information from any location and to use any technique to compute power costs. Moreover, the capacity states registered by drivers in the EM framework are not required to match real performance states of the target. This is particularly important on targets where the performance states are not known by the OS. The power cost coefficients managed by the EM framework are specified in milli-watts. Although the two potential users of those coefficients (IPA and EAS) only need relative correctness, IPA specifically needs to compare the power of CPUs with the power of other components (GPUs, for example), which are still expressed in absolute terms in their respective subsystems. Hence, specifying the power of CPUs in milli-watts should help transitioning IPA to using the EM framework without introducing new problems by keeping units comparable across sub-systems. On the longer term, the EM of other devices than CPUs could also be managed by the EM framework, which would enable to remove the absolute unit. However, this is not absolutely required as a first step, so this extension of the EM framework is left for later. On the client side, the EM framework offers APIs to access the power cost tables of a CPU (em_cpu_get()), and to estimate the energy consumed by the CPUs of a performance domain (em_pd_energy()). Clients such as the task scheduler can then use these APIs to access the shared data structures holding the Energy Model of CPUs. Signed-off-by: Quentin Perret <quentin.perret@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: adharmap@codeaurora.org Cc: chris.redpath@arm.com Cc: currojerez@riseup.net Cc: dietmar.eggemann@arm.com Cc: edubezval@gmail.com Cc: gregkh@linuxfoundation.org Cc: javi.merino@kernel.org Cc: joel@joelfernandes.org Cc: juri.lelli@redhat.com Cc: morten.rasmussen@arm.com Cc: patrick.bellasi@arm.com Cc: pkondeti@codeaurora.org Cc: skannan@codeaurora.org Cc: smuckle@google.com Cc: srinivas.pandruvada@linux.intel.com Cc: thara.gopinath@linaro.org Cc: tkjos@google.com Cc: valentin.schneider@arm.com Cc: vincent.guittot@linaro.org Cc: viresh.kumar@linaro.org Link: https://lkml.kernel.org/r/20181203095628.11858-4-quentin.perret@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-12-03 09:56:16 +00:00
depends on SMP
depends on CPU_FREQ
help
Several subsystems (thermal and/or the task scheduler for example)
can leverage information about the energy consumed by devices to
make smarter decisions. This config option enables the framework
from which subsystems can access the energy models.
PM: Introduce an Energy Model management framework Several subsystems in the kernel (task scheduler and/or thermal at the time of writing) can benefit from knowing about the energy consumed by CPUs. Yet, this information can come from different sources (DT or firmware for example), in different formats, hence making it hard to exploit without a standard API. As an attempt to address this, introduce a centralized Energy Model (EM) management framework which aggregates the power values provided by drivers into a table for each performance domain in the system. The power cost tables are made available to interested clients (e.g. task scheduler or thermal) via platform-agnostic APIs. The overall design is represented by the diagram below (focused on Arm-related drivers as an example, but applicable to any architecture): +---------------+ +-----------------+ +-------------+ | Thermal (IPA) | | Scheduler (EAS) | | Other | +---------------+ +-----------------+ +-------------+ | | em_pd_energy() | | | em_cpu_get() | +-----------+ | +--------+ | | | v v v +---------------------+ | | | Energy Model | | | | Framework | | | +---------------------+ ^ ^ ^ | | | em_register_perf_domain() +----------+ | +---------+ | | | +---------------+ +---------------+ +--------------+ | cpufreq-dt | | arm_scmi | | Other | +---------------+ +---------------+ +--------------+ ^ ^ ^ | | | +--------------+ +---------------+ +--------------+ | Device Tree | | Firmware | | ? | +--------------+ +---------------+ +--------------+ Drivers (typically, but not limited to, CPUFreq drivers) can register data in the EM framework using the em_register_perf_domain() API. The calling driver must provide a callback function with a standardized signature that will be used by the EM framework to build the power cost tables of the performance domain. This design should offer a lot of flexibility to calling drivers which are free of reading information from any location and to use any technique to compute power costs. Moreover, the capacity states registered by drivers in the EM framework are not required to match real performance states of the target. This is particularly important on targets where the performance states are not known by the OS. The power cost coefficients managed by the EM framework are specified in milli-watts. Although the two potential users of those coefficients (IPA and EAS) only need relative correctness, IPA specifically needs to compare the power of CPUs with the power of other components (GPUs, for example), which are still expressed in absolute terms in their respective subsystems. Hence, specifying the power of CPUs in milli-watts should help transitioning IPA to using the EM framework without introducing new problems by keeping units comparable across sub-systems. On the longer term, the EM of other devices than CPUs could also be managed by the EM framework, which would enable to remove the absolute unit. However, this is not absolutely required as a first step, so this extension of the EM framework is left for later. On the client side, the EM framework offers APIs to access the power cost tables of a CPU (em_cpu_get()), and to estimate the energy consumed by the CPUs of a performance domain (em_pd_energy()). Clients such as the task scheduler can then use these APIs to access the shared data structures holding the Energy Model of CPUs. Signed-off-by: Quentin Perret <quentin.perret@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: adharmap@codeaurora.org Cc: chris.redpath@arm.com Cc: currojerez@riseup.net Cc: dietmar.eggemann@arm.com Cc: edubezval@gmail.com Cc: gregkh@linuxfoundation.org Cc: javi.merino@kernel.org Cc: joel@joelfernandes.org Cc: juri.lelli@redhat.com Cc: morten.rasmussen@arm.com Cc: patrick.bellasi@arm.com Cc: pkondeti@codeaurora.org Cc: skannan@codeaurora.org Cc: smuckle@google.com Cc: srinivas.pandruvada@linux.intel.com Cc: thara.gopinath@linaro.org Cc: tkjos@google.com Cc: valentin.schneider@arm.com Cc: vincent.guittot@linaro.org Cc: viresh.kumar@linaro.org Link: https://lkml.kernel.org/r/20181203095628.11858-4-quentin.perret@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-12-03 09:56:16 +00:00
The exact usage of the energy model is subsystem-dependent.
If in doubt, say N.