The cpufreq core now supports suspending and resuming of cpufreq
drivers and governors during systems suspend and resume, so use
the common infrastructure instead of defining special PM notifiers
for the same thing.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
[rjw: Changelog]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The cpufreq core now supports suspending and resuming of cpufreq
drivers and governors during systems suspend and resume, so use
the common infrastructure instead of defining special PM notifiers
for the same thing.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
[rjw: Changelog]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Multiple platforms need to set CPUs to a particular frequency before
suspending the system, so provide a common infrastructure for them.
Those platforms only need to point their ->suspend callback pointers
to the generic routine.
Tested-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
[rjw: Changelog]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
This patch adds cpufreq suspend/resume calls to dpm_{suspend|resume}()
for handling suspend/resume of cpufreq governors.
Lan Tianyu (Intel) & Jinhyuk Choi (Broadcom) found an issue where the
tunables configuration for clusters/sockets with non-boot CPUs was
lost after system suspend/resume, as we were notifying governors with
CPUFREQ_GOV_POLICY_EXIT on removal of the last CPU for that policy
which caused the tunables memory to be freed.
This is fixed by preventing any governor operations from being
carried out between the device suspend and device resume stages of
system suspend and resume, respectively.
We could have added these callbacks at dpm_{suspend|resume}_noirq()
level, but there is an additional problem that the majority of I/O
devices is already suspended at that point and if cpufreq drivers
want to change the frequency before suspending, then that not be
possible on some platforms (which depend on peripherals like i2c,
regulators, etc).
Reported-and-tested-by: Lan Tianyu <tianyu.lan@intel.com>
Reported-by: Jinhyuk Choi <jinchoi@broadcom.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
[rjw: Changelog]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
We call __find_governor() during the addition of the first CPU of
each policy from __cpufreq_add_dev() to find the last governor used
for this CPU before it was hot-removed.
After that we call cpufreq_parse_governor() in cpufreq_init_policy(),
either with this governor, or with the default governor. Right after
that policy->governor is set to NULL.
While that code is not functionally problematic, the structure of it
is suboptimal, because some of the code required in cpufreq_init_policy()
is being executed by its caller, __cpufreq_add_dev(). So, it would make
more sense to get all of it together in a single place to make code more
readable.
Accordingly, move the code needed for policy initialization to
cpufreq_init_policy() and initialize policy->governor to NULL at the
beginning.
In order to clean up the code a bit more, some of the #ifdefs for
CONFIG_HOTPLUG_CPU are dropped too.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
[rjw: Changelog]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
policy->rwsem is used to lock access to all parts of code modifying
struct cpufreq_policy, but it's not used on a new policy created by
__cpufreq_add_dev().
Because of that, if cpufreq_update_policy() is called in a tight loop
on one CPU in parallel with offline/online of another CPU, then the
following crash can be triggered:
Unable to handle kernel NULL pointer dereference at virtual address 00000020
pgd = c0003000
[00000020] *pgd=80000000004003, *pmd=00000000
Internal error: Oops: 206 [#1] PREEMPT SMP ARM
PC is at __cpufreq_governor+0x10/0x1ac
LR is at cpufreq_update_policy+0x114/0x150
---[ end trace f23a8defea6cd706 ]---
Kernel panic - not syncing: Fatal exception
CPU0: stopping
CPU: 0 PID: 7136 Comm: mpdecision Tainted: G D W 3.10.0-gd727407-00074-g979ede8 #396
[<c0afe180>] (notifier_call_chain+0x40/0x68) from [<c02a23ac>] (__blocking_notifier_call_chain+0x40/0x58)
[<c02a23ac>] (__blocking_notifier_call_chain+0x40/0x58) from [<c02a23d8>] (blocking_notifier_call_chain+0x14/0x1c)
[<c02a23d8>] (blocking_notifier_call_chain+0x14/0x1c) from [<c0803c68>] (cpufreq_set_policy+0xd4/0x2b8)
[<c0803c68>] (cpufreq_set_policy+0xd4/0x2b8) from [<c0803e7c>] (cpufreq_init_policy+0x30/0x98)
[<c0803e7c>] (cpufreq_init_policy+0x30/0x98) from [<c0805a18>] (__cpufreq_add_dev.isra.17+0x4dc/0x7a4)
[<c0805a18>] (__cpufreq_add_dev.isra.17+0x4dc/0x7a4) from [<c0805d38>] (cpufreq_cpu_callback+0x58/0x84)
[<c0805d38>] (cpufreq_cpu_callback+0x58/0x84) from [<c0afe180>] (notifier_call_chain+0x40/0x68)
[<c0afe180>] (notifier_call_chain+0x40/0x68) from [<c02812dc>] (__cpu_notify+0x28/0x44)
[<c02812dc>] (__cpu_notify+0x28/0x44) from [<c0aeed90>] (_cpu_up+0xf4/0x1dc)
[<c0aeed90>] (_cpu_up+0xf4/0x1dc) from [<c0aeeed4>] (cpu_up+0x5c/0x78)
[<c0aeeed4>] (cpu_up+0x5c/0x78) from [<c0aec808>] (store_online+0x44/0x74)
[<c0aec808>] (store_online+0x44/0x74) from [<c03a40f4>] (sysfs_write_file+0x108/0x14c)
[<c03a40f4>] (sysfs_write_file+0x108/0x14c) from [<c03517d4>] (vfs_write+0xd0/0x180)
[<c03517d4>] (vfs_write+0xd0/0x180) from [<c0351ca8>] (SyS_write+0x38/0x68)
[<c0351ca8>] (SyS_write+0x38/0x68) from [<c0205de0>] (ret_fast_syscall+0x0/0x30)
Fix that by taking locks at appropriate places in __cpufreq_add_dev()
as well.
Reported-by: Saravana Kannan <skannan@codeaurora.org>
Suggested-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
[rjw: Changelog]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Policy must be fully initialized before it is being made available
for use by others. Otherwise cpufreq_cpu_get() would be able to grab
a half initialized policy structure that might not have affected_cpus
(for example) populated. Then, anybody accessing those fields will get
a wrong value and that will lead to unpredictable results.
In order to fix this, do all the necessary initialization before we
make the policy structure available via cpufreq_cpu_get(). That will
guarantee that any code accessing fields of the policy will get
correct data from them.
Reported-by: Saravana Kannan <skannan@codeaurora.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
[rjw: Changelog]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
If a module calls cpufreq_get while cpufreq is initializing, it's
possible for it to be called after cpufreq_driver is set but before
cpufreq_cpu_data is written during subsys_interface_register. This
happens because cpufreq_get doesn't take the cpufreq_driver_lock
around its use of cpufreq_cpu_data.
Fix this by using cpufreq_cpu_get(cpu) to look up the policy rather
than reading it out of cpufreq_cpu_data directly. cpufreq_cpu_get()
takes the appropriate locks to prevent this race from happening.
Since it's possible for policy to be NULL if the caller passes in an
invalid CPU number or calls the function before cpufreq is initialized,
delete the BUG_ON(!policy) and simply return 0. Don't try to return
-ENOENT because that's negative and the function returns an unsigned
integer.
References: https://bbs.archlinux.org/viewtopic.php?id=177934
Signed-off-by: Aaron Plattner <aplattner@nvidia.com>
Cc: 3.13+ <stable@vger.kernel.org> # 3.13+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
cpufreq_frequency_get_table() is called from all callers of
__cpufreq_stats_create_table(). So, move it inside.
Suggested-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Saravana Kannan <skannan@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Remove sysfs group if __cpufreq_stats_create_table() fails after creating
one.
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Saravana Kannan <skannan@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
__cpufreq_stats_create_table always gets pass the valid and real policy
struct. So, there's no need to call cpufreq_cpu_get() to get the policy
again.
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Saravana Kannan <skannan@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
commit d253d2a526 (Improve accuracy by not truncating until final
result), changed internal variables of the PID to be fixed point
numbers. Update the pid_reset() to reflect this change.
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Remove unneeded sample buffers, intel_pstate operates on the most
recent sample only. This save some memory and make the code more
readable.
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
cpufreq_update_policy() calls cpufreq_driver->get() to get current
frequency of a CPU and it is not supposed to fail or return zero.
Return error in case that happens.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Enable cpufreq and power kconfig menus on arm64 along with arm cpufreq
drivers. The power menu is needed for OPP support. At least on Calxeda
systems, the same cpufreq driver is used for arm and arm64 based
systems.
Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Mark Brown <broonie@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Mark function as static in cpufreq.c because it is not
used outside this file.
This eliminates the following warning in cpufreq.c:
drivers/cpufreq/cpufreq.c:355:9: warning: no previous prototype for ‘show_boost’ [-Wmissing-prototypes]
Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Commit fcb6a15c2e (intel_pstate: Take core C0 time into account for
core busy calculation) introduced a regression on some processor SKUs
supported by intel_pstate. This was due to the truncation caused by
using integer math to calculate core busy and C0 percentages.
On a i7-4770K processor operating at 800Mhz going to 100% utilization
the percent busy of the CPU using integer math is 22%, but it actually
is 22.85%. This value scaled to the current frequency returned 97
which the PID interpreted as no error and did not adjust the P state.
Tested on i7-4770K, i7-2600, i5-3230M.
Fixes: fcb6a15c2e (intel_pstate: Take core C0 time into account for core busy calculation)
References: https://lkml.org/lkml/2014/2/19/626
References: https://bugzilla.kernel.org/show_bug.cgi?id=70941
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
cpufreq_update_policy() is called from two places currently. From a
workqueue handled queued from cpufreq_bp_resume() for boot CPU and
from cpufreq_cpu_callback() whenever a CPU is added.
The first one makes sure that boot CPU is running on the frequency
present in policy->cpu. But we don't really need a call from
cpufreq_cpu_callback(), because we always call cpufreq_driver->init()
(which will set policy->cur correctly) whenever first CPU of any
policy is added back. And so every policy structure is guaranteed to
have the right frequency in policy->cur.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reduce the rampant usage of goto and the indentation level in
cpufreq_set_policy() to improve the readability of that code.
No functional changes should result from that.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
A documentation update exposed the existance of the turbo ratio
register. Update baytrail support to use the turbo range.
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Cc: 3.13+ <stable@vger.kernel.org> # 3.13+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
LFM (max efficiency ratio) is the max frequency at minimum voltage
supported by the processor. Using LFM as the minimum P state
increases performmance without affecting power. By not using P states
below LFM we avoid using P states that are less power efficient.
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Cc: 3.13+ <stable@vger.kernel.org> # 3.13+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The powernow-k8 driver maintains a per-cpu data-structure called
powernow_data that is used to perform the frequency transitions.
It initializes this data structure only for the policy->cpu. So,
accesses to this data structure by other CPUs results in various
problems because they would have been uninitialized.
Specifically, if a cpu (!= policy->cpu) invokes the drivers' ->get()
function, it returns 0 as the KHz value, since its per-cpu memory
doesn't point to anything valid. This causes problems during
suspend/resume since cpufreq_update_policy() tries to enforce this
(0 KHz) as the current frequency of the CPU, and this madness gets
propagated to adjust_jiffies() as well. Eventually, lots of things
start breaking down, including the r8169 ethernet card, in one
particularly interesting case reported by Pierre Ossman.
Fix this by initializing the per-cpu data-structures of all the CPUs
in the policy appropriately.
References: https://bugzilla.kernel.org/show_bug.cgi?id=70311
Reported-by: Pierre Ossman <pierre@ossman.eu>
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Cc: All applicable <stable@vger.kernel.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Commit 42f921a (cpufreq: remove sysfs files for CPUs which failed to
come back after resume) tried to do this but missed this piece of code
to fix.
Currently we are getting this on suspend/resume:
------------[ cut here ]------------
WARNING: CPU: 0 PID: 877 at fs/sysfs/dir.c:52 sysfs_warn_dup+0x68/0x84()
sysfs: cannot create duplicate filename '/devices/system/cpu/cpu1/cpufreq'
Modules linked in: brcmfmac brcmutil
CPU: 0 PID: 877 Comm: test-rtc-resume Not tainted 3.14.0-rc2-00259-g9398a10cd964 #12
[<c0015bac>] (unwind_backtrace) from [<c0011850>] (show_stack+0x10/0x14)
[<c0011850>] (show_stack) from [<c056e018>] (dump_stack+0x80/0xcc)
[<c056e018>] (dump_stack) from [<c0025e44>] (warn_slowpath_common+0x64/0x88)
[<c0025e44>] (warn_slowpath_common) from [<c0025efc>] (warn_slowpath_fmt+0x30/0x40)
[<c0025efc>] (warn_slowpath_fmt) from [<c012776c>] (sysfs_warn_dup+0x68/0x84)
[<c012776c>] (sysfs_warn_dup) from [<c0127a54>] (sysfs_do_create_link_sd+0xb0/0xb8)
[<c0127a54>] (sysfs_do_create_link_sd) from [<c038ef64>] (__cpufreq_add_dev.isra.27+0x2a8/0x814)
[<c038ef64>] (__cpufreq_add_dev.isra.27) from [<c038f548>] (cpufreq_cpu_callback+0x70/0x8c)
[<c038f548>] (cpufreq_cpu_callback) from [<c0043864>] (notifier_call_chain+0x44/0x84)
[<c0043864>] (notifier_call_chain) from [<c0025f60>] (__cpu_notify+0x28/0x44)
[<c0025f60>] (__cpu_notify) from [<c00261e8>] (_cpu_up+0xf0/0x140)
[<c00261e8>] (_cpu_up) from [<c0569eb8>] (enable_nonboot_cpus+0x68/0xb0)
[<c0569eb8>] (enable_nonboot_cpus) from [<c006339c>] (suspend_devices_and_enter+0x198/0x2dc)
[<c006339c>] (suspend_devices_and_enter) from [<c0063654>] (pm_suspend+0x174/0x1e8)
[<c0063654>] (pm_suspend) from [<c00624e0>] (state_store+0x6c/0xbc)
[<c00624e0>] (state_store) from [<c01fc200>] (kobj_attr_store+0x14/0x20)
[<c01fc200>] (kobj_attr_store) from [<c0126e50>] (sysfs_kf_write+0x44/0x48)
[<c0126e50>] (sysfs_kf_write) from [<c012a274>] (kernfs_fop_write+0xb4/0x14c)
[<c012a274>] (kernfs_fop_write) from [<c00d4818>] (vfs_write+0xa8/0x180)
[<c00d4818>] (vfs_write) from [<c00d4bb8>] (SyS_write+0x3c/0x70)
[<c00d4bb8>] (SyS_write) from [<c000e620>] (ret_fast_syscall+0x0/0x30)
---[ end trace 76969904b614c18f ]---
Fix this by removing sysfs link for cpufreq directory when cpu removed
isn't policy->cpu.
Revamps: 42f921a (cpufreq: remove sysfs files for CPUs which failed to come back after resume)
Reported-and-tested-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Remove the reporting of energy since it does not provide any useful
information about the state of the driver and will be a maintainance
headache going forward since the RAPL energy units register is not
architectural and subject to change between micro-architectures
References: https://bugzilla.kernel.org/show_bug.cgi?id=69831
Fixes: b69880f9cc (intel_pstate: Add trace point to report internal state.)
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Take non-idle time into account when calculating core busy time.
This ensures that intel_pstate will notice a decrease in load.
References: https://bugzilla.kernel.org/show_bug.cgi?id=66581
Cc: 3.10+ <stable@vger.kernel.org> # 3.10+
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
- ACPI device hotplug fix preventing ACPI drivers from binding to device
objects that acpi_bus_trim() has been called for and the devices
represented by them may not be operational.
- Recent cpufreq changes related to the "boost" (turbo) feature broke
the acpi-cpufreq error code path causing a NULL pointer dereference
to occur on some systems. Fix from Konrad Rzeszutek Wilk.
- The log level of a CPU initialization error message added recently
needs to be reduced, because the particular BIOS issue indicated by
it turns out to be widespread and doesn't really matter for the
majority of systems having it. From Jiang Liu.
- The regulator API needs to be told to stay away from things on systems
with ACPI BIOSes or it may conflict with the BIOS' own handling of
voltage regulators. Fix from Mark Brown that works around a 3.13
regression in lm90 on PCs occuring if the regulator API is enabled.
- Prevent the Exynos4 devfreq driver from being built on multiplatform,
because it depends on things that aren't available during such builds.
From Sachin Kamat.
- Upstream ACPICA doesn't use the bool type as defined in the kernel,
so modify the kernel's ACPICA code to follow the upstream in that
respect (only one variable definition is affected) to reduce
divergences between the two. From Lv Zheng.
- Make the ACPI device PM code use ACPI_COMPANION() instead of its own
routine doing the same thing (and invokng ACPI_COMPANION() in the
process).
- Modify some routines in the ACPI processor driver to follow the
common convention and return negative integers on errors. From
Hanjun Guo.
/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABCAAGBQJS6mQLAAoJEILEb/54YlRxc6sP/0tYmO8qOF0XwHhA9AQKlcVG
DZxui1pEbhXM3lftwqvVnma1hOKY1T014+JFxldCjvV9hyxdxTvcdvhKRxrj/eAk
DY7p2TK3N0eBScpMQLv/nKMjS20Z0bxz3QNxhC0S3eOYk5QPN3aUhsqG2iu6aFxJ
FzOOV5mQGhGT6a3LmXvjNROXjJN4dibMmGvl+bJ/8Y5zJUUvAOWmxphqjaIZm7WC
Lh4a8wUfxfmq8Mj1DDI/qk/aWP8QTBDYiG8C8XjEa1pj9oHjsrCHNogSrfRHRwa1
sDW/6EjSbLYnKuUDeoA61wXyVvlvR22uTTswDwbJ2mt18WTcmbvPP6Znh7m3oiF2
aUDoh/8KvYGRhDBZm+wf6TjD/KdcIQdOptNl6irCMsHkYNDK/n7f++Blkjy/xmea
UsF6l4XkIbqJCWWsTwO2VjB0p0QPRS49s/YtUxjETqCjdhNKnAQr2wfBH5/fcD6c
yJPkCGOlqcPQAPOrWyabAuStDfFUEUoFkZ/kbZov2xUIXyJFv5gfj1boZ3Ekqhmd
klcOmsvVP8QZLF82iFIhDdgWLNCRPESNfJpN0mLpg2sYMWQtK+r1OnSKpbVNW0Jn
6Ea23WSm0ZEd0QX8tYLjf/iUsIUjn48gJuXG+CDlTtkQtLNdr1VTQd5ZeKm2kcOg
+82EVRoeZi+sqx4YrEQi
=VBdN
-----END PGP SIGNATURE-----
Merge tag 'pm+acpi-3.14-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI and power management fixes and cleanups from Rafael Wysocki:
- ACPI device hotplug fix preventing ACPI drivers from binding to device
objects that acpi_bus_trim() has been called for and the devices
represented by them may not be operational.
- Recent cpufreq changes related to the "boost" (turbo) feature broke
the acpi-cpufreq error code path causing a NULL pointer dereference
to occur on some systems. Fix from Konrad Rzeszutek Wilk.
- The log level of a CPU initialization error message added recently
needs to be reduced, because the particular BIOS issue indicated by
it turns out to be widespread and doesn't really matter for the
majority of systems having it. From Jiang Liu.
- The regulator API needs to be told to stay away from things on systems
with ACPI BIOSes or it may conflict with the BIOS' own handling of
voltage regulators. Fix from Mark Brown that works around a 3.13
regression in lm90 on PCs occuring if the regulator API is enabled.
- Prevent the Exynos4 devfreq driver from being built on multiplatform,
because it depends on things that aren't available during such builds.
From Sachin Kamat.
- Upstream ACPICA doesn't use the bool type as defined in the kernel,
so modify the kernel's ACPICA code to follow the upstream in that
respect (only one variable definition is affected) to reduce
divergences between the two. From Lv Zheng.
- Make the ACPI device PM code use ACPI_COMPANION() instead of its own
routine doing the same thing (and invokng ACPI_COMPANION() in the
process).
- Modify some routines in the ACPI processor driver to follow the
common convention and return negative integers on errors. From
Hanjun Guo.
* tag 'pm+acpi-3.14-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI / scan: Clear match_driver flag in acpi_bus_trim()
ACPI / init: Flag use of ACPI and ACPI idioms for power supplies to regulator API
acpi-cpufreq: De-register CPU notifier and free struct msr on error.
ACPICA: Remove bool usage from ACPICA.
PM / devfreq: Disable Exynos4 driver build on multiplatform
ACPI / PM: Use ACPI_COMPANION() to get ACPI companions of devices
ACPI / scan: reduce log level of "ACPI: \_PR_.CPU4: failed to get CPU APIC ID"
ACPI / processor: Return specific error value when mapping lapic id
If cpufreq_register_driver() fails we would free the acpi driver
related structures but not free the ones allocated
by acpi_cpufreq_boost_init() function. This meant that as
the driver error-ed out and a CPU online/offline event came
we would crash and burn as one of the CPU notifiers would point
to garbage.
Fixes: cfc9c8ed03 (acpi-cpufreq: Adjust the code to use the common boost attribute)
Acked-by: Lukasz Majewski <l.majewski@samsung.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Pull thermal management updates from Zhang Rui:
"This time, the biggest change is the work of representing hardware
thermal properties in device tree infrastructure.
This work includes the introduction of a device tree bindings for
describing the hardware thermal behavior and limits, and also a parser
to read and interpret the data, and build thermal zones and thermal
binding parameters. It also contains three examples on how to use the
new representation on sensor devices, using three different drivers to
accomplish it. One driver is in thermal subsystem, the TI SoC
thermal, and the other two drivers are in hwmon subsystem.
Actually, this would be the first step of the complete work because we
still need to check other potential drivers to be converted and then
validate the proposed API. But the reason why I include it in this
pull request is that, first, this change does not hurt any others
without using this approach, second, the principle and concept of this
change would not break after converting the remaining drivers. BTW,
as you can see, there are several points in this change that do not
belong to thermal subsystem. Because it has been suggested by Guenter
R that in such cases, it is recommended to send the complete series
via one single subsystem.
Specifics:
- representing hardware thermal properties in device tree
infrastructure
- fix a regression that the imx thermal driver breaks system suspend.
- introduce ACPI INT3403 thermal driver to retrieve temperature data
from the INT3403 ACPI device object present on some systems.
- introduce debug statement for thermal core and step_wise governor.
- assorted fixes and cleanups for thermal core, cpu cooling, exynos
thrmal, intel powerclamp and imx thermal driver"
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux: (34 commits)
thermal: remove const flag from .ops of imx thermal
Thermal: update thermal zone device after setting emul_temp
intel_powerclamp: Fix cstate counter detection.
thermal: imx: add necessary clk operation
Thermal cpu cooling: return error if no valid cpu frequency entry
thermal: fix cpu_cooling max_level behavior
thermal: rcar-thermal: Enable driver compilation with COMPILE_TEST
thermal: debug: add debug statement for core and step_wise
thermal: imx_thermal: add module device table
drivers: thermal: Mark function as static in x86_pkg_temp_thermal.c
thermal:samsung: fix compilation warning
thermal: imx: correct suspend/resume flow
thermal: exynos: fix error return code
Thermal: ACPI INT3403 thermal driver
MAINTAINERS: add thermal bindings entry in thermal domain
arm: dts: make OMAP4460 bandgap node to belong to OCP
arm: dts: make OMAP443x bandgap node to belong to OCP
arm: dts: add cooling properties on omap5 cpu node
arm: dts: add omap5 thermal data
arm: dts: add omap5 CORE thermal data
...
- ACPI core changes to make it create a struct acpi_device object for every
device represented in the ACPI tables during all namespace scans regardless
of the current status of that device. In accordance with this, ACPI hotplug
operations will not delete those objects, unless the underlying ACPI tables
go away.
- On top of the above, new sysfs attribute for ACPI device objects allowing
user space to check device status by triggering the execution of _STA for
its ACPI object. From Srinivas Pandruvada.
- ACPI core hotplug changes reducing code duplication, integrating the
PCI root hotplug with the core and reworking container hotplug.
- ACPI core simplifications making it use ACPI_COMPANION() in the code
"glueing" ACPI device objects to "physical" devices.
- ACPICA update to upstream version 20131218. This adds support for the
DBG2 and PCCT tables to ACPICA, fixes some bugs and improves debug
facilities. From Bob Moore, Lv Zheng and Betty Dall.
- Init code change to carry out the early ACPI initialization earlier.
That should allow us to use ACPI during the timekeeping initialization
and possibly to simplify the EFI initialization too. From Chun-Yi Lee.
- Clenups of the inclusions of ACPI headers in many places all over from
Lv Zheng and Rashika Kheria (work in progress).
- New helper for ACPI _DSM execution and rework of the code in drivers
that uses _DSM to execute it via the new helper. From Jiang Liu.
- New Win8 OSI blacklist entries from Takashi Iwai.
- Assorted ACPI fixes and cleanups from Al Stone, Emil Goode, Hanjun Guo,
Lan Tianyu, Masanari Iida, Oliver Neukum, Prarit Bhargava, Rashika Kheria,
Tang Chen, Zhang Rui.
- intel_pstate driver updates, including proper Baytrail support, from
Dirk Brandewie and intel_pstate documentation from Ramkumar Ramachandra.
- Generic CPU boost ("turbo") support for cpufreq from Lukasz Majewski.
- powernow-k6 cpufreq driver fixes from Mikulas Patocka.
- cpufreq core fixes and cleanups from Viresh Kumar, Jane Li, Mark Brown.
- Assorted cpufreq drivers fixes and cleanups from Anson Huang, John Tobias,
Paul Bolle, Paul Walmsley, Sachin Kamat, Shawn Guo, Viresh Kumar.
- cpuidle cleanups from Bartlomiej Zolnierkiewicz.
- Support for hibernation APM events from Bin Shi.
- Hibernation fix to avoid bringing up nonboot CPUs with ACPI EC disabled
during thaw transitions from Bjørn Mork.
- PM core fixes and cleanups from Ben Dooks, Leonardo Potenza, Ulf Hansson.
- PNP subsystem fixes and cleanups from Dmitry Torokhov, Levente Kurusa,
Rashika Kheria.
- New tool for profiling system suspend from Todd E Brandt and a cpupower
tool cleanup from One Thousand Gnomes.
/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABCAAGBQJS3a1eAAoJEILEb/54YlRxnTgP/iGawvgjKWm6Qqp7WSIvd5gQ
zZ6q75C6Pc/W2fq1+OzVGnpCF8WYFy+nFDAXOvUHjIXuoxSwFcuW5l4aMckgl/0a
TXEWe9MJrCHHRfDApfFacCJ44U02bjJAD5vTyL/hKA+IHeinq4WCSojryYC+8jU0
cBrUIV0aNH8r5JR2WJNAyv/U29rXsDUOu0I4qTqZ4YaZT6AignMjtLXn1e9AH1Pn
DPZphTIo/HMnb+kgBOjt4snMk+ahVO9eCOxh/hH8ecnWExw9WynXoU5Nsna0tSZs
ssyHC7BYexD3oYsG8D52cFUpp4FCsJ0nFQNa2kw0LY+0FBNay43LySisKYHZPXEs
2WpESDv+/t7yhtnrvM+TtA7aBheKm2XMWGFSu/aERLE17jIidOkXKH5Y7ryYLNf/
uyRKxNS0NcZWZ0G+/wuY02jQYNkfYz3k/nTr8BAUItRBjdporGIRNEnR9gPzgCUC
uQhjXWMPulqubr8xbyefPWHTEzU2nvbXwTUWGjrBxSy8zkyy5arfqizUj+VG6afT
NsboANoMHa9b+xdzigSFdA3nbVK6xBjtU6Ywntk9TIpODKF5NgfARx0H+oSH+Zrj
32bMzgZtHw/lAbYsnQ9OnTY6AEWQYt6NMuVbTiLXrMHhM3nWwfg/XoN4nZqs6jPo
IYvE6WhQZU6L6fptGHFC
=dRf6
-----END PGP SIGNATURE-----
Merge tag 'pm+acpi-3.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI and power management updates from Rafael Wysocki:
"As far as the number of commits goes, the top spot belongs to ACPI
this time with cpufreq in the second position and a handful of PM
core, PNP and cpuidle updates. They are fixes and cleanups mostly, as
usual, with a couple of new features in the mix.
The most visible change is probably that we will create struct
acpi_device objects (visible in sysfs) for all devices represented in
the ACPI tables regardless of their status and there will be a new
sysfs attribute under those objects allowing user space to check that
status via _STA.
Consequently, ACPI device eject or generally hot-removal will not
delete those objects, unless the table containing the corresponding
namespace nodes is unloaded, which is extremely rare. Also ACPI
container hotplug will be handled quite a bit differently and cpufreq
will support CPU boost ("turbo") generically and not only in the
acpi-cpufreq driver.
Specifics:
- ACPI core changes to make it create a struct acpi_device object for
every device represented in the ACPI tables during all namespace
scans regardless of the current status of that device. In
accordance with this, ACPI hotplug operations will not delete those
objects, unless the underlying ACPI tables go away.
- On top of the above, new sysfs attribute for ACPI device objects
allowing user space to check device status by triggering the
execution of _STA for its ACPI object. From Srinivas Pandruvada.
- ACPI core hotplug changes reducing code duplication, integrating
the PCI root hotplug with the core and reworking container hotplug.
- ACPI core simplifications making it use ACPI_COMPANION() in the
code "glueing" ACPI device objects to "physical" devices.
- ACPICA update to upstream version 20131218. This adds support for
the DBG2 and PCCT tables to ACPICA, fixes some bugs and improves
debug facilities. From Bob Moore, Lv Zheng and Betty Dall.
- Init code change to carry out the early ACPI initialization
earlier. That should allow us to use ACPI during the timekeeping
initialization and possibly to simplify the EFI initialization too.
From Chun-Yi Lee.
- Clenups of the inclusions of ACPI headers in many places all over
from Lv Zheng and Rashika Kheria (work in progress).
- New helper for ACPI _DSM execution and rework of the code in
drivers that uses _DSM to execute it via the new helper. From
Jiang Liu.
- New Win8 OSI blacklist entries from Takashi Iwai.
- Assorted ACPI fixes and cleanups from Al Stone, Emil Goode, Hanjun
Guo, Lan Tianyu, Masanari Iida, Oliver Neukum, Prarit Bhargava,
Rashika Kheria, Tang Chen, Zhang Rui.
- intel_pstate driver updates, including proper Baytrail support,
from Dirk Brandewie and intel_pstate documentation from Ramkumar
Ramachandra.
- Generic CPU boost ("turbo") support for cpufreq from Lukasz
Majewski.
- powernow-k6 cpufreq driver fixes from Mikulas Patocka.
- cpufreq core fixes and cleanups from Viresh Kumar, Jane Li, Mark
Brown.
- Assorted cpufreq drivers fixes and cleanups from Anson Huang, John
Tobias, Paul Bolle, Paul Walmsley, Sachin Kamat, Shawn Guo, Viresh
Kumar.
- cpuidle cleanups from Bartlomiej Zolnierkiewicz.
- Support for hibernation APM events from Bin Shi.
- Hibernation fix to avoid bringing up nonboot CPUs with ACPI EC
disabled during thaw transitions from Bjørn Mork.
- PM core fixes and cleanups from Ben Dooks, Leonardo Potenza, Ulf
Hansson.
- PNP subsystem fixes and cleanups from Dmitry Torokhov, Levente
Kurusa, Rashika Kheria.
- New tool for profiling system suspend from Todd E Brandt and a
cpupower tool cleanup from One Thousand Gnomes"
* tag 'pm+acpi-3.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (153 commits)
thermal: exynos: boost: Automatic enable/disable of BOOST feature (at Exynos4412)
cpufreq: exynos4x12: Change L0 driver data to CPUFREQ_BOOST_FREQ
Documentation: cpufreq / boost: Update BOOST documentation
cpufreq: exynos: Extend Exynos cpufreq driver to support boost
cpufreq / boost: Kconfig: Support for software-managed BOOST
acpi-cpufreq: Adjust the code to use the common boost attribute
cpufreq: Add boost frequency support in core
intel_pstate: Add trace point to report internal state.
cpufreq: introduce cpufreq_generic_get() routine
ARM: SA1100: Create dummy clk_get_rate() to avoid build failures
cpufreq: stats: create sysfs entries when cpufreq_stats is a module
cpufreq: stats: free table and remove sysfs entry in a single routine
cpufreq: stats: remove hotplug notifiers
cpufreq: stats: handle cpufreq_unregister_driver() and suspend/resume properly
cpufreq: speedstep: remove unused speedstep_get_state
platform: introduce OF style 'modalias' support for platform bus
PM / tools: new tool for suspend/resume performance optimization
ACPI: fix module autoloading for ACPI enumerated devices
ACPI: add module autoloading support for ACPI enumerated devices
ACPI: fix create_modalias() return value handling
...
This is the branch where we usually queue up cleanup efforts, moving
drivers out of the architecture directory, header file restructuring,
etc. Sometimes they tangle with new development so it's hard to keep it
strictly to cleanups.
Some of the things included in this branch are:
* Atmel SAMA5 conversion to common clock
* Reset framework conversion for tegra platforms
- Some of this depends on tegra clock driver reworks that are shared with Mike
Turquette's clk tree.
* Tegra DMA refactoring, which are shared branches with the DMA tree.
* Removal of some header files on exynos to prepare for multiplatform
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJS4Vf7AAoJEIwa5zzehBx3f9UP/jwMlbfbSZHfNQ/QG0SqZ9RD
zvddyDMHY/qXnzgF3Dax+JR9BDDVy8AlQe713FCoiHJZggWRAbbavkx8gxITDrZQ
6NYaEkkuVxqyM8APl3PwMqYm8UZ8MUf4lCltlOA4jkesY9vue91AFnfyKh2CvHrn
Leg4XT6mFzf/vYDL6RbvTz/Qr253uv3KvYBxkeiRNa0Y7OXRemEXSOfgxh0YGxUl
LZ2IWQFOh/DH4kaeQI8V4G67X3ceHiFyhCnl0CPwfxaZaNBVaxvIFgIUTdetS6Sb
zcXa029tE/Dfsr55vZAv9LUHEipCSOeE5rn2EJWehTWyM7vJ42Eozqgh+zfCjXS7
Ib6g2npsvIluQit/RdITu44h5yZlrQsLgKTGJ8jjXqbT4HQ/746W8b/TP0YLtbw7
N8oqr7k4vsZyF0dAYZQtfQUZeGISz67UbFcdzl9tmYOR7HFuAYkAQYst77zkVJf8
om59BAYYTG5FNjQ4I9AKUfJzxXYveI6AKpXSCCZiahpFM2D1CJIzp9Wi0GwK1HRR
sFVWhS0dajvz63pVVC2tw5Sq4J7onRRNGIXFPoE5fkmlelm0/q0zzGjw3Z0nTqbZ
8zxuwuy2FfPJK11GbUAIhAgn1sCLYyAhl6IE+FsanGeMOSGIMrH0v5/HphAxoCXt
BvqMDogyLoGPce1Gm3pJ
=3CcT
-----END PGP SIGNATURE-----
Merge tag 'cleanup-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC cleanups from Olof Johansson:
"This is the branch where we usually queue up cleanup efforts, moving
drivers out of the architecture directory, header file restructuring,
etc. Sometimes they tangle with new development so it's hard to keep
it strictly to cleanups.
Some of the things included in this branch are:
* Atmel SAMA5 conversion to common clock
* Reset framework conversion for tegra platforms
- Some of this depends on tegra clock driver reworks that are shared
with Mike Turquette's clk tree.
* Tegra DMA refactoring, which are shared branches with the DMA tree.
* Removal of some header files on exynos to prepare for
multiplatform"
* tag 'cleanup-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (169 commits)
ARM: mvebu: move Armada 370/XP specific definitions to armada-370-xp.h
ARM: mvebu: remove prototypes of non-existing functions from common.h
ARM: mvebu: move ARMADA_XP_MAX_CPUS to armada-370-xp.h
serial: sh-sci: Rework baud rate calculation
serial: sh-sci: Compute overrun_bit without using baud rate algo
serial: sh-sci: Remove unused GPIO request code
serial: sh-sci: Move overrun_bit and error_mask fields out of pdata
serial: sh-sci: Support resources passed through platform resources
serial: sh-sci: Don't check IRQ in verify port operation
serial: sh-sci: Set the UPF_FIXED_PORT flag
serial: sh-sci: Remove duplicate interrupt check in verify port op
serial: sh-sci: Simplify baud rate calculation algorithms
serial: sh-sci: Remove baud rate calculation algorithm 5
serial: sh-sci: Sort headers alphabetically
ARM: EXYNOS: Kill exynos_pm_late_initcall()
ARM: EXYNOS: Consolidate selection of PM_GENERIC_DOMAINS for Exynos4
ARM: at91: switch Calao QIL-A9260 board to DT
clk: at91: fix pmc_clk_ids data type attriubte
PM / devfreq: use inclusion <mach/map.h> instead of <plat/map-s5p.h>
ARM: EXYNOS: remove <mach/regs-clock.h> for exynos
...
Add a special driver data flag (CPUFREQ_BOOST_FREQ) to indicate
a frequency that can be only enabled for BOOST mode.
This frequency will be used only for limited time, since running with
it for too long may cause the target device to overheat.
Signed-off-by: Lukasz Majewski <l.majewski@samsung.com>
Signed-off-by: Myungjoo Ham <myungjoo.ham@samsung.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
[rjw: Changelog]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The cpufreq_driver's boost_supported flag is true only when boost
support is explicitly enabled. Boost related attributes are exported
only under the same condition.
Signed-off-by: Lukasz Majewski <l.majewski@samsung.com>
Signed-off-by: Myungjoo Ham <myungjoo.ham@samsung.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Add CONFIG_CPU_FREQ_BOOST_SW Kconfig option such that software-managed
boost is enabled only after selecting "EXYNOS Frequency Overclocking -
Software". It also depends on the thermal subsystem to be compiled in,
which is necessary for disabling boost and cooling down the device when
overheating is detected.
Software-managed boost _MUST_ _NOT_ be enabled without thermal subsystem
with properly defined overheating temperature thresholds.
This option doesn't affect the x86's hardware-driven boost support
in the acpi-cpufreq driver.
Signed-off-by: Lukasz Majewski <l.majewski@samsung.com>
Signed-off-by: Myungjoo Ham <myungjoo.ham@samsung.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
[rjw: Subject and changelog]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Modify acpi-cpufreq's hardware-based boost solution to work with the
common cpufreq boost framework.
Signed-off-by: Lukasz Majewski <l.majewski@samsung.com>
Signed-off-by: Myungjoo Ham <myungjoo.ham@samsung.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
[rjw: Subject and changelog]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
This commit adds boost frequency support in cpufreq core (Hardware &
Software). Some SoCs (like Exynos4 - e.g. 4x12) allow setting frequency
above its normal operation limits. Such mode shall be only used for a
short time.
Overclocking (boost) support is essentially provided by platform
dependent cpufreq driver.
This commit unifies support for SW and HW (Intel) overclocking solutions
in the core cpufreq driver. Previously the "boost" sysfs attribute was
defined in the ACPI processor driver code. By default boost is disabled.
One global attribute is available at: /sys/devices/system/cpu/cpufreq/boost.
It only shows up when cpufreq driver supports overclocking.
Under the hood frequencies dedicated for boosting are marked with a
special flag (CPUFREQ_BOOST_FREQ) at driver's frequency table.
It is the user's concern to enable/disable overclocking with a proper call
to sysfs.
The cpufreq_boost_trigger_state() function is defined non static on purpose.
It is used later with thermal subsystem to provide automatic enable/disable
of the BOOST feature.
Signed-off-by: Lukasz Majewski <l.majewski@samsung.com>
Signed-off-by: Myungjoo Ham <myungjoo.ham@samsung.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Add perf trace event "power:pstate_sample" to report driver state to
aid in diagnosing issues reported against intel_pstate.
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
CPUFreq drivers that use clock frameworks interface,i.e. clk_get_rate(),
to get CPUs clk rate, have similar sort of code used in most of them.
This patch adds a generic ->get() which will do the same thing for them.
All those drivers are required to now is to set .get to cpufreq_generic_get()
and set their clk pointer in policy->clk during ->init().
Acked-by: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Acked-by: Shawn Guo <shawn.guo@linaro.org>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Shawn Guo <shawn.guo@linaro.org>
Acked-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
When cpufreq_stats is compiled in as a module, cpufreq driver would
have already been registered. And so the CPUFREQ_CREATE_POLICY
notifiers wouldn't be called for it. Hence no sysfs entries for stats. :(
This patch calls cpufreq_stats_create_table() for each online CPU from
cpufreq_stats_init() and so if policy is already created for CPUx then
we will register sysfs stats for it.
When its not compiled as module, we will return early as policy wouldn't
be found for any of the CPUs.
Acked-by: Nicolas Pitre <nico@linaro.org>
Tested-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
We don't have code paths now where we need to do these two things
separately, so it is better do them in a single routine. Just as
they are allocated in a single routine.
Acked-by: Nicolas Pitre <nico@linaro.org>
Tested-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Either CPUs are hot-unplugged or suspend/resume occurs, cpufreq core
will send notifications to cpufreq-stats and stats structure and sysfs
entries would be correctly handled..
And so we don't actually need hotcpu notifiers in cpufreq-stats anymore.
We were only handling cpu hot-unplug events here and that are already
taken care of by POLICY notifiers.
Acked-by: Nicolas Pitre <nico@linaro.org>
Tested-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
There are several problems with cpufreq stats in the way it handles
cpufreq_unregister_driver() and suspend/resume..
- We must not lose data collected so far when suspend/resume happens
and so stats directories must not be removed/allocated during these
operations, which is done currently.
- cpufreq_stat has registered notifiers with both cpufreq and hotplug.
It adds sysfs stats directory with a cpufreq notifier: CPUFREQ_NOTIFY
and removes this directory with a notifier from hotplug core.
In case cpufreq_unregister_driver() is called (on rmmod cpufreq driver),
stats directories per cpu aren't removed as CPUs are still online. The
only call cpufreq_stats gets is cpufreq_stats_update_policy_cpu() for
all CPUs except the last of each policy. And pointer to stat information
is stored in the entry for last CPU in the per-cpu cpufreq_stats_table.
But policy structure would be freed inside cpufreq core and so that will
result in memory leak inside cpufreq stats (as we are never freeing
memory for stats).
Now if we again insert the module cpufreq_register_driver() will be
called and we will again allocate stats data and put it on for first
CPU of every policy. In case we only have a single CPU per policy, we
will return with a error from cpufreq_stats_create_table() due to this
code:
if (per_cpu(cpufreq_stats_table, cpu))
return -EBUSY;
And so probably cpufreq stats directory would not show up anymore (as
it was added inside last policies->kobj which doesn't exist anymore).
I haven't tested it, though. Also the values in stats files wouldn't
be refreshed as we are using the earlier stats structure.
- CPUFREQ_NOTIFY is called from cpufreq_set_policy() which is called for
scenarios where we don't really want cpufreq_stat_notifier_policy() to get
called. For example whenever we are changing anything related to a policy:
min/max/current freq, etc. cpufreq_set_policy() is called and so cpufreq
stats is notified. Where we don't do any useful stuff other than simply
returning with -EBUSY from cpufreq_stats_create_table(). And so this
isn't the right notifier that cpufreq stats..
Due to all above reasons this patch does following changes:
- Add new notifiers CPUFREQ_CREATE_POLICY and CPUFREQ_REMOVE_POLICY,
which are only called when policy is created/destroyed. They aren't
called for suspend/resume paths..
- Use these notifiers in cpufreq_stat_notifier_policy() to create/destory
stats sysfs entries. And so cpufreq_unregister_driver() or suspend/resume
shouldn't be a problem for cpufreq_stats.
- Return early from cpufreq_stat_cpu_callback() for suspend/resume sequence,
so that we don't free stats structure.
Acked-by: Nicolas Pitre <nico@linaro.org>
Tested-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The only caller of speedstep_get_state() was removed in commit d4019f0a92
("cpufreq: move freq change notifications to cpufreq core"). So building
speedstep-smi.o now triggers a GCC warning:
drivers/cpufreq/speedstep-smi.c:148:12: warning: 'speedstep_get_state' defined but not used [-Wunused-function]
Remove this unused function.
Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
KVM environments do not support APERF/MPERF MSRs. intel_pstate cannot
operate without these registers.
The previous validity checks in intel_pstate_msrs_not_valid() are
insufficent in nested KVMs.
References: https://bugzilla.redhat.com/show_bug.cgi?id=1046317
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
This patch reorders reported frequencies from the highest to the lowest,
just like in other frequency drivers.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The powernow-k6 driver used to read the initial multiplier from the
powernow register. However, there is a problem with this:
* If there was a frequency transition before, the multiplier read from the
register corresponds to the current multiplier.
* If there was no frequency transition since reset, the field in the
register always reads as zero, regardless of the current multiplier that
is set using switches on the mainboard and that the CPU is running at.
The zero value corresponds to multiplier 4.5, so as a consequence, the
powernow-k6 driver always assumes multiplier 4.5.
For example, if we have 550MHz CPU with bus frequency 100MHz and
multiplier 5.5, the powernow-k6 driver thinks that the multiplier is 4.5
and bus frequency is 122MHz. The powernow-k6 driver then sets the
multiplier to 4.5, underclocking the CPU to 450MHz, but reports the
current frequency as 550MHz.
There is no reliable way how to read the initial multiplier. I modified
the driver so that it contains a table of known frequencies (based on
parameters of existing CPUs and some common overclocking schemes) and sets
the multiplier according to the frequency. If the frequency is unknown
(because of unusual overclocking or underclocking), the user must supply
the bus speed and maximum multiplier as module parameters.
This patch should be backported to all stable kernels. If it doesn't
apply cleanly, change it, or ask me to change it.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
I found out that a system with k6-3+ processor is unstable during network
server load. The system locks up or the network card stops receiving. The
reason for the instability is the CPU frequency scaling.
During frequency transition the processor is in "EPM Stop Grant" state.
The documentation says that the processor doesn't respond to inquiry
requests in this state. Consequently, coherency of processor caches and
bus master devices is not maintained, causing the system instability.
This patch flushes the cache during frequency transition. It fixes the
instability.
Other minor changes:
* u64 invalue changed to unsigned long because the variable is 32-bit
* move the logic to set the multiplier to a separate function
powernow_k6_set_cpu_multiplier
* preserve lower 5 bits of the powernow port instead of 4 (the voltage
field has 5 bits)
* mask interrupts when reading the multiplier, so that the port is not
open during other activity (running other kernel code with the port open
shouldn't cause any misbehavior, but we should better be safe and keep
the port closed)
This patch should be backported to all stable kernels. If it doesn't
apply cleanly, change it, or ask me to change it.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>