linux

Author	SHA1	Message	Date
Srinivas Pandruvada	9522a2ff9c	cpufreq: intel_pstate: Enforce _PPC limits Use ACPI _PPC notification to limit max P state driver will request. ACPI _PPC change notification is sent by BIOS to limit max P state in several cases: - Reduce impact of platform thermal condition - When Config TDP feature is used, a changed _PPC is sent to follow TDP change - Remote node managers in server want to control platform power via baseboard management controller (BMC) This change registers with ACPI processor performance lib so that _PPC changes are notified to cpufreq core, which in turns will result in call to .setpolicy() callback. Also the way _PSS table identifies a turbo frequency is not compatible to max turbo frequency in intel_pstate, so the very first entry in _PSS needs to be adjusted. This feature can be turned on by using kernel parameters: intel_pstate=support_acpi_ppc Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> [ rjw: Minor cleanups ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-28 01:01:39 +02:00
Akshay Adiga	eaa2c3aeef	cpufreq: powernv: Ramp-down global pstate slower than local-pstate The frequency transition latency from pmin to pmax is observed to be in few millisecond granurality. And it usually happens to take a performance penalty during sudden frequency rampup requests. This patch set solves this problem by using an entity called "global pstates". The global pstate is a Chip-level entity, so the global entitiy (Voltage) is managed across the cores. The local pstate is a Core-level entity, so the local entity (frequency) is managed across threads. This patch brings down global pstate at a slower rate than the local pstate. Hence by holding global pstates higher than local pstate makes the subsequent rampups faster. A per policy structure is maintained to keep track of the global and local pstate changes. The global pstate is brought down using a parabolic equation. The ramp down time to pmin is set to ~5 seconds. To make sure that the global pstates are dropped at regular interval , a timer is queued for every 2 seconds during ramp-down phase, which eventually brings the pstate down to local pstate. Iozone results show fairly consistent performance boost. YCSB on redis shows improved Max latencies in most cases. Iozone write/rewite test were made with filesizes 200704Kb and 401408Kb with different record sizes . The following table shows IOoperations/sec with and without patch. Iozone Results ( in op/sec) ( mean over 3 iterations ) --------------------------------------------------------------------- file size- with without % recordsize-IOtype patch patch change ---------------------------------------------------------------------- 200704-1-SeqWrite 1616532 1615425 0.06 200704-1-Rewrite 2423195 2303130 5.21 200704-2-SeqWrite 1628577 1602620 1.61 200704-2-Rewrite 2428264 2312154 5.02 200704-4-SeqWrite 1617605 1617182 0.02 200704-4-Rewrite 2430524 2351238 3.37 200704-8-SeqWrite 1629478 1600436 1.81 200704-8-Rewrite `2415308` 2298136 5.09 200704-16-SeqWrite 1619632 1618250 0.08 200704-16-Rewrite 2396650 2352591 1.87 200704-32-SeqWrite 1632544 1598083 2.15 200704-32-Rewrite 2425119 2329743 4.09 200704-64-SeqWrite 1617812 1617235 0.03 200704-64-Rewrite 2402021 2321080 3.48 200704-128-SeqWrite 1631998 1600256 1.98 200704-128-Rewrite 2422389 2304954 5.09 200704-256 SeqWrite 1617065 1616962 0.00 200704-256-Rewrite 2432539 2301980 5.67 200704-512-SeqWrite 1632599 1598656 2.12 200704-512-Rewrite 2429270 2323676 4.54 200704-1024-SeqWrite 1618758 1616156 0.16 200704-1024-Rewrite 2431631 2315889 4.99 401408-1-SeqWrite 1631479 1608132 1.45 401408-1-Rewrite 2501550 2459409 1.71 401408-2-SeqWrite 1617095 1626069 -0.55 401408-2-Rewrite 2507557 2443621 2.61 401408-4-SeqWrite 1629601 1611869 1.10 401408-4-Rewrite 2505909 2462098 1.77 401408-8-SeqWrite 1617110 1626968 -0.60 401408-8-Rewrite 2512244 2456827 2.25 401408-16-SeqWrite 1632609 1609603 1.42 401408-16-Rewrite 2500792 2451405 2.01 401408-32-SeqWrite 1619294 1628167 -0.54 401408-32-Rewrite 2510115 2451292 2.39 401408-64-SeqWrite 1632709 1603746 1.80 401408-64-Rewrite 2506692 2433186 3.02 401408-128-SeqWrite 1619284 1627461 -0.50 401408-128-Rewrite 2518698 2453361 2.66 401408-256-SeqWrite 1634022 1610681 1.44 401408-256-Rewrite 2509987 2446328 2.60 401408-512-SeqWrite 1617524 1628016 -0.64 401408-512-Rewrite 2504409 2442899 2.51 401408-1024-SeqWrite 1629812 1611566 1.13 401408-1024-Rewrite 2507620 2442968 2.64 Tested with YCSB workload (50% update + 50% read) over redis for 1 million records and 1 million operation. Each test was carried out with target operations per second and persistence disabled. Max-latency (in us)( mean over 5 iterations ) --------------------------------------------------------------- op/s Operation with patch without patch %change --------------------------------------------------------------- 15000 Read 61480.6 50261.4 22.32 15000 cleanup 215.2 293.6 -26.70 15000 update 25666.2 25163.8 2.00 25000 Read 32626.2 89525.4 -63.56 25000 cleanup 292.2 263.0 11.10 25000 update 32293.4 90255.0 -64.22 35000 Read 34783.0 33119.0 5.02 35000 cleanup 321.2 395.8 -18.8 35000 update 36047.0 38747.8 -6.97 40000 Read 38562.2 42357.4 -8.96 40000 cleanup 371.8 384.6 -3.33 40000 update 27861.4 41547.8 -32.94 45000 Read 42271.0 88120.6 -52.03 45000 cleanup 263.6 383.0 -31.17 45000 update 29755.8 81359.0 -63.43 (test without target op/s) 47659 Read 83061.4 136440.6 -39.12 47659 cleanup 195.8 193.8 1.03 47659 update 73429.4 124971.8 -41.24 Signed-off-by: Akshay Adiga <akshay.adiga@linux.vnet.ibm.com> Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-27 23:56:58 +02:00
Shilpasri G Bhat	2920e9ce8f	cpufreq: powernv: Remove flag use-case of policy->driver_data commit `1b0289848d` ("cpufreq: powernv: Add sysfs attributes to show throttle stats") used policy->driver_data as a flag for one-time creation of throttle sysfs files. Instead of this use 'kernfs_find_and_get()' to check if the attribute already exists. This is required as policy->driver_data is used for other purposes in the later patch. Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com> Signed-off-by: Akshay Adiga <akshay.adiga@linux.vnet.ibm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-27 23:56:58 +02:00
Javier Martinez Canillas	6de0dc4b53	cpufreq: e_powersaver: Use IS_ENABLED() instead of checking for built-in or module The IS_ENABLED() macro checks if a Kconfig symbol has been enabled either built-in or as a module, use that macro instead of open coding the same. Signed-off-by: Javier Martinez Canillas <javier@osg.samsung.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-27 22:42:34 +02:00
Rafael J. Wysocki	ba1ca654f3	cpufreq: governor: Fix prev_load initialization in cpufreq_governor_start() The way cpufreq_governor_start() initializes j_cdbs->prev_load is questionable. First off, j_cdbs->prev_cpu_wall used as a denominator in the computation may be zero. The case this happens is when get_cpu_idle_time_us() returns -1 and get_cpu_idle_time_jiffy() used to return that number is called exactly at the jiffies_64 wrap time. It is rather hard to trigger that error, but it is not impossible and it will just crash the kernel then. Second, j_cdbs->prev_load is computed as the average load during the entire time since the system started and it may not reflect the load in the previous sampling period (as it is expected to). That doesn't play well with the way dbs_update() uses that value. Namely, if the update time delta (wall_time) happens do be greater than twice the sampling rate on the first invocation of it, the initial value of j_cdbs->prev_load (which may be completely off) will be returned to the caller as the current load (unless it is equal to zero and unless another CPU sharing the same policy object has a greater load value). For this reason, notice that the prev_load field of struct cpu_dbs_info is only used by dbs_update() and only in that one place, so if cpufreq_governor_start() is modified to always initialize it to 0, it will make dbs_update() always compute the actual load first time it checks the update time delta against the doubled sampling rate (after initialization) and there won't be any side effects of it. Consequently, modify cpufreq_governor_start() as described. Fixes: `18b46abd00` (cpufreq: governor: Be friendly towards latency-sensitive bursty workloads) Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2016-04-25 16:21:34 +02:00
Viresh Kumar	3920be471c	cpufreq: hisilicon: Use generic platdev driver The cpufreq-dt-platdev driver supports creation of cpufreq-dt platform device now, reuse that and remove similar code from platform code. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-25 16:18:24 +02:00
Viresh Kumar	5e4249c6d9	cpufreq: zynq: Use generic platdev driver The cpufreq-dt-platdev driver supports creation of cpufreq-dt platform device now, reuse that and remove similar code from platform code. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-25 16:18:24 +02:00
Viresh Kumar	117d4f59af	cpufreq: sunxi: Use generic platdev driver The cpufreq-dt-platdev driver supports creation of cpufreq-dt platform device now, reuse that and remove similar code from platform code. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-25 16:18:24 +02:00
Viresh Kumar	a399dc9fc5	cpufreq: shmobile: Use generic platdev driver The cpufreq-dt-platdev driver supports creation of cpufreq-dt platform device now, reuse that and remove similar code from platform code. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-25 16:18:23 +02:00
Finley Xiao	014400c127	cpufreq: rockchip: Use generic platdev driver This patch add rockchip's compatible string to the compat list and remove similar code from platform code for supporting generic platdev driver. Signed-off-by: Finley Xiao <finley.xiao@rock-chips.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Heiko Stuebner <heiko@sntech.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-25 16:18:23 +02:00
Viresh Kumar	7694ca6e1d	cpufreq: omap: Use generic platdev driver The cpufreq-dt-platdev driver supports creation of cpufreq-dt platform device now, reuse that and remove similar code from platform code. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-25 16:18:23 +02:00
Viresh Kumar	7ead83f6df	cpufreq: imx: Use generic platdev driver The cpufreq-dt-platdev driver supports creation of cpufreq-dt platform device now, reuse that and remove similar code from platform code. Note that the complete routine imx27_dt_init() is removed as of_platform_populate(NULL, of_default_bus_match_table, NULL, NULL); has same effect as a NULL .init_machine machine callback pointer. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Lucas Stach <l.stach@pengutronix.de> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-25 16:18:23 +02:00
Viresh Kumar	a59511d1da	cpufreq: berlin: Use generic platdev driver The cpufreq-dt-platdev driver supports creation of cpufreq-dt platform device now, reuse that and remove similar code from platform code. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Antoine Tenart <antoine.tenart@free-electrons.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-25 16:18:23 +02:00
Viresh Kumar	e92bb16674	cpufreq: dt: Mark platdev machines array as __initconst The machines array in cpufreq-dt-platdev is used only once at boot time and so should be marked with __initconst, so that kernel can free up memory used for it, if required. Suggested-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-25 16:18:22 +02:00
Jia Hongtao	394cb8316b	cpufreq: qoriq: Fix cooling device registration issue during suspend Cooling device is registered by ready callback. It's also invoked while system resuming from sleep (Enabling non-boot cpus). Thus cooling device may be multiple registered. Matchable unregistration is added to exit callback to fix this issue. Signed-off-by: Jia Hongtao <hongtao.jia@nxp.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-25 16:07:02 +02:00
Jia Hongtao	495c716f17	cpufreq: qoriq: Remove __exit macro from .exit callback .exit callback (qoriq_cpufreq_cpu_exit()) is also used during suspend. So __exit macro should be removed or the function will be discarded. Signed-off-by: Jia Hongtao <hongtao.jia@nxp.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-25 16:07:02 +02:00
Jia Hongtao	27b8fe8daf	cpufreq: qoriq: Don't show cooling device messages if THERMAL_OF undefined When THERMAL_OF is undefined the cooling device messages should not be shown. -ENOSYS is returned from of_cpufreq_cooling_register() when THERMAL_OF is undefined. Signed-off-by: Jia Hongtao <hongtao.jia@nxp.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-25 16:00:42 +02:00
Ashwin Chaugule	a29a1e7678	cpufreq: ACPI / CPPC: Add module support for cppc_cpufreq driver Add a function to cleanup at module exit and export appropriate GPL string to enable moduler support for the cppc_cpufreq driver. Reported-by: Srinivas Pandruvada <srinivas.pandruvada@intel.com> Signed-off-by: Ashwin Chaugule <ashwin.chaugule@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-25 15:59:35 +02:00
Philippe Longepe	bdcaa23fb8	cpufreq: intel_pstate: Use average P-State instead of current P-State The result returned by pid_calc() is subtracted from current_pstate (which is the P-State requested during the last period) in order to obtain the target P-State for the current iteration. However, current_pstate may not reflect the real current P-State of the CPU. In particular, that P-State may be higher because of the frequency sharing per module. The theory is: - The load is the percentage of time spent in C0 and is related to the average P-State during the same period. - The last requested P-State can be completely different than the average P-State (because of frequency sharing or throttling). - The P-State shift computed by the pid_calc is based on the load computed at average P-State, so the shift must be relative to this average P-State. Using the average P-State instead of current P-State improves power without significant performance penalty in cases when a task migrates from one core to other core sharing frequency and voltage. Performance and power comparison with this patch on Cherry Trail platform using Android: Benchmark ?Perf ?Power FishTank 10.45% 3.1% SmartBench-Gaming -0.1% -10.4% SmartBench-Productivity -0.8% -10.4% CandyCrush n/a -17.4% AngryBirds n/a -5.9% videoPlayback n/a -13.9% audioPlayback n/a -4.9% IcyRocks-20-50 0.0% -38.4% iozone RR -0.16% -1.3% iozone RW 0.74% -1.3% Signed-off-by: Philippe Longepe <philippe.longepe@linux.intel.com> Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-25 15:45:11 +02:00
Rafael J. Wysocki	1cbc99dfe5	Merge back cpufreq changes for v4.7.	2016-04-25 15:44:01 +02:00
Rafael J. Wysocki	94862a62df	Revert "cpufreq: governor: Fix negative idle_time when configured with CONFIG_HZ_PERIODIC" Revert commit `0df35026c6` (cpufreq: governor: Fix negative idle_time when configured with CONFIG_HZ_PERIODIC) that introduced a regression by causing the ondemand cpufreq governor to misbehave for CONFIG_TICK_CPU_ACCOUNTING unset (the frequency goes up to the max at one point and stays there indefinitely). The revert takes subsequent modifications of the code in question into account. Fixes: `0df35026c6` (cpufreq: governor: Fix negative idle_time when configured with CONFIG_HZ_PERIODIC) Link: https://bugzilla.kernel.org/show_bug.cgi?id=115261 Reported-and-tested-by: Timo Valtoaho <timo.valtoaho@gmail.com> Cc: 4.5+ <stable@vger.kernel.org> # 4.5+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2016-04-25 02:39:20 +02:00
Rafael J. Wysocki	c9d9c929e6	cpufreq: Abort cpufreq_update_current_freq() for cpufreq_suspended set Since governor operations are generally skipped if cpufreq_suspended is set, cpufreq_start_governor() should do nothing in that case. That function is called in the cpufreq_online() path, and may also be called from cpufreq_offline() in some cases, which are invoked by the nonboot CPUs disabing/enabling code during system suspend to RAM and resume. That happens when all devices have been suspended, so if the cpufreq driver relies on things like I2C to get the current frequency, it may not be ready to do that then. To prevent problems from happening for this reason, make cpufreq_update_current_freq(), which is the only function invoked by cpufreq_start_governor() that doesn't check cpufreq_suspended already, return 0 upfront if cpufreq_suspended is set. Fixes: `3bbf8fe3ae` (cpufreq: Always update current frequency before startig governor) Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2016-04-18 23:47:42 +02:00
Rafael J. Wysocki	ffb810563c	intel_pstate: Avoid getting stuck in high P-states when idle Jörg Otte reports that commit `a4675fbc4a` (cpufreq: intel_pstate: Replace timers with utilization update callbacks) caused the CPUs in his Haswell-based system to stay in the very high frequency region even if the system is completely idle. That turns out to be an existing problem in the intel_pstate driver's P-state selection algorithm for Core processors. Namely, all decisions made by that algorithm are based on the average frequency of the CPU between sampling events and on the P-state requested on the last invocation, so it may get stuck at a very hight frequency even if the utilization of the CPU is very low (in fact, it may get stuck in a inadequate P-state regardless of the CPU utilization). The only way to kick it out of that limbo is a sufficiently long idle period (3 times longer than the prescribed sampling interval), but if that doesn't happen often enough (eg. due to a timing change like after the above commit), the P-state of the CPU may be inadequate pretty much all the time. To address the most egregious manifestations of that issue, reset the core_busy value used to determine the next P-state to request if the utilization of the CPU, determined with the help of the MPERF feedback register and the TSC, is below 1%. Link: https://bugzilla.kernel.org/show_bug.cgi?id=115771 Reported-and-tested-by: Jörg Otte <jrg.otte@gmail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-10 05:59:10 +02:00
Viresh Kumar	8cee1eed8e	cpufreq: ACPI: Remove freq_table from acpi_cpufreq_data The freq-table is stored in struct cpufreq_policy also and there is absolutely no need of keeping a copy of its reference in struct acpi_cpufreq_data. Drop it. Also policy->freq_table can't be NULL in the target() callback, remove the useless check as well. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> [ rjw: Rebase ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-09 01:54:31 +02:00
Viresh Kumar	9b55f55af8	cpufreq: ACPI: policy->driver_data can't be NULL in ->exit() Its always set by ->init() and so it will always be there in ->exit(). There is no need to have a special check for just that. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> [ rjw: Rebase ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-09 01:41:50 +02:00
Rafael J. Wysocki	a794d6138c	cpufreq: Rearrange cpufreq_add_dev() Reorganize the code in cpufreq_add_dev() to avoid using the ret variable and reduce the indentation level in it. No functional changes. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2016-04-09 01:37:07 +02:00
Rafael J. Wysocki	cd73e9b01f	cpufreq: Simplify switch () in cpufreq_cpu_callback() Merge two switch entries that do the same thing in cpufreq_cpu_callback(). No functional changes. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2016-04-09 01:37:07 +02:00
Joe Perches	1c5864e26c	cpufreq: Use consistent prefixing via pr_fmt Use the more common kernel style adding a define for pr_fmt. Miscellanea: o Remove now unused PFX defines Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-09 01:35:18 +02:00
Joe Perches	b49c22a6ca	cpufreq: Convert printk(KERN_<LEVEL> to pr_<level> Use the more common logging style. Miscellanea: o Coalesce formats o Realign arguments o Add a missing space between a coalesced format Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-09 01:35:18 +02:00
Joe Perches	4836df173a	intel_pstate: Use pr_fmt Prefix the output using the more common kernel style. Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> [ rjw: Rebase ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-09 01:33:42 +02:00
Geliang Tang	d2499d05f0	cpufreq: mt8173: use list_for_each_entry() Use list_for_each_entry() instead of list_for_each*() to simplify the code. Signed-off-by: Geliang Tang <geliangtang@163.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-09 01:27:54 +02:00
Rafael J. Wysocki	22590efb98	intel_pstate: Avoid pointless FRAC_BITS shifts under div_fp() There are multiple places in intel_pstate where int_tofp() is applied to both arguments of div_fp(), but this is pointless, because int_tofp() simply shifts its argument to the left by FRAC_BITS which mathematically is equivalent to multuplication by 2^FRAC_BITS, so if this is done to both arguments of a division, the extra factors will cancel each other during that operation anyway. Drop the pointless int_tofp() applied to div_fp() arguments throughout the driver. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-09 01:25:58 +02:00
Viresh Kumar	2249c00a0b	cpufreq: exynos: Use generic platdev driver The cpufreq-dt-platdev driver supports creation of cpufreq-dt platform device now, reuse that and remove similar code from platform code. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Krzysztof Kozlowski <k.kozlowski@samsung.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-09 01:18:42 +02:00
Viresh Kumar	ea3b05e62f	ARM: exynos: exynos-cpufreq platform device isn't supported anymore The driver is removed long back and we don't support this device anymore. Stop adding it. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Krzysztof Kozlowski <k.kozlowski@samsung.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-09 01:18:42 +02:00
Viresh Kumar	f56aad1d98	cpufreq: dt: Add generic platform-device creation support Multiple platforms are using the generic cpufreq-dt driver now, and all of them are required to create a platform device with name "cpufreq-dt", in order to get the cpufreq-dt probed. Many of them do it from platform code, others have special drivers just to do that. It would be more sensible to do this at a generic place, where all such platform can mark their entries. This patch adds a separate file to get this device created. Currently the compat list of platforms that we support is empty, and will be filled in as and when we move platforms to use it. It always compiles as part of the kernel and so doesn't need a module-exit operation. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Krzysztof Kozlowski <k.kozlowski@samsung.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-09 01:18:42 +02:00
Viresh Kumar	7e67e239a4	cpufreq: dt: Include types.h from cpufreq-dt.h cpufreq-dt.h uses 'bool' data type but doesn't include types.h. It works fine for now as the files that include cpufreq-dt.h, also include types.h directly or indirectly. But, when a file includes cpufreq-dt.h without including types.h, we get a build error. Avoid such errors by including types.h in cpufreq-dt itself. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-09 01:18:42 +02:00
Viresh Kumar	f3f24dea2c	cpufreq: tegra124: No need of setting platform-data All CPUs on Tegra platform share clock/voltage lines and there is absolutely no need of setting platform data for 'cpufreq-dt' platform device, as that's the default case. Stop setting platform data for cpufreq-dt device. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Tested-by: Thierry Reding <treding@nvidia.com> Acked-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-09 01:12:09 +02:00
Paul Gortmaker	dbbe972c11	cpufreq: ppc_cbe_cpufreq_pmi: make the driver explicitly non-modular The Kconfig for this driver is currently: config CPU_FREQ_CBE_PMI bool "CBE frequency scaling using PMI interface" ...meaning that it currently is not being built as a module by anyone. Lets remove the modular and unused code here, so that when reading the driver there is no doubt it is builtin-only. Since module_init translates to device_initcall in the non-modular case, the init ordering remains unchanged with this commit. We also delete the MODULE_LICENSE tag etc. since all that information is already contained at the top of the file in the comments. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> [ rjw: Changelog ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-09 01:11:04 +02:00
Rafael J. Wysocki	4b42fafc1c	Merge branch 'pm-cpufreq-sched' into pm-cpufreq	2016-04-09 01:08:02 +02:00
Rafael J. Wysocki	6c9d9c8192	cpufreq: Call cpufreq_disable_fast_switch() in sugov_exit() Due to differences in the cpufreq core's handling of runtime CPU offline and nonboot CPUs disabling during system suspend-to-RAM, fast frequency switching gets disabled after a suspend-to-RAM and resume cycle on all of the nonboot CPUs. To prevent that from happening, move the invocation of cpufreq_disable_fast_switch() from cpufreq_exit_governor() to sugov_exit(), as the schedutil governor is the only user of fast frequency switching today anyway. That simply prevents cpufreq_disable_fast_switch() from being called without invoking the ->governor callback for the CPUFREQ_GOV_POLICY_EXIT event (which happens during system suspend now). Fixes: `b7898fda5b` (cpufreq: Support for fast frequency switching) Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2016-04-08 22:41:36 +02:00
Viresh Kumar	b318556479	cpufreq: dt: Drop stale comment The comment in file header doesn't hold true anymore, drop it. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-05 03:40:44 +02:00
Srinivas Pandruvada	13ad7701f9	cpufreq: intel_pstate: Documenation for structures No code change. Only added kernel doc style comments for structures. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-05 03:39:05 +02:00
Srinivas Pandruvada	30a3915385	cpufreq: intel_pstate: fix inconsistency in setting policy limits When user sets performance policy using cpufreq interface, it is possible that because of policy->max limits, the actual performance is still limited. But the current implementation will silently switch the policy to powersave and start using powersave limits. If user modifies any limits using intel_pstate sysfs, this is actually changing powersave limits. The current implementation tracks limits under powersave and performance policy using two different variables. When policy->max is less than policy->cpuinfo.max_freq, only powersave limit variable is used. This fix causes the performance limits variable to be used always when the policy is performance. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-05 03:37:13 +02:00
Rafael J. Wysocki	9bdcb44e39	cpufreq: schedutil: New governor based on scheduler utilization data Add a new cpufreq scaling governor, called "schedutil", that uses scheduler-provided CPU utilization information as input for making its decisions. Doing that is possible after commit `34e2c555f3` (cpufreq: Add mechanism for registering utilization update callbacks) that introduced cpufreq_update_util() called by the scheduler on utilization changes (from CFS) and RT/DL task status updates. In particular, CPU frequency scaling decisions may be based on the the utilization data passed to cpufreq_update_util() by CFS. The new governor is relatively simple. The frequency selection formula used by it depends on whether or not the utilization is frequency-invariant. In the frequency-invariant case the new CPU frequency is given by next_freq = 1.25 * max_freq * util / max where util and max are the last two arguments of cpufreq_update_util(). In turn, if util is not frequency-invariant, the maximum frequency in the above formula is replaced with the current frequency of the CPU: next_freq = 1.25 * curr_freq * util / max The coefficient 1.25 corresponds to the frequency tipping point at (util / max) = 0.8. All of the computations are carried out in the utilization update handlers provided by the new governor. One of those handlers is used for cpufreq policies shared between multiple CPUs and the other one is for policies with one CPU only (and therefore it doesn't need to use any extra synchronization means). The governor supports fast frequency switching if that is supported by the cpufreq driver in use and possible for the given policy. In the fast switching case, all operations of the governor take place in its utilization update handlers. If fast switching cannot be used, the frequency switch operations are carried out with the help of a work item which only calls __cpufreq_driver_target() (under a mutex) to trigger a frequency update (to a value already computed beforehand in one of the utilization update handlers). Currently, the governor treats all of the RT and DL tasks as "unknown utilization" and sets the frequency to the allowed maximum when updated from the RT or DL sched classes. That heavy-handed approach should be replaced with something more subtle and specifically targeted at RT and DL tasks. The governor shares some tunables management code with the "ondemand" and "conservative" governors and uses some common definitions from cpufreq_governor.h, but apart from that it is stand-alone. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>	2016-04-02 01:09:12 +02:00
Rafael J. Wysocki	b7898fda5b	cpufreq: Support for fast frequency switching Modify the ACPI cpufreq driver to provide a method for switching CPU frequencies from interrupt context and update the cpufreq core to support that method if available. Introduce a new cpufreq driver callback, ->fast_switch, to be invoked for frequency switching from interrupt context by (future) governors supporting that feature via (new) helper function cpufreq_driver_fast_switch(). Add two new policy flags, fast_switch_possible, to be set by the cpufreq driver if fast frequency switching can be used for the given policy and fast_switch_enabled, to be set by the governor if it is going to use fast frequency switching for the given policy. Also add a helper for setting the latter. Since fast frequency switching is inherently incompatible with cpufreq transition notifiers, make it possible to set the fast_switch_enabled only if there are no transition notifiers already registered and make the registration of new transition notifiers fail if fast_switch_enabled is set for at least one policy. Implement the ->fast_switch callback in the ACPI cpufreq driver and make it set fast_switch_possible during policy initialization as appropriate. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2016-04-02 01:09:03 +02:00
Rafael J. Wysocki	379480d825	cpufreq: Move governor symbols to cpufreq.h Move definitions of symbols related to transition latency and sampling rate to include/linux/cpufreq.h so they can be used by (future) goverernors located outside of drivers/cpufreq/. No functional changes. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2016-04-02 01:09:02 +02:00
Rafael J. Wysocki	66893b6ac9	cpufreq: Move governor attribute set headers to cpufreq.h Move definitions and function headers related to struct gov_attr_set to include/linux/cpufreq.h so they can be used by (future) goverernors located outside of drivers/cpufreq/. No functional changes. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2016-04-02 01:09:02 +02:00
Rafael J. Wysocki	2d0c58ad60	cpufreq: governor: Move abstract gov_attr_set code to seperate file Move abstract code related to struct gov_attr_set to a separate (new) file so it can be shared with (future) goverernors that won't share more code with "ondemand" and "conservative". No intentional functional changes. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2016-04-02 01:09:01 +02:00
Rafael J. Wysocki	0dd3c1d678	cpufreq: governor: New data type for management part of dbs_data In addition to fields representing governor tunables, struct dbs_data contains some fields needed for the management of objects of that type. As it turns out, that part of struct dbs_data may be shared with (future) governors that won't use the common code used by "ondemand" and "conservative", so move it to a separate struct type and modify the code using struct dbs_data to follow. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2016-04-02 01:09:00 +02:00
Rafael J. Wysocki	0bed612be6	cpufreq: sched: Helpers to add and remove update_util hooks Replace the single helper for adding and removing cpufreq utilization update hooks, cpufreq_set_update_util_data(), with a pair of helpers, cpufreq_add_update_util_hook() and cpufreq_remove_update_util_hook(), and modify the users of cpufreq_set_update_util_data() accordingly. With the new helpers, the code using them doesn't need to worry about the internals of struct update_util_data and in particular it doesn't need to worry about populating the func field in it properly upfront. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>	2016-04-02 01:08:43 +02:00

1 2 3 4 5 ...

588530 Commits