linux

mirror of https://github.com/torvalds/linux.git synced 2024-12-28 22:02:28 +00:00

Author	SHA1	Message	Date
Viresh Kumar	955ef48335	cpufreq: Drop rwsem lock around CPUFREQ_GOV_POLICY_EXIT With the rwsem lock around __cpufreq_governor(policy, CPUFREQ_GOV_POLICY_EXIT), we get circular dependency when we call sysfs_remove_group(). ====================================================== [ INFO: possible circular locking dependency detected ] 3.9.0-rc7+ #15 Not tainted ------------------------------------------------------- cat/2387 is trying to acquire lock: (&per_cpu(cpu_policy_rwsem, cpu)){+++++.}, at: [<c02f6179>] lock_policy_rwsem_read+0x25/0x34 but task is already holding lock: (s_active#41){++++.+}, at: [<c00f9bf7>] sysfs_read_file+0x4f/0xcc which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (s_active#41){++++.+}: [<c0055a79>] lock_acquire+0x61/0xbc [<c00fabf1>] sysfs_addrm_finish+0xc1/0x128 [<c00f9819>] sysfs_hash_and_remove+0x35/0x64 [<c00fbe6f>] remove_files.isra.0+0x1b/0x24 [<c00fbea5>] sysfs_remove_group+0x2d/0xa8 [<c02f9a0b>] cpufreq_governor_interactive+0x13b/0x35c [<c02f61df>] __cpufreq_governor+0x2b/0x8c [<c02f6579>] __cpufreq_set_policy+0xa9/0xf8 [<c02f6b75>] store_scaling_governor+0x61/0x100 [<c02f6f4d>] store+0x39/0x60 [<c00f9b81>] sysfs_write_file+0xed/0x114 [<c00b3fd1>] vfs_write+0x65/0xd8 [<c00b424b>] sys_write+0x2f/0x50 [<c000cdc1>] ret_fast_syscall+0x1/0x52 -> #0 (&per_cpu(cpu_policy_rwsem, cpu)){+++++.}: [<c0055253>] __lock_acquire+0xef3/0x13dc [<c0055a79>] lock_acquire+0x61/0xbc [<c03ee1f5>] down_read+0x25/0x30 [<c02f6179>] lock_policy_rwsem_read+0x25/0x34 [<c02f6edd>] show+0x21/0x58 [<c00f9c0f>] sysfs_read_file+0x67/0xcc [<c00b40a7>] vfs_read+0x63/0xd8 [<c00b41fb>] sys_read+0x2f/0x50 [<c000cdc1>] ret_fast_syscall+0x1/0x52 other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(s_active#41); lock(&per_cpu(cpu_policy_rwsem, cpu)); lock(s_active#41); lock(&per_cpu(cpu_policy_rwsem, cpu)); * DEADLOCK * 2 locks held by cat/2387: #0: (&buffer->mutex){+.+.+.}, at: [<c00f9bcd>] sysfs_read_file+0x25/0xcc #1: (s_active#41){++++.+}, at: [<c00f9bf7>] sysfs_read_file+0x4f/0xcc stack backtrace: [<c0011d55>] (unwind_backtrace+0x1/0x9c) from [<c03e9a09>] (print_circular_bug+0x19d/0x1e8) [<c03e9a09>] (print_circular_bug+0x19d/0x1e8) from [<c0055253>] (__lock_acquire+0xef3/0x13dc) [<c0055253>] (__lock_acquire+0xef3/0x13dc) from [<c0055a79>] (lock_acquire+0x61/0xbc) [<c0055a79>] (lock_acquire+0x61/0xbc) from [<c03ee1f5>] (down_read+0x25/0x30) [<c03ee1f5>] (down_read+0x25/0x30) from [<c02f6179>] (lock_policy_rwsem_read+0x25/0x34) [<c02f6179>] (lock_policy_rwsem_read+0x25/0x34) from [<c02f6edd>] (show+0x21/0x58) [<c02f6edd>] (show+0x21/0x58) from [<c00f9c0f>] (sysfs_read_file+0x67/0xcc) [<c00f9c0f>] (sysfs_read_file+0x67/0xcc) from [<c00b40a7>] (vfs_read+0x63/0xd8) [<c00b40a7>] (vfs_read+0x63/0xd8) from [<c00b41fb>] (sys_read+0x2f/0x50) [<c00b41fb>] (sys_read+0x2f/0x50) from [<c000cdc1>] (ret_fast_syscall+0x1/0x52) This lock isn't required while calling __cpufreq_governor(policy, CPUFREQ_GOV_POLICY_EXIT). Remove it. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-05-22 00:23:54 +02:00
Srivatsa S. Bhat	a66b2e503f	cpufreq: Preserve sysfs files across suspend/resume The file permissions of cpufreq per-cpu sysfs files are not preserved across suspend/resume because we internally go through the CPU Hotplug path which reinitializes the file permissions on CPU online. But the user is not supposed to know that we are using CPU hotplug internally within suspend/resume (IOW, the kernel should not silently wreck the user-set file permissions across a suspend cycle). Therefore, we need to preserve the file permissions as they are across suspend/resume. The simplest way to achieve that is to just not touch the sysfs files at all - ie., just ignore the CPU hotplug notifications in the suspend/resume path (_FROZEN) in the cpufreq hotplug callback. Reported-by: Robert Jarzmik <robert.jarzmik@intel.com> Reported-by: Durgadoss R <durgadoss.r@intel.com> Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-05-15 21:47:17 +02:00
Viresh Kumar	d96038e0fa	cpufreq: Issue CPUFREQ_GOV_POLICY_EXIT notifier before dropping policy refcount We must call __cpufreq_governor(data, CPUFREQ_GOV_POLICY_EXIT) before calling cpufreq_cpu_put(data), so that policy kobject have valid fields. Otherwise, removing last online cpu of policy->cpus causes this crash for ondemand/conservative governor. [<c00fb076>] (sysfs_find_dirent+0xe/0xa8) from [<c00fb1bd>] (sysfs_get_dirent+0x21/0x58) [<c00fb1bd>] (sysfs_get_dirent+0x21/0x58) from [<c00fc259>] (sysfs_remove_group+0x85/0xbc) [<c00fc259>] (sysfs_remove_group+0x85/0xbc) from [<c02faad9>] (cpufreq_governor_dbs+0x369/0x4a0) [<c02faad9>] (cpufreq_governor_dbs+0x369/0x4a0) from [<c02f66d7>] (__cpufreq_governor+0x2b/0x8c) [<c02f66d7>] (__cpufreq_governor+0x2b/0x8c) from [<c02f6893>] (__cpufreq_remove_dev.isra.12+0x15b/0x250) [<c02f6893>] (__cpufreq_remove_dev.isra.12+0x15b/0x250) from [<c03e91c7>] (cpufreq_cpu_callback+0x2f/0x3c) [<c03e91c7>] (cpufreq_cpu_callback+0x2f/0x3c) from [<c0036fe1>] (notifier_call_chain+0x45/0x54) [<c0036fe1>] (notifier_call_chain+0x45/0x54) from [<c001e611>] (__cpu_notify+0x1d/0x34) [<c001e611>] (__cpu_notify+0x1d/0x34) from [<c03e5833>] (_cpu_down+0x63/0x1ac) [<c03e5833>] (_cpu_down+0x63/0x1ac) from [<c03e5997>] (cpu_down+0x1b/0x30) [<c03e5997>] (cpu_down+0x1b/0x30) from [<c03e60eb>] (store_online+0x27/0x54) [<c03e60eb>] (store_online+0x27/0x54) from [<c0295629>] (dev_attr_store+0x11/0x18) [<c0295629>] (dev_attr_store+0x11/0x18) from [<c00f9edd>] (sysfs_write_file+0xed/0x114) [<c00f9edd>] (sysfs_write_file+0xed/0x114) from [<c00b42a9>] (vfs_write+0x65/0xd8) [<c00b42a9>] (vfs_write+0x65/0xd8) from [<c00b4523>] (sys_write+0x2f/0x50) [<c00b4523>] (sys_write+0x2f/0x50) from [<c000cdc1>] (ret_fast_syscall+0x1/0x52) Of course this only impacted drivers which have have_governor_per_policy set to true. i.e. big LITTLE cpufreq driver. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-05-12 14:04:16 +02:00
Rafael J. Wysocki	1c3d85dd4e	cpufreq: Revert incorrect commit `5800043` Commit `5800043` (cpufreq: convert cpufreq_driver to using RCU) causes the following call trace to be spit on boot: BUG: sleeping function called from invalid context at /scratch/rafael/work/linux-pm/mm/slab.c:3179 in_atomic(): 0, irqs_disabled(): 0, pid: 292, name: systemd-udevd 2 locks held by systemd-udevd/292: #0: (subsys mutex){+.+.+.}, at: [<ffffffff8146851a>] subsys_interface_register+0x4a/0xe0 #1: (rcu_read_lock){.+.+.+}, at: [<ffffffff81538210>] cpufreq_add_dev_interface+0x60/0x5e0 Pid: 292, comm: systemd-udevd Not tainted 3.9.0-rc8+ #323 Call Trace: [<ffffffff81072c90>] __might_sleep+0x140/0x1f0 [<ffffffff811581c2>] kmem_cache_alloc+0x42/0x2b0 [<ffffffff811e7179>] sysfs_new_dirent+0x59/0x130 [<ffffffff811e63cb>] sysfs_add_file_mode+0x6b/0x110 [<ffffffff81538210>] ? cpufreq_add_dev_interface+0x60/0x5e0 [<ffffffff810a3254>] ? __lock_is_held+0x54/0x80 [<ffffffff811e647d>] sysfs_add_file+0xd/0x10 [<ffffffff811e6541>] sysfs_create_file+0x21/0x30 [<ffffffff81538280>] cpufreq_add_dev_interface+0xd0/0x5e0 [<ffffffff81538210>] ? cpufreq_add_dev_interface+0x60/0x5e0 [<ffffffffa000337f>] ? acpi_processor_get_platform_limit+0x32/0xbb [processor] [<ffffffffa022f540>] ? do_drv_write+0x70/0x70 [acpi_cpufreq] [<ffffffff810a3254>] ? __lock_is_held+0x54/0x80 [<ffffffff8106c97e>] ? up_read+0x1e/0x40 [<ffffffff8106e632>] ? __blocking_notifier_call_chain+0x72/0xc0 [<ffffffff81538dbd>] cpufreq_add_dev+0x62d/0xae0 [<ffffffff815389b8>] ? cpufreq_add_dev+0x228/0xae0 [<ffffffff81468569>] subsys_interface_register+0x99/0xe0 [<ffffffffa014d000>] ? 0xffffffffa014cfff [<ffffffff81535d5d>] cpufreq_register_driver+0x9d/0x200 [<ffffffffa014d000>] ? 0xffffffffa014cfff [<ffffffffa014d0e9>] acpi_cpufreq_init+0xe9/0x1000 [acpi_cpufreq] [<ffffffff810002fa>] do_one_initcall+0x11a/0x170 [<ffffffff810b4b87>] load_module+0x1cf7/0x2920 [<ffffffff81322580>] ? ddebug_proc_open+0xb0/0xb0 [<ffffffff816baee0>] ? retint_restore_args+0xe/0xe [<ffffffff810b5887>] sys_init_module+0xd7/0x120 [<ffffffff816bb6d2>] system_call_fastpath+0x16/0x1b which is quite obvious, because that commit put (multiple instances of) sysfs_create_file() under rcu_read_lock()/rcu_read_unlock(), although sysfs_create_file() may cause memory to be allocated with GFP_KERNEL and that may sleep, which is not permitted in RCU read critical section. Revert the buggy commit altogether along with some changes on top of it. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-04-29 00:08:16 +02:00
Viresh Kumar	820c6ca293	cpufreq: Don't call __cpufreq_governor() for drivers without target() Some cpufreq drivers implement their own governor and so don't need us to call generic governors interface via __cpufreq_governor(). Few recent commits haven't obeyed this law well and we saw some regressions. This patch is an attempt to fix the above issue. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Reported-and-tested-by: Sedat Dilek <sedat.dilek@gmail.com> Tested-by: Dirk Brandewie <dirk.brandewie@gmail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-04-22 00:48:03 +02:00
Viresh Kumar	e4969ebac8	cpufreq: Call __cpufreq_governor() with correct policy->cpus mask __cpufreq_governor() must be called with a correct policy->cpus mask. In __cpufreq_remove_dev() we initially clear policy->cpus with cpumask_clear_cpu() and then call __cpufreq_governor(policy, CPUFREQ_GOV_POLICY_EXIT). If the governor is doing some per-cpu stuff in EXIT callback, this can create uncertain behavior. Generic governors in drivers/cpufreq/ doesn't do any per-cpu stuff in EXIT callback and so we don't face any issues currently. But its better to keep the code clean, so we don't face any issues in future. Now, we call cpumask_clear_cpu() only when multiple cpus are managed by policy. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-04-11 22:50:09 +02:00
Nathan Zimmer	5800043b24	cpufreq: convert cpufreq_driver to using RCU We eventually would like to remove the rwlock cpufreq_driver_lock or convert it back to a spinlock and protect the read sections with RCU. The first step in that direction is to make cpufreq_driver use RCU. I don't see an easy wasy to protect the cpufreq_cpu_data structure with RCU, so I am leaving it with the rwlock for now since under certain configs __cpufreq_cpu_get is a hot spot with 256+ cores. [rjw: Subject, changelog, white space] Signed-off-by: Nathan Zimmer <nzimmer@sgi.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-04-10 13:19:26 +02:00
Viresh Kumar	b43a7ffbf3	cpufreq: Notify all policy->cpus in cpufreq_notify_transition() policy->cpus contains all online cpus that have single shared clock line. And their frequencies are always updated together. Many SMP system's cpufreq drivers take care of this in individual drivers but the best place for this code is in cpufreq core. This patch modifies cpufreq_notify_transition() to notify frequency change for all cpus in policy->cpus and hence updates all users of this API. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Stephen Warren <swarren@nvidia.com> Tested-by: Stephen Warren <swarren@nvidia.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-04-02 15:24:00 +02:00
Viresh Kumar	4d5dcc4211	cpufreq: governor: Implement per policy instances of governors Currently, there can't be multiple instances of single governor_type. If we have a multi-package system, where we have multiple instances of struct policy (per package), we can't have multiple instances of same governor. i.e. We can't have multiple instances of ondemand governor for multiple packages. Governors directory in sysfs is created at /sys/devices/system/cpu/cpufreq/ governor-name/. Which again reflects that there can be only one instance of a governor_type in the system. This is a bottleneck for multicluster system, where we want different packages to use same governor type, but with different tunables. This patch uses the infrastructure provided by earlier patch and implements init/exit routines for ondemand and conservative governors. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-04-01 01:11:34 +02:00
Viresh Kumar	7bd353a995	cpufreq: Add per policy governor-init/exit infrastructure Currently, there can't be multiple instances of single governor_type. If we have a multi-package system, where we have multiple instances of struct policy (per package), we can't have multiple instances of same governor. i.e. We can't have multiple instances of ondemand governor for multiple packages. Governors directory in sysfs is created at /sys/devices/system/cpu/cpufreq/ governor-name/. Which again reflects that there can be only one instance of a governor_type in the system. This is a bottleneck for multicluster system, where we want different packages to use same governor type, but with different tunables. This patch is inclined towards providing this infrastructure. Because we are required to allocate governor's resources dynamically now, we must do it at policy creation and end. And so got CPUFREQ_GOV_POLICY_INIT/EXIT. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-04-01 01:11:34 +02:00
Nathan Zimmer	0d1857a1b9	cpufreq: Convert the cpufreq_driver_lock to a rwlock This eliminates the contention I am seeing in __cpufreq_cpu_get. It also nicely stages the lock to be replaced by the rcu. Signed-off-by: Nathan Zimmer <nzimmer@sgi.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-04-01 01:11:34 +02:00
Rafael J. Wysocki	4419fbd4b4	Merge branch 'pm-cpufreq' * pm-cpufreq: (55 commits) cpufreq / intel_pstate: Fix 32 bit build cpufreq: conservative: Fix typos in comments cpufreq: ondemand: Fix typos in comments cpufreq: exynos: simplify .init() for setting policy->cpus cpufreq: kirkwood: Add a cpufreq driver for Marvell Kirkwood SoCs cpufreq/x86: Add P-state driver for sandy bridge. cpufreq_stats: do not remove sysfs files if frequency table is not present cpufreq: Do not track governor name for scaling drivers with internal governors. cpufreq: Only call cpufreq_out_of_sync() for driver that implement cpufreq_driver.target() cpufreq: Retrieve current frequency from scaling drivers with internal governors cpufreq: Fix locking issues cpufreq: Create a macro for unlock_policy_rwsem{read,write} cpufreq: Remove unused HOTPLUG_CPU code cpufreq: governors: Fix WARN_ON() for multi-policy platforms cpufreq: ondemand: Replace down_differential tuner with adj_up_threshold cpufreq / stats: Get rid of CPUFREQ_STATDEVICE_ATTR cpufreq: Don't check cpu_online(policy->cpu) cpufreq: add imx6q-cpufreq driver cpufreq: Don't remove sysfs link for policy->cpu cpufreq: Remove unnecessary use of policy->shared_type ...	2013-02-15 13:59:07 +01:00
Dirk Brandewie	fa69e33f7d	cpufreq: Do not track governor name for scaling drivers with internal governors. Scaling drivers that implement internal governors do not have governor structures assocaited with them. Only track the name of the governor associated with the CPU if the driver does not implement cpufreq_driver.setpolicy() Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-02-09 12:55:53 +01:00
Dirk Brandewie	f6b0515b07	cpufreq: Only call cpufreq_out_of_sync() for driver that implement cpufreq_driver.target() Scaling drivers that implement cpufreq_driver.setpolicy() have internal governors that do not signal changes via cpufreq_notify_transition() so the frequncy in the policy will almost certainly be different than the current frequncy. Only call cpufreq_out_of_sync() when the underlying driver implements cpufreq_driver.target() Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-02-09 12:55:47 +01:00
Dirk Brandewie	9e21ba8bd8	cpufreq: Retrieve current frequency from scaling drivers with internal governors Scaling drivers that implement the cpufreq_driver.setpolicy() versus the cpufreq_driver.target() interface do not set policy->cur. Normally policy->cur is set during the call to cpufreq_driver.target() when the frequnecy request is made by the governor. If the scaling driver implements cpufreq_driver.setpolicy() and cpufreq_driver.get() interfaces use cpufreq_driver.get() to retrieve the current frequency. Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-02-09 12:55:03 +01:00
Viresh Kumar	2eaa3e2df1	cpufreq: Fix locking issues cpufreq core uses two locks: - cpufreq_driver_lock: General lock for driver and cpufreq_cpu_data array. - cpu_policy_rwsemfix locking: per CPU reader-writer semaphore designed to cure all cpufreq/hotplug/workqueue/etc related lock issues. These locks were not used properly and are placed against their principle (present before their definition) at various places. This patch is an attempt to fix their use. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-02-09 01:22:57 +01:00
Viresh Kumar	fa1d8af47f	cpufreq: Create a macro for unlock_policy_rwsem{read,write} On the lines of macro: lock_policy_rwsem, we can create another macro for unlock_policy_rwsem. Lets do it. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-02-09 01:22:06 +01:00
Viresh Kumar	65922465b5	cpufreq: Remove unused HOTPLUG_CPU code Because the sibling cpu of any online cpu is identified very early in cpufreq_add_dev(), below code is never executed. And so can be removed. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-02-09 01:21:37 +01:00
Viresh Kumar	8e53695f7f	cpufreq: governors: Fix WARN_ON() for multi-policy platforms On multi-policy systems there is a single instance of governor for both the policies (if same governor is chosen for both policies). With the code update from following patches: `8eeed09` cpufreq: governors: Get rid of dbs_data->enable field `b394058` cpufreq: governors: Reset tunables only for cpufreq_unregister_governor() We are creating/removing sysfs directory of governor for for every call to GOV_START and STOP. This would fail for multi-policy system as there is a per-policy call to START/STOP. This patch reuses the governor->initialized variable to detect total users of governor. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-02-09 01:21:13 +01:00
Viresh Kumar	3361b7b173	cpufreq: Don't check cpu_online(policy->cpu) policy->cpu or cpus in policy->cpus can't be offline anymore. And so we don't need to check if they are online or not. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-02-09 01:18:34 +01:00
Viresh Kumar	73bf0fc2b0	cpufreq: Don't remove sysfs link for policy->cpu "cpufreq" directory in policy->cpu is never created using sysfs_create_link(), but using kobject_init_and_add(). And so we shouldn't call sysfs_remove_link() for policy->cpu(). sysfs stuff for policy->cpu is automatically removed when we call kobject_put() for dying policy. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Tested-by: Dirk Brandewie <dirk.brandewie@gmail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-02-05 22:21:14 +01:00
Viresh Kumar	b394058f06	cpufreq: governors: Reset tunables only for cpufreq_unregister_governor() Currently, whenever governor->governor() is called for CPUFRREQ_GOV_START event we reset few tunables of governor. Which isn't correct, as this routine is called for every cpu hot-[un]plugging event. We should actually be resetting these only when the governor module is removed and re-installed. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-02-02 01:29:31 +01:00
Viresh Kumar	fcf8058296	cpufreq: Simplify cpufreq_add_dev() Currently cpufreq_add_dev() firsts allocates policy, calls driver->init() and then checks if this CPU is already managed or not. And if it is already managed, its policy is freed. We can save all this if we somehow know that CPU is managed or not in advance. policy->related_cpus contains the list of all valid sibling CPUs of policy->cpu. We can check this to see if the current CPU is already managed. From now on, platforms don't really need to set related_cpus from their init() routines, as the same work is done by core too. If a platform driver needs to set the related_cpus mask with some additional CPUs, other than CPUs present in policy->cpus, they are free to do it, though, as we don't override anything. [rjw: Changelog] Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Tested-by: Shawn Guo <shawn.guo@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-02-02 00:01:16 +01:00
Viresh Kumar	b26f72042e	cpufreq: Revert "cpufreq: Don't use cpu removed during cpufreq_driver_unregister" This reverts commit 956f339 "cpufreq: Don't use cpu removed during cpufreq_driver_unregister". With the addition of the following commit, this change/variable is not required any more: commit b9ba2725343ae57add3f324dfa5074167f48de96 Author: Viresh Kumar <viresh.kumar@linaro.org> Date: Mon Jan 14 13:23:03 2013 +0000 cpufreq: Simplify __cpufreq_remove_dev() [rjw: Subject and changelog] Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-02-02 00:01:16 +01:00
Borislav Petkov	9d95046e5d	cpufreq: Add a get_current_driver helper Add a helper function to return cpufreq_driver->name. Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-02-02 00:01:15 +01:00
Dirk Brandewie	d5aaffa9dd	cpufreq: handle cpufreq being disabled for all exported function. When disable_cpufreq() is called some exported functions are still being used that do not have a check for cpufreq being disabled. Add a disabled check into cpufreq_cpu_get() to return NULL if cpufreq is disabled this covers most of the exported functions. For the exported functions that do not call cpufreq_cpu_get() add an explicit check. Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-02-02 00:01:14 +01:00
Viresh Kumar	b8eed8af94	cpufreq: Simplify __cpufreq_remove_dev() __cpufreq_remove_dev() is called on multiple occasions: cpufreq_driver unregister and cpu removals. Current implementation of this routine is overly complex without much need. If the cpu to be removed is the policy->cpu, we remove the policy first and add all other cpus again from policy->cpus and then finally call __cpufreq_remove_dev() again to remove the cpu to be deleted. Haahhhh.. There exist a simple solution to removal of a cpu: - Simply use the old policy structure - update its fields like: policy->cpu, etc. - notify any users of cpufreq, which depend on changing policy->cpu Hence this patch, which tries to implement the above theory. It is tested well by myself on ARM big.LITTLE TC2 SoC, which has 5 cores (2 A15 and 3 A7). Both A15's share same struct policy and all A7's share same policy structure. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Tested-by: Shawn Guo <shawn.guo@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-02-02 00:01:14 +01:00
Viresh Kumar	6954ca9c8b	cpufreq: Don't use cpu removed during cpufreq_driver_unregister This is how the core works: cpufreq_driver_unregister() - subsys_interface_unregister() - for_each_cpu() call cpufreq_remove_dev(), i.e. 0,1,2,3,4 when we unregister. cpufreq_remove_dev(): - Remove policy node - Call cpufreq_add_dev() for next cpu, sharing mask with removed cpu. i.e. When cpu 0 is removed, we call it for cpu 1. And when called for cpu 2, we call it for cpu 3. - cpufreq_add_dev() would call cpufreq_driver->init() - init would return mask as AND of 2, 3 and 4 for cluster A7. - cpufreq core would do online_cpu && policy->cpus Here is the BUG(). Because cpu hasn't died but we have just unregistered the cpufreq driver, online cpu would still have cpu 2 in it. And so thing go bad again. Solution: Keep cpumask of cpus that are registered with cpufreq core and clear cpus when we get a call from subsys_interface_unregister() via cpufreq_remove_dev(). Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-02-02 00:01:14 +01:00
Viresh Kumar	f6a7409cab	cpufreq: Notify governors when cpus are hot-[un]plugged Because cpufreq core and governors worry only about the online cpus, if a cpu is hot [un]plugged, we must notify governors about it, otherwise be ready to expect something unexpected. We already have notifiers in the form of CPUFREQ_GOV_START/CPUFREQ_GOV_STOP, we just need to call them now. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-02-02 00:01:14 +01:00
Viresh Kumar	643ae6e81d	cpufreq: Manage only online cpus cpufreq core doesn't manage offline cpus and if driver->init() has returned mask including offline cpus, it may result in unwanted behavior by cpufreq core or governors. We need to get only online cpus in this mask. There are two places to fix this mask, cpufreq core and cpufreq driver. It makes sense to do this at common place and hence is done in core. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-02-02 00:01:14 +01:00
Paul Gortmaker	43720bd601	PM / tracing: remove deprecated power trace API The text in Documentation said it would be removed in 2.6.41; the text in the Kconfig said removal in the 3.1 release. Either way you look at it, we are well past both, so push it off a cliff. Note that the POWER_CSTATE and the POWER_PSTATE are part of the legacy tracing API. Remove all tracepoints which use these flags. As can be seen from context, most already have a trace entry via trace_cpu_idle anyways. Also, the cpufreq/cpufreq.c PSTATE one is actually unpaired, as compared to the CSTATE ones which all have a clear start/stop. As part of this, the trace_power_frequency also becomes orphaned, so it too is deleted. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-01-26 00:39:12 +01:00
Jingoo Han	f55c9c2627	cpufreq: Remove unnecessary initialization of a local variable Remove an unnecessary initializer for the 'ret' variable in __cpufreq_set_policy(). [rjw: Modified the subject and changelog slightly.] Signed-off-by: Jingoo Han <jg1.han@samsung.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2012-11-15 00:33:09 +01:00
Viresh Kumar	7249924e53	cpufreq: Make sure target freq is within limits __cpufreq_driver_target() must not pass target frequency beyond the limits of current policy. Today most of cpufreq platform drivers are doing this check in their target routines. Why not move it to __cpufreq_driver_target()? Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2012-11-15 00:33:09 +01:00
Viresh Kumar	5a1c022850	cpufreq: Avoid calling cpufreq driver's target() routine if target_freq == policy->cur Avoid calling cpufreq driver's target() routine if new frequency is same as policies current frequency. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2012-11-15 00:33:09 +01:00
Viresh Kumar	da58445570	cpufreq: Fix sparse warning by making local function static cpufreq_disabled() is a local function, so should be marked static. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2012-11-15 00:33:08 +01:00
Viresh Kumar	0676f7f2e7	cpufreq: return early from __cpufreq_driver_getavg() There is no need to do cpufreq_get_cpu() and cpufreq_put_cpu() for drivers that don't support getavg() routine. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2012-11-15 00:33:07 +01:00
Viresh Kumar	db7011516c	cpufreq: Improve debug prints With debug options on, it is difficult to locate cpufreq core's debug prints. Fix this by prefixing debug prints with KBUILD_MODNAME. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2012-11-15 00:33:06 +01:00
viresh kumar	4b972f0b04	cpufreq / core: Fix printing of governor and driver name Arrays for governer and driver name are of size CPUFREQ_NAME_LEN or 16. i.e. 15 bytes for name and 1 for trailing '\0'. When cpufreq driver print these names (for sysfs), it includes '\n' or ' ' in the fmt string and still passes length as CPUFREQ_NAME_LEN. If the driver or governor names are using all 15 fields allocated to them, then the trailing '\n' or ' ' will never be printed. And so commands like: root@linaro-developer# cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_driver will print something like: cpufreq_foodrvroot@linaro-developer# Fix this by increasing print length by one character. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2012-11-15 00:33:06 +01:00
viresh kumar	8bf1ac7236	cpufreq / core: Fix typo in comment describing show_bios_limit() show_bios_limit is mistakenly written as show_scaling_driver in a comment describing purpose of show_bios_limit() routine. Fix it. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2012-11-15 00:33:05 +01:00
Stephen Boyd	a914443627	cpufreq: Fix sysfs deadlock with concurrent hotplug/frequency switch Running one program that continuously hotplugs and replugs a cpu concurrently with another program that continuously writes to the scaling_setspeed node eventually deadlocks with: ============================================= [ INFO: possible recursive locking detected ] 3.4.0 #37 Tainted: G W --------------------------------------------- filemonkey/122 is trying to acquire lock: (s_active#13){++++.+}, at: [<c01a3d28>] sysfs_remove_dir+0x9c/0xb4 but task is already holding lock: (s_active#13){++++.+}, at: [<c01a22f0>] sysfs_write_file+0xe8/0x140 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(s_active#13); lock(s_active#13); * DEADLOCK * May be due to missing lock nesting notation 2 locks held by filemonkey/122: #0: (&buffer->mutex){+.+.+.}, at: [<c01a2230>] sysfs_write_file+0x28/0x140 #1: (s_active#13){++++.+}, at: [<c01a22f0>] sysfs_write_file+0xe8/0x140 stack backtrace: [<c0014fcc>] (unwind_backtrace+0x0/0x120) from [<c00ca600>] (validate_chain+0x6f8/0x1054) [<c00ca600>] (validate_chain+0x6f8/0x1054) from [<c00cb778>] (__lock_acquire+0x81c/0x8d8) [<c00cb778>] (__lock_acquire+0x81c/0x8d8) from [<c00cb9c0>] (lock_acquire+0x18c/0x1e8) [<c00cb9c0>] (lock_acquire+0x18c/0x1e8) from [<c01a3ba8>] (sysfs_addrm_finish+0xd0/0x180) [<c01a3ba8>] (sysfs_addrm_finish+0xd0/0x180) from [<c01a3d28>] (sysfs_remove_dir+0x9c/0xb4) [<c01a3d28>] (sysfs_remove_dir+0x9c/0xb4) from [<c02d0e5c>] (kobject_del+0x10/0x38) [<c02d0e5c>] (kobject_del+0x10/0x38) from [<c02d0f74>] (kobject_release+0xf0/0x194) [<c02d0f74>] (kobject_release+0xf0/0x194) from [<c0565a98>] (cpufreq_cpu_put+0xc/0x24) [<c0565a98>] (cpufreq_cpu_put+0xc/0x24) from [<c05683f0>] (store+0x6c/0x74) [<c05683f0>] (store+0x6c/0x74) from [<c01a2314>] (sysfs_write_file+0x10c/0x140) [<c01a2314>] (sysfs_write_file+0x10c/0x140) from [<c014af44>] (vfs_write+0xb0/0x128) [<c014af44>] (vfs_write+0xb0/0x128) from [<c014b06c>] (sys_write+0x3c/0x68) [<c014b06c>] (sys_write+0x3c/0x68) from [<c000e0e0>] (ret_fast_syscall+0x0/0x3c) This is because store() in cpufreq.c indirectly calls kobject_get() via cpufreq_cpu_get() and is the last one to call kobject_put() via cpufreq_cpu_put(). Sysfs code should not call kobject_get() or kobject_put() directly (see the comment around sysfs_schedule_callback() for more information). Fix this deadlock by introducing two new functions: struct cpufreq_policy cpufreq_cpu_get_sysfs(unsigned int cpu) void cpufreq_cpu_put_sysfs(struct cpufreq_policy data) which do the same thing as cpufreq_cpu_{get,put}() but don't call kobject functions. To easily trigger this deadlock you can insert an msleep() with a reasonably large value right after the fail label at the bottom of the store() function in cpufreq.c and then write scaling_setspeed in one task and offline the cpu in another. The first task will hang and be detected by the hung task detector. Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>	2012-07-20 21:39:25 +02:00
Konrad Rzeszutek Wilk	a7b422cda5	provide disable_cpufreq() function to disable the API. useful for disabling cpufreq altogether. The cpu frequency scaling drivers and cpu frequency governors will fail to register. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Dave Jones <davej@redhat.com>	2012-03-14 14:45:03 -04:00
Linus Torvalds	02d929502c	Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq: (23 commits) [CPUFREQ] EXYNOS: Removed useless headers and codes [CPUFREQ] EXYNOS: Make EXYNOS common cpufreq driver [CPUFREQ] powernow-k8: Update copyright, maintainer and documentation information [CPUFREQ] powernow-k8: Fix indexing issue [CPUFREQ] powernow-k8: Avoid Pstate MSR accesses on systems supporting CPB [CPUFREQ] update lpj only if frequency has changed [CPUFREQ] cpufreq:userspace: fix cpu_cur_freq updation [CPUFREQ] Remove wall variable from cpufreq_gov_dbs_init() [CPUFREQ] EXYNOS4210: cpufreq code is changed for stable working [CPUFREQ] EXYNOS4210: Update frequency table for cpu divider [CPUFREQ] EXYNOS4210: Remove code about bus on cpufreq [CPUFREQ] s3c64xx: Use pr_fmt() for consistent log messages cpufreq: OMAP: fixup for omap_device changes, include <linux/module.h> cpufreq: OMAP: fix freq_table leak cpufreq: OMAP: put clk if cpu_init failed cpufreq: OMAP: only supports OPP library cpufreq: OMAP: dont support !freq_table cpufreq: OMAP: deny initialization if no mpudev cpufreq: OMAP: move clk name decision to init cpufreq: OMAP: notify even with bad boot frequency ...	2012-01-11 18:53:33 -08:00
Afzal Mohammed	d08de0c19c	[CPUFREQ] update lpj only if frequency has changed During scaling up of cpu frequency, loops_per_jiffy is updated upon invoking PRECHANGE notifier. If setting to new frequency fails in cpufreq driver, lpj is left at incorrect value. Hence update lpj only if cpu frequency is changed, i.e. upon invoking POSTCHANGE notifier. Penalty would be that during time period between changing cpu frequency & invocation of POSTCHANGE notifier, udelay(x) may not gurantee minimal delay of 'x' us for frequency scaling up operation. Perhaps a better solution would be to define CPUFREQ_ABORTCHANGE & handle accordingly, but then it would be more intrusive (using ABORTCHANGE may help drivers also; if any has registered notifier and expect POST for a PRECHANGE, their needs can be taken care using ABORT) Signed-off-by: Afzal Mohammed <afzal@ti.com> Signed-off-by: Dave Jones <davej@redhat.com>	2012-01-06 10:10:53 -05:00
Kay Sievers	8a25a2fd12	cpu: convert 'cpu' and 'machinecheck' sysdev_class to a regular subsystem This moves the 'cpu sysdev_class' over to a regular 'cpu' subsystem and converts the devices to regular devices. The sysdev drivers are implemented as subsystem interfaces now. After all sysdev classes are ported to regular driver core entities, the sysdev implementation will be entirely removed from the kernel. Userspace relies on events and generic sysfs subsystem infrastructure from sysdev devices, which are made available with this conversion. Cc: Haavard Skinnemoen <hskinnemoen@gmail.com> Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no> Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Borislav Petkov <bp@amd64.org> Cc: Tigran Aivazian <tigran@aivazian.fsnet.co.uk> Cc: Len Brown <lenb@kernel.org> Cc: Zhang Rui <rui.zhang@intel.com> Cc: Dave Jones <davej@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Russell King <rmk+kernel@arm.linux.org.uk> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-12-21 14:29:42 -08:00
Jesse Barnes	3d73710880	cpufreq: expose a cpufreq_quick_get_max routine This allows drivers and other code to get the max reported CPU frequency. Initial use is for scaling ring frequency with GPU frequency in the i915 driver. Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Keith Packard <keithp@keithp.com>	2011-06-28 13:54:26 -07:00
Kees Cook	1a8e1463a4	[CPUFREQ] remove redundant sprintf from request_module call. Since format string handling is part of request_module, there is no need to construct the module name. As such, drop the redundant sprintf and heap usage. Signed-off-by: Kees Cook <kees.cook@canonical.com> Signed-off-by: Dave Jones <davej@redhat.com>	2011-05-04 11:50:58 -04:00
Dominik Brodowski	2d06d8c49a	[CPUFREQ] use dynamic debug instead of custom infrastructure With dynamic debug having gained the capability to report debug messages also during the boot process, it offers a far superior interface for debug messages than the custom cpufreq infrastructure. As a first step, remove the old cpufreq_debug_printk() function and replace it with a call to the generic pr_debug() function. How can dynamic debug be used on cpufreq? You need a kernel which has CONFIG_DYNAMIC_DEBUG enabled. To enabled debugging during runtime, mount debugfs and $ echo -n 'module cpufreq +p' > /sys/kernel/debug/dynamic_debug/control for debugging the complete "cpufreq" module. To achieve the same goal during boot, append ddebug_query="module cpufreq +p" as a boot parameter to the kernel of your choice. For more detailled instructions, please see Documentation/dynamic-debug-howto.txt Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net> Signed-off-by: Dave Jones <davej@redhat.com>	2011-05-04 11:50:57 -04:00
Jacob Shin	27ecddc2a9	[CPUFREQ] CPU hotplug, re-create sysfs directory and symlinks When we discover CPUs that are affected by each other's frequency/voltage transitions, the first CPU gets a sysfs directory created, and rest of the siblings get symlinks. Currently, when we hotplug off only the first CPU, all of the symlinks and the sysfs directory gets removed. Even though rest of the siblings are still online and functional, they are orphaned, and no longer governed by cpufreq. This patch, given the above scenario, creates a sysfs directory for the first sibling and symlinks for the rest of the siblings. Please note the recursive call, it was rather too ugly to roll it out. And the removal of redundant NULL setting (it is already taken care of near the top of the function). Signed-off-by: Jacob Shin <jacob.shin@amd.com> Acked-by: Mark Langsdorf <mark.langsdorf@amd.com> Reviewed-by: Thomas Renninger <trenn@suse.de> Signed-off-by: Dave Jones <davej@redhat.com> Cc: stable@kernel.org	2011-05-04 11:50:57 -04:00
Lucas De Marchi	25985edced	Fix common misspellings Fixes generated by 'codespell' and manually reviewed. Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>	2011-03-31 11:26:23 -03:00
Rafael J. Wysocki	e00e56dfd3	cpufreq: Use syscore_ops for boot CPU suspend/resume (v2) The cpufreq subsystem uses sysdev suspend and resume for executing cpufreq_suspend() and cpufreq_resume(), respectively, during system suspend, after interrupts have been switched off on the boot CPU, and during system resume, while interrupts are still off on the boot CPU. In both cases the other CPUs are off-line at the relevant point (either they have been switched off via CPU hotplug during suspend, or they haven't been switched on yet during resume). For this reason, although it may seem that cpufreq_suspend() and cpufreq_resume() are executed for all CPUs in the system, they are only called for the boot CPU in fact, which is quite confusing. To remove the confusion and to prepare for elimiating sysdev suspend and resume operations from the kernel enirely, convernt cpufreq to using a struct syscore_ops object for the boot CPU suspend and resume and rename the callbacks so that their names reflect their purpose. In addition, put some explanatory remarks into their kerneldoc comments. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>	2011-03-23 22:16:32 +01:00

1 2 3 4

193 Commits