Pull sched/idle changes from Ingo Molnar:
"More idle code reorganization, to prepare for more integration.
(Sent separately because it depended on pending timer work, which is
now upstream)"
* 'sched-idle-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/idle: Add more comments to the code
sched/idle: Move idle conditions in cpuidle_idle main function
sched/idle: Reorganize the idle loop
cpuidle/idle: Move the cpuidle_idle_call function to idle.c
idle/cpuidle: Split cpuidle_idle_call main function into smaller functions
Pull powerpc non-virtualized cpuidle from Ben Herrenschmidt:
"This is the branch I mentioned in my other pull request which contains
our improved cpuidle support for the "powernv" platform
(non-virtualized).
It adds support for the "fast sleep" feature of the processor which
provides higher power savings than our usual "nap" mode but at the
cost of losing the timers while asleep, and thus exploits the new
timer broadcast framework to work around that limitation.
It's based on a tip timer tree that you seem to have already merged"
* 'powernv-cpuidle' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
cpuidle/powernv: Parse device tree to setup idle states
cpuidle/powernv: Add "Fast-Sleep" CPU idle state
powerpc/powernv: Add OPAL call to resync timebase on wakeup
powerpc/powernv: Add context management for Fast Sleep
powerpc: Split timer_interrupt() into timer handling and interrupt handling routines
powerpc: Implement tick broadcast IPI as a fixed IPI message
powerpc: Free up the slot of PPC_MSG_CALL_FUNC_SINGLE IPI message
Pull core block layer updates from Jens Axboe:
"This is the pull request for the core block IO bits for the 3.15
kernel. It's a smaller round this time, it contains:
- Various little blk-mq fixes and additions from Christoph and
myself.
- Cleanup of the IPI usage from the block layer, and associated
helper code. From Frederic Weisbecker and Jan Kara.
- Duplicate code cleanup in bio-integrity from Gu Zheng. This will
give you a merge conflict, but that should be easy to resolve.
- blk-mq notify spinlock fix for RT from Mike Galbraith.
- A blktrace partial accounting bug fix from Roman Pen.
- Missing REQ_SYNC detection fix for blk-mq from Shaohua Li"
* 'for-3.15/core' of git://git.kernel.dk/linux-block: (25 commits)
blk-mq: add REQ_SYNC early
rt,blk,mq: Make blk_mq_cpu_notify_lock a raw spinlock
blk-mq: support partial I/O completions
blk-mq: merge blk_mq_insert_request and blk_mq_run_request
blk-mq: remove blk_mq_alloc_rq
blk-mq: don't dump CPU -> hw queue map on driver load
blk-mq: fix wrong usage of hctx->state vs hctx->flags
blk-mq: allow blk_mq_init_commands() to return failure
block: remove old blk_iopoll_enabled variable
blktrace: fix accounting of partially completed requests
smp: Rename __smp_call_function_single() to smp_call_function_single_async()
smp: Remove wait argument from __smp_call_function_single()
watchdog: Simplify a little the IPI call
smp: Move __smp_call_function_single() below its safe version
smp: Consolidate the various smp_call_function_single() declensions
smp: Teach __smp_call_function_single() to check for offline cpus
smp: Remove unused list_head from csd
smp: Iterate functions through llist_for_each_entry_safe()
block: Stop abusing rq->csd.list in blk-softirq
block: Remove useless IPI struct initialization
...
- Device PM QoS support for latency tolerance constraints on systems with
hardware interfaces allowing such constraints to be specified. That is
necessary to prevent hardware-driven power management from becoming
overly aggressive on some systems and to prevent power management
features leading to excessive latencies from being used in some cases.
- Consolidation of the handling of ACPI hotplug notifications for device
objects. This causes all device hotplug notifications to go through
the root notify handler (that was executed for all of them anyway
before) that propagates them to individual subsystems, if necessary,
by executing callbacks provided by those subsystems (those callbacks
are associated with struct acpi_device objects during device
enumeration). As a result, the code in question becomes both smaller
in size and more straightforward and all of those changes should not
affect users.
- ACPICA update, including fixes related to the handling of _PRT in cases
when it is broken and the addition of "Windows 2013" to the list of
supported "features" for _OSI (which is necessary to support systems
that work incorrectly or don't even boot without it). Changes from
Bob Moore and Lv Zheng.
- Consolidation of ACPI _OST handling from Jiang Liu.
- ACPI battery and AC fixes allowing unusual system configurations to
be handled by that code from Alexander Mezin.
- New device IDs for the ACPI LPSS driver from Chiau Ee Chew.
- ACPI fan and thermal optimizations related to system suspend and resume
from Aaron Lu.
- Cleanups related to ACPI video from Jean Delvare.
- Assorted ACPI fixes and cleanups from Al Stone, Hanjun Guo, Lan Tianyu,
Paul Bolle, Tomasz Nowicki.
- Intel RAPL (Running Average Power Limits) driver cleanups from Jacob Pan.
- intel_pstate fixes and cleanups from Dirk Brandewie.
- cpufreq fixes related to system suspend/resume handling from Viresh Kumar.
- cpufreq core fixes and cleanups from Viresh Kumar, Stratos Karafotis,
Saravana Kannan, Rashika Kheria, Joe Perches.
- cpufreq drivers updates from Viresh Kumar, Zhuoyu Zhang, Rob Herring.
- cpuidle fixes related to the menu governor from Tuukka Tikkanen.
- cpuidle fix related to coupled CPUs handling from Paul Burton.
- Asynchronous execution of all device suspend and resume callbacks,
except for ->prepare and ->complete, during system suspend and resume
from Chuansheng Liu.
- Delayed resuming of runtime-suspended devices during system suspend for
the PCI bus type and ACPI PM domain.
- New set of PM helper routines to allow device runtime PM callbacks to
be used during system suspend and resume more easily from Ulf Hansson.
- Assorted fixes and cleanups in the PM core from Geert Uytterhoeven,
Prabhakar Lad, Philipp Zabel, Rashika Kheria, Sebastian Capella.
- devfreq fix from Saravana Kannan.
/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABCAAGBQJTLgB1AAoJEILEb/54YlRxfs4P/35fIu9h8ClNWUPXqi3nlGIt
yMyumKvF1VdsOKLbjTtFq6B3UOlhqDijYTCQd7Xt7X8ONTk/ND9ec2t/5xGkSdUI
q46fa0qZXeqUn0Kt2t+kl6tgVQOkDj94aNlEh+7Ya3Uu6WYDDfmZtOBOFAMk6D8l
ND4rHJpX+eUsRLBrcxaUxxdD8AW5guGcPKyeyzsXv1bY1BZnpLFrZ3PhuI5dn2CL
L/zmk3A+wG6+ZlQxnwDdrKa3E6uhRSIDeF0vI4Byspa1wi5zXknJG2J7MoQ9JEE9
VQpBXlqach5wgXqJ8PAqAeaB6Ie26/F7PYG8r446zKw/5UUtdNUx+0dkjQ7Mz8Tu
ajuVxfwrrPhZeQqmVBxlH5Gg7Ez2KBKEfDxTdRnzI7FoA7PE5XDcg3kO64bhj8LJ
yugnV/ToU9wMztZnPC7CoGPwUgxMJvr9LwmxS4aeKcVUBES05eg0vS3lwdZMgqkV
iO0QkWTmhZ952qZCqZxbh0JqaaX8Wgx2kpX2tf1G2GJqLMZco289bLh6njNT+8CH
EzdQKYYyn6G6+Qg2M0f/6So3qU17x9XtE4ZBWQdGDpqYOGZhjZAOs/VnB1Ysw/K3
cDBzswlJd0CyyUps9B+qbf49OpbWVwl5kKeuHUuPxugEVryhpSp9AuG+tNil74Sj
JuGTGR4fyFjDBX5cvAPm
=ywR6
-----END PGP SIGNATURE-----
Merge tag 'pm+acpi-3.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI and power management updates from Rafael Wysocki:
"The majority of this material spent some time in linux-next, some of
it even several weeks. There are a few relatively fresh commits in
it, but they are mostly fixes and simple cleanups.
ACPI took the lead this time, both in terms of the number of commits
and the number of modified lines of code, cpufreq follows and there
are a few changes in the PM core and in cpuidle too.
A new feature that already got some LWN.net's attention is the device
PM QoS extension allowing latency tolerance requirements to be
propagated from leaf devices to their ancestors with hardware
interfaces for specifying latency tolerance. That should help systems
with hardware-driven power management to avoid going too far with it
in cases when there are latency tolerance constraints.
There also are some significant changes in the ACPI core related to
the way in which hotplug notifications are handled. They affect PCI
hotplug (ACPIPHP) and the ACPI dock station code too. The bottom line
is that all those notification now go through the root notify handler
and are propagated to the interested subsystems by means of callbacks
instead of having to install a notify handler for each device object
that we can potentially get hotplug notifications for.
In addition to that ACPICA will now advertise "Windows 2013"
compatibility for _OSI, because some systems out there don't work
correctly if that is not done (some of them don't even boot).
On the system suspend side of things, all of the device suspend and
resume callbacks, except for ->prepare() and ->complete(), are now
going to be executed asynchronously as that turns out to speed up
system suspend and resume on some platforms quite significantly and we
have a few more optimizations in that area.
Apart from that, there are some new device IDs and fixes and cleanups
all over. In particular, the system suspend and resume handling by
cpufreq should be improved and the cpuidle menu governor should be a
bit more robust now.
Specifics:
- Device PM QoS support for latency tolerance constraints on systems
with hardware interfaces allowing such constraints to be specified.
That is necessary to prevent hardware-driven power management from
becoming overly aggressive on some systems and to prevent power
management features leading to excessive latencies from being used
in some cases.
- Consolidation of the handling of ACPI hotplug notifications for
device objects. This causes all device hotplug notifications to go
through the root notify handler (that was executed for all of them
anyway before) that propagates them to individual subsystems, if
necessary, by executing callbacks provided by those subsystems
(those callbacks are associated with struct acpi_device objects
during device enumeration). As a result, the code in question
becomes both smaller in size and more straightforward and all of
those changes should not affect users.
- ACPICA update, including fixes related to the handling of _PRT in
cases when it is broken and the addition of "Windows 2013" to the
list of supported "features" for _OSI (which is necessary to
support systems that work incorrectly or don't even boot without
it). Changes from Bob Moore and Lv Zheng.
- Consolidation of ACPI _OST handling from Jiang Liu.
- ACPI battery and AC fixes allowing unusual system configurations to
be handled by that code from Alexander Mezin.
- New device IDs for the ACPI LPSS driver from Chiau Ee Chew.
- ACPI fan and thermal optimizations related to system suspend and
resume from Aaron Lu.
- Cleanups related to ACPI video from Jean Delvare.
- Assorted ACPI fixes and cleanups from Al Stone, Hanjun Guo, Lan
Tianyu, Paul Bolle, Tomasz Nowicki.
- Intel RAPL (Running Average Power Limits) driver cleanups from
Jacob Pan.
- intel_pstate fixes and cleanups from Dirk Brandewie.
- cpufreq fixes related to system suspend/resume handling from Viresh
Kumar.
- cpufreq core fixes and cleanups from Viresh Kumar, Stratos
Karafotis, Saravana Kannan, Rashika Kheria, Joe Perches.
- cpufreq drivers updates from Viresh Kumar, Zhuoyu Zhang, Rob
Herring.
- cpuidle fixes related to the menu governor from Tuukka Tikkanen.
- cpuidle fix related to coupled CPUs handling from Paul Burton.
- Asynchronous execution of all device suspend and resume callbacks,
except for ->prepare and ->complete, during system suspend and
resume from Chuansheng Liu.
- Delayed resuming of runtime-suspended devices during system suspend
for the PCI bus type and ACPI PM domain.
- New set of PM helper routines to allow device runtime PM callbacks
to be used during system suspend and resume more easily from Ulf
Hansson.
- Assorted fixes and cleanups in the PM core from Geert Uytterhoeven,
Prabhakar Lad, Philipp Zabel, Rashika Kheria, Sebastian Capella.
- devfreq fix from Saravana Kannan"
* tag 'pm+acpi-3.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (162 commits)
PM / devfreq: Rewrite devfreq_update_status() to fix multiple bugs
PM / sleep: Correct whitespace errors in <linux/pm.h>
intel_pstate: Set core to min P state during core offline
cpufreq: Add stop CPU callback to cpufreq_driver interface
cpufreq: Remove unnecessary braces
cpufreq: Fix checkpatch errors and warnings
cpufreq: powerpc: add cpufreq transition latency for FSL e500mc SoCs
MAINTAINERS: Reorder maintainer addresses for PM and ACPI
PM / Runtime: Update runtime_idle() documentation for return value meaning
video / output: Drop display output class support
fujitsu-laptop: Drop unneeded include
acer-wmi: Stop selecting VIDEO_OUTPUT_CONTROL
ACPI / gpu / drm: Stop selecting VIDEO_OUTPUT_CONTROL
ACPI / video: fix ACPI_VIDEO dependencies
cpufreq: remove unused notifier: CPUFREQ_{SUSPENDCHANGE|RESUMECHANGE}
cpufreq: Do not allow ->setpolicy drivers to provide ->target
cpufreq: arm_big_little: set 'physical_cluster' for each CPU
cpufreq: arm_big_little: make vexpress driver depend on bL core driver
ACPI / button: Add ACPI Button event via netlink routine
ACPI: Remove duplicate definitions of PREFIX
...
Pull timer changes from Thomas Gleixner:
"This assorted collection provides:
- A new timer based timer broadcast feature for systems which do not
provide a global accessible timer device. That allows those
systems to put CPUs into deep idle states where the per cpu timer
device stops.
- A few NOHZ_FULL related improvements to the timer wheel
- The usual updates to timer devices found in ARM SoCs
- Small improvements and updates all over the place"
* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (44 commits)
tick: Remove code duplication in tick_handle_periodic()
tick: Fix spelling mistake in tick_handle_periodic()
x86: hpet: Use proper destructor for delayed work
workqueue: Provide destroy_delayed_work_on_stack()
clocksource: CMT, MTU2, TMU and STI should depend on GENERIC_CLOCKEVENTS
timer: Remove code redundancy while calling get_nohz_timer_target()
hrtimer: Rearrange comments in the order struct members are declared
timer: Use variable head instead of &work_list in __run_timers()
clocksource: exynos_mct: silence a static checker warning
arm: zynq: Add support for cpufreq
arm: zynq: Don't use arm_global_timer with cpufreq
clocksource/cadence_ttc: Overhaul clocksource frequency adjustment
clocksource/cadence_ttc: Call clockevents_update_freq() with IRQs enabled
clocksource: Add Kconfig entries for CMT, MTU2, TMU and STI
sh: Remove Kconfig entries for TMU, CMT and MTU2
ARM: shmobile: Remove CMT, TMU and STI Kconfig entries
clocksource: armada-370-xp: Use atomic access for shared registers
clocksource: orion: Use atomic access for shared registers
clocksource: timer-keystone: Delete unnecessary variable
clocksource: timer-keystone: introduce clocksource driver for Keystone
...
As described by a comment at the end of cpuidle_enter_state_coupled it
can be inefficient for coupled idle states to return with IRQs enabled
since they may proceed to service an interrupt instead of clearing the
coupled idle state. Until they have finished & cleared the idle state
all CPUs coupled with them will spin rather than being able to enter a
safe idle state.
Commits e1689795a7 "cpuidle: Add common time keeping and irq
enabling" and 554c06ba3e "cpuidle: remove en_core_tk_irqen flag" led
to the cpuidle_enter_state enabling interrupts for all idle states,
including coupled ones, making this inefficiency unavoidable by drivers
& the local_irq_enable near the end of cpuidle_enter_state_coupled
redundant. This patch avoids enabling interrupts in cpuidle_enter_state
after a coupled state has been entered, allowing them to remain disabled
until all coupled CPUs have exited the idle state and
cpuidle_enter_state_coupled re-enables them.
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The cpuidle_idle_call does nothing more than calling the three individuals
function and is no longer used by any arch specific code but only in the
cpuidle framework code.
We can move this function into the idle task code to ensure better
proximity to the scheduler code.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: rjw@rjwysocki.net
Cc: preeti@linux.vnet.ibm.com
Link: http://lkml.kernel.org/r/1393832934-11625-2-git-send-email-daniel.lezcano@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
In order to allow better integration between the cpuidle framework and the
scheduler, reducing the distance between these two sub-components will
facilitate this integration by moving part of the cpuidle code in the idle
task file and, because idle.c is in the sched directory, we have access to
the scheduler's private structures.
This patch splits the cpuidle_idle_call main entry function into 3 calls
to a newly added API:
1. select the idle state
2. enter the idle state
3. reflect the idle state
The cpuidle_idle_call calls these three functions to implement the main
idle entry function.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: rjw@rjwysocki.net
Cc: preeti@linux.vnet.ibm.com
Link: http://lkml.kernel.org/r/1393832934-11625-1-git-send-email-daniel.lezcano@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
For some platforms, a poll state is inserted in the cpuidle driver states.
The flags for the state do not indicate that timekeeping is not affected.
As the state does not do anything apart from calling cpu_relax(), the
times returned by ktime_get should remain valid. Add the missing flag.
Signed-off-by: Tuukka Tikkanen <tuukka.tikkanen@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The menu governor performance multiplier defines a minimum predicted
idle duration to latency ratio. Instead of checking this separately
in every iteration of the state selection loop, adjust the overall
latency restriction for the whole loop if this restriction is tighter
than what is set by the QoS subsystem.
The original test
s->exit_latency * multiplier > data->predicted_us
becomes
s->exit_latency > data->predicted_us / multiplier
by dividing both sides of the comparison by "multiplier".
While division is likely to be several times slower than multiplication,
the minor performance hit allows making a generic sleep state selection
function based on (sleep duration, maximum latency) tuple.
Signed-off-by: Tuukka Tikkanen <tuukka.tikkanen@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The menu governor statistics update function tries to determine the
amount of time between entry to low power state and the occurrence
of the wakeup event. However, the time measured by the framework
includes exit latency on top of the desired value. This exit latency
is substracted from the measured value to obtain the desired value.
When measured value is not available, the menu governor assumes
the wakeup was caused by the timer and the time is equal to remaining
timer length. No exit latency should be substracted from this value.
This patch prevents the erroneous substraction and clarifies the
associated comment. It also removes one intermediate variable that
serves no purpose.
Signed-off-by: Tuukka Tikkanen <tuukka.tikkanen@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The menu governor uses coefficients as one method of actual idle
period length estimation. The coefficients are, as detailed below,
multipliers giving expected idle period length from time until next
timer expiry. The multipliers are supposed to have domain of (0..1].
The coefficients are fractions where only the numerators are stored
and denominators are a shared constant RESOLUTION*DECAY. Since the
value of the coefficient should always be greater than 0 and less
than or equal to 1, the numerator must have a value greater than
0 and less than or equal to RESOLUTION*DECAY.
If the coefficients are updated with measured idle durations exceeding
timer length, the multiplier may reach values exceeding unity (i.e.
the stored numerator exceeds RESOLUTION*DECAY). This patch ensures that
the multipliers are updated with durations capped to timer length.
Signed-off-by: Tuukka Tikkanen <tuukka.tikkanen@linaro.org>
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Currently menu governor records the exit latency of the state it has
chosen for the idle period. The stored latency value is then later
used to calculate the actual length of the idle period. This value
may however be incorrect, as the entered state may not be the one
chosen by the governor. The entered state information is available,
so we can use that to obtain the real exit latency.
Signed-off-by: Tuukka Tikkanen <tuukka.tikkanen@linaro.org>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The field expected_us is used to store the time remaining until next
timer expiry. The name is inaccurate, as we really do not expect all
wakeups to be caused by timers. In addition, another field with a very
similar name (predicted_us) is used to store the predicted time
remaining until any wakeup source being active.
This patch renames expected_us to next_timer_us in order to better
reflect the contained information.
Signed-off-by: Tuukka Tikkanen <tuukka.tikkanen@linaro.org>
Acked-by: Nicolas Pitre <nico@linaro.org>
Acked-by: Len Brown <len.brown@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Add deep idle states such as nap and fast sleep to the cpuidle state table
only if they are discovered from the device tree during cpuidle initialization.
Signed-off-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Fast sleep is one of the deep idle states on Power8 in which local timers of
CPUs stop. On PowerPC we do not have an external clock device which can
handle wakeup of such CPUs. Now that we have the support in the tick broadcast
framework for archs that do not sport such a device and the low level support
for fast sleep, enable it in the cpuidle framework on PowerNV.
Signed-off-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Avoid heavy conflicts caused by WIP patches in drivers/cpuidle/cpuidle.c,
by merging these into a single base.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The name __smp_call_function_single() doesn't tell much about the
properties of this function, especially when compared to
smp_call_function_single().
The comments above the implementation are also misleading. The main
point of this function is actually not to be able to embed the csd
in an object. This is actually a requirement that result from the
purpose of this function which is to raise an IPI asynchronously.
As such it can be called with interrupts disabled. And this feature
comes at the cost of the caller who then needs to serialize the
IPIs on this csd.
Lets rename the function and enhance the comments so that they reflect
these properties.
Suggested-by: Christoph Hellwig <hch@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Jens Axboe <axboe@fb.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
The main point of calling __smp_call_function_single() is to send
an IPI in a pure asynchronous way. By embedding a csd in an object,
a caller can send the IPI without waiting for a previous one to complete
as is required by smp_call_function_single() for example. As such,
sending this kind of IPI can be safe even when irqs are disabled.
This flexibility comes at the expense of the caller who then needs to
synchronize the csd lifecycle by himself and make sure that IPIs on a
single csd are serialized.
This is how __smp_call_function_single() works when wait = 0 and this
usecase is relevant.
Now there don't seem to be any usecase with wait = 1 that can't be
covered by smp_call_function_single() instead, which is safer. Lets look
at the two possible scenario:
1) The user calls __smp_call_function_single(wait = 1) on a csd embedded
in an object. It looks like a nice and convenient pattern at the first
sight because we can then retrieve the object from the IPI handler easily.
But actually it is a waste of memory space in the object since the csd
can be allocated from the stack by smp_call_function_single(wait = 1)
and the object can be passed an the IPI argument.
Besides that, embedding the csd in an object is more error prone
because the caller must take care of the serialization of the IPIs
for this csd.
2) The user calls __smp_call_function_single(wait = 1) on a csd that
is allocated on the stack. It's ok but smp_call_function_single()
can do it as well and it already takes care of the allocation on the
stack. Again it's more simple and less error prone.
Therefore, using the underscore prepend API version with wait = 1
is a bad pattern and a sign that the caller can do safer and more
simple.
There was a single user of that which has just been converted.
So lets remove this option to discourage further users.
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Jens Axboe <axboe@fb.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
With the move of kirkwood into mach-mvebu, drivers Kconfig need
tweeking to allow the kirkwood specific drivers to be built.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Mark Brown <broonie@linaro.org>
Acked-by: Kishon Vijay Abraham I <kishon@ti.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Tested-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Cc: Richard Purdie <rpurdie@rpsys.net>
Cc: Bryan Wu <cooloney@gmail.com>
Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Eduardo Valentin <eduardo.valentin@ti.com>
Signed-off-by: Jason Cooper <jason@lakedaemon.net>
The core idle loop now takes care of it. We need to add the runlatch
function calls to the idle routines which was earlier taken care of by
the arch specific idle routine.
Signed-off-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Reviewed-by: Deepthi Dharwar <deepthi@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: linux-pm@vger.kernel.org
Cc: linaro-kernel@lists.linaro.org
Link: http://lkml.kernel.org/n/tip-nr4mtbkkzf2oomaj85m24o7c@git.kernel.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Commit d8c6ad3184 ("sched/idle, PPC: Remove redundant
cpuidle_idle_call()") reintroduced ppc64_runlatch_off/on() in the
pseries cpuidle backend driver. Hence the cleanup caused by the
commit "c0c4301c54adde05:pseries/cpuidle: Remove redundant call
to ppc64_runlatch_off() in cpu idle routines" in conjuction
with the commit d8c6ad3184 causes a build failure.
Signed-off-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Nicolas Pitre <nicolas.pitre@linaro.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Link: http://lkml.kernel.org/r/52FAFD2D.2090306@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The core idle loop now takes care of it. However a few things need
checking:
- Invocation of cpuidle_idle_call() in pseries_lpar_idle() happened
through arch_cpu_idle() and was therefore always preceded by a call
to ppc64_runlatch_off(). To preserve this property now that
cpuidle_idle_call() is invoked directly from core code, a call to
ppc64_runlatch_off() has been added to idle_loop_prolog() in
platforms/pseries/processor_idle.c.
- Similarly, cpuidle_idle_call() was followed by ppc64_runlatch_off()
so a call to the later has been added to idle_loop_epilog().
- And since arch_cpu_idle() always made sure to re-enable IRQs if they
were not enabled, this is now
done in idle_loop_epilog() as well.
The above was made in order to keep the execution flow close to the
original. I don't know if that was strictly necessary. Someone well
aquainted with the platform details might find some room for possible
optimizations.
Signed-off-by: Nicolas Pitre <nico@linaro.org>
Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-sh@vger.kernel.org
Cc: linux-pm@vger.kernel.org
Cc: Russell King <linux@arm.linux.org.uk>
Cc: linaro-kernel@lists.linaro.org
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-47o4m03citrfg9y1vxic5asb@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Some archs set the CPUIDLE_FLAG_TIMER_STOP flag for idle states in which the
local timers stop. The cpuidle_idle_call() currently handles such idle states
by calling into the broadcast framework so as to wakeup CPUs at their next
wakeup event. With the hrtimer mode of broadcast, the BROADCAST_ENTER call
into the broadcast frameowork can fail for archs that do not have an external
clock device to handle wakeups and the CPU in question has thus to be made
the stand by CPU. This patch handles such cases by failing the call into
cpuidle so that the arch can take some default action. The arch will certainly
not enter a similar idle state because a failed cpuidle call will also implicitly
indicate that the broadcast framework has not registered this CPU to be woken up.
Hence we are safe if we fail the cpuidle call.
In the process move the functions that trace idle statistics just before and
after the entry and exit into idle states respectively. In other
scenarios where the call to cpuidle fails, we end up not tracing idle
entry and exit since a decision on an idle state could not be taken. Similarly
when the call to broadcast framework fails, we skip tracing idle statistics
because we are in no further position to take a decision on an alternative
idle state to enter into.
Signed-off-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Cc: deepthi@linux.vnet.ibm.com
Cc: paulmck@linux.vnet.ibm.com
Cc: fweisbec@gmail.com
Cc: paulus@samba.org
Cc: srivatsa.bhat@linux.vnet.ibm.com
Cc: svaidy@linux.vnet.ibm.com
Cc: peterz@infradead.org
Cc: benh@kernel.crashing.org
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lkml.kernel.org/r/20140207080652.17187.66344.stgit@preeti.in.ibm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Following patch ports the cpuidle framework for powernv
platform and also implements a cpuidle back-end powernv
idle driver calling on to power7_nap and snooze idle states.
Signed-off-by: Deepthi Dharwar <deepthi@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
smt-snooze-delay was designed to disable NAP state or delay the entry
to the NAP state prior to adoption of cpuidle framework. This
is per-cpu variable. With the coming of CPUIDLE framework,
states can be disabled on per-cpu basis using the cpuidle/enable
sysfs entry.
Also, with the coming of cpuidle driver each state's target residency
is per-driver unlike earlier which was per-device. Therefore,
the per-cpu sysfs smt-snooze-delay which decides the target residency
of the idle state on a particular cpu causes more confusion to the user
as we cannot have different smt-snooze-delay (target residency)
values for each cpu.
In the current code, smt-snooze-delay functionality is completely broken.
It makes sense to remove smt-snooze-delay from idle driver with the
coming of cpuidle framework.
However, sysfs files are retained as ppc64_util currently
utilises it. Once we fix ppc64_util, propose to clean
up the kernel code.
Signed-off-by: Deepthi Dharwar <deepthi@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This patch removes the usage of MAX_IDLE_STATE macro
and dead code around it. The number of states
are determined at run time based on the cpuidle
state table selected on a given platform
Signed-off-by: Deepthi Dharwar <deepthi@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Currently cpuidle-pseries backend driver cannot be
built as a module due to dependencies wrt cpuidle framework.
This patch removes all the module related code in the driver.
Signed-off-by: Deepthi Dharwar <deepthi@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This patch replaces the cpuidle driver and devices initialisation
calls with a single generic cpuidle_register() call
and also includes minor refactoring of the code around it.
Remove the cpu online check in snooze loop, as this code can
only locally run on a cpu only if it is online. Therefore,
this check is not required.
Signed-off-by: Deepthi Dharwar <deepthi@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Move the file from arch specific pseries/processor_idle.c
to drivers/cpuidle/cpuidle-pseries.c
Make the relevant Makefile and Kconfig changes.
Also, introduce Kconfig.powerpc in drivers/cpuidle
for all powerpc cpuidle drivers.
Signed-off-by: Deepthi Dharwar <deepthi@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Commit 60a66e3700 changed the Calxeda
cpuidle driver to a platform driver, copying the __init tag from the
_init() to the newly used _probe() function. However, "probe should
not be __init." (Rob said ;-)
Remove the __init tag to fix a section mismatch in the Calxeda
cpuidle driver.
Signed-off-by: Andre Przywara <andre.przywara@linaro.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
If not, we could end up in the unfortunate situation where
we dereference a NULL pointer b/c we have cpuidle disabled.
This is the case when booting under Xen (which uses the
ACPI P/C states but disables the CPU idle driver) - and can
be easily reproduced when booting with cpuidle.off=1.
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<ffffffff8156db4a>] cpuidle_unregister_device+0x2a/0x90
.. snip..
Call Trace:
[<ffffffff813b15b4>] acpi_processor_power_exit+0x3c/0x5c
[<ffffffff813af0a9>] acpi_processor_stop+0x61/0xb6
[<ffffffff814215bf>] __device_release_driver+0fffff81421653>] device_release_driver+0x23/0x30
[<ffffffff81420ed8>] bus_remove_device+0x108/0x180
[<ffffffff8141d9d9>] device_del+0x129/0x1c0
[<ffffffff813cb4b0>] ? unregister_xenbus_watch+0x1f0/0x1f0
[<ffffffff8141da8e>] device_unregister+0x1e/0x60
[<ffffffff814243e9>] unregister_cpu+0x39/0x60
[<ffffffff81019e03>] arch_unregister_cpu+0x23/0x30
[<ffffffff813c3c51>] handle_vcpu_hotplug_event+0xc1/0xe0
[<ffffffff813cb4f5>] xenwatch_thread+0x45/0x120
[<ffffffff810af010>] ? abort_exclusive_wait+0xb0/0xb0
[<ffffffff8108ec42>] kthread+0xd2/0xf0
[<ffffffff8108eb70>] ? kthread_create_on_node+0x180/0x180
[<ffffffff816ce17c>] ret_from_fork+0x7c/0xb0
[<ffffffff8108eb70>] ? kthread_create_on_node+0x180/0x180
This problem also appears in 3.12 and could be a candidate for backport.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
- New power capping framework and the the Intel Running Average Power
Limit (RAPL) driver using it from Srinivas Pandruvada and Jacob Pan.
- Addition of the in-kernel switching feature to the arm_big_little
cpufreq driver from Viresh Kumar and Nicolas Pitre.
- cpufreq support for iMac G5 from Aaro Koskinen.
- Baytrail processors support for intel_pstate from Dirk Brandewie.
- cpufreq support for Midway/ECX-2000 from Mark Langsdorf.
- ARM vexpress/TC2 cpufreq support from Sudeep KarkadaNagesha.
- ACPI power management support for the I2C and SPI bus types from
Mika Westerberg and Lv Zheng.
- cpufreq core fixes and cleanups from Viresh Kumar, Srivatsa S Bhat,
Stratos Karafotis, Xiaoguang Chen, Lan Tianyu.
- cpufreq drivers updates (mostly fixes and cleanups) from Viresh Kumar,
Aaro Koskinen, Jungseok Lee, Sudeep KarkadaNagesha, Lukasz Majewski,
Manish Badarkhe, Hans-Christian Egtvedt, Evgeny Kapaev.
- intel_pstate updates from Dirk Brandewie and Adrian Huang.
- ACPICA update to version 20130927 includig fixes and cleanups and
some reduction of divergences between the ACPICA code in the kernel
and ACPICA upstream in order to improve the automatic ACPICA patch
generation process. From Bob Moore, Lv Zheng, Tomasz Nowicki,
Naresh Bhat, Bjorn Helgaas, David E Box.
- ACPI IPMI driver fixes and cleanups from Lv Zheng.
- ACPI hotplug fixes and cleanups from Bjorn Helgaas, Toshi Kani,
Zhang Yanfei, Rafael J Wysocki.
- Conversion of the ACPI AC driver to the platform bus type and
multiple driver fixes and cleanups related to ACPI from Zhang Rui.
- ACPI processor driver fixes and cleanups from Hanjun Guo, Jiang Liu,
Bartlomiej Zolnierkiewicz, Mathieu Rhéaume, Rafael J Wysocki.
- Fixes and cleanups and new blacklist entries related to the ACPI
video support from Aaron Lu, Felipe Contreras, Lennart Poettering,
Kirill Tkhai.
- cpuidle core cleanups from Viresh Kumar and Lorenzo Pieralisi.
- cpuidle drivers fixes and cleanups from Daniel Lezcano, Jingoo Han,
Bartlomiej Zolnierkiewicz, Prarit Bhargava.
- devfreq updates from Sachin Kamat, Dan Carpenter, Manish Badarkhe.
- Operation Performance Points (OPP) core updates from Nishanth Menon.
- Runtime power management core fix from Rafael J Wysocki and update
from Ulf Hansson.
- Hibernation fixes from Aaron Lu and Rafael J Wysocki.
- Device suspend/resume lockup detection mechanism from Benoit Goby.
- Removal of unused proc directories created for various ACPI drivers
from Lan Tianyu.
- ACPI LPSS driver fix and new device IDs for the ACPI platform scan
handler from Heikki Krogerus and Jarkko Nikula.
- New ACPI _OSI blacklist entry for Toshiba NB100 from Levente Kurusa.
- Assorted fixes and cleanups related to ACPI from Andy Shevchenko,
Al Stone, Bartlomiej Zolnierkiewicz, Colin Ian King, Dan Carpenter,
Felipe Contreras, Jianguo Wu, Lan Tianyu, Yinghai Lu, Mathias Krause,
Liu Chuansheng.
- Assorted PM fixes and cleanups from Andy Shevchenko, Thierry Reding,
Jean-Christophe Plagniol-Villard.
/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
iQIcBAABCAAGBQJSfPKLAAoJEILEb/54YlRxH6YQAJwDKi25RCZziFSIenXuqzC/
c6JxoH/tSnDHJHhcTgqh7H7Raa+zmatMDf0m2oEv2Wjfx4Lt4BQK4iefhe/zY4lX
yJ8uXDg+U8DYhDX2XwbwnFpd1M1k/A+s2gIHDTHHGnE0kDngXdd8RAFFktBmooTZ
l5LBQvOrTlgX/ZfqI/MNmQ6lfY6kbCABGSHV1tUUsDA6Kkvk/LAUTOMSmptv1q22
hcs6k55vR34qADPkUX5GghjmcYJv+gNtvbDEJUjcmCwVoPWouF415m7R5lJ8w3/M
49Q8Tbu5HELWLwca64OorS8qh/P7sgUOf1BX5IDzHnJT+TGeDfvcYbMv2Z275/WZ
/bqhuLuKBpsHQ2wvEeT+lYV3FlifKeTf1FBxER3ApjzI3GfpmVVQ+dpEu8e9hcTh
ZTPGzziGtoIsHQ0unxb+zQOyt1PmIk+cU4IsKazs5U20zsVDMcKzPrb19Od49vMX
gCHvRzNyOTqKWpE83Ss4NGOVPAG02AXiXi/BpuYBHKDy6fTH/liKiCw5xlCDEtmt
lQrEbupKpc/dhCLo5ws6w7MZzjWJs2eSEQcNR4DlR++pxIpYOOeoPTXXrghgZt2X
mmxZI2qsJ7GAvPzII8OBeF3CRO3fabZ6Nez+M+oEZjGe05ZtpB3ccw410HwieqBn
dYpJFt/BHK189odhV9CM
=JCxk
-----END PGP SIGNATURE-----
Merge tag 'pm+acpi-3.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI and power management updates from Rafael J Wysocki:
- New power capping framework and the the Intel Running Average Power
Limit (RAPL) driver using it from Srinivas Pandruvada and Jacob Pan.
- Addition of the in-kernel switching feature to the arm_big_little
cpufreq driver from Viresh Kumar and Nicolas Pitre.
- cpufreq support for iMac G5 from Aaro Koskinen.
- Baytrail processors support for intel_pstate from Dirk Brandewie.
- cpufreq support for Midway/ECX-2000 from Mark Langsdorf.
- ARM vexpress/TC2 cpufreq support from Sudeep KarkadaNagesha.
- ACPI power management support for the I2C and SPI bus types from Mika
Westerberg and Lv Zheng.
- cpufreq core fixes and cleanups from Viresh Kumar, Srivatsa S Bhat,
Stratos Karafotis, Xiaoguang Chen, Lan Tianyu.
- cpufreq drivers updates (mostly fixes and cleanups) from Viresh
Kumar, Aaro Koskinen, Jungseok Lee, Sudeep KarkadaNagesha, Lukasz
Majewski, Manish Badarkhe, Hans-Christian Egtvedt, Evgeny Kapaev.
- intel_pstate updates from Dirk Brandewie and Adrian Huang.
- ACPICA update to version 20130927 includig fixes and cleanups and
some reduction of divergences between the ACPICA code in the kernel
and ACPICA upstream in order to improve the automatic ACPICA patch
generation process. From Bob Moore, Lv Zheng, Tomasz Nowicki, Naresh
Bhat, Bjorn Helgaas, David E Box.
- ACPI IPMI driver fixes and cleanups from Lv Zheng.
- ACPI hotplug fixes and cleanups from Bjorn Helgaas, Toshi Kani, Zhang
Yanfei, Rafael J Wysocki.
- Conversion of the ACPI AC driver to the platform bus type and
multiple driver fixes and cleanups related to ACPI from Zhang Rui.
- ACPI processor driver fixes and cleanups from Hanjun Guo, Jiang Liu,
Bartlomiej Zolnierkiewicz, Mathieu Rhéaume, Rafael J Wysocki.
- Fixes and cleanups and new blacklist entries related to the ACPI
video support from Aaron Lu, Felipe Contreras, Lennart Poettering,
Kirill Tkhai.
- cpuidle core cleanups from Viresh Kumar and Lorenzo Pieralisi.
- cpuidle drivers fixes and cleanups from Daniel Lezcano, Jingoo Han,
Bartlomiej Zolnierkiewicz, Prarit Bhargava.
- devfreq updates from Sachin Kamat, Dan Carpenter, Manish Badarkhe.
- Operation Performance Points (OPP) core updates from Nishanth Menon.
- Runtime power management core fix from Rafael J Wysocki and update
from Ulf Hansson.
- Hibernation fixes from Aaron Lu and Rafael J Wysocki.
- Device suspend/resume lockup detection mechanism from Benoit Goby.
- Removal of unused proc directories created for various ACPI drivers
from Lan Tianyu.
- ACPI LPSS driver fix and new device IDs for the ACPI platform scan
handler from Heikki Krogerus and Jarkko Nikula.
- New ACPI _OSI blacklist entry for Toshiba NB100 from Levente Kurusa.
- Assorted fixes and cleanups related to ACPI from Andy Shevchenko, Al
Stone, Bartlomiej Zolnierkiewicz, Colin Ian King, Dan Carpenter,
Felipe Contreras, Jianguo Wu, Lan Tianyu, Yinghai Lu, Mathias Krause,
Liu Chuansheng.
- Assorted PM fixes and cleanups from Andy Shevchenko, Thierry Reding,
Jean-Christophe Plagniol-Villard.
* tag 'pm+acpi-3.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (386 commits)
cpufreq: conservative: fix requested_freq reduction issue
ACPI / hotplug: Consolidate deferred execution of ACPI hotplug routines
PM / runtime: Use pm_runtime_put_sync() in __device_release_driver()
ACPI / event: remove unneeded NULL pointer check
Revert "ACPI / video: Ignore BIOS initial backlight value for HP 250 G1"
ACPI / video: Quirk initial backlight level 0
ACPI / video: Fix initial level validity test
intel_pstate: skip the driver if ACPI has power mgmt option
PM / hibernate: Avoid overflow in hibernate_preallocate_memory()
ACPI / hotplug: Do not execute "insert in progress" _OST
ACPI / hotplug: Carry out PCI root eject directly
ACPI / hotplug: Merge device hot-removal routines
ACPI / hotplug: Make acpi_bus_hot_remove_device() internal
ACPI / hotplug: Simplify device ejection routines
ACPI / hotplug: Fix handle_root_bridge_removal()
ACPI / hotplug: Refuse to hot-remove all objects with disabled hotplug
ACPI / scan: Start matching drivers after trying scan handlers
ACPI: Remove acpi_pci_slot_init() headers from internal.h
ACPI / blacklist: fix name of ThinkPad Edge E530
PowerCap: Fix build error with option -Werror=format-security
...
Conflicts:
arch/arm/mach-omap2/opp.c
drivers/Kconfig
drivers/spi/spi.c
cpuidle_unregister_governor() and cpuidle_replace_governor() aren't
used anymore and can be removed. They were used by cpufreq governors
earlier, but since the governors can't be compiled as modules any
more, these two functions aren't necessary.
Suggested-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
poll_idle_init() just initializes drv->states[0] and so that is
required to be done only once for each driver. Currently, it is
called from cpuidle_enable_device() which is called for every CPU
that the driver supports. That is not required, so move it to a
better place and call it from __cpuidle_register_driver() so that
the initialization is carried out only once.
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Instances of "struct cpuidle_driver *" are consistently named as "drv"
in the cpuidle core except in show_current_driver().
Make that function use variable naming consistent with the rest of the
code.
[rjw: Changelog]
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
There are a few cpuidle_get_driver() calls that aren't made under
cpuidle_driver_lock which is incorrect.
Fix them by calling cpuidle_get_driver() after taking cpuidle_driver_lock.
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Few statements in cpuidle_idle_call() are broken into multiple lines,
although that isn't really necessary. Convert those to single line.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
We are doing this twice in cpuidle_idle_call() routine:
drv->states[next_state].flags & CPUIDLE_FLAG_TIMER_STOP
Would be better if we actually store this in a local variable and
use that. That reduces code duplication and likely makes this piece
of code run faster (in case the compiler wasn't able to optimize it
earlier)
[rjw: Cast the result of bitwise AND to bool explicitly using !!]
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Two checks cpuidle_idle_call() cause the same error code to be
returned if they fail, so merge them for clarity.
[rjw: Changelog]
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
This patch rearranges __cpuidle_register_device() a bit in order to
reduce the number of exit points in that function.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
This is trivial patch that just reorders a few statements in
__cpuidle_driver_init() routine so that we don't need both 'continue'
and 'break' in the for loop. Functionally it shouldn't change anything.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The only value returned by __cpuidle_driver_init() is 0, so it
very well may be a void function.
[rjw: Changelog]
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The only value returned by __cpuidle_device_init() is 0, so it very
well may be a void function. Make that happen.
[rjw: Changelog]
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Some comments in cpuidle core files contain trivial mistakes.
This patch fixes them.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
As the cpuidle driver code has no more the dependency with the pm code, the
'standby' callback being passed as a parameter to the device's platform data,
we can move the cpuidle driver in the drivers/cpuidle directory.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Conflicts:
drivers/cpuidle/Kconfig.arm
drivers/cpuidle/Makefile
The dbx500_cpuidle_probe is tagged as an __init section but the variable
dbx500_cpuidle_plat_driver is not.
The dbx500_cpuidle_probe could not be declared as __init because of macro
module_platform_driver builds the exit function, tags as __exit and this one
refers to the dbx500_cpuidle_plat_driver which is an __initdata.
That leads to a section mismatch.
Fix it by removing the __init tag for the probe function.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
As the ux500 and the kirkwood driver, make the zynq driver a platform driver
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Michal Simek <michal.simek@xilinx.com>
Tested-by: Soren Brinkmann <soren.brinkmann@xilinx.com>
All zynq platforms have this compatibility string and there is no any other
clone.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Michal Simek <michal.simek@xilinx.com>
Following the reorganization of CPU idle drivers configurations into an ARM
specific Kconfig, the existing idle drivers Kconfig entries were renamed and
moved to the Kconfig.arm file. Makefile entries were updated accordingly.
This patch renames the entries in Kconfig.arm and makefile to make the newly
added big.LITTLE CPUidle driver compliant with the new naming convention.
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Kevin Hilman <khilman@linaro.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
This updates the Calxeda cpuidle driver to use PSCI calls to powergate
cores. This also enables cpuidle for the ECX-2000.
This could possibly become a generic PSCI driver, but there are no other
PSCI users in the kernel other than mach-virt.
Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: linux-pm@vger.kernel.org
As the ux500 and the kirkwood driver, make the calxeda driver a platform driver
[Compiled only]
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Wnen powergating the core, we need to call cpu pm notifiers to save VFP
state (!SMP only) and resetting the breakpoint h/w.
Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: linux-pm@vger.kernel.org
1) ACPI-based PCI hotplug (ACPIPHP) fixes related to spurious events
After the recent ACPIPHP changes we've seen some interesting breakage
on a system that triggers device check notifications during boot for
non-existing devices. Although those notifications are really
spurious, we should be able to deal with them nevertheless and that
shouldn't introduce too much overhead. Four commits to make that
work properly.
2) Memory hotplug and hibernation mutual exclusion rework
This was maent to be a cleanup, but it happens to fix a classical
ABBA deadlock between system suspend/hibernation and ACPI memory
hotplug which is possible if they are started roughly at the same
time. Three commits rework memory hotplug so that it doesn't
acquire pm_mutex and make hibernation use device_hotplug_lock
which prevents it from racing with memory hotplug.
3) ACPI Intel LPSS (Low-Power Subsystem) driver crash fix
The ACPI LPSS driver crashes during boot on Apple Macbook Air with
Haswell that has slightly unusual BIOS configuration in which one
of the LPSS device's _CRS method doesn't return all of the information
expected by the driver. Fix from Mika Westerberg, for stable.
4) ACPICA fix related to Store->ArgX operation
AML interpreter fix for obscure breakage that causes AML to be
executed incorrectly on some machines (observed in practice). From
Bob Moore.
5) ACPI core fix for PCI ACPI device objects lookup
There still are cases in which there is more than one ACPI device
object matching a given PCI device and we don't choose the one that
the BIOS expects us to choose, so this makes the lookup take more
criteria into account in those cases.
6) Fix to prevent cpuidle from crashing in some rare cases
If the result of cpuidle_get_driver() is NULL, which can happen on
some systems, cpuidle_driver_ref() will crash trying to use that
pointer and the Daniel Fu's fix prevents that from happening.
7) cpufreq fixes related to CPU hotplug
Stephen Boyd reported a number of concurrency problems with cpufreq
related to CPU hotplug which are addressed by a series of fixes
from Srivatsa S Bhat and Viresh Kumar.
8) cpufreq fix for time conversion in time_in_state attribute
Time conversion carried out by cpufreq when user space attempts to
read /sys/devices/system/cpu/cpu*/cpufreq/stats/time_in_state won't
work correcty if cputime_t doesn't map directly to jiffies. Fix
from Andreas Schwab.
9) Revert of a troublesome cpufreq commit
Commit 7c30ed5 (cpufreq: make sure frequency transitions are
serialized) was intended to address some known concurrency problems
in cpufreq related to the ordering of transitions, but unfortunately
it introduced several problems of its own, so I decided to revert it
now and address the original problems later in a more robust way.
10) Intel Haswell CPU models for intel_pstate from Nell Hardcastle.
11) cpufreq fixes related to system suspend/resume
The recent cpufreq changes that made it preserve CPU sysfs attributes
over suspend/resume cycles introduced a possible NULL pointer
dereference that caused it to crash during the second attempt to
suspend. Three commits from Srivatsa S Bhat fix that problem and a
couple of related issues.
12) cpufreq locking fix
cpufreq_policy_restore() should acquire the lock for reading, but
it acquires it for writing. Fix from Lan Tianyu.
/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
iQIcBAABCAAGBQJSMbdRAAoJEKhOf7ml8uNsiFkQAKSh1iBXuiUCxBApEGZgoQio
8lmnuyWdhNQWdjZTnh7ptjpDxdrWhxcoxvoaGABU++reDObjef1QnyrQtdO3r8dl
oy0C/YGh5kq5SIffIDEwPIb/ipDe/47cgRMW8iBlnViDa1MJBqICuLyefcTRIrKp
QGvv0owUM2o7TXpA10+qm8zXjv6m5mu1DTtxYI+2Eodhwi54neAqb+aKMspa2thy
V9KFcVv3Td4rJrNvw6BhXNM81QbaYpRxaK3DRr1T6SM++EKvbqYFA1jgW24YvqTL
nrCZlDMb6KRww5DCxA/ns9Kx5H+ZyicoRwdtAM3PBYA6MGqsLqPozC/8VKV1fSvZ
sgUdbUSuLqKRAkOqM1bjKAhi9PdCGBvkQAg2AqbRK6IBl4HJC8xhdb5E6eZ/J42G
GyNBpKef7wVJwYKXE2hSChZ5dYjqMizNHWxFHf8Xy1dveExbQ2nmSJmaWMy2A3kx
YOXFkcTV5F6GOIZB8WCRruzUalff9xal4G+iVhGF+AZIOCm7bC+FDXfwIS82uVor
ej2l+uQLLZCB499IRmM6942ZIAXshmtN7eRfGtKBc6jsbSCEdQDqf1Z7oRwqAD6h
WkD/k/zz30CyM8y4snOkAXkZgqAQsZodtqfowE3e9OHd51tfcNiqdht+obwCx+eD
MWXc2xATMAX6NcZTXSZS
=U/Jw
-----END PGP SIGNATURE-----
Merge tag 'pm+acpi-fixes-3.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI and power management fixes from Rafael Wysocki:
"All of these commits are fixes that have emerged recently and some of
them fix bugs introduced during this merge window.
Specifics:
1) ACPI-based PCI hotplug (ACPIPHP) fixes related to spurious events
After the recent ACPIPHP changes we've seen some interesting
breakage on a system that triggers device check notifications
during boot for non-existing devices. Although those
notifications are really spurious, we should be able to deal with
them nevertheless and that shouldn't introduce too much overhead.
Four commits to make that work properly.
2) Memory hotplug and hibernation mutual exclusion rework
This was maent to be a cleanup, but it happens to fix a classical
ABBA deadlock between system suspend/hibernation and ACPI memory
hotplug which is possible if they are started roughly at the same
time. Three commits rework memory hotplug so that it doesn't
acquire pm_mutex and make hibernation use device_hotplug_lock
which prevents it from racing with memory hotplug.
3) ACPI Intel LPSS (Low-Power Subsystem) driver crash fix
The ACPI LPSS driver crashes during boot on Apple Macbook Air with
Haswell that has slightly unusual BIOS configuration in which one
of the LPSS device's _CRS method doesn't return all of the
information expected by the driver. Fix from Mika Westerberg, for
stable.
4) ACPICA fix related to Store->ArgX operation
AML interpreter fix for obscure breakage that causes AML to be
executed incorrectly on some machines (observed in practice).
From Bob Moore.
5) ACPI core fix for PCI ACPI device objects lookup
There still are cases in which there is more than one ACPI device
object matching a given PCI device and we don't choose the one
that the BIOS expects us to choose, so this makes the lookup take
more criteria into account in those cases.
6) Fix to prevent cpuidle from crashing in some rare cases
If the result of cpuidle_get_driver() is NULL, which can happen on
some systems, cpuidle_driver_ref() will crash trying to use that
pointer and the Daniel Fu's fix prevents that from happening.
7) cpufreq fixes related to CPU hotplug
Stephen Boyd reported a number of concurrency problems with
cpufreq related to CPU hotplug which are addressed by a series of
fixes from Srivatsa S Bhat and Viresh Kumar.
8) cpufreq fix for time conversion in time_in_state attribute
Time conversion carried out by cpufreq when user space attempts to
read /sys/devices/system/cpu/cpu*/cpufreq/stats/time_in_state
won't work correcty if cputime_t doesn't map directly to jiffies.
Fix from Andreas Schwab.
9) Revert of a troublesome cpufreq commit
Commit 7c30ed5 (cpufreq: make sure frequency transitions are
serialized) was intended to address some known concurrency
problems in cpufreq related to the ordering of transitions, but
unfortunately it introduced several problems of its own, so I
decided to revert it now and address the original problems later
in a more robust way.
10) Intel Haswell CPU models for intel_pstate from Nell Hardcastle.
11) cpufreq fixes related to system suspend/resume
The recent cpufreq changes that made it preserve CPU sysfs
attributes over suspend/resume cycles introduced a possible NULL
pointer dereference that caused it to crash during the second
attempt to suspend. Three commits from Srivatsa S Bhat fix that
problem and a couple of related issues.
12) cpufreq locking fix
cpufreq_policy_restore() should acquire the lock for reading, but
it acquires it for writing. Fix from Lan Tianyu"
* tag 'pm+acpi-fixes-3.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (25 commits)
cpufreq: Acquire the lock in cpufreq_policy_restore() for reading
cpufreq: Prevent problems in update_policy_cpu() if last_cpu == new_cpu
cpufreq: Restructure if/else block to avoid unintended behavior
cpufreq: Fix crash in cpufreq-stats during suspend/resume
intel_pstate: Add Haswell CPU models
Revert "cpufreq: make sure frequency transitions are serialized"
cpufreq: Use signed type for 'ret' variable, to store negative error values
cpufreq: Remove temporary fix for race between CPU hotplug and sysfs-writes
cpufreq: Synchronize the cpufreq store_*() routines with CPU hotplug
cpufreq: Invoke __cpufreq_remove_dev_finish() after releasing cpu_hotplug.lock
cpufreq: Split __cpufreq_remove_dev() into two parts
cpufreq: Fix wrong time unit conversion
cpufreq: serialize calls to __cpufreq_governor()
cpufreq: don't allow governor limits to be changed when it is disabled
ACPI / bind: Prefer device objects with _STA to those without it
ACPI / hotplug / PCI: Avoid parent bus rescans on spurious device checks
ACPI / hotplug / PCI: Use _OST to notify firmware about notify status
ACPI / hotplug / PCI: Avoid doing too much for spurious notifies
ACPICA: Fix for a Store->ArgX when ArgX contains a reference to a field.
ACPI / hotplug / PCI: Don't trim devices before scanning the namespace
...
This branch contains ARM SoC related driver updates for v3.12. The
only thing this cycle are core PM updates and CPUidle support for
ARM's TC2 big.LITTLE development platform.
Conflicts:
One cleanup/reorg conflict with a new entry in
drivers/cpuidle/Makefile. Append the new entry after the existing
ones. A follow up patch for v3.12-rc will make the new entry conform
to the cleanup/reorg.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJSLjatAAoJEFk3GJrT+8Zl32sP/Aw2iEXd/5DUvcp6y/qZoAjO
oLhCPviEnQCpz4smFFySBLvvKyVyA7oOMet8nelIJhwHCTNMBpJZHIfcvpIP5uBY
6LLpFUw4m7TqOISwpVXlwc/3CuG76QCrITLJmButq6tHF4udHeAur+pAnNHoaoys
O5arRMLvl5C4rREeiZctTv5JARICCxIcHpweQdtt+MZ03yG78fEfSB9XxvyOlhh0
OJnGcqU07fIXw9kT/9KAnR3Ql7JJsdzlXqLq6/wFWPe5a1KtgxHNXPbtWaxl8JWW
cPSQci+n9iWgxKzoQTGyQO6sfkDHcol3izMeCScMwlx05SMPwofXpYitaPHLF1cy
PtJosSMVQvJPrHyGlY4vhD9mtCIcyOmlwSlZ6dOf7oqXMhT9CPJe2UD/8JZWgXBi
imY/vpU8mgZT315rQmc/Khg721VNKcSuIvP6xUS9PuaSMUrPSCJFbbkckHGnzdC7
XVFCui9gFxa7vMN+CzrZRqfZnjJ7ujuiFDauMzltu0iBiPNXkAfyoqbxMqUP1HJ5
pdU84vuEVjsUdWt9ivJs6I6cqIwroeji9HZzZnWkWyoDgtAjxhDFVXydqlhrZsuJ
O3uErP8fjRtloFa2iLDZfawPpHDFsY4F+Nm09rZLO7RE4ELlYlQGfYEwuIh+kZ16
nLPE/V5DYrBVyNGDouKx
=FvQD
-----END PGP SIGNATURE-----
Merge tag 'drivers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC driver update from Kevin Hilman:
"This contains the ARM SoC related driver updates for v3.12. The only
thing this cycle are core PM updates and CPUidle support for ARM's TC2
big.LITTLE development platform"
* tag 'drivers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
cpuidle: big.LITTLE: vexpress-TC2 CPU idle driver
ARM: vexpress: tc2: disable GIC CPU IF in tc2_pm_suspend
drivers: irq-chip: irq-gic: introduce gic_cpu_if_down()
If the current CPU has no cpuidle driver, drv will be NULL in
cpuidle_driver_ref(). Check if that is the case before trying
to bump up the driver's refcount to prevent the kernel from
crashing.
[rjw: Subject and changelog]
Signed-off-by: Daniel Fu <danifu@nvidia.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The coupled cpuidle waiting loop clears pending pokes before
entering the safe state. If a poke arrives just before the
pokes are cleared, but after the while loop condition checks,
the poke will be lost and the cpu will stay in the safe state
until another interrupt arrives. This may cause the cpu that
sent the poke to spin in the ready loop with interrupts off
until another cpu receives an interrupt, and if no other cpus
have interrupts routed to them it can spin forever.
Change the return value of cpuidle_coupled_clear_pokes to
return if a poke was cleared, and move the need_resched()
checks into the callers. In the waiting loop, if
a poke was cleared restart the loop to repeat the while
condition checks.
Reported-by: Neil Zhang <zhangwm@marvell.com>
Signed-off-by: Colin Cross <ccross@android.com>
Cc: 3.6+ <stable@vger.kernel.org> # 3.6+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Joseph Lo <josephl@nvidia.com> reported a lockup on Tegra20 caused
by a race condition in coupled cpuidle. When two or more cpus
enter idle at the same time, the first cpus to arrive may go to the
ready loop without processing pending pokes from the last cpu to
arrive.
This patch adds a check for pending pokes once all cpus have been
synchronized in the ready loop and resets the coupled state and
retries if any cpus failed to handle their pending poke.
Retrying on all cpus may trigger the same issue again, so this patch
also adds a check to ensure that each cpu has received at least one
poke between when it enters the waiting loop and when it moves on to
the ready loop.
Reported-and-tested-by: Joseph Lo <josephl@nvidia.com>
Tested-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Colin Cross <ccross@android.com>
Cc: 3.6+ <stable@vger.kernel.org> # 3.6+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Calling cpuidle_enter_state is expected to return with interrupts
enabled, but interrupts must be disabled before starting the
ready loop synchronization stage. Call local_irq_disable after
each call to cpuidle_enter_state for the safe state.
Tested-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Colin Cross <ccross@android.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
From Lorenzo Pieralisi:
This patch series contains:
- GIC driver update to add a method to disable the GIC CPU IF
- TC2 MCPM update to add GIC CPU disabling to suspend method
- TC2 CPU idle big.LITTLE driver
* cpuidle/biglittle:
cpuidle: big.LITTLE: vexpress-TC2 CPU idle driver
ARM: vexpress: tc2: disable GIC CPU IF in tc2_pm_suspend
drivers: irq-chip: irq-gic: introduce gic_cpu_if_down()
ARM: vexpress/TC2: implement PM suspend method
ARM: vexpress/TC2: basic PM support
ARM: vexpress: Add SCC to V2P-CA15_A7's device tree
ARM: vexpress/TC2: add Serial Power Controller (SPC) support
ARM: vexpress/dcscb: fix cache disabling sequences
Signed-off-by: Olof Johansson <olof@lixom.net>
The big.LITTLE architecture is composed of two clusters of cpus. One cluster
contains less powerful but more energy efficient processors and the other
cluster groups the powerful but energy-intensive cpus.
The TC2 testchip implements two clusters of CPUs (A7 and A15 clusters in
a big.LITTLE configuration) connected through a CCI interconnect that manages
coherency of their respective L2 caches and intercluster distributed
virtual memory messages (DVM).
TC2 testchip integrates a power controller that manages cores resets, wake-up
IRQs and cluster low-power states. Power states are managed at cluster
level, which means that voltage is removed from a cluster iff all cores
in a cluster are in a wfi state. Single cores can enter a reset state
which is identical to wfi in terms of power consumption but simplifies the
way cluster states are entered.
This patch provides a multiple driver CPU idle implementation for TC2
which paves the way for a generic big.LITTLE idle driver for all
upcoming big.LITTLE based systems on chip.
The driver relies on the MCPM infrastructure to coordinate and manage
core power states; in particular MCPM allows to suspend specific cores
and hides the CPUs coordination required to shut-down clusters of CPUs.
Power down sequences for the respective clusters are implemented in the
MCPM TC2 backend, with all code needed to clean caches and exit coherency.
The multiple driver CPU idle infrastructure allows to define different
C-states for big and little cores, determined at boot by checking the
part id of the possible CPUs and initializing the respective logical
masks in the big and little drivers.
Current big.little systems are composed of A7 and A15 clusters, as
implemented in TC2, but in the future that may change and the driver
will have evolve to retrieve what is a 'big' cpu and what is a 'little'
cpu in order to build the correct topology.
Cc: Kevin Hilman <khilman@linaro.org>
Cc: Amit Kucheria <amit.kucheria@linaro.org>
Cc: Olof Johansson <olof@lixom.net>
Cc: Nicolas Pitre <nicolas.pitre@linaro.org>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
Field predicted_us value can never exceed expected_us value, but it has
a potentially larger type. As there is no need for additional 32 bits of
zeroes on 32 bit plaforms, change the type of predicted_us to match the
type of expected_us.
Field correction_factor is used to store a value that cannot exceed the
product of RESOLUTION and DECAY (default 1024*8 = 8192). The constants
cannot in practice be incremented to such values, that they'd overflow
unsigned int even on 32 bit systems, so the type is changed to avoid
unnecessary 64 bit arithmetic on 32 bit systems.
One multiplication of (now) 32 bit values needs an added cast to avoid
truncation of the result and has been added.
In order to avoid another multiplication from 32 bit domain to 64 bit
domain, the new correction_factor calculation has been changed from
new = old * (DECAY-1) / DECAY
to
new = old - old / DECAY,
which with infinite precision would yeild exactly the same result, but
now changes the direction of rounding. The impact is not significant as
the maximum accumulated difference cannot exceed the value of DECAY,
which is relatively small compared to product of RESOLUTION and DECAY
(8 / 8192).
Signed-off-by: Tuukka Tikkanen <tuukka.tikkanen@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The menu governor has a number of tunable constants that may be changed
in the source. If certain combination of values are chosen, an overflow
is possible when the correction_factor is being recalculated.
This patch adds a warning regarding this possibility and describes the
change needed for fixing the issue. The change should not be permanently
enabled, as it will hurt performance when it is not needed.
Signed-off-by: Tuukka Tikkanen <tuukka.tikkanen@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The menu governor uses a static function get_typical_interval() to
try to detect a repeating pattern of wakeups. The previous interval
durations are stored as an array of unsigned ints, but the arithmetic
in the function is performed exclusively as 64 bit values, even when
the value stored in a variable is known not to exceed unsigned int,
which may be smaller and more efficient on some platforms.
This patch changes the types of varibles used to store some
intermediates, the maximum and and the cutoff threshold to unsigned
ints. Average and standard deviation are still treated as 64 bit values,
even when the values are known to be within the domain of unsigned int,
to avoid casts to ensure correct integer promotion for arithmetic
operations.
Signed-off-by: Tuukka Tikkanen <tuukka.tikkanen@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Struct menu_device member intervals is declared as u32, but the value
stored is (unsigned) int. The type is changed to match the value being
stored.
Signed-off-by: Tuukka Tikkanen <tuukka.tikkanen@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The function get_typical_interval() initializes a number of variables
that are immediately after declarations assigned constant values.
In addition, there are multiple assignments on a single line, which
is explicitly forbidden by Documentation/CodingStyle.
This patch removes redundant initial values for the variables and
breaks up the multiple assignment line.
Signed-off-by: Tuukka Tikkanen <tuukka.tikkanen@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
get_typical_interval() uses int_sqrt() in calculation of standard
deviation. The formal parameter of int_sqrt() is unsigned long, which
may on some platforms be smaller than the 64 bit unsigned integer used
as the actual parameter. The overflow can occur frequently when actual
idle period lengths are in hundreds of milliseconds.
This patch adds a check for such overflow and rejects the candidate
average when an overflow would occur.
Signed-off-by: Tuukka Tikkanen <tuukka.tikkanen@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
This patch rearranges a if-return-elsif-goto-fi-return sequence into
if-return-fi-if-return-fi-goto sequence. The functionality remains the
same. Also, a lengthy comment that did not describe the functionality
in the order it occurs is split into half and top half is moved closer
to actual implementation it describes.
Signed-off-by: Tuukka Tikkanen <tuukka.tikkanen@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
This patch prevents cpuidle menu governor from using repeating interval
prediction result if the idle period predicted is longer than the one
allowed by shortest running timer.
Signed-off-by: Tuukka Tikkanen <tuukka.tikkanen@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Remove unneeded error handling on the result of a call to
platform_get_resource when the value is passed to
devm_ioremap_resource().
A simplified version of the semantic patch that makes this change is
as follows: (http://coccinelle.lip6.fr/)
// <smpl>
@@
expression pdev,res,n,e,e1;
expression ret != 0;
identifier l;
@@
- res = platform_get_resource(pdev, IORESOURCE_MEM, n);
... when != res
- if (res == NULL) { ... \(goto l;\|return ret;\) }
... when != res
+ res = platform_get_resource(pdev, IORESOURCE_MEM, n);
e = devm_ioremap_resource(e1, res);
// </smpl>
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Pull ARM cpuidle updates from Daniel Lezcano.
* 'cpuidle/arm-next' of git://git.linaro.org/people/dlezcano/linux:
cpuidle: kirkwood: Make kirkwood_cpuidle_remove function static
cpuidle: calxeda: Add missing __iomem annotation
SH: cpuidle: Add missing parameter for cpuidle_register()
This local symbol is used only in this file.
Fix the following sparse warnings:
drivers/cpuidle/cpuidle-kirkwood.c:73:5: warning: symbol 'kirkwood_cpuidle_remove' was not declared. Should it be static ?
Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Acked-by: Jason Cooper <jason@lakedaemon.net>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Revert commit e11538d1 (cpuidle: Quickly notice prediction failure in
general case), since it depends on commit 69a37be (cpuidle: Quickly
notice prediction failure for repeat mode) that has been identified
as the source of a significant performance regression in v3.8 and
later.
Requested-by: Jeremy Eder <jeder@redhat.com>
Tested-by: Len Brown <len.brown@intel.com>
Cc: 3.8+ <stable@vger.kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
There is no more dependency with arch/arm headers, so we can safely move the
driver to the drivers/cpuidle directory.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Add Kconfig.arm for ARM cpuidle drivers and moves calxeda, kirkwood
and zynq to Kconfig.arm. Like in the cpufreq menu, "CPU Idle" menu
is added to drivers/cpuidle/Kconfig.
Signed-off-by: Sahara <keun-o.park@windriver.com>
Make __cpuidle_register_device() check whether or not the device has
been registered already and return -EBUSY immediately if that's the
case.
[rjw: Changelog]
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Add __cpuidle_device_init() for initializing the cpuidle_device
structure.
[rjw: Changelog]
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
To reduce code duplication related to the unregistration of cpuidle
devices, introduce __cpuidle_unregister_device() and move all of the
unregistration code to that function.
[rjw: Changelog]
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The cpuidle sysfs code is designed to have a single instance of per
CPU cpuidle directory. It is not possible to remove the sysfs entry
and create it again. This is not a problem with the current code but
future changes will add CPU hotplug support to enable/disable the
device, so it will need to remove the sysfs entry like other
subsystems do. That won't be possible without this change, because
the kobj is a static object which can't be reused for
kobj_init_and_add().
Add cpuidle_device_kobj to be allocated dynamically when
adding/removing a sysfs entry which is consistent with the other
cpuidle's sysfs entries.
An added benefit is that the sysfs code is now more self-contained
and the includes needed for sysfs can be moved from cpuidle.h
directly into sysfs.c so as to reduce the total number of headers
dragged along with cpuidle.h.
[rjw: Changelog]
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Fix white space in the cpuidle code to follow the rules described in
CodingStyle.
No changes in behavior should result from this.
[rjw: Changelog]
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
We previously changed the ordering of the cpuidle framework
initialization so that the governors are registered before the
drivers which can register their devices right from the start.
Now, we can safely remove the __cpuidle_register_device() call hack
in cpuidle_enable_device() and check if the driver has been
registered before enabling it. Then, cpuidle_register_device() can
consistently check the cpuidle_enable_device() return value when
enabling the device.
[rjw: Changelog]
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
cpufreq governors are defined as modules in the code, but the Kconfig
options do not allow them to be built as modules. This is not really
a problem, but the cpuidle init ordering is: the cpuidle init
functions (framework and driver) and then the governors. That leads
to some weirdness in the cpuidle framework.
Namely, cpuidle_register_device() calls cpuidle_enable_device() which
fails at the first attempt, because governors have not been registered
yet. When a governor is registered, the framework calls
cpuidle_enable_device() again which runs __cpuidle_register_device()
only then. Of course, for that to work, the cpuidle_enable_device()
return value has to be ignored by cpuidle_register_device().
Instead of having this cyclic call graph and relying on a positive
side effects of the hackish back and forth cpuidle_enable_device()
calls it is better to fix the cpuidle init ordering.
To that end, replace the module init code with postcore_initcall()
so we have:
* cpuidle framework : core_initcall
* cpuidle governors : postcore_initcall
* cpuidle drivers : device_initcall
and remove the corresponding module exit code as it is dead anyway
(governors can't be built as modules).
[rjw: Changelog]
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
- Hotplug changes allowing device hot-removal operations to fail
gracefully (instead of crashing the kernel) if they cannot be
carried out completely. From Rafael J Wysocki and Toshi Kani.
- Freezer update from Colin Cross and Mandeep Singh Baines targeted
at making the freezing of tasks a bit less heavy weight operation.
- cpufreq resume fix from Srivatsa S Bhat for a regression introduced
during the 3.10 cycle causing some cpufreq sysfs attributes to
return wrong values to user space after resume.
- New freqdomain_cpus sysfs attribute for the acpi-cpufreq driver to
provide information previously available via related_cpus from
Lan Tianyu.
- cpufreq fixes and cleanups from Viresh Kumar, Jacob Shin,
Heiko Stübner, Xiaoguang Chen, Ezequiel Garcia, Arnd Bergmann, and
Tang Yuantian.
- Fix for an ACPICA regression causing suspend/resume issues to
appear on some systems introduced during the 3.4 development cycle
from Lv Zheng.
- ACPICA fixes and cleanups from Bob Moore, Tomasz Nowicki, Lv Zheng,
Chao Guan, and Zhang Rui.
- New cupidle driver for Xilinx Zynq processors from Michal Simek.
- cpuidle fixes and cleanups from Daniel Lezcano.
- Changes to make suspend/resume work correctly in Xen guests from
Konrad Rzeszutek Wilk.
- ACPI device power management fixes and cleanups from Fengguang Wu
and Rafael J Wysocki.
- ACPI documentation updates from Lv Zheng, Aaron Lu and Hanjun Guo.
- Fix for the IA-64 issue that was the reason for reverting commit
9f29ab1 and updates of the ACPI scan code from Rafael J Wysocki.
- Mechanism for adding CMOS RTC address space handlers from Lan Tianyu
(to allow some EC-related breakage to be fixed on some systems).
- Spec-compliant implementation of acpi_os_get_timer() from
Mika Westerberg.
- Modification of do_acpi_find_child() to execute _STA in order to
to avoid situations in which a pointer to a disabled device object
is returned instead of an enabled one with the same _ADR value.
From Jeff Wu.
- Intel BayTrail PCH (Platform Controller Hub) support for the ACPI
Intel Low-Power Subsystems (LPSS) driver and modificaions of that
driver to work around a couple of known BIOS issues from
Mika Westerberg and Heikki Krogerus.
- EC driver fix from Vasiliy Kulikov to make it use get_user() and
put_user() instead of dereferencing user space pointers blindly.
- Assorted ACPI code cleanups from Bjorn Helgaas, Nicholas Mazzuca and
Toshi Kani.
- Modification of the "runtime idle" helper routine to take the return
values of the callbacks executed by it into account and to call
rpm_suspend() if they return 0, which allows some code bloat
reduction to be done, from Rafael J Wysocki and Alan Stern.
- New trace points for PM QoS from Sahara <keun-o.park@windriver.com>.
- PM QoS documentation update from Lan Tianyu.
- Assorted core PM code cleanups and changes from Bernie Thompson,
Bjorn Helgaas, Julius Werner, and Shuah Khan.
- New devfreq driver for the Exynos5-bus device from Abhilash Kesavan.
- Minor devfreq cleanups, fixes and MAINTAINERS update from
MyungJoo Ham, Abhilash Kesavan, Paul Bolle, Rajagopal Venkat, and
Wei Yongjun.
- OMAP Adaptive Voltage Scaling (AVS) SmartReflex voltage control
driver updates from Andrii Tseglytskyi and Nishanth Menon.
/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
iQIcBAABAgAGBQJR0ZNOAAoJEKhOf7ml8uNsDLYP/0EU4rmvw0TWTITfp6RS1KDE
9GwBn96ZR4Q5bJd9gBCTPSqhHOYMqxWEUp99sn/M2wehG1pk/jw5LO56+2IhM3UZ
g1HDcJ7te2nVT/iXsKiAGTVhU9Rk0aYwoVSknwk27qpIBGxW9w/s5tLX8pY3Q3Zq
wL/7aTPjyL+PFFFEaxgH7qLqsl3DhbtYW5AriUBTkXout/tJ4eO1b7MNBncLDh8X
VQ/0DNCKE95VEJfkO4rk9RKUyVp9GDn0i+HXCD/FS4IA5oYzePdVdNDmXf7g+swe
CGlTZq8pB+oBpDiHl4lxzbNrKQjRNbGnDUkoRcWqn0nAw56xK+vmYnWJhW99gQ/I
fKnvxeLca5po1aiqmC4VSJxZIatFZqLrZAI4dzoCLWY+bGeTnCKmj0/F8ytFnZA2
8IuLLs7/dFOaHXV/pKmpg6FAlFa9CPxoqRFoyqb4M0GjEarADyalXUWsPtG+6xCp
R/p0CISpwk+guKZR/qPhL7M654S7SHrPwd2DPF0KgGsvk+G2GhoB8EzvD8BVp98Z
9siCGCdgKQfJQVI6R0k9aFmn/4gRQIAgyPhkhv9tqULUUkiaXki+/t8kPfnb8O/d
zep+CA57E2G8MYLkDJfpFeKS7GpPD6TIdgFdGmOUC0Y6sl9iTdiw4yTx8O2JM37z
rHBZfYGkJBrbGRu+Q1gs
=VBBq
-----END PGP SIGNATURE-----
Merge tag 'pm+acpi-3.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management and ACPI updates from Rafael Wysocki:
"This time the total number of ACPI commits is slightly greater than
the number of cpufreq commits, but Viresh Kumar (who works on cpufreq)
remains the most active patch submitter.
To me, the most significant change is the addition of offline/online
device operations to the driver core (with the Greg's blessing) and
the related modifications of the ACPI core hotplug code. Next are the
freezer updates from Colin Cross that should make the freezing of
tasks a bit less heavy weight.
We also have a couple of regression fixes, a number of fixes for
issues that have not been identified as regressions, two new drivers
and a bunch of cleanups all over.
Highlights:
- Hotplug changes to support graceful hot-removal failures.
It sometimes is necessary to fail device hot-removal operations
gracefully if they cannot be carried out completely. For example,
if memory from a memory module being hot-removed has been allocated
for the kernel's own use and cannot be moved elsewhere, it's
desirable to fail the hot-removal operation in a graceful way
rather than to crash the kernel, but currenty a success or a kernel
crash are the only possible outcomes of an attempted memory
hot-removal. Needless to say, that is not a very attractive
alternative and it had to be addressed.
However, in order to make it work for memory, I first had to make
it work for CPUs and for this purpose I needed to modify the ACPI
processor driver. It's been split into two parts, a resident one
handling the low-level initialization/cleanup and a modular one
playing the actual driver's role (but it binds to the CPU system
device objects rather than to the ACPI device objects representing
processors). That's been sort of like a live brain surgery on a
patient who's riding a bike.
So this is a little scary, but since we found and fixed a couple of
regressions it caused to happen during the early linux-next testing
(a month ago), nobody has complained.
As a bonus we remove some duplicated ACPI hotplug code, because the
ACPI-based CPU hotplug is now going to use the common ACPI hotplug
code.
- Lighter weight freezing of tasks.
These changes from Colin Cross and Mandeep Singh Baines are
targeted at making the freezing of tasks a bit less heavy weight
operation. They reduce the number of tasks woken up every time
during the freezing, by using the observation that the freezer
simply doesn't need to wake up some of them and wait for them all
to call refrigerator(). The time needed for the freezer to decide
to report a failure is reduced too.
Also reintroduced is the check causing a lockdep warining to
trigger when try_to_freeze() is called with locks held (which is
generally unsafe and shouldn't happen).
- cpufreq updates
First off, a commit from Srivatsa S Bhat fixes a resume regression
introduced during the 3.10 cycle causing some cpufreq sysfs
attributes to return wrong values to user space after resume. The
fix is kind of fresh, but also it's pretty obvious once Srivatsa
has identified the root cause.
Second, we have a new freqdomain_cpus sysfs attribute for the
acpi-cpufreq driver to provide information previously available via
related_cpus. From Lan Tianyu.
Finally, we fix a number of issues, mostly related to the
CPUFREQ_POSTCHANGE notifier and cpufreq Kconfig options and clean
up some code. The majority of changes from Viresh Kumar with bits
from Jacob Shin, Heiko Stübner, Xiaoguang Chen, Ezequiel Garcia,
Arnd Bergmann, and Tang Yuantian.
- ACPICA update
A usual bunch of updates from the ACPICA upstream.
During the 3.4 cycle we introduced support for ACPI 5 extended
sleep registers, but they are only supposed to be used if the
HW-reduced mode bit is set in the FADT flags and the code attempted
to use them without checking that bit. That caused suspend/resume
regressions to happen on some systems. Fix from Lv Zheng causes
those registers to be used only if the HW-reduced mode bit is set.
Apart from this some other ACPICA bugs are fixed and code cleanups
are made by Bob Moore, Tomasz Nowicki, Lv Zheng, Chao Guan, and
Zhang Rui.
- cpuidle updates
New driver for Xilinx Zynq processors is added by Michal Simek.
Multidriver support simplification, addition of some missing
kerneldoc comments and Kconfig-related fixes come from Daniel
Lezcano.
- ACPI power management updates
Changes to make suspend/resume work correctly in Xen guests from
Konrad Rzeszutek Wilk, sparse warning fix from Fengguang Wu and
cleanups and fixes of the ACPI device power state selection
routine.
- ACPI documentation updates
Some previously missing pieces of ACPI documentation are added by
Lv Zheng and Aaron Lu (hopefully, that will help people to
uderstand how the ACPI subsystem works) and one outdated doc is
updated by Hanjun Guo.
- Assorted ACPI updates
We finally nailed down the IA-64 issue that was the reason for
reverting commit 9f29ab11dd ("ACPI / scan: do not match drivers
against objects having scan handlers"), so we can fix it and move
the ACPI scan handler check added to the ACPI video driver back to
the core.
A mechanism for adding CMOS RTC address space handlers is
introduced by Lan Tianyu to allow some EC-related breakage to be
fixed on some systems.
A spec-compliant implementation of acpi_os_get_timer() is added by
Mika Westerberg.
The evaluation of _STA is added to do_acpi_find_child() to avoid
situations in which a pointer to a disabled device object is
returned instead of an enabled one with the same _ADR value. From
Jeff Wu.
Intel BayTrail PCH (Platform Controller Hub) support is added to
the ACPI driver for Intel Low-Power Subsystems (LPSS) and that
driver is modified to work around a couple of known BIOS issues.
Changes from Mika Westerberg and Heikki Krogerus.
The EC driver is fixed by Vasiliy Kulikov to use get_user() and
put_user() instead of dereferencing user space pointers blindly.
Code cleanups are made by Bjorn Helgaas, Nicholas Mazzuca and Toshi
Kani.
- Assorted power management updates
The "runtime idle" helper routine is changed to take the return
values of the callbacks executed by it into account and to call
rpm_suspend() if they return 0, which allows us to reduce the
overall code bloat a bit (by dropping some code that's not
necessary any more after that modification).
The runtime PM documentation is updated by Alan Stern (to reflect
the "runtime idle" behavior change).
New trace points for PM QoS are added by Sahara
(<keun-o.park@windriver.com>).
PM QoS documentation is updated by Lan Tianyu.
Code cleanups are made and minor issues are addressed by Bernie
Thompson, Bjorn Helgaas, Julius Werner, and Shuah Khan.
- devfreq updates
New driver for the Exynos5-bus device from Abhilash Kesavan.
Minor cleanups, fixes and MAINTAINERS update from MyungJoo Ham,
Abhilash Kesavan, Paul Bolle, Rajagopal Venkat, and Wei Yongjun.
- OMAP power management updates
Adaptive Voltage Scaling (AVS) SmartReflex voltage control driver
updates from Andrii Tseglytskyi and Nishanth Menon."
* tag 'pm+acpi-3.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (162 commits)
cpufreq: Fix cpufreq regression after suspend/resume
ACPI / PM: Fix possible NULL pointer deref in acpi_pm_device_sleep_state()
PM / Sleep: Warn about system time after resume with pm_trace
cpufreq: don't leave stale policy pointer in cdbs->cur_policy
acpi-cpufreq: Add new sysfs attribute freqdomain_cpus
cpufreq: make sure frequency transitions are serialized
ACPI: implement acpi_os_get_timer() according the spec
ACPI / EC: Add HP Folio 13 to ec_dmi_table in order to skip DSDT scan
ACPI: Add CMOS RTC Operation Region handler support
ACPI / processor: Drop unused variable from processor_perflib.c
cpufreq: tegra: call CPUFREQ_POSTCHANGE notfier in error cases
cpufreq: s3c64xx: call CPUFREQ_POSTCHANGE notfier in error cases
cpufreq: omap: call CPUFREQ_POSTCHANGE notfier in error cases
cpufreq: imx6q: call CPUFREQ_POSTCHANGE notfier in error cases
cpufreq: exynos: call CPUFREQ_POSTCHANGE notfier in error cases
cpufreq: dbx500: call CPUFREQ_POSTCHANGE notfier in error cases
cpufreq: davinci: call CPUFREQ_POSTCHANGE notfier in error cases
cpufreq: arm-big-little: call CPUFREQ_POSTCHANGE notfier in error cases
cpufreq: powernow-k8: call CPUFREQ_POSTCHANGE notfier in error cases
cpufreq: pcc: call CPUFREQ_POSTCHANGE notfier in error cases
...
These changes are all to SoC-specific code, a total of 33 branches on
17 platforms were pulled into this. Like last time, Renesas sh-mobile
is now the platform with the most changes, followed by OMAP and EXYNOS.
Two new platforms, TI Keystone and Rockchips RK3xxx are added in
this branch, both containing almost no platform specific code at all,
since they are using generic subsystem interfaces for clocks, pinctrl,
interrupts etc. The device drivers are getting merged through the
respective subsystem maintainer trees.
One more SoC (u300) is now multiplatform capable and several others
(shmobile, exynos, msm, integrator, kirkwood, clps711x) are moving
towards that goal with this series but need more work.
Also noteworthy is the work on PCI here, which is traditionally part of
the SoC specific code. With the changes done by Thomas Petazzoni, we can
now more easily have PCI host controller drivers as loadable modules and
keep them separate from the platform code in drivers/pci/host. This has
already led to the discovery that three platforms (exynos, spear and imx)
are actually using an identical PCIe host controller and will be able
to share a driver once support for spear and imx is added.
Conflicts:
* asm/glue-proc.h has one CPU type getting added that conflicts
with another addition in 3.10-rc7
* Simple context changes in arch/arm/Makefile and arch/arm/Kconfig
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIVAwUAUdLnpmCrR//JCVInAQLoFRAAyatR+MhVFwc91cO7yDw/mz81RO1V9jEd
QMufoWi0BRfBsubqxnGlb510EEMTz7gxdrlYPILYNr8TqR+lNGhjKt2FQAjN3q2O
IBvu4x8C+xcxnMNbkCnTQRxP/ziK6yCI6e7enQhwuMuJwvsnJtGbsqKi5ODMw6x0
o5EQmIdj5NhhSJqJZPCmWsKbx100TH1UwaEnhNl0DSaFj51n3bVRrK6Nxce10GWZ
HsS1/a63lq/YZLkwfUEvgin/PU9Jx5jMmqhlp3bZjG+f1ItdzJF+9IgS248vCIi2
ystzWCH88Kh69UFcYFfCjeZe8H45XcP+Zykd8WC0DvF/a7Hwk5KTKE/ciT6RPRxb
rkWW5EwjqZL9w9cU3rUHWtSVenayQMMEmCfksadr1AExyCrhPqfs9RINyBs2lK5a
q2bdSFbXZsNzSyL+3yQAfChvRo1/2FdlFVQy+oVUCActV7L77Y7y6jl+b2qzFsSu
xMKwvC/1vDXTvOnGk6A/qJu7yrHpqJrvw1eI+wnMswNBl7lCTgyyHnr5y8S092jI
KU4hmSxsYP+y13HmKy4ewPy9DYJYBTSdReKfEFo79Dx8eqySAWjHFL/OPRqhCUYS
kBq0eZpVZO7tJnHRaRz8n93wIYzb1UOhhgVwxdjPZF9L4d/jzh1BCv0OBWv8IXCu
uWLAi92lL24=
=0r9S
-----END PGP SIGNATURE-----
Merge tag 'soc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC specific changes from Arnd Bergmann:
"These changes are all to SoC-specific code, a total of 33 branches on
17 platforms were pulled into this. Like last time, Renesas sh-mobile
is now the platform with the most changes, followed by OMAP and
EXYNOS.
Two new platforms, TI Keystone and Rockchips RK3xxx are added in this
branch, both containing almost no platform specific code at all, since
they are using generic subsystem interfaces for clocks, pinctrl,
interrupts etc. The device drivers are getting merged through the
respective subsystem maintainer trees.
One more SoC (u300) is now multiplatform capable and several others
(shmobile, exynos, msm, integrator, kirkwood, clps711x) are moving
towards that goal with this series but need more work.
Also noteworthy is the work on PCI here, which is traditionally part
of the SoC specific code. With the changes done by Thomas Petazzoni,
we can now more easily have PCI host controller drivers as loadable
modules and keep them separate from the platform code in
drivers/pci/host. This has already led to the discovery that three
platforms (exynos, spear and imx) are actually using an identical PCIe
host controller and will be able to share a driver once support for
spear and imx is added."
* tag 'soc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (480 commits)
ARM: integrator: let pciv3 use mem/premem from device tree
ARM: integrator: set local side PCI addresses right
ARM: dts: Add pcie controller node for exynos5440-ssdk5440
ARM: dts: Add pcie controller node for Samsung EXYNOS5440 SoC
ARM: EXYNOS: Enable PCIe support for Exynos5440
pci: Add PCIe driver for Samsung Exynos
ARM: OMAP5: voltagedomain data: remove temporary OMAP4 voltage data
ARM: keystone: Move CPU bringup code to dedicated asm file
ARM: multiplatform: always pick one CPU type
ARM: imx: select syscon for IMX6SL
ARM: keystone: select ARM_ERRATA_798181 only for SMP
ARM: imx: Synertronixx scb9328 needs to select SOC_IMX1
ARM: OMAP2+: AM43x: resolve SMP related build error
dmaengine: edma: enable build for AM33XX
ARM: edma: Add EDMA crossbar event mux support
ARM: edma: Add DT and runtime PM support to the private EDMA API
dmaengine: edma: Add TI EDMA device tree binding
arm: add basic support for Rockchip RK3066a boards
arm: add debug uarts for rockchip rk29xx and rk3xxx series
arm: Add basic clocks for Rockchip rk3066a SoCs
...
Like other ARM specific drivers, this one requires ARM_CPU_SUSPEND,
as shown by this linker error:
drivers/built-in.o: In function `calxeda_pwrdown_idle':
drivers/cpuidle/cpuidle-calxeda.c:84: undefined reference to `cpu_suspend'
drivers/cpuidle/cpuidle-calxeda.c:86: undefined reference to `cpu_resume'
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Rob Herring <rob.herring@calxeda.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: linux-pm@vger.kernel.org
Before commit d6f346f (cpuidle: improve governor Kconfig options),
the CONFIG_ARCH_NEEDS_CPU_IDLE_COUPLED option didn't depend on
CONFIG_CPU_IDLE but now it has been moved under the CPU_IDLE
menuconfig.
That raises the following warnings:
warning: (ARCH_OMAP4 && ARCH_TEGRA_2x_SOC) selects ARCH_NEEDS_CPU_IDLE_COUPLED
which has unmet direct dependencies (CPU_IDLE)
warning: (ARCH_OMAP4 && ARCH_TEGRA_2x_SOC) selects ARCH_NEEDS_CPU_IDLE_COUPLED
which has unmet direct dependencies (CPU_IDLE)
because the tegra2 and omap4 Kconfig files select this option
without checking if CPU_IDLE is set.
Fix that by moving ARCH_NEEDS_CPU_IDLE_COUPLED outside of CPU_IDLE.
[rjw: Changelog]
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Add kerneldoc (and other) comments to the cpuidle driver's framework
code.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Commit bf4d1b5 (cpuidle: support multiple drivers) introduced support
for using multiple cpuidle drivers at the same time. It added a
couple of new APIs to register the driver per CPU, but that led to
some unnecessary code complexity related to the kernel config options
deciding whether or not the multiple driver support is enabled. The
code has to work as it did before when the multiple driver support is
not enabled and the multiple driver support has to be compatible with
the previously existing API.
Remove the new API, not used by any driver in the tree yet (but
needed for the HMP cpuidle drivers that will be submitted soon), and
add a new cpumask pointer to the cpuidle driver structure that will
point to the mask of CPUs handled by the given driver. That will
allow the cpuidle_[un]register_driver() API to be used for the
multiple driver support along with the cpuidle_[un]register()
functions added recently.
[rjw: Changelog]
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Add cpuidle support for Xilinx Zynq.
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Each governor is suitable for different kernel configurations: the menu
governor suits better for a tickless system, while the ladder governor fits
better for a periodic timer tick system.
The Kconfig does not allow to [un]select a governor, thus both are compiled in
the kernel but the init order makes the menu governor to be the last one to be
registered, so becoming the default. The only way to switch back to the ladder
governor is to enable the sysfs governor switch in the kernel command line.
Because it seems nobody complained about this, the menu governor is used by
default most of the time on the system, having both governors is not really
necessary on a tickless system but there isn't a config option to disable one
or another governor.
Create a submenu for cpuidle and add a label for each governor, so we can see
the option in the menu config and enable/disable it.
The governors will be enabled depending on the CONFIG_NO_HZ option:
- If CONFIG_NO_HZ is set, then the menu governor is selected and the ladder
governor is optional, defaulting to 'yes'
- If CONFIG_NO_HZ is not set, then the ladder governor is selected and the
menu governor is optional, defaulting to 'yes'
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Move the private set_auxcr/get_auxcr functions from
drivers/cpuidle/cpuidle-calxeda.c so they can be used across platforms.
Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Cc: Russell King <linux@arm.linux.org.uk>
Signed-off-by: Nicolas Pitre <nico@linaro.org>
Acked-by: Tony Lindgren <tony@atomide.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Currently cpuidle drivers are spread across different archs.
As a result, there are several different paths for cpuidle patch
submissions: cpuidle core changes go through linux-pm, ARM driver
changes go to the arm-soc or SoC-specific trees, sh changes go
through the sh arch tree, pseries changes go through the PowerPC tree
and finally intel changes go through the Len's tree while ACPI idle
changes go through linux-pm.
That makes it difficult to consolidate code and to propagate
modifications from the cpuidle core to the different drivers.
Hopefully, a movement has started to put the majority of cpuidle
drivers under drivers/cpuidle like cpuidle-calxeda.c and
cpuidle-kirkwood.c.
Add a maintainer entry for cpuidle to MAINTAINERS to clarify the
situation and to indicate to new cpuidle driver authors that those
drivers should not go into arch-specific directories.
The upstreaming process is unchanged: Rafael takes patches for
merging into his tree, but with an Acked-by: tag from the driver's
maintainer, so indicate in the drivers' headers who maintains them.
The arrangement will be the same as for cpufreq.
[rjw: Changelog]
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Andrew Lunn <andrew@lunn.ch> #for kirkwood
Acked-by: Jason Cooper <jason@lakedaemon.net> #for kirkwood
Acked-by: Kevin Hilman <khilman@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Fix comment format for the kernel doc script.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Remove the duplicated code and use the cpuidle common code for initialization.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Tested-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Remove the duplicated code and use the cpuidle common code for initialization.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Rob Herring <rob.herring@calxeda.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The usual scheme to initialize a cpuidle driver on a SMP is:
cpuidle_register_driver(drv);
for_each_possible_cpu(cpu) {
device = &per_cpu(cpuidle_dev, cpu);
cpuidle_register_device(device);
}
This code is duplicated in each cpuidle driver.
On UP systems, it is done this way:
cpuidle_register_driver(drv);
device = &per_cpu(cpuidle_dev, cpu);
cpuidle_register_device(device);
On UP, the macro 'for_each_cpu' does one iteration:
#define for_each_cpu(cpu, mask) \
for ((cpu) = 0; (cpu) < 1; (cpu)++, (void)mask)
Hence, the initialization loop is the same for UP than SMP.
Beside, we saw different bugs / mis-initialization / return code unchecked in
the different drivers, the code is duplicated including bugs. After fixing all
these ones, it appears the initialization pattern is the same for everyone.
Please note, some drivers are doing dev->state_count = drv->state_count. This is
not necessary because it is done by the cpuidle_enable_device function in the
cpuidle framework. This is true, until you have the same states for all your
devices. Otherwise, the 'low level' API should be used instead with the specific
initialization for the driver.
Let's add a wrapper function doing this initialization with a cpumask parameter
for the coupled idle states and use it for all the drivers.
That will save a lot of LOC, consolidate the code, and the modifications in the
future could be done in a single place. Another benefit is the consolidation of
the cpuidle_device variable which is now in the cpuidle framework and no longer
spread accross the different arch specific drivers.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>