processor.bm_check_disable=1" prevents Linux from checking BM_STS
before entering C3-type cpu power states.
This may be useful for a system running acpi_idle
where the BIOS exports FADT C-states, _CST IO C-states,
or _CST FFH C-states with the BM_STS bit set;
while configuring the chipset to set BM_STS
more frequently than perhaps is optimal.
Note that such systems may have been developed
using a tickful OS that would quickly clear BM_STS,
rather than a tickless OS that may go for some time
between checking and clearing BM_STS.
Note also that an alternative for newer systems
is to use the intel_idle driver, which always
ignores BM_STS, relying Linux device drivers
to register constraints explicitly via PM_QOS.
https://bugzilla.kernel.org/show_bug.cgi?id=15886
Signed-off-by: Len Brown <len.brown@intel.com>
It turns out that there is a bit in the _CST for Intel FFH C3
that tells the OS if we should be checking BM_STS or not.
Linux has been unconditionally checking BM_STS.
If the chip-set is configured to enable BM_STS,
it can retard or completely prevent entry into
deep C-states -- as illustrated by turbostat:
http://userweb.kernel.org/~lenb/acpi/utils/pmtools/turbostat/
ref: Intel Processor Vendor-Specific ACPI Interface Specification
table 4 "_CST FFH GAS Field Encoding"
Bit 1: Set to 1 if OSPM should use Bus Master avoidance for this C-state
https://bugzilla.kernel.org/show_bug.cgi?id=15886
Signed-off-by: Len Brown <len.brown@intel.com>
When booting 2.6.35-rc3 on a x86 system without an ERST ACPI table with
the 'quiet' option, we still observe an "ERST: Table is not found!"
warning.
Quiesce it to the same info log level as the other 'table not found'
warnings.
Signed-off-by: Daniel J Blueman <daniel.blueman@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Implicit slab.h inclusion via percpu.h is about to go away. Make sure
gfp.h or slab.h is included as necessary.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Len Brown <lenb@kernel.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
After commit 9630bdd9b1
(ACPI: Use GPE reference counting to support shared GPEs) the wakeup
enable mask bits of GPEs are set as soon as the GPEs are enabled to
wake up the system. Unfortunately, this leads to a regression
reported by Michal Hocko, where a system is woken up from ACPI S5 by
a device that is not supposed to do that, because the wakeup enable
mask bit of this device's GPE is always set when
acpi_enter_sleep_state() calls acpi_hw_enable_all_wakeup_gpes(),
although it should only be set if the device is supposed to wake up
the system from the target state.
To work around this issue, rework the ACPI power management code so
that GPEs are not enabled to wake up the system upfront, but only
during a system state transition when the target state of the system
is known. [Of course, this means that the reference counting of
"wakeup" GPEs doesn't really make sense and it is sufficient to
set/unset the wakeup mask bits for them during system sleep
transitions. This will allow us to simplify the GPE handling code
quite a bit, but that change is too intrusive for 2.6.35.]
Fixes https://bugzilla.kernel.org/show_bug.cgi?id=15951
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Reported-and-tested-by: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Len Brown <len.brown@intel.com>
This feature is optional and is enabled if the BIOS requests any
Windows OSI strings. It can also be enabled by the host OS.
Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
To prevent accidental deep sleeps, limit the maximum time that
Sleep() will sleep. Configurable, default maximum is two seconds.
ACPICA bugzilla 854.
http://www.acpica.org/bugzilla/show_bug.cgi?id=854
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
The sysfs interface allowing user space to disable/enable GPEs
doesn't work correctly, because a GPE disabled this way will be
re-enabled shortly by acpi_ev_asynch_enable_gpe() if it was
previosuly enabled by acpi_enable_gpe() (in which case the
corresponding bit in its enable register's enable_for_run mask is
set).
To address this issue make the sysfs GPE interface use
acpi_enable_gpe() and acpi_disable_gpe() instead of acpi_set_gpe()
so that GPE reference counters are modified by it along with the
values of GPE enable registers.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Len Brown <len.brown@intel.com>
While developing the GPE reference counting code we overlooked the
fact that acpi_ev_update_gpes() could have enabled GPEs before
acpi_ev_initialize_gpe_block() was called. As a result, some GPEs
are enabled twice during the initialization.
To fix this issue avoid calling acpi_enable_gpe() from
acpi_ev_initialize_gpe_block() for the GPEs that have nonzero
runtime reference counters.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Len Brown <len.brown@intel.com>
ACPICA uses acpi_hw_write_gpe_enable_reg() to re-enable a GPE after
an event signaled by it has been handled. However, this function
writes the entire GPE enable mask to the GPE's enable register which
may not be correct. Namely, if one of the other GPEs in the same
register was previously enabled by acpi_enable_gpe() and subsequently
disabled using acpi_set_gpe(), acpi_hw_write_gpe_enable_reg() will
re-enable it along with the target GPE.
To fix this issue rework acpi_hw_write_gpe_enable_reg() so that it
calls acpi_hw_low_set_gpe() with a special action value,
ACPI_GPE_COND_ENABLE, that will make it only enable the GPE if the
corresponding bit in its register's enable_for_run mask is set.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Len Brown <len.brown@intel.com>
ACPICA uses acpi_ev_enable_gpe() for enabling GPEs at the low level,
which is incorrect, because this function only enables the GPE if the
corresponding bit in its enable register's enable_for_run mask is set.
This causes acpi_set_gpe() to work incorrectly if used for enabling
GPEs that were not previously enabled with acpi_enable_gpe(). As a
result, among other things, wakeup-only GPEs are never enabled by
acpi_enable_wakeup_device(), so the devices that use them are unable
to wake up the system.
To fix this issue remove acpi_ev_enable_gpe() and its counterpart
acpi_ev_disable_gpe() and replace acpi_hw_low_disable_gpe() with
acpi_hw_low_set_gpe() that will be used instead to manipulate GPE
enable bits at the low level. Make the users of acpi_ev_enable_gpe()
and acpi_ev_disable_gpe() call acpi_hw_low_set_gpe() instead and
make sure that GPE enable masks are only updated by acpi_enable_gpe()
and acpi_disable_gpe() when GPE reference counters change from 0
to 1 and from 1 to 0, respectively.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Len Brown <len.brown@intel.com>
In quite a few places ACPICA needs to compute a GPE enable mask with
only one bit, corresponding to a given GPE, set. Currently, that
computation is always open coded which leads to unnecessary code
duplication. Fix this by introducing a helper function for computing
one-bit GPE enable masks and using it where appropriate.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Len Brown <len.brown@intel.com>
Commit 0f849d2cc6 (ACPICA: Minimize
the differences between linux GPE code and ACPICA code base)
introduced a change attempting to disable a GPE before installing
a handler for it in acpi_install_gpe_handler() which was incorrect.
First, the GPE disabled by it is never enabled again (except during
resume) which leads to battery insert/remove events not being
reported on the Maxim Levitsky's machine. Second, the disabled
GPE is still reported as enabled by the sysfs interface that only
checks its enable register's enable_for_run mask.
Revert this change for now, because it causes more damage to happen
than the bug it was supposed to fix.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Reported-and-tested-by: Maxim Levitsky <maximlevitsky@gmail.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Disable Vista compatibility for Sony VGN-NS50B_L.
https://bugzilla.kernel.org/show_bug.cgi?id=12904#c46
Note that this change is a workaround, not a permanent fix.
For the permanent fix is to figure out what compatibility
means and to actually be compatible...
Tested-by: Voldemar <harestomper@gmail.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
https://bugzilla.kernel.org/show_bug.cgi?id=13931 describes a bug where
a system fails to successfully resume after the second suspend. Maxim
Levitsky discovered that this could be rectified by forcibly saving
and restoring the ACPI non-volatile state. The spec indicates that this
is only required for S4, but testing the behaviour of Windows by adding
an ACPI NVS region to qemu's e820 map and registering a custom memory
read/write handler reveals that it's saved and restored even over suspend
to RAM. We should mimic that behaviour to avoid other broken platforms.
Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Saving platform non-volatile state may be required for suspend to RAM as
well as hibernation. Move it to more generic code.
Signed-off-by: Matthew Garrett <mjg@redhat.com>
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Tested-by: Maxim Levitsky <maximlevitsky@gmail.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Patch is against latest Linus master branch and is expected to be
safe bug fix.
You get:
ACPI: HARDWARE addr space,NOT supported yet
for each ACPI defined CPU which status is active, but exceeds
maxcpus= count.
As these "not booted" CPUs do not run an idle routine
and echo X >/proc/acpi/processor/*/throttling did not work
I couldn't find a way to really access not onlined/booted
machines. Still this should get fixed and
/proc/acpi/processor/X dirs of cores exceeding maxcpus
should not show up.
I wonder whether this could get cleaned up by truncating possible cpu mask
and nr_cpu_ids to setup_max_cpus early some day
(and not exporting setup_max_cpus anymore then).
But this needs touching of a lot other places...
Signed-off-by: Thomas Renninger <trenn@suse.de>
CC: travis@sgi.com
CC: linux-acpi@vger.kernel.org
CC: lenb@kernel.org
Signed-off-by: Len Brown <len.brown@intel.com>
acpi_enter_[simple|bm] routines does us to pm tick conversion on every
idle wakeup and the value is only used in /proc/acpi display. We can
store the time in us and convert it into pm ticks before printing instead and
avoid the conversion in the common path.
Signed-off-by: Venkatesh Pallipadi <venki@google.com>
Signed-off-by: Len Brown <len.brown@intel.com>
The C-state idle time is not calculated correctly, which will return the wrong
residency time in C-state. It will have the following effects:
1. The system can't choose the deeper C-state when it is idle next time.
Of course the system power is increased. E.g. On one server machine about 40W
idle power is increased.
2. The powertop shows that it will stay in C0 running state about 95% time
although the system is idle at most time.
2.6.35-rc1 regression caused-by: 2da513f582
(ACPI: Minor cleanup eliminating redundant PMTIMER_TICKS to NS conversion)
Signed-off-by: Zhao Yakui <yakui.zhao@intel.com>
Reported-by: Yu Zhidong <zhidong.yu@intel.com>
Tested-by: Yu Zhidong <zhidong.yu@intel.com>
Acked-by: Venkatesh Pallipadi <venki@google.com>
Signed-off-by: Len Brown <len.brown@intel.com>
As suggested in Venki's suggestion in the commit 0dc698b,
add LAPIC unstable detection in the acpi_pad drvier too.
Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
The names of the functions used for blocking/unblocking EC
transactions during suspend/hibernation suggest that the transactions
are suspended and resumed by them, while in fact they are disabled
and enabled. Rename the functions (and the flag used by them) to
better reflect what they really do.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Len Brown <len.brown@intel.com>
There still is a race that may result in suspending the system in
the middle of an EC transaction in progress, which leads to problems
(like the kernel thinking that the ACPI global lock is held during
resume while in fact it's not).
To remove the race condition, modify the ACPI platform suspend and
hibernate callbacks so that EC transactions are blocked right after
executing the _PTS global control method and are allowed to happen
again right after the low-level wakeup.
Introduce acpi_pm_freeze() that will disable GPEs, wait until the
event queues are empty and block EC transactions. Use it wherever
GPEs are disabled in preparation for switching local interrupts off.
Introduce acpi_pm_thaw() that will allow EC transactions to happen
again and enable runtime GPEs. Use it to balance acpi_pm_freeze()
wherever necessary.
In addition to that use acpi_ec_resume_transactions_early() to
unblock EC transactions as early as reasonably possible during
resume. Also unblock EC transactions in acpi_hibernation_finish()
and in the analogous suspend routine to make sure that the EC
transactions are enabled in all error paths.
Fixes https://bugzilla.kernel.org/show_bug.cgi?id=14668
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Reported-and-tested-by: Maxim Levitsky <maximlevitsky@gmail.com>
Signed-off-by: Len Brown <len.brown@intel.com>
* 'idle-release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-idle-2.6:
intel_idle: native hardware cpuidle driver for latest Intel processors
ACPI: acpi_idle: touch TS_POLLING only in the non-MWAIT case
acpi_pad: uses MONITOR/MWAIT, so it doesn't need to clear TS_POLLING
sched: clarify commment for TS_POLLING
ACPI: allow a native cpuidle driver to displace ACPI
cpuidle: make cpuidle_curr_driver static
cpuidle: add cpuidle_unregister_driver() error check
cpuidle: fail to register if !CONFIG_CPU_IDLE
acpi pad driver kind of aggressively marks TSC as unstable at init
time, on mwait capable and non X86_FEATURE_NONSTOP_TSC systems. This is
irrespective of whether pad driver is ever going to be used on the
system or deep C-states are supported/used. This will affect every user
who just happens to compile in (or get a kernel version which
compiles in) acpi pad driver.
Move mark_tsc_unstable() out of init to the actual idle invocation path
of the pad driver.
There is also another bug/missing_feature in the code that it does not
support 'always running apic timer' and switches to broadcast mode
unconditionally. Shaohua, can you take a look at that please.
Signed-off-by: Venkatesh Pallipadi <venki@google.com>
Signed-off-by: Len Brown <len.brown@intel.com>
drivers/acpi/sleep.h:3: WARNING: space prohibited between function name and open parenthesis '('
Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net>
Signed-off-by: Len Brown <len.brown@intel.com>
acpi_enter_[simple,bm] does
idle timing in ns, convert it to timeval, then to us, then to
pmtimer_ticks and then back to ns.
This patch changes things to
idle timing in ns, convert it to us, and then to pmtimer_ticks.
Just saves an imul along this path, but makes the code cleaner.
Signed-off-by: Venkatesh Pallipadi <venki@google.com>
Signed-off-by: Len Brown <len.brown@intel.com>
This EXPERIMENTAL driver supersedes acpi_idle on
Intel Atom Processors, Intel Core i3/i5/i7 Processors
and associated Intel Xeon processors.
It does not support the Intel Core2 processor or earlier.
For kernels configured with ACPI, CONFIG_INTEL_IDLE=y
allows intel_idle to probe before the ACPI processor driver.
Booting with "intel_idle.max_cstate=0" disables intel_idle
and the system will fall back on ACPI's "acpi_idle".
Typical Linux distributions load ACPI processor module early,
making CONFIG_INTEL_IDLE=m not easily useful on ACPI platforms.
intel_idle probes all processors at module_init time.
Processors that are hot-added later will be limited
to using C1 in idle.
Signed-off-by: Len Brown <len.brown@intel.com>
commit d306ebc286
(ACPI: Be in TS_POLLING state during mwait based C-state entry)
fixed an important power & performance issue where ACPI c2 and c3 C-states
were clearing TS_POLLING even when using MWAIT (ACPI_STATE_FFH).
That bug had been causing us to receive redundant scheduling interrups
when we had already been woken up by MONITOR/MWAIT.
Following up on that...
In the MWAIT case, we don't have to subsequently
check need_resched(), as that c heck was there
for the TS_POLLING-clearing case.
Note that not only does the cpuidle calling function
already check need_resched() before calling us, the
low-level entry into monitor/mwait calls it twice --
guaranteeing that a write to the trigger address
can not go un-noticed.
Also, in this case, we don't have to set TS_POLLING
when we wake, because we never cleared it.
Signed-off-by: Len Brown <len.brown@intel.com>
Acked-by: Venkatesh Pallipadi <venki@google.com>
api_pad exclusively uses MONITOR/MWAIT to sleep in idle,
so it does not need the wakeup IPI during idle sleep
that is provoked by clearing TS_POLLING.
Signed-off-by: Len Brown <len.brown@intel.com>
Cc: Shaohua Li <shaohua.li@intel.com>
The ACPI driver would fail probe when it found that
another driver had previously registered with cpuidle.
But this is a natural situation, as a native hardware
cpuidle driver should be able to bind instead of ACPI,
and the ACPI processor driver should be able to handle
yielding control of C-states while still handling
P-states and T-states.
Add a KERN_DEBUG line showing when acpi_idle
does successfully register.
Signed-off-by: Len Brown <len.brown@intel.com>
When the user passes the kernel parameter acpi_enforce_resources=lax,
the ACPI resources are no longer protected, so a native driver can
make use of them. In that case, we do not want the asus_atk0110 to be
loaded. Unfortunately, this driver loads automatically due to its
MODULE_DEVICE_TABLE, so the user ends up with two drivers loaded for
the same device - this is bad.
So I suggest that we prevent the asus_atk0110 driver from loading if
acpi_enforce_resources=lax.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Acked-by: Luca Tettamanti <kronos.it@gmail.com>
Cc: Len Brown <lenb@kernel.org>
Remove own implementation of hex_to_bin().
Signed-off-by: Andy Shevchenko <ext-andriy.shevchenko@nokia.com>
Cc: Len Brown <lenb@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
These were used before cpuidle by the native ACPI idle driver,
which tracked promotion and demotion between states.
The code was referenced by CONFIG_ACPI_PROCFS
for /proc/acpi/processor/*/power,
but as we no longer do promotion/demotion, that
reference has been a NOP since the transition.
Signed-off-by: Len Brown <len.brown@intel.com>
This allows bin_attr->read,write,mmap callbacks to check file specific data
(such as inode owner) as part of any privilege validation.
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>