IRQ_WORK is used by GHES, but it is selected by PERF_EVENT.
For now PERF_EVENT is selected by x86 by default, but
in concept, IRQ_WORK should be selected by GHES, not by others.
Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Bit 0 of the support parameter to the OSC call should be set in order to
indicate that the OS supports the WHEA mechanism. Stuart Hayes tracked
an APEI issue on some Dell platforms down to this.
Reported-by: Stuart Hayes <Stuart_Hayes@Dell.com>
Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Len Brown <len.brown@intel.com>
* 'apei-release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6:
ACPI, APEI, EINJ Param support is disabled by default
APEI GHES: 32-bit buildfix
ACPI: APEI build fix
ACPI, APEI, GHES: Add hardware memory error recovery support
HWPoison: add memory_failure_queue()
ACPI, APEI, GHES, Error records content based throttle
ACPI, APEI, GHES, printk support for recoverable error via NMI
lib, Make gen_pool memory allocator lockless
lib, Add lock-less NULL terminated single list
Add Kconfig option ARCH_HAVE_NMI_SAFE_CMPXCHG
ACPI, APEI, Add WHEA _OSC support
ACPI, APEI, Add APEI bit support in generic _OSC call
ACPI, APEI, GHES, Support disable GHES at boot time
ACPI, APEI, GHES, Prevent GHES to be built as module
ACPI, APEI, Use apei_exec_run_optional in APEI EINJ and ERST
ACPI, APEI, Add apei_exec_run_optional
ACPI, APEI, GHES, Do not ratelimit fatal error printk before panic
ACPI, APEI, ERST, Fix erst-dbg long record reading issue
ACPI, APEI, ERST, Prevent erst_dbg from loading if ERST is disabled
Some trivial conflicts due to other various merges
adding to the end of common lists sooner than this one.
arch/ia64/Kconfig
arch/powerpc/Kconfig
arch/x86/Kconfig
lib/Kconfig
lib/Makefile
Signed-off-by: Len Brown <len.brown@intel.com>
EINJ parameter support is only usable for some specific BIOS.
Originally, it is expected to have no harm for BIOS does not support
it. But now, we found it will cause issue (memory overwriting) for
some BIOS. So param support is disabled by default and only enabled
when newly added module parameter named "param_extension" is
explicitly specified.
Signed-off-by: Huang Ying <ying.huang@intel.com>
Cc: Matthew Garrett <mjg@redhat.com>
Acked-by: Don Zickus <dzickus@redhat.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
drivers/acpi/apei/ghes.c:542: warning: integer overflow in expression
drivers/acpi/apei/ghes.c:619: warning: integer overflow in expression
ghes.c:(.text+0x46289): undefined reference to `__udivdi3'
in function ghes_estatus_cache_add().
Reported-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Len Brown <len.brown@intel.com>
as GHES is optional...
When # CONFIG_ACPI_APEI_GHES is not set:
(.init.text+0x4c22): undefined reference to `ghes_disable'
Reported-by: Randy Dunlap <rdunlap@xenotime.net>
Acked-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Len Brown <len.brown@intel.com>
memory_failure_queue() is called when recoverable memory errors are
notified by firmware to do the recovery work.
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
printk is used by GHES to report hardware errors. Ratelimit is
enforced on the printk to avoid too many hardware error reports in
kernel log. Because there may be thousands or even millions of
corrected hardware errors during system running.
Currently, a simple scheme is used. That is, the total number of
hardware error reporting is ratelimited. This may cause some issues
in practice.
For example, there are two kinds of hardware errors occurred in
system. One is corrected memory error, because the fault memory
address is accessed frequently, there may be hundreds error report
per-second. The other is corrected PCIe AER error, it will be
reported once per-second. Because they share one ratelimit control
structure, it is highly possible that only memory error is reported.
To avoid the above issue, an error record content based throttle
algorithm is implemented in the patch. Where after the first
successful reporting, all error records that are same are throttled for
some time, to let other kinds of error records have the opportunity to
be reported.
In above example, the memory errors will be throttled for some time,
after being printked. Then the PCIe AER error will be printked
successfully.
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Some APEI GHES recoverable errors are reported via NMI, but printk is
not safe in NMI context.
To solve the issue, a lock-less memory allocator is used to allocate
memory in NMI handler, save the error record into the allocated
memory, put the error record into a lock-less list. On the other
hand, an irq_work is used to delay the operation from NMI context to
IRQ context. The irq_work IRQ handler will remove nodes from
lock-less list, printk the error record and do some further processing
include recovery operation, then free the memory.
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: (28 commits)
ACPI: delete stale reference in kernel-parameters.txt
ACPI: add missing _OSI strings
ACPI: remove NID_INVAL
thermal: make THERMAL_HWMON implementation fully internal
thermal: split hwmon lookup to a separate function
thermal: hide CONFIG_THERMAL_HWMON
ACPI print OSI(Linux) warning only once
ACPI: DMI workaround for Asus A8N-SLI Premium and Asus A8N-SLI DELUX
ACPI / Battery: propagate sysfs error in acpi_battery_add()
ACPI / Battery: avoid acpi_battery_add() use-after-free
ACPI: introduce "acpi_rsdp=" parameter for kdump
ACPI: constify ops structs
ACPI: fix CONFIG_X86_REROUTE_FOR_BROKEN_BOOT_IRQS
ACPI: fix 80 char overflow
ACPI / Battery: Resolve the race condition in the sysfs_remove_battery()
ACPI / Battery: Add the check before refresh sysfs in the battery_notify()
ACPI / Battery: Add the hibernation process in the battery_notify()
ACPI / Battery: Rename acpi_battery_quirks2 with acpi_battery_quirks
ACPI / Battery: Change 16-bit signed negative battery current into correct value
ACPI / Battery: Add the power unit macro
...
Linux supports some optional features, but it should notify the BIOS about
them via the _OSI method. Currently Linux doesn't notify any, which might
make such features not work because the BIOS doesn't know about them.
Jarosz has a system which needs this to make ACPI processor aggregator
device work.
Reported-by: "Jarosz, Sebastian" <sebastian.jarosz@intel.com>
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Acked-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Len Brown <len.brown@intel.com>
* 'pstore-efi' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6:
efivars: Introduce PSTORE_EFI_ATTRIBUTES
efivars: Use string functions in pstore_write
efivars: introduce utf16_strncmp
efivars: String functions
efi: Add support for using efivars as a pstore backend
pstore: Allow the user to explicitly choose a backend
pstore: Make "part" unsigned
pstore: Add extra context for writes and erases
pstore: Extend API for more flexibility in new backends
DMI workaround for A8N-SLI Premium and A8N-SLI DELUXE
to enable the s3 suspend old ordering.
http://bugzilla.kernel.org/show_bug.cgi?id=9528
Tested-by: Heiko Ettelbrück <hbruckynews@gmx.de>
Tested-by: Brian Beardall <brian@rapsure.net>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
* 'drm-core-next' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: (135 commits)
drm/radeon/kms: fix DP training for DPEncoderService revision bigger than 1.1
drm/radeon/kms: add missing vddci setting on NI+
drm/radeon: Add a rmb() in IH processing
drm/radeon: ATOM Endian fix for atombios_crtc_program_pll()
drm/radeon: Fix the definition of RADEON_BUF_SWAP_32BIT
drm/radeon: Do an MMIO read on interrupts when not uisng MSIs
drm/radeon: Writeback endian fixes
drm/radeon: Remove a bunch of useless _iomem casts
drm/gem: add support for private objects
DRM: clean up and document parsing of video= parameter
DRM: Radeon: Fix section mismatch.
drm: really make debug levels match in edid failure code
drm/radeon/kms: fix i2c map for rv250/280
drm/nouveau/gr: disable fifo access and idle before suspend ctx unload
drm/nouveau: pass flag to engine fini() method on suspend
drm/nouveau: replace nv04_graph_fifo_access() use with direct reg bashing
drm/nv40/gr: rewrite/split context takedown functions
drm/nouveau: detect disabled device in irq handler and return IRQ_NONE
drm/nouveau: ignore connector type when deciding digital/analog on DVI-I
drm/nouveau: Add a quirk for Gigabyte NX86T
...
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (43 commits)
fs: Merge split strings
treewide: fix potentially dangerous trailing ';' in #defined values/expressions
uwb: Fix misspelling of neighbourhood in comment
net, netfilter: Remove redundant goto in ebt_ulog_packet
trivial: don't touch files that are removed in the staging tree
lib/vsprintf: replace link to Draft by final RFC number
doc: Kconfig: `to be' -> `be'
doc: Kconfig: Typo: square -> squared
doc: Konfig: Documentation/power/{pm => apm-acpi}.txt
drivers/net: static should be at beginning of declaration
drivers/media: static should be at beginning of declaration
drivers/i2c: static should be at beginning of declaration
XTENSA: static should be at beginning of declaration
SH: static should be at beginning of declaration
MIPS: static should be at beginning of declaration
ARM: static should be at beginning of declaration
rcu: treewide: Do not use rcu_read_lock_held when calling rcu_dereference_check
Update my e-mail address
PCIe ASPM: forcedly -> forcibly
gma500: push through device driver tree
...
Fix up trivial conflicts:
- arch/arm/mach-ep93xx/dma-m2p.c (deleted)
- drivers/gpio/gpio-ep93xx.c (renamed and context nearby)
- drivers/net/r8169.c (just context changes)
We'll never have a negative part, so just make this an unsigned int.
Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
EFI only provides small amounts of individual storage, and conventionally
puts metadata in the storage variable name. Rather than add a metadata
header to the (already limited) variable storage, it's easier for us to
modify pstore to pass all the information we need to construct a unique
variable name to the appropriate functions.
Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Some pstore implementations may not have a static context, so extend the
API to pass the pstore_info struct to all calls and allow for a context
pointer.
Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
All these are instances of
#define NAME value;
or
#define NAME(params_opt) value;
These of course fail to build when used in contexts like
if(foo $OP NAME)
while(bar $OP NAME)
and may silently generate the wrong code in contexts such as
foo = NAME + 1; /* foo = value; + 1; */
bar = NAME - 1; /* bar = value; - 1; */
baz = NAME & quux; /* baz = value; & quux; */
Reported on comp.lang.c,
Message-ID: <ab0d55fe-25e5-482b-811e-c475aa6065c3@c29g2000yqd.googlegroups.com>
Initial analysis of the dangers provided by Keith Thompson in that thread.
There are many more instances of more complicated macros having unnecessary
trailing semicolons, but this pile seems to be all of the cases of simple
values suffering from the problem. (Thus things that are likely to be found
in one of the contexts above, more complicated ones aren't.)
Signed-off-by: Phil Carmody <ext-phil.2.carmody@nokia.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Make sure the error return from sysfs_add_battery() is checked and
propagated out from acpi_battery_add().
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Len Brown <len.brown@intel.com>
When acpi_battery_add_fs() fails the error handling code does not clean
up completely. Moreover, it does not return resulting in a
use-after-free.
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Len Brown <len.brown@intel.com>
There is a problem with putting the first kernel in EFI virtual mode,
it is that when the second kernel comes up it tries to initialize the
EFI again and once we have put EFI in virtual mode we can not really
do that.
Actually, EFI is not necessary for kdump, we can boot the second kernel
with "noefi" parameter, but the boot will mostly fail because 2nd kernel
cannot find RSDP.
In this situation, we introduced "acpi_rsdp=" kernel parameter, so that
kexec-tools can pass the "noefi acpi_rsdp=X" to the second kernel to
make kdump works. The physical address of the RSDP can be got from
sysfs(/sys/firmware/efi/systab).
Signed-off-by: Takao Indoh <indou.takao@jp.fujitsu.com>
Reviewed-by: WANG Cong <amwang@redhat.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Structs battery_file, acpi_dock_ops, file_operations,
thermal_cooling_device_ops, thermal_zone_device_ops, kernel_param_ops
are not changed in runtime. It is safe to make them const.
register_hotplug_dock_device() was altered to take const "ops" argument
to respect acpi_dock_ops' const notion.
Signed-off-by: Vasiliy Kulikov <segoon@openwall.com>
Acked-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: Len Brown <len.brown@intel.com>
The following was observed by Steve Rostedt on 3.0.0-rc5
Backtrace:
irq 16: nobody cared (try booting with the "irqpoll" option)
Pid: 65, comm: irq/16-uhci_hcd Not tainted 3.0.0-rc5-test+ #94
Call Trace:
[<ffffffff810aa643>] __report_bad_irq+0x37/0xc1
[<ffffffff810aaa2d>] note_interrupt+0x14e/0x1c9
[<ffffffff810a9a05>] ? irq_thread_fn+0x3c/0x3c
[<ffffffff810a990e>] irq_thread+0xf6/0x1b1
[<ffffffff810a9818>] ? irq_finalize_oneshot+0xb3/0xb3
[<ffffffff8106b4d6>] kthread+0x9f/0xa7
[<ffffffff814f1f04>] kernel_thread_helper+0x4/0x10
[<ffffffff8103ca09>] ? finish_task_switch+0x7b/0xc0
[<ffffffff814eac78>] ? retint_restore_args+0x13/0x13
[<ffffffff8106b437>] ? __init_kthread_worker+0x5a/0x5a
[<ffffffff814f1f00>] ? gs_change+0x13/0x13
handlers:
[<ffffffff810a912d>] irq_default_primary_handler threaded [<ffffffff8135eaa6>] usb_hcd_irq
[<ffffffff810a912d>] irq_default_primary_handler threaded [<ffffffff8135eaa6>] usb_hcd_irq
Disabling IRQ #16
The problem being that a device triggers boot interrupts (due to threaded
interrupt handling and masking of the IO-APIC), which are forwarded
to the PIRQ line of the device. These interrupts are not handled on the PIRQ
line because the interrupt handler is not present there.
This should have already been fixed by CONFIG_X86_REROUTE_FOR_BROKEN_BOOT_IRQS.
However some parts of the quirk got lost in the ACPI merge. This is a resent of
the patch proposed in 2009.
See http://lkml.org/lkml/2009/9/7/192
Signed-off-by: Stefan Assmann <sassmann@kpanic.de>
Tested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Len Brown <len.brown@intel.com>
Trivial fix for 80 char line overflow in drivers/acpi/pci_root.c
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Signed-off-by: Len Brown <len.brown@intel.com>
Use battery->lock in sysfs_remove_battery() to make
checking, removing, and clearing bat.dev atomic.
This is necessary because sysfs_remove_battery() may
be invoked concurrently from different paths.
https://bugzilla.kernel.org/show_bug.cgi?id=35642
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
In the commit 25be5821, add the refresh sysfs when system resumes
from suspending. But it didn't check that the battery exists. This
will cause battery sysfs files added when the battery doesn't exist.
This patch add the check before refreshing.
https://bugzilla.kernel.org/show_bug.cgi?id=35642
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
The Commit 25be58215 has added a PM notifier to refresh the sys in order
to deal with the unit change of the Battery Present Rate. But it just
consided the suspend situation. The problem also will happen during the
hibernation according the bug 28192.
https://bugzilla.kernel.org/show_bug.cgi?id=28192
This patch adds the hibernation process and fix the bug.
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
This patch is cosmetic only, and makes no functional change.
Since the acpi_battery_quirks has been deleted, rename
acpi_battery_quirks2 with acpi_battery_quirks to clean the code.
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
This patch is for some machines which report the battery current
as a 16-bit signed negative when it is charging. This is caused
by DSDT bug. The commit bc76f90b8a
has resolved the problem for Acer laptops. But some other machines
also have such problem.
https://bugzilla.kernel.org/show_bug.cgi?id=33722
Since it is improper that the current is above 32A on laptops
whether on AC or on battery, this patch is to check the current and
take its absolute value as current and producing a message when it
is negative in s16.
Remove Acer quirk, as this workaround handles Acer too.
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
This patch is cosmetic only, and makes no functional change.
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
We can only sort the _TSS return package if there is no _PSS
in the same scope. This is because if _PSS is present, the ACPI
specification dictates that the _TSS Power Dissipation field is
to be ignored, and therefore some BIOSs leave garbage values in
the _TSS Power field(s). In this case, it is best to just return
the _TSS package as-is.
Reported-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Runtime option can be used to disable return value repair if this
is causing a problem on a particular machine.
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Affects both iASL and core ACPICA.
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Now, only allow "SSDT" "OEM", and a null signature.
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
APEI firmware first mode must be turned on explicitly on some
machines, otherwise there may be no GHES hardware error record for
hardware error notification. APEI bit in generic _OSC call can be
used to do that, but on some machine, a special WHEA _OSC call must be
used. This patch adds the support to that WHEA _OSC call.
Signed-off-by: Huang Ying <ying.huang@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Reviewed-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Len Brown <len.brown@intel.com>
In APEI firmware first mode, hardware error is reported by hardware to
firmware firstly, then firmware reports the error to Linux in a GHES
error record via POLL/SCI/IRQ/NMI etc.
This may result in some issues if OS has no full APEI support. So
some firmware implementation will work in a back-compatible mode by
default. Where firmware will only notify OS in old-fashion, without
GHES record. For example, for a fatal hardware error, only NMI is
signaled, no GHES record.
To gain full APEI power on these machines, APEI bit in generic _OSC
call can be specified to tell firmware that Linux has full APEI
support. This patch adds the APEI bit support in generic _OSC call.
Signed-off-by: Huang Ying <ying.huang@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Reviewed-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Some machine may have broken firmware so that GHES and firmware first
mode should be disabled. This patch adds support to that.
Signed-off-by: Huang Ying <ying.huang@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Reviewed-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Len Brown <len.brown@intel.com>
GHES (Generic Hardware Error Source) is used to process hardware error
notification in firmware first mode. But because firmware first mode
can be turned on but can not be turned off, it is unreasonable to
unload the GHES module with firmware first mode turned on. To avoid
confusion, this patch makes GHES can be enabled/disabled in
configuration time, but not built as module and unloaded at run time.
Signed-off-by: Huang Ying <ying.huang@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Reviewed-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Len Brown <len.brown@intel.com>
This patch changes APEI EINJ and ERST to use apei_exec_run for
mandatory actions, and apei_exec_run_optional for optional actions.
Cc: Thomas Renninger <trenn@novell.com>
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>