While the original code is valid, it is not the obvious choice for the
sizeof() call and in preparation to limit the scope of the list iterator
variable the sizeof should be changed to the size of the variable
being allocated.
Signed-off-by: Jakob Koschel <jakobkoschel@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Platforms with large BERT table data can trigger soft lockup errors
while attempting to print the entire BERT table data to the console at
boot:
watchdog: BUG: soft lockup - CPU#160 stuck for 23s! [swapper/0:1]
Observed on Ampere Altra systems with a single BERT record of ~250KB.
The original bert driver appears to have assumed relatively small table
data. Since it is impractical to reassemble large table data from
interwoven console messages, and the table data is available in
/sys/firmware/acpi/tables/data/BERT
limit the size for tables printed to the console to 1024 (for no reason
other than it seemed like a good place to kick off the discussion, would
appreciate feedback from existing users in terms of what size would
maintain their current usage model).
Alternatively, we could make printing a CONFIG option, use the
bert_disable boot arg (or something similar), or use a debug log level.
However, all those solutions require extra steps or change the existing
behavior for small table data. Limiting the size preserves existing
behavior on existing platforms with small table data, and eliminates the
soft lockups for platforms with large table data, while still making it
available.
Signed-off-by: Darren Hart <darren@os.amperecomputing.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
__setup() handlers should return 1 to indicate that the boot option
has been handled. Returning 0 causes a boot option to be listed in
the Unknown kernel command line parameters and also added to init's
arg list (if no '=' sign) or environment list (if of the form 'a=b').
Unknown kernel command line parameters "erst_disable
bert_disable hest_disable BOOT_IMAGE=/boot/bzImage-517rc6", will be
passed to user space.
Run /sbin/init as init process
with arguments:
/sbin/init
erst_disable
bert_disable
hest_disable
with environment:
HOME=/
TERM=linux
BOOT_IMAGE=/boot/bzImage-517rc6
Fixes: a3e2acc5e3 ("ACPI / APEI: Add Boot Error Record Table (BERT) support")
Fixes: a08f82d080 ("ACPI, APEI, Error Record Serialization Table (ERST) support")
Fixes: 9dc9666416 ("ACPI, APEI, HEST table parsing")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: Igor Zhbanov <i.zhbanov@omprussia.ru>
Link: lore.kernel.org/r/64644a2f-4a20-bab3-1e15-3b2cdd0defe3@omprussia.ru
Reviewed-by: "Huang, Ying" <ying.huang@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
ghes_init() sticks out in acpi_init() because it is the only functions
without an "acpi_" prefix.
Rename ghes_init with an "acpi_" prefix, then all looks fine.
Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
From commit e147133a42 ("ACPI / APEI: Make hest.c manage the estatus
memory pool") was merged, ghes_init() relies on acpi_hest_init() to manage
the estatus memory pool. On the other hand, ghes_init() relies on
sdei_init() to detect the SDEI version and (un)register events. The
dependencies are as follows:
ghes_init() => acpi_hest_init() => acpi_bus_init() => acpi_init()
ghes_init() => sdei_init()
HEST is not PCI-specific and initcall ordering is implicit and not
well-defined within a level.
Based on above, remove acpi_hest_init() from acpi_pci_root_init() and
convert ghes_init() and sdei_init() from initcalls to explicit calls in the
following order:
acpi_hest_init()
ghes_init()
sdei_init()
Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
SGX EPC pages do not have a "struct page" associated with them so the
pfn_valid() sanity check fails and results in a warning message to
the console.
Add an additional check to skip the warning if the address of the error
is in an SGX EPC page.
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Tested-by: Reinette Chatre <reinette.chatre@intel.com>
Link: https://lkml.kernel.org/r/20211026220050.697075-8-tony.luck@intel.com
SGX reserved memory does not appear in the standard address maps.
Add hook to call into the SGX code to check if an address is located
in SGX memory.
There are other challenges in injecting errors into SGX. Update the
documentation with a sequence of operations to inject.
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Tested-by: Reinette Chatre <reinette.chatre@intel.com>
Link: https://lkml.kernel.org/r/20211026220050.697075-7-tony.luck@intel.com
apei_hest_parse() is only used in hest.c, so mark it static.
Signed-off-by: Christoph Hellwig <hch@lst.de>
[ rjw: Minor subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
When injecting an error into the platform, the OSPM executes an
EXECUTE_OPERATION action to instruct the platform to begin the injection
operation. And then, the OSPM busy waits for a while by continually
executing CHECK_BUSY_STATUS action until the platform indicates that the
operation is complete. More specifically, the platform is limited to
respond within 1 millisecond right now. This is too strict for some
platforms.
For example, in Arm platform, when injecting a Processor Correctable error,
the OSPM will warn:
Firmware does not respond in time.
And a message is printed on the console:
echo: write error: Input/output error
We observe that the waiting time for DDR error injection is about 10 ms and
that for PCIe error injection is about 500 ms in Arm platform.
In this patch, we relax the response timeout to 1 second.
Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Before commit 8fcc4ae6fa ("arm64: acpi: Make apei_claim_sea()
synchronise with APEI's irq work"), do_sea() would unconditionally
signal the affected task from the arch code. Since that change,
the GHES driver sends the signals.
This exposes a problem as errors the GHES driver doesn't understand
or doesn't handle effectively are silently ignored. It will cause
the errors get taken again, and circulate endlessly. User-space task
get stuck in this loop.
Existing firmware on Kunpeng9xx systems reports cache errors with the
'ARM Processor Error' CPER records.
Do memory failure handling for ARM Processor Error Section just like
for Memory Error Section.
Fixes: 8fcc4ae6fa ("arm64: acpi: Make apei_claim_sea() synchronise with APEI's irq work")
Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Reviewed-by: James Morse <james.morse@arm.com>
[ rjw: Subject edit ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
If ACPI is not enabled but support for ACPI and APEI is enabled in the
kernel, then the following warning is seen on boot ...
WARNING KERN EINJ: ACPI disabled.
For ARM64 platforms, the 'acpi_disabled' variable is true by default
and hence, the above is often seen on ARM64. Given that it can be
normal for ACPI to be disabled, make this an informational print rather
that a warning.
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* acpi-misc:
ACPI: dock: fix some coding style issues
ACPI: sysfs: fix some coding style issues
ACPI: PM: add a missed blank line after declarations
ACPI: custom_method: fix a coding style issue
ACPI: CPPC: fix some coding style issues
ACPI: button: fix some coding style issues
ACPI: battery: fix some coding style issues
ACPI: acpi_pad: add a missed blank line after declarations
ACPI: LPSS: add a missed blank line after declarations
ACPI: ipmi: remove useless return statement for void function
ACPI: processor: fix some coding style issues
ACPI: APD: fix a block comment align issue
ACPI: AC: fix some coding style issues
ACPI: fix various typos in comments
The variable rc is being assigned a value that is never read,
the assignment is redundant and can be removed.
Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Eliminate the following coccicheck warning:
./drivers/acpi/apei/erst.c:691:2-3: Unneeded semicolon
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
From commit 6915564dc5 ("ACPI: OSL: Change the type of
acpi_os_map_generic_address() return value"),
acpi_os_map_generic_address() will return logical address or NULL
for error, but for ACPI_ADR_SPACE_SYSTEM_IO case, it should be also
return 0 as it's a normal case, but now it will return -ENXIO.
So check it out for such case to avoid einj module initialization
fail.
Fixes: 6915564dc5 ("ACPI: OSL: Change the type of acpi_os_map_generic_address() return value")
Cc: <stable@vger.kernel.org>
Reviewed-by: James Morse <james.morse@arm.com>
Tested-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Aili Yao <yaoaili@kingsoft.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAl+SOXIQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgptrcD/93VUDmRAn73ChKNd0TtXUicJlAlNLVjvfs
VFTXWBDnlJnGkZT7ElkDD9b8dsz8l4xGf/QZ5dzhC/th2OsfObQkSTfe0lv5cCQO
mX7CRSrDpjaHtW+WGPDa0oQsGgIfpqUz2IOg9NKbZZ1LJ2uzYfdOcf3oyRgwZJ9B
I3sh1vP6OzjZVVCMmtMTM+sYZEsDoNwhZwpkpiwMmj8tYtOPgKCYKpqCiXrGU0x2
ML5FtDIwiwU+O3zYYdCBWqvCb2Db0iA9Aov2whEBz/V2jnmrN5RMA/90UOh1E2zG
br4wM1Wt3hNrtj5qSxZGlF/HEMYJVB8Z2SgMjYu4vQz09qRVVqpGdT/dNvLAHQWg
w4xNCj071kVZDQdfwnqeWSKYUau9Xskvi8xhTT+WX8a5CsbVrM9vGslnS5XNeZ6p
h2D3Q+TAYTvT756icTl0qsYVP7PrPY7DdmQYu0q+Lc3jdGI+jyxO2h9OFBRLZ3p6
zFX2N8wkvvCCzP2DwVnnhIi/GovpSh7ksHnb039F36Y/IhZPqV1bGqdNQVdanv6I
8fcIDM6ltRQ7dO2Br5f1tKUZE9Pm6x60b/uRVjhfVh65uTEKyGRhcm5j9ztzvQfI
cCBg4rbVRNKolxuDEkjsAFXVoiiEEsb7pLf4pMO+Dr62wxFG589tQNySySneUIVZ
J9ILnGAAeQ==
=aVWo
-----END PGP SIGNATURE-----
Merge tag 'arch-cleanup-2020-10-22' of git://git.kernel.dk/linux-block
Pull arch task_work cleanups from Jens Axboe:
"Two cleanups that don't fit other categories:
- Finally get the task_work_add() cleanup done properly, so we don't
have random 0/1/false/true/TWA_SIGNAL confusing use cases. Updates
all callers, and also fixes up the documentation for
task_work_add().
- While working on some TIF related changes for 5.11, this
TIF_NOTIFY_RESUME cleanup fell out of that. Remove some arch
duplication for how that is handled"
* tag 'arch-cleanup-2020-10-22' of git://git.kernel.dk/linux-block:
task_work: cleanup notification modes
tracehook: clear TIF_NOTIFY_RESUME in tracehook_notify_resume()
A previous commit changed the notification mode from true/false to an
int, allowing notify-no, notify-yes, or signal-notify. This was
backwards compatible in the sense that any existing true/false user
would translate to either 0 (on notification sent) or 1, the latter
which mapped to TWA_RESUME. TWA_SIGNAL was assigned a value of 2.
Clean this up properly, and define a proper enum for the notification
mode. Now we have:
- TWA_NONE. This is 0, same as before the original change, meaning no
notification requested.
- TWA_RESUME. This is 1, same as before the original change, meaning
that we use TIF_NOTIFY_RESUME.
- TWA_SIGNAL. This uses TIF_SIGPENDING/JOBCTL_TASK_WORK for the
notification.
Clean up all the callers, switching their 0/1/false/true to using the
appropriate TWA_* mode for notifications.
Fixes: e91b481623 ("task_work: teach task_work_add() to do signal_wake_up()")
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
This eliminates the following sparse warning:
drivers/acpi/apei/apei-base.c:290:23: warning: symbol
'apei_resources_all' was not declared. Should it be static?
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Jason Yan <yanaijie@huawei.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
CPER records describing a firmware-first error are identified by GUID.
The ghes driver currently logs, but ignores any unknown CPER records.
This prevents describing errors that can't be represented by a standard
entry, that would otherwise allow a driver to recover from an error.
The UEFI spec calls these 'Non-standard Section Body' (N.2.3 of
version 2.8).
Add a notifier chain for these non-standard/vendor-records. Callers
must identify their type of records by GUID.
Record data is copied to memory from the ghes_estatus_pool to allow
us to keep it until after the notifier has run.
Co-developed-by: James Morse <james.morse@arm.com>
Link: https://lore.kernel.org/r/20200903123456.1823-2-shiju.jose@huawei.com
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Modify acpi_os_map_generic_address() to return the pointer returned
by acpi_os_map_iomem() which represents the logical address
corresponding to the struct acpi_generic_address argument passed to
it or NULL if that address cannot be obtained (for example, the
argument does not represent an address in system memory or it could
not be mapped by the OS).
Among other things, that will allow the ACPI OS layer to pass the
logical addresses of the FADT GPE blocks 0 and 1 to ACPICA going
forward.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The variable rc is being initialized with a value that is
never read and it is being updated later with a new value. The
initialization is redundant and can be removed.
Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
- Update the ACPICA code in the kernel to upstream revision
20200430:
* Move acpi_gbl_next_cmd_num definition (Erik Kaneda).
* Ignore AE_ALREADY_EXISTS status in the disassembler when parsing
create operators (Erik Kaneda).
* Add status checks to the dispatcher (Erik Kaneda).
* Fix required parameters for _NIG and _NIH (Erik Kaneda).
* Make acpi_protocol_lengths static (Yue Haibing).
- Fix ACPI table reference counting errors in several places, mostly
in error code paths (Hanjun Guo).
- Extend the Generic Event Device (GED) driver to support _Exx and
_Lxx handler methods (Ard Biesheuvel).
- Add new acpi_evaluate_reg() helper and modify the ACPI PCI hotplug
code to use it (Hans de Goede).
- Add new DPTF battery participant driver and make the DPFT power
participant driver create more sysfs device attributes (Srinivas
Pandruvada).
- Improve the handling of memory failures in APEI (James Morse).
- Add new blacklist entry for Acer TravelMate 5735Z to the backlight
driver (Paul Menzel).
- Add i2c address for thermal control to the PMIC driver (Mauro
Carvalho Chehab).
- Allow the ACPI processor idle driver to work on platforms with
only one ACPI C-state present (Zhang Rui).
- Fix kobject reference count leaks in error code paths in two
places (Qiushi Wu).
- Delete unused proc filename macros and make some symbols static
(Pascal Terjan, Zheng Zengkai, Zou Wei).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAl7VHb8SHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRxVboQAIjYda2RhQANIlIvoEa+Qd2/FBd3HXgU
Mv0LZ6y1xxxEZYeKne7zja1hzt5WetuZ1hZHGfg8YkXyrLqZGxfCIFbbhSA90BGG
PGzFerGmOBNzB3I9SN6iQY7vSqoFHvQEV1PVh24d+aHWZqj2lnaRRq+GT54qbRLX
/U3Hy5glFl8A/DCBP4cpoEjDr4IJHY68DathkDK2Ep2ybXV6B401uuqx8Su/OBd/
MQmJTYI1UK/RYBXfdzS9TIZahnkxBbU1cnLFy08Ve2mawl5YsHPEbvm77a0yX2M6
sOAerpgyzYNivAuOLpNIwhUZjpOY66nQuKAQaEl2cfRUkqt4nbmq7yDoH3d2MJLC
/Ccz955rV2YyD1DtyV+PyT+HB+/EVwH/+UCZ+gsSbdHvOiwdFU6VaTc2eI1qq8K9
4m5eEZFrAMPlvTzj/xVxr2Hfw1lbm23J5B5n7sM5HzYbT6MUWRQpvfV4zM3jTGz0
rQd8JmcHVvZk/MV1mGrYHrN5TnGTLWpbS4Yv1lAQa6FP0N0NxzVud7KRfLKnCnJ1
vh5yzW2fCYmVulJpuqxJDfXSqNV7n40CFrIewSp6nJRQXnWpImqHwwiA8fl51+hC
fBL72Ey08EHGFnnNQqbebvNglsodRWJddBy43ppnMHtuLBA/2GVKYf2GihPbpEBq
NHtX+Rd3vlWW
=xH3i
-----END PGP SIGNATURE-----
Merge tag 'acpi-5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI updates from Rafael Wysocki:
"These update the ACPICA code in the kernel to upstream revision
20200430, fix several reference counting errors related to ACPI
tables, add _Exx / _Lxx support to the GED driver, add a new
acpi_evaluate_reg() helper, add new DPTF battery participant driver
and extend the DPFT power participant driver, improve the handling of
memory failures in the APEI code, add a blacklist entry to the
backlight driver, update the PMIC driver and the processor idle
driver, fix two kobject reference count leaks, and make a few janitory
changes.
Specifics:
- Update the ACPICA code in the kernel to upstream revision 20200430:
- Move acpi_gbl_next_cmd_num definition (Erik Kaneda).
- Ignore AE_ALREADY_EXISTS status in the disassembler when parsing
create operators (Erik Kaneda).
- Add status checks to the dispatcher (Erik Kaneda).
- Fix required parameters for _NIG and _NIH (Erik Kaneda).
- Make acpi_protocol_lengths static (Yue Haibing).
- Fix ACPI table reference counting errors in several places, mostly
in error code paths (Hanjun Guo).
- Extend the Generic Event Device (GED) driver to support _Exx and
_Lxx handler methods (Ard Biesheuvel).
- Add new acpi_evaluate_reg() helper and modify the ACPI PCI hotplug
code to use it (Hans de Goede).
- Add new DPTF battery participant driver and make the DPFT power
participant driver create more sysfs device attributes (Srinivas
Pandruvada).
- Improve the handling of memory failures in APEI (James Morse).
- Add new blacklist entry for Acer TravelMate 5735Z to the backlight
driver (Paul Menzel).
- Add i2c address for thermal control to the PMIC driver (Mauro
Carvalho Chehab).
- Allow the ACPI processor idle driver to work on platforms with only
one ACPI C-state present (Zhang Rui).
- Fix kobject reference count leaks in error code paths in two places
(Qiushi Wu).
- Delete unused proc filename macros and make some symbols static
(Pascal Terjan, Zheng Zengkai, Zou Wei)"
* tag 'acpi-5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (32 commits)
ACPI: CPPC: Fix reference count leak in acpi_cppc_processor_probe()
ACPI: sysfs: Fix reference count leak in acpi_sysfs_add_hotplug_profile()
ACPI: GED: use correct trigger type field in _Exx / _Lxx handling
ACPI: DPTF: Add battery participant driver
ACPI: DPTF: Additional sysfs attributes for power participant driver
ACPI: video: Use native backlight on Acer TravelMate 5735Z
arm64: acpi: Make apei_claim_sea() synchronise with APEI's irq work
ACPI: APEI: Kick the memory_failure() queue for synchronous errors
mm/memory-failure: Add memory_failure_queue_kick()
ACPI / PMIC: Add i2c address for thermal control
ACPI: GED: add support for _Exx / _Lxx handler methods
ACPI: Delete unused proc filename macros
ACPI: hotplug: PCI: Use the new acpi_evaluate_reg() helper
ACPI: utils: Add acpi_evaluate_reg() helper
ACPI: debug: Make two functions static
ACPI: sleep: Put the FACS table after using it
ACPI: scan: Put SPCR and STAO table after using it
ACPI: EC: Put the ACPI table after using it
ACPI: APEI: Put the HEST table for error path
ACPI: APEI: Put the error record serialization table for error path
...
These functions are not needed anymore because the vmalloc and ioremap
mappings are now synchronized when they are created or torn down.
Remove all callers and function definitions.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Acked-by: Andy Lutomirski <luto@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H . Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Link: http://lkml.kernel.org/r/20200515140023.25469-7-joro@8bytes.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
memory_failure() offlines or repairs pages of memory that have been
discovered to be corrupt. These may be detected by an external
component, (e.g. the memory controller), and notified via an IRQ.
In this case the work is queued as not all of memory_failure()s work
can happen in IRQ context.
If the error was detected as a result of user-space accessing a
corrupt memory location the CPU may take an abort instead. On arm64
this is a 'synchronous external abort', and on a firmware first
system it is replayed using NOTIFY_SEA.
This notification has NMI like properties, (it can interrupt
IRQ-masked code), so the memory_failure() work is queued. If we
return to user-space before the queued memory_failure() work is
processed, we will take the fault again. This loop may cause platform
firmware to exceed some threshold and reboot when Linux could have
recovered from this error.
For NMIlike notifications keep track of whether memory_failure() work
was queued, and make task_work pending to flush out the queue.
To save memory allocations, the task_work is allocated as part of
the ghes_estatus_node, and free()ing it back to the pool is deferred.
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Tyler Baicar <baicar@os.amperecomputing.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
hest_tab will be used after hest_init(), but we need to release
it for error path.
Signed-off-by: Hanjun Guo <guohanjun@huawei.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The mapped error record serialization table needs to be
released for error path of erst_init().
Signed-off-by: Hanjun Guo <guohanjun@huawei.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The mapped error injection table will be used after einj_init()
for debugfs, but it should be released for module exit and error
path of einj_init().
Signed-off-by: Hanjun Guo <guohanjun@huawei.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The mapped boot error record table is not used after
bert_init(), release it.
Signed-off-by: Hanjun Guo <guohanjun@huawei.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Commit 3f8fd02b1b ("mm/vmalloc: Sync unmappings in
__purge_vmap_area_lazy()") introduced a call to vmalloc_sync_all() in
the vunmap() code-path. While this change was necessary to maintain
correctness on x86-32-pae kernels, it also adds additional cycles for
architectures that don't need it.
Specifically on x86-64 with CONFIG_VMAP_STACK=y some people reported
severe performance regressions in micro-benchmarks because it now also
calls the x86-64 implementation of vmalloc_sync_all() on vunmap(). But
the vmalloc_sync_all() implementation on x86-64 is only needed for newly
created mappings.
To avoid the unnecessary work on x86-64 and to gain the performance
back, split up vmalloc_sync_all() into two functions:
* vmalloc_sync_mappings(), and
* vmalloc_sync_unmappings()
Most call-sites to vmalloc_sync_all() only care about new mappings being
synchronized. The only exception is the new call-site added in the
above mentioned commit.
Shile Zhang directed us to a report of an 80% regression in reaim
throughput.
Fixes: 3f8fd02b1b ("mm/vmalloc: Sync unmappings in __purge_vmap_area_lazy()")
Reported-by: kernel test robot <oliver.sang@intel.com>
Reported-by: Shile Zhang <shile.zhang@linux.alibaba.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Borislav Petkov <bp@suse.de>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> [GHES]
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20191009124418.8286-1-joro@8bytes.org
Link: https://lists.01.org/hyperkitty/list/lkp@lists.01.org/thread/4D3JPPHBNOSPFK2KEPC6KGKS6J25AIDB/
Link: http://lkml.kernel.org/r/20191113095530.228959-1-shile.zhang@linux.alibaba.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently, the ghes_poll_func() timer callback is registered with the
TIMER_DEFERRABLE flag. Thus, it is run when the CPU eventually wakes
up together with a subsequent non-deferrable timer and not at the precisely
configured polling interval.
For polling mode, the polling interval configured by firmware should not
be exceeded according to the ACPI spec 6.3, Table 18-394. The definition
of the polling interval is:
"Indicates the poll interval in milliseconds OSPM should use to
periodically check the error source for the presence of an error
condition."
If this interval is extended due to the timer callback deferring, error
records can get lost. Which we are observing on our ThunderX2 platforms.
Therefore, remove the TIMER_DEFERRABLE flag so that the timer callback
executes at the precise interval.
Signed-off-by: Bhaskar Upadhaya <bupadhaya@marvell.com>
[ bp: Subject & changelog ]
Acked-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEESH4wyp42V4tXvYsjUqAMR0iAlPIFAl3bpjoACgkQUqAMR0iA
lPJJDA/+IJT4YCRp2TwV2jvIs0QzvXZrzEsxgCLibLE85mYTJgoQBD3W1bH2eyjp
T/9U0Zh5PGr/84cHd4qiMxzo+5Olz930weG59NcO4RJBSr671aRYs5tJqwaQAZDR
wlwaob5S28vUmjPxKulvxv6V3FdI79ZE9xrCOCSTQvz4iCLsGOu+Dn/qtF64pImX
M/EXzPMBrByiQ8RTM4Ege8JoBqiCZPDG9GR3KPXIXQwEeQgIoeYxwRYakxSmSzz8
W8NduFCbWavg/yHhghHikMiyOZeQzAt+V9k9WjOBTle3TGJegRhvjgI7508q3tXe
jQTMGATBOPkIgFaZz7eEn/iBa3jZUIIOzDY93RYBmd26aBvwKLOma/Vkg5oGYl0u
ZK+CMe+/xXl7brQxQ6JNsQhbSTjT+746LvLJlCvPbbPK9R0HeKNhsdKpGY3ugnmz
VAnOFIAvWUHO7qx+J+EnOo5iiPpcwXZj4AjrwVrs/x5zVhzwQ+4DSU6rbNn0O1Ak
ELrBqCQkQzh5kqK93jgMHeWQ9EOUp1Lj6PJhTeVnOx2x8tCOi6iTQFFrfdUPlZ6K
2DajgrFhti4LvwVsohZlzZuKRm5EuwReLRSOn7PU5qoSm5rcouqMkdlYG/viwyhf
mTVzEfrfemrIQOqWmzPrWEXlMj2mq8oJm4JkC+jJ/+HsfK4UU8I=
=QCEy
-----END PGP SIGNATURE-----
Merge tag 'printk-for-5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk
Pull printk updates from Petr Mladek:
- Allow to print symbolic error names via new %pe modifier.
- Use pr_warn() instead of the remaining pr_warning() calls. Fix
formatting of the related lines.
- Add VSPRINTF entry to MAINTAINERS.
* tag 'printk-for-5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk: (32 commits)
checkpatch: don't warn about new vsprintf pointer extension '%pe'
MAINTAINERS: Add VSPRINTF
tools lib api: Renaming pr_warning to pr_warn
ASoC: samsung: Use pr_warn instead of pr_warning
lib: cpu_rmap: Use pr_warn instead of pr_warning
trace: Use pr_warn instead of pr_warning
dma-debug: Use pr_warn instead of pr_warning
vgacon: Use pr_warn instead of pr_warning
fs: afs: Use pr_warn instead of pr_warning
sh/intc: Use pr_warn instead of pr_warning
scsi: Use pr_warn instead of pr_warning
platform/x86: intel_oaktrail: Use pr_warn instead of pr_warning
platform/x86: asus-laptop: Use pr_warn instead of pr_warning
platform/x86: eeepc-laptop: Use pr_warn instead of pr_warning
oprofile: Use pr_warn instead of pr_warning
of: Use pr_warn instead of pr_warning
macintosh: Use pr_warn instead of pr_warning
idsn: Use pr_warn instead of pr_warning
ide: Use pr_warn instead of pr_warning
crypto: n2: Use pr_warn instead of pr_warning
...
As said in commit f2c2cbcc35 ("powerpc: Use pr_warn instead of
pr_warning"), removing pr_warning so all logging messages use a
consistent <prefix>_warn style. Let's do it.
Link: http://lkml.kernel.org/r/20191018031850.48498-8-wangkefeng.wang@huawei.com
To: linux-kernel@vger.kernel.org
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Len Brown <lenb@kernel.org>
Cc: James Morse <james.morse@arm.com>
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
[pmladek@suse.com: two more indentation fixes]
Signed-off-by: Petr Mladek <pmladek@suse.com>
Destroy ghes_estatus_pool and release memory allocated via vmalloc() on
errors in ghes_estatus_pool_init() in order to avoid memory leaks.
[ bp: do the labels properly and with descriptive names and massage. ]
Signed-off-by: Liguang Zhang <zhangliguang@linux.alibaba.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/1563173924-47479-1-git-send-email-zhangliguang@linux.alibaba.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
This is a missed part of the commit 5b53696a30
("ACPI / APEI: Switch to use new generic UUID API"), i.e.
replacing old definition with a global constant variable.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: James Morse <james.morse@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Function __ghes_check_estatus() is always called after
__ghes_peek_estatus(), but it is already called in __ghes_peek_estatus().
So we should remove some needless __ghes_check_estatus() calls.
Signed-off-by: Liguang Zhang <zhangliguang@linux.alibaba.com>
Reviewed-by: James Morse <james.morse@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Based on 1 normalized pattern(s):
this file is licensed under gplv2
extracted by the scancode license scanner the SPDX license identifier
GPL-2.0-only
has been chosen to replace the boilerplate/reference in 22 file(s).
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Armijn Hemel <armijn@tjaldur.nl>
Reviewed-by: Allison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190531190115.129548190@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Based on 1 normalized pattern(s):
this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license version 2 as
published by the free software foundation this program is
distributed in the hope that it will be useful but without any
warranty without even the implied warranty of merchantability or
fitness for a particular purpose see the gnu general public license
for more details
extracted by the scancode license scanner the SPDX license identifier
GPL-2.0-only
has been chosen to replace the boilerplate/reference in 655 file(s).
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Allison Randal <allison@lohutok.net>
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Richard Fontana <rfontana@redhat.com>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190527070034.575739538@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* acpi-apei: (29 commits)
efi: cper: Fix possible out-of-bounds access
ACPI: APEI: Fix possible out-of-bounds access to BERT region
MAINTAINERS: Add James Morse to the list of APEI reviewers
ACPI / APEI: Add support for the SDEI GHES Notification type
firmware: arm_sdei: Add ACPI GHES registration helper
ACPI / APEI: Use separate fixmap pages for arm64 NMI-like notifications
ACPI / APEI: Only use queued estatus entry during in_nmi_queue_one_entry()
ACPI / APEI: Split ghes_read_estatus() to allow a peek at the CPER length
ACPI / APEI: Make GHES estatus header validation more user friendly
ACPI / APEI: Pass ghes and estatus separately to avoid a later copy
ACPI / APEI: Let the notification helper specify the fixmap slot
ACPI / APEI: Move locking to the notification helper
arm64: KVM/mm: Move SEA handling behind a single 'claim' interface
KVM: arm/arm64: Add kvm_ras.h to collect kvm specific RAS plumbing
ACPI / APEI: Switch NOTIFY_SEA to use the estatus queue
ACPI / APEI: Move NOTIFY_SEA between the estatus-queue and NOTIFY_NMI
ACPI / APEI: Don't allow ghes_ack_error() to mask earlier errors
ACPI / APEI: Generalise the estatus queue's notify code
ACPI / APEI: Don't update struct ghes' flags in read/clear estatus
ACPI / APEI: Remove spurious GHES_TO_CLEAR check
...
Check that the length recorded in the generic error status block is
within the region before checking the contents of the region itself.
Otherwise it may result in an out-of-bounds access if the system
firmware has generated a status block with an invalid length (larger
than the mapped region). Also move the block_status check so that it
only happens after the block has been verified to be within the mapped
region.
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Acked-by: Borislav Petkov <bp@suse.de>
Tested-by: Tyler Baicar <baicar.tyler@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
If the GHES notification type is SDEI, register the provided event
using the SDEI-GHES helper.
SDEI may be one of two types of event, normal and critical. Critical
events can interrupt normal events, so these must have separate
fixmap slots and locks in case both event types are in use.
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Now that ghes notification helpers provide the fixmap slots and
take the lock themselves, multiple NMI-like notifications can
be used on arm64.
These should be named after their notification method as they can't
all be called 'NMI'. x86's NOTIFY_NMI already is, change the SEA
fixmap entry to be called FIX_APEI_GHES_SEA.
Future patches can add support for FIX_APEI_GHES_SEI and
FIX_APEI_GHES_SDEI_{NORMAL,CRITICAL}.
Because all of ghes.c builds on both architectures, provide a
constant for each fixmap entry that the architecture will never
use.
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Each struct ghes has an worst-case sized buffer for storing the
estatus. If an error is being processed by ghes_proc() in process
context this buffer will be in use. If the error source then triggers
an NMI-like notification, the same buffer will be used by
in_nmi_queue_one_entry() to stage the estatus data, before
__process_error() copys it into a queued estatus entry.
Merge __process_error()s work into in_nmi_queue_one_entry() so that
the queued estatus entry is used from the beginning. Use the new
ghes_peek_estatus() to know how much memory to allocate from
the ghes_estatus_pool before reading the records.
Reported-by: Borislav Petkov <bp@suse.de>
Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
Change since v6:
* Added a comment explaining the 'ack-error, then goto no_work'.
* Added missing esatus-clearing, which is necessary after reading the GAS,
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
ghes_read_estatus() reads the record address, then the record's
header, then performs some sanity checks before reading the
records into the provided estatus buffer.
To provide this estatus buffer the caller must know the size of the
records in advance, or always provide a worst-case sized buffer as
happens today for the non-NMI notifications.
Add a function to peek at the record's header to find the size. This
will let the NMI path allocate the right amount of memory before reading
the records, instead of using the worst-case size, and having to copy
the records.
Split ghes_read_estatus() to create __ghes_peek_estatus() which
returns the address and size of the CPER records.
Signed-off-by: James Morse <james.morse@arm.com>
Changes since v7:
* Grammar
* concistent argument ordering
Changes since v6:
* Additional buf_addr = 0 error handling
* Moved checking out of peek-estatus
* Reworded an error message so we can tell them apart
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
ghes_read_estatus() checks various lengths in the top-level header to
ensure the CPER records to be read aren't obviously corrupt.
Take the opportunity to make this more user-friendly, printing a
(ratelimited) message about the nature of the header format error.
Suggested-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: James Morse <james.morse@arm.com>
[ rjw: Add missing 'static' ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The NMI-like notifications scribble over ghes->estatus, before
copying it somewhere else. If this interrupts the ghes_probe() code
calling ghes_proc() on each struct ghes, the data is corrupted.
All the NMI-like notifications should use a queued estatus entry
from the beginning, instead of the ghes version, then copying it.
To do this, break up any use of "ghes->estatus" so that all
functions take the estatus as an argument.
This patch just moves these ghes->estatus dereferences into separate
arguments, no change in behaviour. struct ghes becomes unused in
ghes_clear_estatus() as it only wanted ghes->estatus, which we now
pass directly. This is removed.
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>