Pull RAS updates from Ingo Molnar:
"Various RAS updates:
- AMD MCE support updates for future CPUs, fixes and 'SMCA' (Scalable
MCA) error decoding support (Aravind Gopalakrishnan)
- x86 memcpy_mcsafe() support, to enable smart(er) hardware error
recovery in NVDIMM drivers, based on an extension of the x86
exception handling code. (Tony Luck)"
* 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
EDAC/sb_edac: Fix computation of channel address
x86/mm, x86/mce: Add memcpy_mcsafe()
x86/mce/AMD: Document some functionality
x86/mce: Clarify comments regarding deferred error
x86/mce/AMD: Fix logic to obtain block address
x86/mce/AMD, EDAC: Enable error decoding of Scalable MCA errors
x86/mce: Move MCx_CONFIG MSR definitions
x86/mce: Check for faults tagged in EXTABLE_CLASS_FAULT exception table entries
x86/mm: Expand the exception table logic to allow new handling options
x86/mce/AMD: Set MCAX Enable bit
x86/mce/AMD: Carve out threshold block preparation
x86/mce/AMD: Fix LVT offset configuration for thresholding
x86/mce/AMD: Reduce number of blocks scanned per bank
x86/mce/AMD: Do not perform shared bank check for future processors
x86/mce: Fix order of AMD MCE init function call
Large memory Haswell-EX systems with multiple DIMMs per channel were
sometimes reporting the wrong DIMM.
Found three problems:
1) Debug printouts for socket and channel interleave were not interpreting
the register fields correctly. The socket interleave field is a 2^X
value (0=1, 1=2, 2=4, 3=8). The channel interleave is X+1 (0=1, 1=2,
2=3. 3=4).
2) Actual use of the socket interleave value didn't interpret as 2^X
3) Conversion of address to channel address was complicated, and wrong.
Signed-off-by: Tony Luck <tony.luck@intel.com>
Acked-by: Aristeu Rozanski <arozansk@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-edac@vger.kernel.org
Cc: stable@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
They're both running only when ->edac_check is initialized so remove
that check from the workqueue function itself. Synchronize/generalize
the ->op_state check between the two.
Kill useless comments, while at it.
Signed-off-by: Borislav Petkov <bp@suse.de>
We use the ->edac_check function pointers to determine whether we need
to setup a polling workqueue. However, the destroy path is not balanced
and we might try to teardown an unitialized workqueue.
Balance init and destroy paths by looking at ->edac_check in both cases.
Set op_state to OP_OFFLINE *before* destroying anything.
Reported-by: Zhiqiang Hou <Zhiqiang.Hou@freescale.com>
Cc: Varun Sethi <Varun.Sethi@freescale.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Hide the EDAC workqueue pointer in a separate compilation unit and add
accessors for the workqueue manipulations needed.
Remove edac_pci_reset_delay_period() which wasn't used by anything. It
seems it got added without a user with
91b99041c1 ("drivers/edac: updated PCI monitoring")
Signed-off-by: Borislav Petkov <bp@suse.de>
It cannot fail now. We either load EDAC core after having successfully
initialized edac_subsys or we don't.
Signed-off-by: Borislav Petkov <bp@suse.de>
This was really dumb - reference counting for the main EDAC sysfs
object. While we could've simply registered it as the first thing in the
module init path and then hand it around to what needs it.
Do that and rip out all the code around it, thus simplifying the whole
handling significantly.
Move the edac_subsys node back to edac_module.c.
Signed-off-by: Borislav Petkov <bp@suse.de>
EDAC workqueue destruction is really fragile. We cancel delayed work
but if it is still running and requeues itself, we still go ahead and
destroy the workqueue and the queued work explodes when workqueue core
attempts to run it.
Make the destruction more robust by switching op_state to offline so
that requeuing stops. Cancel any pending work *synchronously* too.
EDAC i7core: Driver loaded.
general protection fault: 0000 [#1] SMP
CPU 12
Modules linked in:
Supported: Yes
Pid: 0, comm: kworker/0:1 Tainted: G IE 3.0.101-0-default #1 HP ProLiant DL380 G7
RIP: 0010:[<ffffffff8107dcd7>] [<ffffffff8107dcd7>] __queue_work+0x17/0x3f0
< ... regs ...>
Process kworker/0:1 (pid: 0, threadinfo ffff88019def6000, task ffff88019def4600)
Stack:
...
Call Trace:
call_timer_fn
run_timer_softirq
__do_softirq
call_softirq
do_softirq
irq_exit
smp_apic_timer_interrupt
apic_timer_interrupt
intel_idle
cpuidle_idle_call
cpu_idle
Code: ...
RIP __queue_work
RSP <...>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: <stable@vger.kernel.org>
Knights Landing is the next generation architecture for HPC market.
KNL introduces concept of a tile and CHA - Cache/Home Agent for memory
accesses.
Some things are fixed in KNL:
() There's single DIMM slot per channel
() There's 2 memory controllers with 3 channels each, however,
from EDAC standpoint, it is presented as single memory controller
with 6 channels. In order to represent 2 MCs w/ 3 CH, it would
require major redesign of EDAC core driver.
Basically, two functionalities are added/extended:
() during driver initialization KNL topology is being recognized, i.e.
which channels are populated with what DIMM sizes
(knl_get_dimm_capacity function)
() handle MCE errors - channel swizzling
Reviewed-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Jim Snow <jim.m.snow@intel.com>
Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: lukasz.anaczkowski@intel.com
Link: http://lkml.kernel.org/r/1449136134-23706-5-git-send-email-hubert.chrzaniuk@intel.com
[ Rebase to 4.4-rc3. ]
Signed-off-by: Hubert Chrzaniuk <hubert.chrzaniuk@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Pull asm-generic cleanups from Arnd Bergmann:
"The asm-generic changes for 4.4 are mostly a series from Christoph
Hellwig to clean up various abuses of headers in there. The patch to
rename the io-64-nonatomic-*.h headers caused some conflicts with new
users, so I added a workaround that we can remove in the next merge
window.
The only other patch is a warning fix from Marek Vasut"
* tag 'asm-generic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
asm-generic: temporarily add back asm-generic/io-64-nonatomic*.h
asm-generic: cmpxchg: avoid warnings from macro-ized cmpxchg() implementations
gpio-mxc: stop including <asm-generic/bug>
n_tracesink: stop including <asm-generic/bug>
n_tracerouter: stop including <asm-generic/bug>
mlx5: stop including <asm-generic/kmap_types.h>
hifn_795x: stop including <asm-generic/kmap_types.h>
drbd: stop including <asm-generic/kmap_types.h>
move count_zeroes.h out of asm-generic
move io-64-nonatomic*.h out of asm-generic
Pull RAS changes from Ingo Molnar:
"The main system reliability related changes were from x86, but also
some generic RAS changes:
- AMD MCE error injection subsystem enhancements. (Aravind
Gopalakrishnan)
- Fix MCE and CPU hotplug interaction bug. (Ashok Raj)
- kcrash bootup robustness fix. (Baoquan He)
- kcrash cleanups. (Borislav Petkov)
- x86 microcode driver rework: simplify it by unmodularizing it and
other cleanups. (Borislav Petkov)"
* 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits)
x86/mce: Add a default case to the switch in __mcheck_cpu_ancient_init()
x86/mce: Add a Scalable MCA vendor flags bit
MAINTAINERS: Unify the microcode driver section
x86/microcode/intel: Move #ifdef DEBUG inside the function
x86/microcode/amd: Remove maintainers from comments
x86/microcode: Remove modularization leftovers
x86/microcode: Merge the early microcode loader
x86/microcode: Unmodularize the microcode driver
x86/mce: Fix thermal throttling reporting after kexec
kexec/crash: Say which char is the unrecognized
x86/setup/crash: Check memblock_reserve() retval
x86/setup/crash: Cleanup some more
x86/setup/crash: Remove alignment variable
x86/setup: Cleanup crashkernel reservation functions
x86/amd_nb, EDAC: Rename amd_get_node_id()
x86/setup: Do not reserve crashkernel high memory if low reservation failed
x86/microcode/amd: Do not overwrite final patch levels
x86/microcode/amd: Extract current patch level read to a function
x86/ras/mce_amd_inj: Inject bank 4 errors on the NBC
x86/ras/mce_amd_inj: Trigger deferred and thresholding errors interrupts
...
These are not implementations of default architecture code but helpers
for drivers. Move them to the place they belong to.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Darren Hart <dvhart@linux.intel.com>
Acked-by: Hitoshi Mitake <mitake.hitoshi@lab.ntt.co.jp>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
The scrub rate control register has moved to function 2 in PCI config
space and is at a different offset on family 0x15, models 0x60 and
later. The minimum recommended scrub rate has also changed. (Refer to
D18F2x1c9_dct[1:0][DramScrub] in Fam15hM60h BKDG).
Adjust set_scrub_rate() and get_scrub_rate() functions to accommodate
this.
Tested on F15hM60h, Fam15h, models 00h-0fh and Fam10h systems.
Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1443440593-2316-2-git-send-email-Aravind.Gopalakrishnan@amd.com
[ Cleanup conditionals. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
Sysfs "dimm_label" and "chX_dimm_label" nodes have the following issues
in their store operation:
1) A newline-terminated input string causes redundant newlines:
# echo "test" > /sys/bus/mc0/devices/dimm0/dimm_label
# cat /sys/bus/mc0/devices/dimm0/dimm_label
test
# od -bc /sys/bus/mc0/devices/dimm0/dimm_label
0000000 164 145 163 164 012 012
t e s t \n \n
0000006
2) The original label string (31 characters) cannot be stored due to
an improper size check:
# echo "CPU_SrcID#0_Ha#0_Chan#0_DIMM#0" > /sys/bus/mc0/devices/dimm0/dimm_label
# cat /sys/bus/mc0/devices/dimm0/dimm_label
# od -bc /sys/bus/mc0/devices/dimm0/dimm_label
0000000 012 012
\n \n
0000002
3) An input string longer than the buffer size results a wrong label
info as it allows a retry with the remaining string:
# echo "CPU_SrcID#0_Ha#0_Chan#0_DIMM#0_TEST" > /sys/bus/mc0/devices/dimm0/dimm_label
# cat /sys/bus/mc0/devices/dimm0/dimm_label
_TEST
Fix these issues by making the following changes:
1) Replace a newline character at the end by setting a null. It also
assures that the string is null-terminated in the label buffer.
2) Check the label buffer size with 'sizeof(dimm->label)'.
3) Fail a request if its string exceeds the label buffer size.
Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Cc: Robert Elliott <elliott@hpe.com>
Link: http://lkml.kernel.org/r/1443121564.25474.160.camel@hpe.com
Signed-off-by: Borislav Petkov <bp@suse.de>
After
7d375bffa5 ("sb_edac: Fix support for systems with two home agents per socket")
sysfs "dimm_label" and "chX_dimm_label" show their label string without a
newline "\n" at the end.
[root@orange ~]# cat /sys/bus/mc0/devices/dimm0/dimm_label
CPU_SrcID#0_Ha#0_Chan#0_DIMM#0[root@orange ~]#
[root@orange ~]# cat /sys/devices/system/edac/mc/mc0/csrow0/ch0_dimm_label
CPU_SrcID#0_Ha#0_Chan#0_DIMM#0[root@orange ~]#
The label strings now have 31 characters, which are the same as
EDAC_MC_LABEL_LEN. Since the snprintf()s in channel_dimm_label_show()
and dimmdev_label_show() limit the whole length by EDAC_MC_LABEL_LEN,
the newline in the format "%s\n" is ignored.
[root@orange ~]# od -bc /sys/bus/mc0/devices/dimm0/dimm_label
0000000 103 120 125 137 123 162 143 111 104 043 060 137 110 141 043 060
C P U _ S r c I D # 0 _ H a # 0
0000020 137 103 150 141 156 043 060 137 104 111 115 115 043 060 000
_ C h a n # 0 _ D I M M # 0 \0
0000037
Fix it by using 'sizeof(dimm->label) + 1' as the whole length in the
snprintf()s in channel_dimm_label_show() and dimmdev_label_show().
Reported-by: Robert Elliott <elliott@hpe.com>
Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Link: http://lkml.kernel.org/r/1442933883-21587-2-git-send-email-toshi.kani@hpe.com
Signed-off-by: Borislav Petkov <bp@suse.de>
In commit
7d375bffa5 ("sb_edac: Fix support for systems with two home agents per socket")
NUM_CHANNELS was changed to 8 and the channel space was renumerated to
handle EN, EP, and EX configurations.
The *_mci_bind_devs() functions - except for sbridge_mci_bind_devs() -
got a new device presence check in the form of saw_chan_mask. However,
sbridge_mci_bind_devs() still uses the NUM_CHANNELS for loop.
With the increase in NUM_CHANNELS, this loop fails at index 4 since
SB only has 4 TADs. This results in the following error on SB machines:
EDAC sbridge: Some needed devices are missing
EDAC sbridge: Couldn't find mci handler
EDAC sbridge: Couldn't find mci handle
This patch adapts the saw_chan_mask logic for sbridge_mci_bind_devs() as
well.
After this patch:
EDAC MC0: Giving out device to module sbridge_edac.c controller Sandy Bridge Socket#0: DEV 0000:3f:0e.0 (POLLED)
EDAC MC1: Giving out device to module sbridge_edac.c controller Sandy Bridge Socket#1: DEV 0000:7f:0e.0 (POLLED)
Signed-off-by: Seth Jennings <sjenning@redhat.com>
Acked-by: Aristeu Rozanski <aris@redhat.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Tested-by: Borislav Petkov <bp@suse.de>
Cc: <stable@vger.kernel.org> # v4.2
Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1438798561-10180-1-git-send-email-sjenning@redhat.com
Signed-off-by: Borislav Petkov <bp@suse.de>
This driver creates its debugfs hierarchy under the toplevel debugfs dir
- see i5100_init() - so make it use edac_debugfs_create_dir_at( , NULL)
because we're not breaking userspace. Oh well.
Signed-off-by: Borislav Petkov <bp@suse.de>