Commit Graph

1779 Commits

Author SHA1 Message Date
Shubhrajyoti Datta
e2932d1f6f EDAC/synopsys: Read the error count from the correct register
Currently, the error count is read wrongly from the status register. Read
the count from the proper error count register (ERRCNT).

  [ bp: Massage. ]

Fixes: b500b4a029 ("EDAC, synopsys: Add ECC support for ZynqMP DDR controller")
Signed-off-by: Shubhrajyoti Datta <shubhrajyoti.datta@xilinx.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Michal Simek <michal.simek@xilinx.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20220414102813.4468-1-shubhrajyoti.datta@xilinx.com
2022-04-14 14:44:49 +02:00
Borislav Petkov
1422df58e5 Merge branch 'edac-amd64' into edac-updates-for-v5.18
Signed-off-by: Borislav Petkov <bp@suse.de>
2022-03-21 10:34:57 +01:00
Rabara Niravkumar L
e1bca853dd EDAC/altera: Add SDRAM ECC check for U-Boot
A bug in legacy U-Boot causes a crash during SDRAM boot if ECC is not
enabled in the bitstream but enabled in the Linux config.

Memory mapped read of the ECC Enabled bit was only enabled if U-Boot
determined ECC was enabled in the bitstream.

The Linux driver checks the ECC enable bit using a memory map read.
In the ECC disabled bitstream case, U-Boot didn't enable ECC register
memory map reads and since they are not allowed this results in a crash.

Always read the ECC Enable register through a SMC call which is always
allowed and it works with legacy and current U-Boot.

  [ bp: Massage commit message. ]

Signed-off-by: Rabara Niravkumar L <niravkumar.l.rabara@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Dinh Nguyen <dinguyen@kernel.org>
Link: https://lore.kernel.org/r/20220305014118.4794-1-niravkumar.l.rabara@intel.com
2022-03-16 09:56:39 +01:00
Yazen Ghannam
2151c84ece EDAC/amd64: Add new register offset support and related changes
Introduce a "family flags" bitmask that can be used to indicate any
special behavior needed on a per-family basis.

Add a flag to indicate a system uses the new register offsets introduced
with Family 19h Model 10h.

Use this flag to account for register offset changes, a new bitfield
indicating DDR5 use on a memory controller, and to set the proper number
of chip select masks.

Rework f17_addr_mask_to_cs_size() to properly handle the change in chip
select masks. And update code comments to reflect the updated Chip
Select, DIMM, and Mask relationships.

[uninitialized variable warning]
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: William Roche <william.roche@oracle.com>
Link: https://lore.kernel.org/r/20220202144307.2678405-3-yazen.ghannam@amd.com
2022-02-23 22:01:33 +01:00
Yazen Ghannam
75aeaaf23d EDAC/amd64: Set memory type per DIMM
Current AMD systems allow mixing of DIMM types within a system. However,
DIMMs within a channel, i.e. managed by a single Unified Memory
Controller (UMC), must be of the same type.

Handle this possible configuration by checking and setting the memory
type for each individual "UMC" structure.

Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: William Roche <william.roche@oracle.com>
Link: https://lore.kernel.org/r/20220202144307.2678405-2-yazen.ghannam@amd.com
2022-02-23 21:53:45 +01:00
Eliav Farber
f8efca92ae EDAC: Fix calculation of returned address and next offset in edac_align_ptr()
Do alignment logic properly and use the "ptr" local variable for
calculating the remainder of the alignment.

This became an issue because struct edac_mc_layer has a size that is not
zero modulo eight, and the next offset that was prepared for the private
data was unaligned, causing an alignment exception.

The patch in Fixes: which broke this actually wanted to "what we
actually care about is the alignment of the actual pointer that's about
to be returned." But it didn't check that alignment.

Use the correct variable "ptr" for that.

  [ bp: Massage commit message. ]

Fixes: 8447c4d15e ("edac: Do alignment logic properly in edac_align_ptr()")
Signed-off-by: Eliav Farber <farbere@amazon.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20220113100622.12783-2-farbere@amazon.com
2022-02-15 15:54:46 +01:00
Sergey Shtylyov
dfd0dfb9a7 EDAC/xgene: Fix deferred probing
The driver overrides error codes returned by platform_get_irq_optional()
to -EINVAL for some strange reason, so if it returns -EPROBE_DEFER, the
driver will fail the probe permanently instead of the deferred probing.
Switch to propagating the proper error codes to platform driver code
upwards.

  [ bp: Massage commit message. ]

Fixes: 0d4429301c ("EDAC: Add APM X-Gene SoC EDAC driver")
Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20220124185503.6720-3-s.shtylyov@omp.ru
2022-01-30 01:06:35 +01:00
Sergey Shtylyov
279eb8575f EDAC/altera: Fix deferred probing
The driver overrides the error codes returned by platform_get_irq() to
-ENODEV for some strange reason, so if it returns -EPROBE_DEFER, the
driver will fail the probe permanently instead of the deferred probing.
Switch to propagating the proper error codes to platform driver code
upwards.

  [ bp: Massage commit message. ]

Fixes: 71bcada88b ("edac: altera: Add Altera SDRAM EDAC support")
Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Dinh Nguyen <dinguyen@kernel.org>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20220124185503.6720-2-s.shtylyov@omp.ru
2022-01-28 21:38:46 +01:00
Eliav Farber
b0596da1a0 EDAC/mc: Remove unnecessary cast to char * in edac_align_ptr()
Remove the forgotten (char *) casts as that function returns void *.

  [ bp: Rewrite commit message. ]

Signed-off-by: Eliav Farber <farbere@amazon.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20220113100622.12783-3-farbere@amazon.com
2022-01-26 19:42:09 +01:00
Greg Kroah-Hartman
625c6b5569 EDAC: Use default_groups in kobj_type
There are currently 2 ways to create a set of sysfs files for a
kobj_type, through the default_attrs field, and the default_groups
field. Move the edac sysfs code to use default_groups field which has
been the preferred way since

  aa30f47cf6 ("kobject: Add support for default attribute groups to kobj_type")

so that the obsolete default_attrs field can be removed soon.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20220104112401.1067148-2-gregkh@linuxfoundation.org
2022-01-23 20:01:29 +01:00
Greg Kroah-Hartman
11413893a0 EDAC: Use proper list of struct attribute for attributes
The EDAC sysfs code is doing some crazy casting of the list of
attributes that is not necessary at all. Instead, properly point to the
correct attribute structure in the lists, which removes the need to
cast anything and the code is now properly typesafe (as much as sysfs
attribute logic is typesafe...)

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20220104112401.1067148-1-gregkh@linuxfoundation.org
2022-01-23 20:01:25 +01:00
Linus Torvalds
ff8be96420 - Add support for version 3 of the Synopsys DDR controller to synopsys_edac
- Add support for DRR5 and new models 0x10-0x1f and 0x50-0x5f of AMD
   family 0x19 CPUs to amd64_edac
 
 - The usual set of fixes and cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmHb/PwACgkQEsHwGGHe
 VUqhbhAAo0mRNnBF3CJn1zlXRgmqrvV1IPnJQNp+z5iaXY1vr0qRMgO4OcgsJrxF
 nxrx/fdAYlQQO4vz1iq4t4j+eazOQyM/JZ0DKi4e+Dw2mC0axdCx8a0pyl1g2de6
 oQ5GkplRKUFn+3bTJpHIE5QnCOD7S85Mrp1F3Soa6jD9i+HwQIqAoltNMcCP7Yei
 ibhWUBX2H/oYcHARecIkP/YEyzSEHhcX6LRjNILW5haZQ6GziQUFzKUUwpUS3hsz
 9i6hXnHXEPhOq8JyoyWWhvVDywFK9z8lh57G7DFfZIhAk1FjuLDP2iI270D/LkYF
 shq6+M8ST9yqwOMV3Iaoa8VZFf/fjTyV0E0L2p2+faxaJ66rqdzbagLIZQv6hDNe
 N1/LD72/Io4et1kEbbaHm5jpxzSJ0jQwu1o+rY1/NmKsWhzE6V4X0GnDTZzZwP9b
 CbFJAWdCD+fi3WQjzv8HLVepjIsV+R3VVTOLq2oodn/mtoK0DRU/vTeCCwRS9ntF
 IyF55L/jSqy9CtP119KBnItGo4b84UrJDozXizGtc6Zt3chz7ljSpa2gJrKF+fCR
 Yhyr9Pt+vYQBpnIMDu1BPcoE58pwZYKoOSO00COUHHsLn+u8qhetGmQzYwWHCq7J
 hz8HHZnlTdFseZTp5tavk3B4md5HiPu+A7GevK3YH3lBv/vEmDM=
 =85ZE
 -----END PGP SIGNATURE-----

Merge tag 'edac_updates_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras

Pull EDAC updates from Borislav Petkov:

 - Add support for version 3 of the Synopsys DDR controller to
   synopsys_edac

 - Add support for DRR5 and new models 0x10-0x1f and 0x50-0x5f of AMD
   family 0x19 CPUs to amd64_edac

 - The usual set of fixes and cleanups

* tag 'edac_updates_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
  EDAC/amd64: Add support for family 19h, models 50h-5fh
  EDAC/sb_edac: Remove redundant initialization of variable rc
  RAS/CEC: Remove a repeated 'an' in a comment
  EDAC/amd64: Add support for AMD Family 19h Models 10h-1Fh and A0h-AFh
  EDAC: Add RDDR5 and LRDDR5 memory types
  EDAC/sifive: Fix non-kernel-doc comment
  dt-bindings: memory: Add entry for version 3.80a
  EDAC/synopsys: Enable the driver on Intel's N5X platform
  EDAC/synopsys: Add support for version 3 of the Synopsys EDAC DDR
  EDAC/synopsys: Use the quirk for version instead of ddr version
2022-01-10 11:45:23 -08:00
Linus Torvalds
7e740ae635 - First part of a series to move the AMD address translation code from
arch/x86/ to amd64_edac as that is its only user anyway
 
 - Some MCE error injection improvements to the AMD side
 
 - Reorganization of the #MC handler code and the facilities it calls to
 make it noinstr-safe
 
 - Add support for new AMD MCA bank types and non-uniform banks layout
 
 - The usual set of cleanups and fixes
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmHcGZ4ACgkQEsHwGGHe
 VUr6Zw//WBvNvfV/akQGsvVo94G0DaF+buYB+Tl1p0goMd7QfKA5iHxjB1alEJC2
 dTchIr7pjiiE3nr4svuWLLQZamx8kMwQNqipioBHXg3YThj0wD4PbUOC9TlIceBR
 3yxVbvwlD7Y7sb2PII6IMlagzTiIeW0/ps29DHFr5vqDBvEanNdAHoV/h2vQi+76
 Ma96psIxzTMSk11yGB6l9k66EASCdDGBU7sODjup7wuQmuRaQ/1oJAWY0wIJvJez
 frjpaz/YKmlTwTf9bxoJbky2FkeBsD4yXXUGwjDgMq0EyUUaeSbvaQkm8gSHX9Yr
 VDDv1WvT6QIw6x7Wc4skS8lWmZghNBbAHOoNS31BPJ2IDmFWkF5Q2bNEuHrtU4EC
 0mkNeyN6x48L/F8j/1aE/tm+SjiGexZX4zhi6MNWReTV140I1zqQq/r7CCu5+MEa
 PAB1YH/96k2dMPT6mbFrRIFJmkDuBuZOAkuwYWEjO/XjPl2SGBGj1jKolWW3qjRR
 Po7vBJnDt7wgigWFh6+R4rJv+fh87XfB7B2wEOt4Yn37jUkK6dNRIy0zFmDaC1J2
 bHgsJbWC+Sgs1G57gnYABJYzLj7RRdDyCu1/UUVyBBP7/WfZJw0kjABE7p3AaYTd
 15JV1L0c/Ypuv05LJf40LkyF2F5w2fnP5QM2Rr8U4xW/GumEyWs=
 =8Hu7
 -----END PGP SIGNATURE-----

Merge tag 'ras_core_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull RAS updates from Borislav Petkov:
 "A relatively big amount of movements in RAS-land this time around:

   - First part of a series to move the AMD address translation code
     from arch/x86/ to amd64_edac as that is its only user anyway

   - Some MCE error injection improvements to the AMD side

   - Reorganization of the #MC handler code and the facilities it calls
     to make it noinstr-safe

   - Add support for new AMD MCA bank types and non-uniform banks layout

   - The usual set of cleanups and fixes"

* tag 'ras_core_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
  x86/mce: Reduce number of machine checks taken during recovery
  x86/mce/inject: Avoid out-of-bounds write when setting flags
  x86/MCE/AMD, EDAC/mce_amd: Support non-uniform MCA bank type enumeration
  x86/MCE/AMD, EDAC/mce_amd: Add new SMCA bank types
  x86/mce: Check regs before accessing it
  x86/mce: Mark mce_start() noinstr
  x86/mce: Mark mce_timed_out() noinstr
  x86/mce: Move the tainting outside of the noinstr region
  x86/mce: Mark mce_read_aux() noinstr
  x86/mce: Mark mce_end() noinstr
  x86/mce: Mark mce_panic() noinstr
  x86/mce: Prevent severity computation from being instrumented
  x86/mce: Allow instrumentation during task work queueing
  x86/mce: Remove noinstr annotation from mce_setup()
  x86/mce: Use mce_rdmsrl() in severity checking code
  x86/mce: Remove function-local cpus variables
  x86/mce: Do not use memset to clear the banks bitmaps
  x86/mce/inject: Set the valid bit in MCA_STATUS before error injection
  x86/mce/inject: Check if a bank is populated before injecting
  x86/mce: Get rid of cpu_missing
  ...
2022-01-10 11:43:09 -08:00
Borislav Petkov
da0119a912 Merge branches 'edac-misc' and 'edac-amd64' into edac-updates-for-v5.17
Signed-off-by: Borislav Petkov <bp@suse.de>
2022-01-10 10:07:00 +01:00
Qiuxu Zhuo
c370baa328 EDAC/i10nm: Release mdev/mbase when failing to detect HBM
On systems without HBM (High Bandwidth Memory) mdev/mbase are not
released/unmapped.

Add the code to release mdev/mbase when failing to detect HBM.

[Tony: re-word commit message]

Cc: <stable@vger.kernel.org>
Fixes: c945088384 ("EDAC/i10nm: Add support for high bandwidth memory")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Link: https://lore.kernel.org/r/20211224091126.1246-1-qiuxu.zhuo@intel.com
2022-01-04 09:08:00 -08:00
Marc Bevand
0b8bf9cb14 EDAC/amd64: Add support for family 19h, models 50h-5fh
Add the new family 19h models 50h-5fh PCI IDs (device 18h functions 0
and 6) to support Ryzen 5000 APUs ("Cezanne").

Signed-off-by: Marc Bevand <m@zorinaq.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com>
Link: https://lore.kernel.org/r/20211221233112.556927-1-m@zorinaq.com
2021-12-24 10:53:41 +01:00
Yazen Ghannam
91f75eb481 x86/MCE/AMD, EDAC/mce_amd: Support non-uniform MCA bank type enumeration
AMD systems currently lay out MCA bank types such that the type of bank
number "i" is either the same across all CPUs or is Reserved/Read-as-Zero.

For example:

  Bank # | CPUx | CPUy
    0      LS     LS
    1      RAZ    UMC
    2      CS     CS
    3      SMU    RAZ

Future AMD systems will lay out MCA bank types such that the type of
bank number "i" may be different across CPUs.

For example:

  Bank # | CPUx | CPUy
    0      LS     LS
    1      RAZ    UMC
    2      CS     NBIO
    3      SMU    RAZ

Change the structures that cache MCA bank types to be per-CPU and update
smca_get_bank_type() to handle this change.

Move some SMCA-specific structures to amd.c from mce.h, since they no
longer need to be global.

Break out the "count" for bank types from struct smca_hwid, since this
should provide a per-CPU count rather than a system-wide count.

Apply the "const" qualifier to the struct smca_hwid_mcatypes array. The
values in this array should not change at runtime.

Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20211216162905.4132657-3-yazen.ghannam@amd.com
2021-12-22 17:22:09 +01:00
Yazen Ghannam
5176a93ab2 x86/MCE/AMD, EDAC/mce_amd: Add new SMCA bank types
Add HWID and McaType values for new SMCA bank types, and add their error
descriptions to edac_mce_amd.

The "PHY" bank types all have the same error descriptions, and the NBIF
and SHUB bank types have the same error descriptions. So reuse the same
arrays where appropriate.

  [ bp: Remove useless comments over hwid types. ]

Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20211216162905.4132657-2-yazen.ghannam@amd.com
2021-12-22 17:19:18 +01:00
Colin Ian King
567617baac EDAC/sb_edac: Remove redundant initialization of variable rc
The variable rc is being initialized with a value that is never read, it
is being updated later on. The assignment is redundant and thus remove
it.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20211126221848.1125321-1-colin.i.king@gmail.com
2021-12-21 12:02:11 +01:00
Yazen Ghannam
e2be5955a8 EDAC/amd64: Add support for AMD Family 19h Models 10h-1Fh and A0h-AFh
Add a new family type for AMD Family 19h Models 10h to 1Fh. Use this new
family type for Models A0h to AFh also.

Increase the maximum number of controllers from 8 to 12.

Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20211208174356.1997855-3-yazen.ghannam@amd.com
2021-12-10 12:54:33 +01:00
Yazen Ghannam
f957112423 EDAC: Add RDDR5 and LRDDR5 memory types
Include Registered-DDR5 and Load-Reduced DDR5 in the list of memory
types.

Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20211208174356.1997855-2-yazen.ghannam@amd.com
2021-12-10 12:51:28 +01:00
Randy Dunlap
ad2c302bc6 EDAC/sifive: Fix non-kernel-doc comment
scripts/kernel-doc complains about a comment that begins with "/**"
but is not in kernel-doc format, so correct it.

Prevents this warning:

  drivers/edac/sifive_edac.c:23: warning: This comment starts with '/**', \
  but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
    * EDAC error callback

Fixes: 91abaeaaff ("EDAC/sifive: Add EDAC platform driver for SiFive SoCs")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20211201030913.10283-1-rdunlap@infradead.org
2021-12-05 19:54:46 +01:00
Dinh Nguyen
f6bc0d8bc2 EDAC/synopsys: Enable the driver on Intel's N5X platform
Intel's N5X platform is also using the Synopsys EDAC controller.

Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Michal Simek <michal.simek@xilinx.com>
Link: https://lkml.kernel.org/r/20211012190709.1504152-3-dinguyen@kernel.org
2021-11-20 19:41:48 +01:00
Dinh Nguyen
f7824ded41 EDAC/synopsys: Add support for version 3 of the Synopsys EDAC DDR
Add support for version 3.80a of the Synopsys DDR controller. This
version of the controller has the following differences:

- UE/CE are auto cleared
- Interrupts are supported by default

Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Michal Simek <michal.simek@xilinx.com>
Link: https://lkml.kernel.org/r/20211012190709.1504152-2-dinguyen@kernel.org
2021-11-20 19:23:49 +01:00
Dinh Nguyen
bd1d6da17c EDAC/synopsys: Use the quirk for version instead of ddr version
Version 2.40a supports DDR_ECC_INTR_SUPPORT for a quirk, so use that
quirk to determine a call to setup_address_map().

Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Michal Simek <michal.simek@xilinx.com>
Link: https://lkml.kernel.org/r/20211012190709.1504152-1-dinguyen@kernel.org
2021-11-20 18:13:06 +01:00
Yazen Ghannam
70aeb807cf EDAC/amd64: Add context struct
Define an address translation context struct. This will hold values that
will be passed between multiple functions.

Save return address, Node ID, and the Instance ID number to start.
Currently, the UMC number is used as the Instance ID, but future DF
versions may use another value.

Also include a "tmp" field to use when reading registers. This is to
avoid having to define a temporary variable in multiple functions.

Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20211028175728.121452-5-yazen.ghannam@amd.com
2021-11-15 12:54:16 +01:00
Yazen Ghannam
448c3d6085 EDAC/amd64: Allow for DF Indirect Broadcast reads
The DF Indirect Access method allows for "Broadcast" accesses in which
case no specific instance is targeted. Add support using a reserved
instance ID of 0xFF to indicate a broadcast access. Set the FICAA
register appropriately.

Define helpers functions for instance and broadcast reads and use them
where appropriate.

Drop the "amd_" prefix since these functions are all static.

Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20211028175728.121452-4-yazen.ghannam@amd.com
2021-11-15 12:48:55 +01:00
Yazen Ghannam
b3218ae477 x86/amd_nb, EDAC/amd64: Move DF Indirect Read to AMD64 EDAC
df_indirect_read() is used only for address translation. Move it to EDAC
along with the translation code.

Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20211028175728.121452-3-yazen.ghannam@amd.com
2021-11-15 12:44:47 +01:00
Yazen Ghannam
0b746e8c1e x86/MCE/AMD, EDAC/amd64: Move address translation to AMD64 EDAC
The address translation code used for current AMD systems is
non-architectural. So move it to EDAC.

Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20211028175728.121452-2-yazen.ghannam@amd.com
2021-11-15 12:36:32 +01:00
Linus Torvalds
fe354159ca - amd64_edac: Add support for three-rank interleaving mode which is
present on AMD zen2 servers
 
 - The usual fixes and cleanups all over EDAC land
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmF/rQAACgkQEsHwGGHe
 VUqTsg//QSI/akiJnnZ5Ea3CMsFAWH/LIAZzjWa3zcit6OcxrjRMsym1JGs4llUY
 7Uis0lGTUH2je3kFrYdPVGXyoWPNdsQCQi/89Anabl5Jrf2u7N2MmUIKcCnzQWjS
 PHb4BSSWW2zR8wU8UXOo2I1QKltenKlaO97uE9Te9YZCYZ6PzS0z2qevd0SCPUtF
 tPQ+AqYNKE0JNNAqlPietSHs7MisyCuxTHqLiwC0QNuKJ3GhCtVhmYGKqaLOMcfr
 oKdBFaAZOhmtUFT0z8cLka8sKO5WDfqli9Qp6Zphv9E6iz6gJq7NCBaKRz8gOkPi
 boOjmSOhCi/xJq63DUtfKOogVRM+qhms1Kxni3EgJxyGR/oLwnVX4n0AopKBstb6
 AsHPUB8AxrtnEtoBPFM/bO18Eto3HkCZD2yCzztOc5NmtUnVlIh2/qN4oYjBG7pc
 20iXwZ49OAYVQwrLdoNvyXQ5BdKGC5tyMgvPWVyOO8Iad2YYfM5iUWDGSycEWSo3
 /xZgVljbFRBwjxaTqgoUDxOlBEgQsDg4ENTathCpxuR+meac/kqGZDSLc0zc7hV1
 zV2kHB7y6gWGAQiuS3Lq+j0BTNtgbUi43R61J00ecQEIOomsBaWLi6uewNJmvOZ9
 Ef7c0t/ree/PMTaXVer0IA8lU3vcHBMTrbNtQsPb4TCRsZOgFhI=
 =O0zA
 -----END PGP SIGNATURE-----

Merge tag 'edac_updates_for_v5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras

Pull EDAC updates from Borislav Petkov:
 "A small pile of EDAC updates which the autumn wind blew my way. :)

   - amd64_edac: Add support for three-rank interleaving mode which is
     present on AMD zen2 servers

   - The usual fixes and cleanups all over EDAC land"

* tag 'edac_updates_for_v5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
  EDAC/sb_edac: Fix top-of-high-memory value for Broadwell/Haswell
  EDAC/ti: Remove redundant error messages
  EDAC/amd64: Handle three rank interleaving mode
  EDAC/mc_sysfs: Print MC-scope sysfs counters unsigned
  EDAC/al_mc: Make use of the helper function devm_add_action_or_reset()
  EDAC/mc: Replace strcpy(), sprintf() and snprintf() with strscpy() or scnprintf()
2021-11-01 15:02:49 -07:00
Hans Potsch
d9b7748ffc EDAC/armada-xp: Fix output of uncorrectable error counter
The number of correctable errors is displayed as uncorrectable
errors because the "SBE" error count is passed to both calls of
edac_mc_handle_error().

Pass the correct uncorrectable error count to the second
edac_mc_handle_error() call when logging uncorrectable errors.

 [ bp: Massage commit message. ]

Fixes: 7f6998a412 ("ARM: 8888/1: EDAC: Add driver for the Marvell Armada XP SDRAM and L2 cache ECC")
Signed-off-by: Hans Potsch <hans.potsch@nokia.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/20211006121332.58788-1-hans.potsch@nokia.com
2021-10-14 11:46:03 +02:00
Eric Badger
537bddd069 EDAC/sb_edac: Fix top-of-high-memory value for Broadwell/Haswell
The computation of TOHM is off by one bit. This missed bit results in
too low a value for TOHM, which can cause errors in regular memory to
incorrectly report:

  EDAC MC0: 1 CE Error at MMIOH area, on addr 0x000000207fffa680 on any memory

Fixes: 50d1bb9367 ("sb_edac: add support for Haswell based systems")
Cc: stable@vger.kernel.org
Reported-by: Meeta Saggi <msaggi@purestorage.com>
Signed-off-by: Eric Badger <ebadger@purestorage.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Link: https://lore.kernel.org/r/20211010170127.848113-1-ebadger@purestorage.com
2021-10-11 08:28:46 -07:00
Tang Bin
0b6d4ab216 EDAC/ti: Remove redundant error messages
In the function ti_edac_probe(), devm_ioremap_resource() and
platform_get_irq() already issue error messages when they fail so remove
the redundant error messages in the EDAC driver.

Signed-off-by: Tang Bin <tangbin@cmss.chinamobile.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210811112626.27848-1-tangbin@cmss.chinamobile.com
2021-10-07 19:16:01 +02:00
Yazen Ghannam
9f4873fb6a EDAC/amd64: Handle three rank interleaving mode
AMD Rome systems and later support interleaving between three identical
ranks within a channel.

Check for this mode by counting the number of enabled chip selects and
comparing their masks. If there are exactly three enabled chip selects
and their masks are identical, then three rank interleaving is enabled.

The size of a rank is determined from its mask value. However, three
rank interleaving doesn't follow the method of swapping an interleave
bit with the most significant bit. Rather, the interleave bit is flipped
and the most significant bit remains the same. There is only a single
interleave bit in this case.

Account for this when determining the chip select size by keeping the
most significant bit at its original value and ignoring any zero bits.
This will return a full bitmask in [MSB:1].

Fixes: e53a3b267f ("EDAC/amd64: Find Chip Select memory size using Address Mask")
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20211005154419.2060504-1-yazen.ghannam@amd.com
2021-10-07 12:30:53 +02:00
Eric Badger
34417f27b9 EDAC/mc_sysfs: Print MC-scope sysfs counters unsigned
This is cosmetically nicer for counts > INT32_MAX, and aligns the
MC-scope format with that of the lower layer sysfs counter files.

Signed-off-by: Eric Badger <ebadger@purestorage.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Tony Luck <tony.luck@intel.com>
Link: https://lkml.kernel.org/r/20211003181653.GA685515@ebadger-ThinkPad-T590
2021-10-07 12:06:20 +02:00
Cai Huoqing
470b52564c EDAC/al_mc: Make use of the helper function devm_add_action_or_reset()
The helper function devm_add_action_or_reset() will internally call
devm_add_action(), and if devm_add_action() fails then it will
execute the action mentioned and return the error code. So use
devm_add_action_or_reset() instead of devm_add_action() to simplify the
error handling, reduce the code.

Signed-off-by: Cai Huoqing <caihuoqing@baidu.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Talel Shenhar <talel@amazon.com>
Link: https://lkml.kernel.org/r/20210922125924.321-1-caihuoqing@baidu.com
2021-09-28 18:35:11 +02:00
Borislav Petkov
54607282fa EDAC/dmc520: Assign the proper type to dimm->edac_mode
dimm->edac_mode contains values of type enum edac_type - not the
corresponding capability flags. Fix that.

Fixes: 1088750d78 ("EDAC: Add EDAC driver for DMC520")
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/20210916085258.7544-1-bp@alien8.de
2021-09-16 11:00:12 +02:00
Sai Krishna Potthuri
5297cfa6bd EDAC/synopsys: Fix wrong value type assignment for edac_mode
dimm->edac_mode contains values of type enum edac_type - not the
corresponding capability flags. Fix that.

Issue caught by Coverity check "enumerated type mixed with another
type."

 [ bp: Rewrite commit message, add tags. ]

Fixes: ae9b56e399 ("EDAC, synps: Add EDAC support for zynq ddr ecc controller")
Signed-off-by: Sai Krishna Potthuri <lakshmi.sai.krishna.potthuri@xilinx.com>
Signed-off-by: Shubhrajyoti Datta <shubhrajyoti.datta@xilinx.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/20210818072315.15149-1-shubhrajyoti.datta@xilinx.com
2021-09-16 10:21:35 +02:00
Len Baker
fca6116564 EDAC/mc: Replace strcpy(), sprintf() and snprintf() with strscpy() or scnprintf()
strcpy() performs no bounds checking on the destination buffer. This
could result in linear overflows beyond the end of the buffer, leading
to all kinds of misbehavior. The safe replacement is strscpy().
[1][2]

However, to simplify and clarify the code, to concatenate labels use the
scnprintf() function. This way it is not necessary to check the return
value of strscpy() (-E2BIG if the parameter count is 0 or the src was
truncated) since scnprintf() always returns the number of chars written
into the buffer. This function always returns a nul-terminated string
even if it needs to be truncated.

While at it, fix all other broken string generation code that wrongly
interprets snprintf()'s return code or just uses sprintf(), implement
that using scnprintf() here too. Drop breaks in loops around
scnprintf() as it is safe now to loop.

Moreover, the check is not needed: for the case when the buffer is
exhausted, len never gets zero because scnprintf() takes the full buffer
length as input parameter, but excludes the trailing '\0' in its return
code and thus, 1 is the minimum len.

[1] https://www.kernel.org/doc/html/latest/process/deprecated.html#strcpy
[2] https://github.com/KSPP/linux/issues/88

 [ rric: Replace snprintf() with scnprintf(), rework sprintf() user,
   drop breaks in loops around scnprintf(), introduce 'end' pointer to
   reduce pointer arithmetic, use prefix pattern for e->location,
   adjust subject and description ]

Co-developed-by: Joe Perches <joe@perches.com>
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Len Baker <len.baker@gmx.com>
Signed-off-by: Robert Richter <rrichter@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210903150539.7282-1-len.baker@gmx.com
2021-09-15 13:52:58 +02:00
Linus Torvalds
7d6e3fa87e Updates to the interrupt core and driver subsystems:
Core changes:
 
    - The usual set of small fixes and improvements all over the place, but nothing
      outstanding
 
 MSI changes:
 
    - Further consolidation of the PCI/MSI interrupt chip code
 
    - Make MSI sysfs code independent of PCI/MSI and expose the MSI interrupts
      of platform devices in the same way as PCI exposes them.
 
 Driver changes:
 
    - Support for ARM GICv3 EPPI partitions
 
    - Treewide conversion to generic_handle_domain_irq() for all chained
      interrupt controllers
 
    - Conversion to bitmap_zalloc() throughout the irq chip drivers
 
    - The usual set of small fixes and improvements
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmEsnpsTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoS+/EACQdpRkzl3IDIYqThxVZ8KQzp2rKKVn
 qisAQiWg/6koNJx/yYy62KNAUyKjCIObNtRnWi7OAOx6OvNtQTD2WOLAwkh3Pgw1
 8ePYYl55k+yCs8VoITsZM9jYeO+Tk878pU2A6R943zR+g6G7bskGJrxEyZ9TbzIe
 qKfusNKnRY9/jMQaRALUAAtA9VIVR867GqORX5X8hKz8yE2rqlpb4y+1CFba5BTV
 Vlxw7cIXvXBn7BKAom5diRqEGDNJEbX+56jJ7yDZshgLo7m11D7QLw72kmb6TNVC
 g7PchvFi4afpc1ifEAAp0tk4RiSIAQ91nS3n0+jLcLbodOjIkl14eY02ZCJGAP29
 uslyzUbmy1wgejG6CA63JtZ4MYdrf/OSMGuoN78qnOKYcIsWFzOvlJmBWWNW34qW
 LCaUF9QdJ/slXu6B4vIx30GfN9q4myml8bFUobE5q9mBRrEk4R0B7iyBvPu1xKYr
 ZEan67prI5VEu+afJGpp4r294m4HNVkMLfl3nYmE5+y4MoLeMNKDY3IPTvI9iP4G
 kaFgoPvQo23WnuclNYpJ+CaA4aRASlB2nTY+oAXIYfehbey9EW5vq4/EK864ek6w
 oyUTepxxNhE81tG2jpQbf2tR4COsEHy986clxqPP4AvsZXcbypCw8O2FcflpQbHO
 5DLEAfTmp7cziQ==
 =qyll
 -----END PGP SIGNATURE-----

Merge tag 'irq-core-2021-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull irq updates from Thomas Gleixner:
 "Updates to the interrupt core and driver subsystems:

  Core changes:

   - The usual set of small fixes and improvements all over the place,
     but nothing stands out

  MSI changes:

   - Further consolidation of the PCI/MSI interrupt chip code

   - Make MSI sysfs code independent of PCI/MSI and expose the MSI
     interrupts of platform devices in the same way as PCI exposes them.

  Driver changes:

   - Support for ARM GICv3 EPPI partitions

   - Treewide conversion to generic_handle_domain_irq() for all chained
     interrupt controllers

   - Conversion to bitmap_zalloc() throughout the irq chip drivers

   - The usual set of small fixes and improvements"

* tag 'irq-core-2021-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (57 commits)
  platform-msi: Add ABI to show msi_irqs of platform devices
  genirq/msi: Move MSI sysfs handling from PCI to MSI core
  genirq/cpuhotplug: Demote debug printk to KERN_DEBUG
  irqchip/qcom-pdc: Trim unused levels of the interrupt hierarchy
  irqdomain: Export irq_domain_disconnect_hierarchy()
  irqchip/gic-v3: Fix priority comparison when non-secure priorities are used
  irqchip/apple-aic: Fix irq_disable from within irq handlers
  pinctrl/rockchip: drop the gpio related codes
  gpio/rockchip: drop irq_gc_lock/irq_gc_unlock for irq set type
  gpio/rockchip: support next version gpio controller
  gpio/rockchip: use struct rockchip_gpio_regs for gpio controller
  gpio/rockchip: add driver for rockchip gpio
  dt-bindings: gpio: change items restriction of clock for rockchip,gpio-bank
  pinctrl/rockchip: add pinctrl device to gpio bank struct
  pinctrl/rockchip: separate struct rockchip_pin_bank to a head file
  pinctrl/rockchip: always enable clock for gpio controller
  genirq: Fix kernel doc indentation
  EDAC/altera: Convert to generic_handle_domain_irq()
  powerpc: Bulk conversion to generic_handle_domain_irq()
  nios2: Bulk conversion to generic_handle_domain_irq()
  ...
2021-08-30 14:38:37 -07:00
Linus Torvalds
05b5fdb2a8 - Add new HBM2 (High Bandwidth Memory Gen 2) type and add support for it
to the Intel SKx drivers
 
 - Print additional useful per-channel error information on i10nm, like on SKL
 
 - Check whether the AMD decoder is loaded on a guest and if so, don't
 
 - The usual round of fixes and cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmEso2QACgkQEsHwGGHe
 VUqZRRAAqraSdImfO4a7dV4lHErI43GDx9WZIKhrfNyvEv32pDc7E6dwUqJ+NHa1
 ON469MjmV4N3X5Deq+Ecaij6giieBKazfqbbZA9pSbNtX4+ta+TEDTCmO7IX38BC
 pESwHlWinJbpy8SfopXQh9wgx0pWOHfPysStHS0pmG5mH+o9mIDqsn/20Q0o/kIY
 Cn+PZ1mJ2f7iJXvjQv7XsGMM/HotLFkJ5RNUoVjZqteOOdTcHRWytuL/r7QY0SpX
 iWAcmkkW3gcW1qypvo/n53ghxvSydZpqWJhI25QBJ79CGOCVocYfitkIdIYAYOtt
 xH82GFuC5EtDBTsQXQa2gzbraTwnxPCr6BXKY819a7+xsw/WX2x7MctZc6vCH4R+
 UoDRQoUAqgNG2D89lgtQHspy3h3gSSWbGRysLdFCohJlDv3mCU4hk7yyvoCtatqR
 ki725nkJn0XT1HZAQc+o3p6E+PCmVYd2Km8TtNJ2ZS8z+97rjsZQAV5eeZKLnUC7
 lVX57lZ9MciSlrBNe1k9b1er1Ir3/rLs1X33K+yqE8IwwCX1hEw1yPNEf6jSXQwK
 qce95INCamyl1Z8y0uhFZfgl8aEN5ZbaL/lBYz1Gk5R7IQCSVd9Amq5fqxOUdPb/
 PcJk24Ng+h0oPKiBBVHrsDhFiDnZKAv7JGqEPFNmOi3dWZPrNrI=
 =JYRz
 -----END PGP SIGNATURE-----

Merge tag 'edac_updates_for_v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras

Pull EDAC updates from Borislav Petkov:
 "The usual EDAC stuff which managed to trickle in for 5.15:

   - Add new HBM2 (High Bandwidth Memory Gen 2) type and add support for
     it to the Intel SKx drivers

   - Print additional useful per-channel error information on i10nm,
     like on SKL

   - Don't load the AMD EDAC decoder in virtual images

   - The usual round of fixes and cleanups"

* tag 'edac_updates_for_v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
  EDAC/i10nm: Retrieve and print retry_rd_err_log registers
  EDAC/i10nm: Fix NVDIMM detection
  EDAC/skx_common: Set the memory type correctly for HBM memory
  EDAC/altera: Skip defining unused structures for specific configs
  EDAC/mce_amd: Do not load edac_mce_amd module on guests
  EDAC/mc: Add new HBM2 memory type
  EDAC/amd64: Use DEVICE_ATTR helper macros
2021-08-30 13:17:29 -07:00
Youquan Song
cf4e6d52f5 EDAC/i10nm: Retrieve and print retry_rd_err_log registers
Retrieve and print retry_rd_err_log registers like the earlier change:
commit e80634a75a ("EDAC, skx: Retrieve and print retry_rd_err_log registers")

This is a little trickier than on Skylake because of potential
interference with BIOS use of the same registers. The default
behavior is to ignore these registers.

A module parameter retry_rd_err_log(default=0) controls the mode of operation:
- 0=off  : Default.
- 1=bios : Linux doesn't reset any control bits, but just reports values.
           This is "no harm" mode, but it may miss reporting some data.
- 2=linux: Linux tries to take control and resets mode bits,
           clears valid/UC bits after reading. This should be
           more reliable (especially if BIOS interference is reduced
           by disabling eMCA reporting mode in BIOS setup).

Co-developed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Youquan Song <youquan.song@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Link: https://lore.kernel.org/r/20210818175701.1611513-3-tony.luck@intel.com
2021-08-23 10:35:36 -07:00
Qiuxu Zhuo
2294a7299f EDAC/i10nm: Fix NVDIMM detection
MCDDRCFG is a per-channel register and uses bit{0,1} to indicate
the NVDIMM presence on DIMM slot{0,1}. Current i10nm_edac driver
wrongly uses MCDDRCFG as per-DIMM register and fails to detect
the NVDIMM.

Fix it by reading MCDDRCFG as per-channel register and using its
bit{0,1} to check whether the NVDIMM is populated on DIMM slot{0,1}.

Fixes: d4dc89d069 ("EDAC, i10nm: Add a driver for Intel 10nm server processors")
Reported-by: Fan Du <fan.du@intel.com>
Tested-by: Wen Jin <wen.jin@intel.com>
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Link: https://lore.kernel.org/r/20210818175701.1611513-2-tony.luck@intel.com
2021-08-23 10:35:36 -07:00
Qiuxu Zhuo
fd07a4a0d3 EDAC/skx_common: Set the memory type correctly for HBM memory
Set the memory type to MEM_HBM2 if it's managed by the HBM2
memory controller.

Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Link: https://lore.kernel.org/r/20210720163009.GA1417532@agluck-desk2.amr.corp.intel.com
2021-08-23 10:32:32 -07:00
Krzysztof Kozlowski
7d07deb3b8 EDAC/altera: Skip defining unused structures for specific configs
The Altera EDAC driver has several features conditionally built
depending on Kconfig options. The edac_device_prv_data structures
are conditionally used in of_device_id tables. They reference other
functions and structures which can be defined as __maybe_unused.

Silence build warnings like:

  drivers/edac/altera_edac.c:643:37: warning:
      ‘altr_edac_device_inject_fops’ defined but not used [-Wunused-const-variable=]

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Dinh Nguyen <dinguyen@kernel.org>
Link: https://lkml.kernel.org/r/20210601092704.203555-1-krzysztof.kozlowski@canonical.com
2021-08-16 20:21:46 +02:00
Marc Zyngier
eecb06813d EDAC/altera: Convert to generic_handle_domain_irq()
Replace generic_handle_irq(irq_linear_revmap()) with a single call to
generic_handle_domain_irq().

Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-08-12 11:39:41 +01:00
Smita Koralahalli
767f4b620e EDAC/mce_amd: Do not load edac_mce_amd module on guests
Hypervisors likely do not expose the SMCA feature to the guest and
loading this module leads to false warnings. This module should not be
loaded in guests to begin with, but people tend to do so, especially
when testing kernels in VMs. And then they complain about those false
warnings.

Do the practical thing and do not load this module when running as a
guest to avoid all that complaining.

 [ bp: Rewrite commit message. ]

Suggested-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com>
Tested-by: Kim Phillips <kim.phillips@amd.com>
Link: https://lkml.kernel.org/r/20210628172740.245689-1-Smita.KoralahalliChannabasappa@amd.com
2021-08-09 12:35:43 +02:00
Naveen Krishna Chatradhi
e1ca90b7cc EDAC/mc: Add new HBM2 memory type
Add a new entry to 'enum mem_type' and a new string to 'edac_mem_types[]'
for HBM2 (High Bandwidth Memory Gen 2) new memory type.

Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Muralidhara M K <muralimk@amd.com>
Signed-off-by: Naveen Krishna Chatradhi <nchatrad@amd.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Link: https://lore.kernel.org/r/20210630152828.162659-4-nchatrad@amd.com
2021-07-20 09:20:49 -07:00
Randy Dunlap
a1c9ca5f65 EDAC/igen6: fix core dependency AGAIN
My previous patch had a typo/thinko which prevents this driver
from being enabled: change X64_64 to X86_64.

Fixes: 0a9ece9ba1 ("EDAC/igen6: fix core dependency")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: linux-edac@vger.kernel.org
Cc: bowsingbetee <bowsingbetee@protonmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-07-15 11:59:59 -07:00
Dwaipayan Ray
d19faf0e49 EDAC/amd64: Use DEVICE_ATTR helper macros
Instead of "open coding" DEVICE_ATTR, use the corresponding
helper macros DEVICE_ATTR_{RW,RO,WO} in amd64_edac.c

Some function names needed to be changed to match the device
conventions <foo>_show and <foo>_store, but the functionality
itself is unchanged.

The devices using EDAC_DCT_ATTR_SHOW() are left unchanged.

Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Dwaipayan Ray <dwaipayanray1@gmail.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Link: https://lore.kernel.org/r/20210713065130.2151-1-dwaipayanray1@gmail.com
2021-07-13 09:59:57 -07:00