linux

mirror of https://github.com/torvalds/linux.git synced 2024-11-02 10:11:36 +00:00

Author	SHA1	Message	Date
Linus Torvalds	128283a47e	Merge branch 'mce-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp * 'mce-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp: EDAC, MCE: Fix NB error formatting EDAC, MCE: Use BIT_64() to eliminate warnings on 32-bit EDAC, MCE: Enable MCE decoding on F15h EDAC, MCE: Allow F15h bank 6 MCE injection EDAC, MCE: Shorten error report formatting EDAC, MCE: Overhaul error fields extraction macros EDAC, MCE: Add F15h FP MCE decoder EDAC, MCE: Add F15 EX MCE decoder EDAC, MCE: Add an F15h NB MCE decoder EDAC, MCE: No F15h LS MCE decoder EDAC, MCE: Add F15h CU MCE decoder EDAC, MCE: Add F15h IC MCE decoder EDAC, MCE: Add F15h DC MCE decoder EDAC, MCE: Select extended error code mask	2011-01-07 14:54:03 -08:00
Borislav Petkov	6245288232	EDAC, MCE: Overhaul error fields extraction macros Make macro names shorter thus making code shorter and more clear. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-01-07 11:54:21 +01:00
Borislav Petkov	a135cef79a	amd64_edac: Disable DRAM ECC injection on K8 K8 does not allow for an atomic RMW to a cacheline as F10h does so disable the error injection interface for it. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-01-07 11:38:46 +01:00
Borislav Petkov	390944439f	EDAC: Fixup scrubrate manipulation Make the ->{get\|set}_sdram_scrub_rate return the actual scrub rate bandwidth it succeeded setting and remove superfluous arg pointer used for that. A negative value returned still means that an error occurred while setting the scrubrate. Document this for future reference. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-01-07 11:38:31 +01:00
Borislav Petkov	360b7f3c60	amd64_edac: Remove two-stage initialization Now that all prerequisites are in place, drop the two-stage driver instances initialization in favor of the following simple init sequence: 1. Probe PCI device: we only test ECC capabilities here and if none exit early. 2. If the hw supports ECC and it is/can be enabled, we init the per-node instance. Remove "amd64_" prefix from static functions touched, while at it. There actually should be no visible functional change resulting from this patch. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-01-07 11:34:03 +01:00
Borislav Petkov	2299ef7114	amd64_edac: Check ECC capabilities initially Rework the code to check the hardware ECC capabilities at PCI probing time. We do all further initialization only if we actually can/have ECC enabled. While at it: 0. Fix function naming. 1. Simplify/clarify debug output. 2. Remove amd64_ prefix from the static functions 3. Reorganize code. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-01-07 11:34:02 +01:00
Borislav Petkov	ae7bb7c679	amd64_edac: Carve out ECC-related hw settings This is in preparation for the init path reorganization where we want only to 1) test whether a particular node supports ECC 2) can it be enabled and only then do the necessary allocation/initialization. For that, we need to decouple the ECC settings of the node from the instance's descriptor. The should be no functional change introduced by this patch. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-01-07 11:34:00 +01:00
Borislav Petkov	f1db274e1b	amd64_edac: Remove PCI ECS enabling functions PCI ECS is being enabled by default since 2.6.26 on AMD so this code is just superfluous now, remove it. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-01-07 11:33:59 +01:00
Borislav Petkov	cc4d8860fc	amd64_edac: Allocate driver instances dynamically Remove static allocation in favor of dynamically allocating space for as many driver instances as northbridges present on the system. There should be no functional change resulting from this patch. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-01-07 11:33:57 +01:00
Borislav Petkov	24f9a7fe3f	amd64_edac: Rework printk macros Add a macro per printk level, shorten up error messages. Add relevant information to KERN_INFO level. No functional change. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-01-07 11:33:56 +01:00
Borislav Petkov	8d5b5d9c7b	amd64_edac: Rename CPU PCI devices Rename variables representing PCI devices to their BKDG names for faster search and shorter, clearer code. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-01-07 11:33:54 +01:00
Borislav Petkov	b8cfa02f83	amd64_edac: Concentrate per-family init even more Move the remaining per-family init code into the proper place and simplify the rest of the initialization. Reorganize error handling in amd64_init_one_instance(). Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-01-07 11:33:53 +01:00
Borislav Petkov	bbd0c1f675	amd64_edac: Cleanup the CPU PCI device reservation Shorten code and clarify comments, return proper -E* values on error. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-01-07 11:33:52 +01:00
Borislav Petkov	0092b20d4c	amd64_edac: Simplify CPU family detection Concentrate CPU family detection in the per-family init function. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-01-07 11:33:51 +01:00
Borislav Petkov	395ae783b3	amd64_edac: Add per-family init function Run a per-family init function which does all the settings based on the family this driver instance is running on. Move the scrubrate calculation in it and simplify code. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-01-07 11:33:50 +01:00
Borislav Petkov	9f56da0e3c	amd64_edac: Use cached extended CPU model ... instead of computing it needlessly again. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-01-07 11:33:49 +01:00
Borislav Petkov	3ab0e7dc2e	amd64_edac: Remove F11h support F11h doesn't support DRAM ECC so whack it away. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-01-07 11:33:47 +01:00
Linus Torvalds	42cbd8efb0	Merge branch 'x86-amd-nb-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-amd-nb-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, cacheinfo: Cleanup L3 cache index disable support x86, amd-nb: Cleanup AMD northbridge caching code x86, amd-nb: Complete the rename of AMD NB and related code	2011-01-06 10:50:28 -08:00
Borislav Petkov	e726f3c368	amd64_edac: Fix interleaving check When matching error address to the range contained by one memory node, we're in valid range when node interleaving 1. is disabled, or 2. enabled and when the address bits we interleave on match the interleave selector on this node (see the "Node Interleaving" section in the BKDG for an enlightening example). Thus, when we early-exit, we need to reverse the compound logic statement properly. Cc: <stable@kernel.org> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2010-12-08 19:52:54 +01:00
Hans Rosenfeld	9653a5c76c	x86, amd-nb: Cleanup AMD northbridge caching code Support more than just the "Misc Control" part of the northbridges. Support more flags by turning "gart_supported" into a single bit flag that is stored in a flags member. Clean up related code by using a set of functions (amd_nb_num(), amd_nb_has_feature() and node_to_amd_nb()) instead of accessing the NB data structures directly. Reorder the initialization code and put the GART flush words caching in a separate function. Signed-off-by: Hans Rosenfeld <hans.rosenfeld@amd.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2010-11-18 15:53:05 +01:00
Hans Rosenfeld	eec1d4fa00	x86, amd-nb: Complete the rename of AMD NB and related code Not only the naming of the files was confusing, it was even more so for the function and variable names. Renamed the K8 NB and NUMA stuff that is also used on other AMD platforms. This also renames the CONFIG_K8_NUMA option to CONFIG_AMD_NUMA and the related file k8topology_64.c to amdtopology_64.c. No functional changes intended. Signed-off-by: Hans Rosenfeld <hans.rosenfeld@amd.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2010-11-18 15:53:04 +01:00
Linus Torvalds	c029e405bd	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp: (21 commits) EDAC, MCE: Fix shift warning on 32-bit EDAC, MCE: Add a BIT_64() macro EDAC, MCE: Enable MCE decoding on F12h EDAC, MCE: Add F12h NB MCE decoder EDAC, MCE: Add F12h IC MCE decoder EDAC, MCE: Add F12h DC MCE decoder EDAC, MCE: Add support for F11h MCEs EDAC, MCE: Enable MCE decoding on F14h EDAC, MCE: Fix FR MCEs decoding EDAC, MCE: Complete NB MCE decoders EDAC, MCE: Warn about LS MCEs on F14h EDAC, MCE: Adjust IC decoders to F14h EDAC, MCE: Adjust DC decoders to F14h EDAC, MCE: Rename files EDAC, MCE: Rework MCE injection EDAC: Export edac sysfs class to users. EDAC, MCE: Pass complete MCE info to decoders EDAC, MCE: Sanitize error codes EDAC, MCE: Remove unused function parameter EDAC, MCE: Add HW_ERR prefix ...	2010-10-21 14:04:58 -07:00
Borislav Petkov	7cfd4a8744	EDAC, MCE: Pass complete MCE info to decoders ... instead of the MCi_STATUS info only for improved handling of certain types of errors later. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2010-10-21 14:47:58 +02:00
Andreas Herrmann	23ac4ae827	x86, k8: Rename k8.[ch] to amd_nb.[ch] and CONFIG_K8_NB to CONFIG_AMD_NB The file names are somehow misleading as the code is not specific to AMD K8 CPUs anymore. The files accomodate code for other AMD CPU northbridges as well. Same is true for the config option which is valid for AMD CPU northbridges in general and not specific to K8. Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> LKML-Reference: <20100917160343.GD4958@loge.amd.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-09-20 14:22:58 -07:00
Andreas Herrmann	900f9ac9f1	x86, k8-gart: Decouple handling of garts and northbridges So far we only provide num_k8_northbridges. This is required in different areas (e.g. L3 cache index disable, GART). But not all AMD CPUs provide a GART. Thus it is useful to split off the GART handling from the generic caching of AMD northbridge misc devices. Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> LKML-Reference: <20100917160254.GC4958@loge.amd.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-09-17 13:26:21 -07:00
Borislav Petkov	37b7370a8d	amd64_edac: Do not report error overflow as a separate error When the Overflow MCi_STATUS bit is set, EDAC reports the lost error with a "no information available" message which often puzzles users parsing the dmesg. This doesn't make much sense since this error has been lost anyway so no need for reporting it separately. Thus, report the overflow bit setting in the MCE dump instead. While at it, remove reporting of MiscV and ErrorEnable (en) which are superfluous. Now it looks like this: [ 1501.650024] MC4_STATUS: Corrected error, other errors lost: yes, CPU context corrupt: no, CECC Error [ 1501.666887] Northbridge Error, node 2 Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2010-08-26 12:46:03 +02:00
Borislav Petkov	c4799c7570	amd64_edac: Minor formatting fix EDAC MC3: CE page 0xc32281, offset 0x8a0, grain 0, syndrome 0x1, row 2, channel 1, label "": amd64_edac EDAC MC3: CE - no information available: amd64_edacError Overflow Add the missing space before "Error Overflow" on the second line. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2010-08-04 11:16:01 +02:00
Borislav Petkov	962b70a1eb	amd64_edac: Fix operator precendence error The bitwise AND is of higher precedence, make that explicit. Cc: <stable@kernel.org> # 34.x Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2010-08-04 11:15:09 +02:00
Borislav Petkov	eba042a81e	edac, mc: Improve scrub rate handling Fortify the interface to not accept negative values, remove memctrl_int_store() as a result. Also, sanitize bandwidth setting by making the argument a simple u32 instead of strange u32 pointer being passed around for no obvious reason. Then, fix error handling and teach it to return proper error values. Finally, make code more readable, simplify debug messages. Cc: Mauro Carvalho Chehab <mchehab@redhat.com> Cc: Arthur Jones <ajones@riverbed.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Acked-by: Doug Thompson <dougthompson@xmission.com>	2010-08-03 16:14:06 +02:00
Borislav Petkov	bc57117856	amd64_edac: Correct scrub rate setting Exit early when setting scrub rate on unknown/unsupported families. Cc: <stable@kernel.org> # 32.x 33.x 34.x Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Acked-by: Doug Thompson <dougthompson@xmission.com>	2010-08-03 16:14:05 +02:00
Borislav Petkov	9975a5f22a	amd64_edac: Fix DCT base address selector The correct check is to verify whether in high range we're below 4GB and not to extract the DctSelBaseAddr again. See "2.8.5 Routing DRAM Requests" in the F10h BKDG. Cc: <stable@kernel.org> # .32.x .33.x .34.x Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Acked-by: Doug Thompson <dougthompson@xmission.com>	2010-08-03 16:14:04 +02:00
Borislav Petkov	f4347553b3	amd64_edac: Remove polling mechanism Switch to reusing the mcheck core's machine check polling mechanism instead of duplicating functionality by using the EDAC polling routine. Correct formatting while at it. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Acked-by: Doug Thompson <dougthompson@xmission.com>	2010-08-03 16:14:03 +02:00
Borislav Petkov	ad6a32e969	amd64_edac: Sanitize syndrome extraction Remove the two syndrome extraction macros and add a single function which does the same thing but with proper typechecking. While at it, make sure to cache ECC syndrome size and dump it in debug output. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2010-08-03 16:13:31 +02:00
Borislav Petkov	41c310447f	amd64_edac: Fix syndrome calculation on K8 When calculating the DCT channel from the syndrome we need to know the syndrome type (x4 vs x8). On F10h, this is read out from extended PCI cfg space register F3x180 while on K8 we only support x4 syndromes and don't have extended PCI config space anyway. Make the code accessing F3x180 F10h only and fall back to x4 syndromes on everything else. Cc: <stable@kernel.org> # .33.x .34.x Reported-by: Jeffrey Merkey <jeffmerkey@gmail.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2010-07-02 17:32:34 +02:00
Linus Torvalds	eaa5eec739	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp: amd64_edac: Simplify ECC override handling	2010-03-03 09:25:37 -08:00
Linus Torvalds	0a135ba14d	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: percpu: add __percpu sparse annotations to what's left percpu: add __percpu sparse annotations to fs percpu: add __percpu sparse annotations to core kernel subsystems local_t: Remove leftover local.h this_cpu: Remove pageset_notifier this_cpu: Page allocator conversion percpu, x86: Generic inc / dec percpu instructions local_t: Move local.h include to ringbuffer.c and ring_buffer_benchmark.c module: Use this_cpu_xx to dynamically allocate counters local_t: Remove cpu_local_xx macros percpu: refactor the code in pcpu_[de]populate_chunk() percpu: remove compile warnings caused by __verify_pcpu_ptr() percpu: make accessors check for percpu pointer in sparse percpu: add __percpu for sparse. percpu: make access macros universal percpu: remove per_cpu__ prefix.	2010-03-03 07:34:18 -08:00
Borislav Petkov	d95cf4de6a	amd64_edac: Simplify ECC override handling No need for clearing ecc_enable_override and checking it in two places. Instead, simply check it during probing and act accordingly. Also, rename the flag bitfields according to the functionality they actually represent. What is more, make sure original BIOS ECC settings are restored when the module is unloaded. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2010-03-01 19:25:12 +01:00
Tejun Heo	a29d8b8e2d	percpu: add __percpu sparse annotations to what's left Add __percpu sparse annotations to places which didn't make it in one of the previous patches. All converions are trivial. These annotations are to make sparse consider percpu variables to be in a different address space and warn if accessed without going through percpu accessors. This patch doesn't affect normal builds. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Borislav Petkov <borislav.petkov@amd.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Huang Ying <ying.huang@intel.com> Cc: Len Brown <lenb@kernel.org> Cc: Neil Brown <neilb@suse.de>	2010-02-17 11:17:38 +09:00
Borislav Petkov	cab4d27764	amd64_edac: Do not falsely trigger kerneloops An unfortunate "WARNING" in the message amd64_edac dumps when the system doesn't support DRAM ECC or ECC checking is not enabled in the BIOS used to trigger kerneloops which qualified the message as an OOPS thus misleading the users. See, e.g. https://bugs.launchpad.net/ubuntu/+source/linux/+bug/422536 http://bugzilla.kernel.org/show_bug.cgi?id=15238 Downgrade the message level to KERN_NOTICE and fix the formulation. Cc: stable@kernel.org # .32.x Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Acked-by: Doug Thompson <dougthompson@xmission.com>	2010-02-11 20:32:14 +01:00
Roel Kluin	926311fd7d	amd64_edac: Ensure index stays within bounds in amd64_get_scrub_rate Add a missing iterator variable thus fixing the conditional of the for-loop in amd64_get_scrub_rate(). Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2010-01-15 10:45:58 +01:00
Borislav Petkov	92389102b6	amd64_edac: restrict PCI config space access Do not access F2x19[0,4] on K8 since they're undefined there. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-12-24 11:07:08 +01:00
Borislav Petkov	43f5e68733	amd64_edac: fix forcing module load/unload Clear the override flag after force-loading the module. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-12-24 11:07:08 +01:00
Borislav Petkov	56b34b91e2	amd64_edac: make driver loading more robust Currently, the module does not initialize fully when the DIMMs aren't ECC but remains still loaded. Propagate the error when no instance of the driver is properly initialized and prevent further loading. Reorganize and polish error handling in amd64_edac_init() while at it. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-12-24 11:07:07 +01:00
Borislav Petkov	8f68ed9728	amd64_edac: fix driver instance freeing Fix use-after-free errors by pushing all memory-freeing calls to the end of amd64_remove_one_instance(). Reported-by: Darren Jenkins <darrenrjenkins@gmail.com> LKML-Reference: <1261370306.11354.52.camel@ICE-BOX> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-12-24 11:07:07 +01:00
Borislav Petkov	603adaf6b3	amd64_edac: fix K8 chip select reporting Fix the case when amd64_debug_display_dimm_sizes() reports only half the amount of DRAM on it because it doesn't account for when the single DCT operates in 128-bit mode and merges chip selects from different DIMMs. Reported-by: Johannes Hirte <johannes.hirte@fem.tu-ilmenau.de> LKML-Reference: <200912112202.48173.johannes.hirte@fem.tu-ilmenau.de> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-12-24 11:07:07 +01:00
Borislav Petkov	505422517d	x86, msr: Add support for non-contiguous cpumasks The current rd/wrmsr_on_cpus helpers assume that the supplied cpumasks are contiguous. However, there are machines out there like some K8 multinode Opterons which have a non-contiguous core enumeration on each node (e.g. cores 0,2 on node 0 instead of 0,1), see http://www.gossamer-threads.com/lists/linux/kernel/1160268. This patch fixes out-of-bounds writes (see URL above) by adding per-CPU msr structs which are used on the respective cores. Additionally, two helpers, msrs_{alloc,free}, are provided for use by the callers of the MSR accessors. Cc: H. Peter Anvin <hpa@zytor.com> Cc: Mauro Carvalho Chehab <mchehab@redhat.com> Cc: Aristeu Rozanski <aris@redhat.com> Cc: Randy Dunlap <randy.dunlap@oracle.com> Cc: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> LKML-Reference: <20091211171440.GD31998@aftab> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-12-11 10:59:21 -08:00
Andrew Morton	18ba54ac12	amd64_edac: fix use-uninitialised bug drivers/edac/amd64_edac.c: In function 'amd64_edac_init': drivers/edac/amd64_edac.c:2840: warning: 'ret' may be used uninitialized in this function Cc: Doug Thompson <dougthompson@xmission.com> Cc: Mauro Carvalho Chehab <mchehab@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-12-08 13:38:13 +01:00
Borislav Petkov	bdc30a0c8c	amd64_edac: correct sys address to chip select mapping The routine does the reverse mapping of the error address of a CECC back to the node id, DRAM controller and chip select of the DIMM which caused the error. We should lookup the channel using the syndromes _only_ when the DCTs are ganged so fix that. Also, add an early exit when there's an error while scanning for the csrow thus decreasing indentation levels for better readability. Finally, fixup comments. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-12-08 13:38:12 +01:00
Borislav Petkov	bfc04aec7d	amd64_edac: add a leaner syndrome decoding algorithm Instead of using the whole syndrome tables for channel decoding, use a set of eigenvectors with which the tables can be generated to search for the syndrome in error. The algorithm operates independently of symbol size and can be used for both x4 and x8 syndromes. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-12-08 13:37:59 +01:00
Borislav Petkov	986a42a250	amd64_edac: remove early hw support check The .probe_valid_hardware low_ops member checked whether the DCTs are in DDR3 mode and bailed out if so. Now that all the needed changes for DDR3 support is in place, remove it. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-12-07 19:14:31 +01:00

1 2 3

108 Commits