Commit Graph

497 Commits

Author SHA1 Message Date
Borislav Petkov
6b4c0bdeb0 amd64_edac: detect DDR3 memory type
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-12-07 19:14:31 +01:00
Borislav Petkov
239642fe19 edac: add memory types strings for debugging
Instead of using deeply-nested conditionals for dumping the DIMM type in
debug mode, add a strings array of the supported DIMM types.

This is useful in cases where an edac driver supports multiple DRAM
types and is only defined in debug builds.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-12-07 19:14:31 +01:00
Borislav Petkov
cec7924f56 edac, mce: update AMD F10h revD check
F10h revD start with model number 8.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-12-07 19:14:30 +01:00
Borislav Petkov
1f6bcee75e amd64_edac: remove unneeded extract_error_address wrapper
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-12-07 19:14:30 +01:00
Borislav Petkov
44e9e2ee21 amd64_edac: rename StinkyIdentifier
SystemAddress -> sys_addr

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-12-07 19:14:30 +01:00
Borislav Petkov
ad858bfa14 amd64_edac: remove superfluous dbg printk
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-12-07 19:14:29 +01:00
Borislav Petkov
1433eb9903 amd64_edac: enhance address to DRAM bank mapping
Add cs mode to cs size mapping tables for DDR2 and DDR3 and F10
and all K8 flavors and remove klugdy table of pseudo values. Add a
low_ops->dbam_to_cs member which is family-specific and replaces
low_ops->dbam_map_to_pages since the pages calculation is a one liner
now.

Further cleanups, while at it:

- shorten family name defines
- align amd64_family_types struct members

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-12-07 19:14:29 +01:00
Borislav Petkov
d16149e8c3 amd64_edac: cleanup f10_early_channel_count
Do not read DCLR[01] again since this is done in
amd64_read_mc_registers() earlier. There can be more than two physical
DIMMs present so clamp the channels value to max 2. Also, do not report
DCT data width - it is also done earlier.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-12-07 19:14:29 +01:00
Borislav Petkov
8566c4df16 amd64_edac: dump DIMM sizes on K8 too
Extend f10_debug_display_dimm_sizes to dump the logical DIMMs
configuration on K8 revF too. Remove the ganged arg since we print the
DCT operating mode (ganged vs unganged) earlier.

Also, DCT csrow configuration is relevant therefore dump it as
KERN_DEBUG instead of only on debug builds. Remove misleading DIMM
output since there's no reliable way of mapping of chip selects to
actual physical DIMMs.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-12-07 19:14:28 +01:00
Borislav Petkov
8de1d91e62 amd64_edac: cleanup rest of amd64_dump_misc_regs
Clarify bitfields description, add PCI config function/offset names to
registers for easy reference, simplify code layout, remove unneeded
info.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-12-07 19:14:28 +01:00
Borislav Petkov
68798e1760 amd64_edac: cleanup DRAM cfg low debug output
Carve out the register-specific debug statements into a separate
function, clarify meanings of the single bitfields in the register,
remove irrelevant output and macros.

There should be no functionality change resulting from this patch.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-12-07 19:14:28 +01:00
Borislav Petkov
6ba5dcdc44 amd64_edac: wrap-up pci config read error handling
Add a pci config read wrapper for signaling pci config space access
errors instead of them being visible only on a debug build. This is
important on amd64_edac since it uses all those pci config register
values to access the DRAM/DIMM configuration of the nodes.

In addition, the wrapper makes a _lot_ (look at the diffstat!) of
error handling code superfluous and improves much of the overall code
readability by removing error handling details out of the way.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-12-07 19:14:27 +01:00
Borislav Petkov
f6d6ae9657 amd64_edac: unify MCGCTL ECC switching
Unify almost identical code into one function and remove NUMA-specific
usage (specifically cpumask_of_node()) in favor of generic topology
methods.

Remove unused defines, while at it.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-12-07 19:14:27 +01:00
Rusty Russell
ba578cb34a cpumask: use modern cpumask style in drivers/edac/amd64_edac.c
cpumask_t -> struct cpumask, and don't put one on the stack.  (Note: this
is actually on the stack unless CONFIG_CPUMASK_OFFSTACK=y).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-12-07 19:14:27 +01:00
Borislav Petkov
e97f8bb8ce amd64_edac: make DRAM regions output more human-readable
Do not shift the TOP_MEM and TOP_MEM2 values by 23 but rather save the
whole 64-bit value read from the MSR. Although the TOP_MEM/TOP_MEM2 bits
are only a subset of the 64bit register, the values are correct since
the remaining bits are Read-As-Zero and no shifting is needed.

Also, cleanup DRAM base/limit debug output.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-12-07 19:14:27 +01:00
Borislav Petkov
72381bd55e amd64_edac: clarify DRAM CTL debug reporting
Make debug info formulations about the DRAM and DCT configuration of the
machine more human readable.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-12-07 19:14:26 +01:00
Ingo Molnar
26fb20d008 Merge branch 'perf/mce' into perf/core
Merge reason: It's ready for v2.6.33.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-03 20:11:06 +01:00
Borislav Petkov
17adea01b9 amd64_edac: fix CECCs reporting
Shift error type bits properly.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-11-04 14:04:06 +01:00
Li Hong
a3c4c58085 amd64_edac: fix a wrong goto clause in amd64_edac.c
In amd64_edac_init(void) in amd64_edac.c, cache_k8_northbridges() is
called before pci_register_driver. If it fails, should exit with err
directly.

Signed-off-by: Li Hong <lihong.hi@gmail.com>
Acked-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-11-04 14:02:32 +01:00
Keith Mannthey
c2494ace99 edac: i5100 fix initialization code
Allow csrows to properly initialize when the topology only has active
channels on 2 and 3.  This new check allows proper detection and
initialization in this topology.  Only checking the first mrt that
represented channels 0 and 1 is not sufficient.

I also fixed up the related debug information path.  I can submit as a 2nd
patch if needed.

Signed-off-by: Keith Mannthey <kmannth@us.ibm.com>
Acked-by: Aristeu Rozanski <aris@ruivo.org>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-10-29 07:39:30 -07:00
Ira W. Snyder
0616fb003d edac: i5400 fix missing CONFIG_PCI define
When building without CONFIG_PCI the edac_pci_idx variable is unused,
causing a build-time warning.  Wrap the variable in #ifdef CONFIG_PCI,
just like the rest of the PCI support.

Signed-off-by: Ira W. Snyder <iws@ovro.caltech.edu>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-10-29 07:39:30 -07:00
Jeff Roberson
156edd4aaa edac: i5400 fix csrow mapping
The i5400 EDAC driver has several bugs with chip-select row computation
which most likely lead to bugs in detailed error reporting.  Attempts to
contact the authors have gone mostly unanswered so I am presenting my diff
here.  I do not subscribe to lkml and would appreciate being kept in the
cc.

The most egregious problem was miscalculating the addresses of MTR
registers after register 0 by assuming they are 32bit rather than 16.
This caused the driver to miss half of the memories.  Most motherboards
tend to have only 8 dimm slots and not 16, so this may not have been
noticed before.

Further, the row calculations multiplied the number of dimms several
times, ultimately ending up with a maximum row of 32.  The chipset only
supports 4 dimms in each of 4 channels, so csrow could not be higher than
4 unless you use a row per-rank with dual-rank dimms.  I opted to
eliminate this behavior as it is confusing to the user and the error
reporting works by slot and not rank.  This gives a much clearer view of
memory by slot and channel in /sys.

Signed-off-by: Jeff Roberson <jroberson@jroberson.net>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-10-29 07:39:30 -07:00
Borislav Petkov
4997811e3b amd64_edac: fix DRAM base and limit extraction masks, v2
This is a proper fix as a follow-up to 66216a7 and 916d11b.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-10-16 18:51:22 +02:00
Borislav Petkov
fb2531953f mce, edac: Use an atomic notifier for MCEs decoding
Add an atomic notifier which ensures proper locking when conveying
MCE info to EDAC for decoding. The actual notifier call overrides a
default, negative priority notifier.

Note: make sure we register the default decoder only once since
mcheck_init() runs on each CPU.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
LKML-Reference: <20091003065752.GA8935@liondog.tnic>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-12 12:24:45 +02:00
Linus Torvalds
624235c5b3 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, pci: Correct spelling in a comment
  x86: Simplify bound checks in the MTRR code
  x86: EDAC: carve out AMD MCE decoding logic
  initcalls: Add early_initcall() for modules
  x86: EDAC: MCE: Fix MCE decoding callback logic
2009-10-08 12:06:36 -07:00
Borislav Petkov
94baaee494 amd64_edac: beef up DRAM error injection
When injecting DRAM ECC errors (F3xBC_x8), EccVector[15:0] is a bitmask
of which bits should be error injected when written to and holds the
payload of 16-bit DRAM word when read, respectively.

Add /sysfs members to show the DRAM ECC section/word/vector.

Fail wrong injection values entered over /sysfs instead of truncating
them.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-10-07 16:51:28 +02:00
Borislav Petkov
66216a7a15 amd64_edac: fix DRAM base and limit extraction
On Fam10h and above, F1x[1, 0][7C:40] are DRAM Base/Limit registers
which specify the destination node of a DRAM address. Those address
boundaries are being extracted into ->dram_base[] and ->dram_limit[].
Correct the extraction masks to match the respective address bits.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-10-07 16:51:15 +02:00
Borislav Petkov
9d858bb10a amd64_edac: fix chip select handling
Different processor families support a different number of chip selects.
Handle this in a family-dependent way with the proper values assigned at
init time (see amd64_set_dct_base_and_mask).

Remove _DCSM_COUNT defines since they're used at one place and originate
from public documentation.

CC: Keith Mannthey <kmannth@us.ibm.com>
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-10-07 16:50:50 +02:00
Keith Mannthey
2cff18c22c amd64_edac: simple fix to allow reporting of CECC errors
This allows the errors to be further decoded and mapped to csrows.
Tested with ECC debug dimms and an Rev F cpu based system.

Signed-off-by: Keith Mannthey <kmannth@us.ibm.com>
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-10-07 16:49:58 +02:00
Borislav Petkov
8edc544589 amd64_edac: fix K8 intlv_sel check
The check when DRAM interleaving is enabled should be done against the
pvt->dram_IntlvSel field and not against the ->dram_limit.

Simplify first loop and fixup printk formatting while at it.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-10-07 16:49:43 +02:00
Borislav Petkov
72f158fe6f amd64_edac: fix interleave enable tests
The pvt->dram_IntlvEn saves the 3 "Interleave Enable" bits already
right-shifted by 8 so the check in find_mc_by_sys_addr() by shifting the
values to the left 8 bits is wrong.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-10-07 16:48:08 +02:00
Borislav Petkov
916d11b2b5 amd64_edac: fix DRAM base and limit address extraction
K8 DRAM base and limit addresses from F1x40 +8*i and F1x44 + 8*i, where
i in (0..7) are both bits 39-24 and therefore the shifting should be
done by 24 and not by 8.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-10-07 16:47:51 +02:00
Borislav Petkov
3011b20da9 amd64_edac: fix driver instance lookup table allocation
Allocate memory statically for 8-node machines max for simplicity
instead of relying on MAX_NUMNODES which is 0 on !CONFIG_NUMA builds.

Spotted by Jan Beulich.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-10-07 16:47:34 +02:00
Borislav Petkov
0d18b2e34b x86: EDAC: carve out AMD MCE decoding logic
This converts the MCE decoding logic into a standalone config
option which can be built-in or a module, the first one being the
default for MCEs happening early on in the boot process.

This, beyond being separated in a cleaner way, also saves RAM by
making the decoding logic modular.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andi Kleen <andi@firstfloor.org>
LKML-Reference: <20091002133148.GD28682@aftab>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-02 15:42:19 +02:00
Ingo Molnar
f436f8bb73 x86: EDAC: MCE: Fix MCE decoding callback logic
Make decoding of MCEs happen only on AMD hardware by registering a
non-default callback only on CPU families which support it.

While looking at the interaction of decode_mce() with the other MCE
code i also noticed a few other things and made the following
cleanups/fixes:

 - Fixed the mce_decode() weak alias - a weak alias is really not
   good here, it should be a proper callback. A weak alias will be
   overriden if a piece of code is built into the kernel - not
   good, obviously.

 - The patch initializes the callback on AMD family 10h and 11h.

 - Added the more correct fallback printk of:

	No support for human readable MCE decoding on this CPU type.
	Transcribe the message and run it through 'mcelog --ascii' to decode.

   On CPUs that dont have a decoder.

 - Made the surrounding code more readable.

Note that the callback allows us to have a default fallback -
without having to check the CPU versions during the printout
itself. When an EDAC module registers itself, it can install the
decode-print function.

(there's no unregister needed as this is core code.)

version -v2 by Borislav Petkov:

 - add K8 to the set of supported CPUs

 - always build in edac_mce_amd since we use an early_initcall now

 - fix checkpatch warnings

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andi Kleen <andi@firstfloor.org>
LKML-Reference: <20091001141432.GA11410@aftab>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-02 15:42:18 +02:00
Jesper Dangaard Brouer
458e5ff13e edac: core: remove completion-wait for complete with rcu_barrier
Module edac_core.ko uses call_rcu() callbacks in edac_device.c, edac_mc.c
and edac_pci.c.

They all use a wait_for_completion() scheme, but this scheme it not 100%
safe on multiple CPUs.  See the _rcu_barrier() implementation which
explains why extra precausion is needed.

The patch adds a comment about rcu_barrier() and as a precausion calls
rcu_barrier().  A maintainer needs to look at removing the
wait_for_completion code.

[dougthompson@xmission.com: remove the wait_for_completion code]
Signed-off-by Jesper Dangaard Brouer <hawk@comx.dk>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-09-24 07:21:05 -07:00
Jason Uhlenkott
dd8ef1db87 edac: i3200 memory controller driver
A driver for the Intel 3200 and 3210 memory controllers.  It has only had
light testing so far, and currently makes no attempt to decode error
addresses at anything finer than csrow granularity.

Signed-off-by: Jason Uhlenkott <juhlenko@akamai.com>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-09-24 07:21:04 -07:00
Julia Lawall
30a61fff3a edac: fix resource size calculation
Use the function resource_size, which reduces the chance of introducing
off-by-one errors in calculating the resource size.

The semantic patch that makes this change is as follows:
(http://www.emn.fr/x-info/coccinelle/)

// <smpl>
@@
struct resource *res;
@@

- (res->end - res->start) + 1
+ resource_size(res)
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-09-24 07:21:04 -07:00
Ira W. Snyder
b484625172 edac: mpc85xx add mpc83xx support
Add support for the Freescale MPC83xx memory controller to the existing
driver for the Freescale MPC85xx memory controller.  The only difference
between the two processors are in the CS_BNDS register parsing code, which
has been changed so it will work on both processors.

The L2 cache controller does not exist on the MPC83xx, but the OF
subsystem will not use the driver if the device is not present in the OF
device tree.

I had to change the nr_pages calculation to make the math work out.  I
checked it on my board and did the math by hand for a 64GB 85xx using 64K
pages.  In both cases, nr_pages * PAGE_SIZE comes out to the correct
value.

Signed-off-by: Ira W. Snyder <iws@ovro.caltech.edu>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Cc: Kumar Gala <galak@gate.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-09-24 07:21:04 -07:00
Yang Shi
a014554e66 edac: mpc85xx add P2020DS support
Based on Kumar's new compatible types patch, add P2020 into MPC85xx EDAC
compatible lists so that EDAC can recognize P2020 meomry controller and L2
cache controller and export the relevant fields to sysfs.

EDAC MPC85xx DDR3 support is needed if DDR3 memory stick is installed on a
P2020DS board so that EDAC core can recognize DDR3 memory type.

Signed-off-by: Yang Shi <yang.shi@windriver.com>
Acked-by: Dave Jiang <djiang@mvista.com>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-09-24 07:21:04 -07:00
Anand Gadiyar
411c940385 trivial: fix typo "for for" in multiple files
trivial: fix typo "for for" in multiple files

Signed-off-by: Anand Gadiyar <gadiyar@ti.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2009-09-21 15:14:54 +02:00
Borislav Petkov
06724535f8 amd64_edac: check NB MCE bank enable on the current node properly
The old code was using smp_call_function_many which skips the current
cpu if it is in the supplied cpumask. Switch to the rdmsr_on_cpus()
interface which takes care of that.

In addition, add get_cpus_on_this_dct_cpumask helper which computes a
cpumask of all the cores on a node and thus on a DCT.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-09-16 13:05:46 +02:00
Wan Wei
57a30854c8 amd64_edac: Rewrite unganged mode code of f10_early_channel_count
Simplify the procedure by checking if there is any DIMM in each channel.
This patch will fix the bugs such as when there is no DIMMs under
certain node, two DIMMs in the same channel, and only one DIMM in each
channel of the node.

Borislav: minor fixups

Signed-off-by: Wan Wei <wanwei@mail.dawning.com.cn>
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-09-16 12:42:55 +02:00
Borislav Petkov
be3468e8ff amd64_edac: cleanup amd64_check_ecc_enabled
Simplify code flow and make sure return value is always valid since
further driver init depends on it. Carve out long warning string and
make code more readable. Shorten some names, while at it.

There should be no functional change resulting from this patch.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-09-16 12:40:38 +02:00
Andreas Herrmann
6a8126911a x86, EDAC: Provide function to return NodeId of a CPU
Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
Acked-by: H. Peter Anvin <hpa@zytor.com>
2009-09-16 11:33:40 +02:00
Ingo Molnar
b9183f9b99 amd64_edac: build driver only on AMD hardware
-tip testing found the following build failure (config attached):

drivers/built-in.o: In function `amd64_check':
amd64_edac.c:(.text+0x3e9491): undefined reference to `amd_decode_nb_mce'
drivers/built-in.o: In function `amd64_init_2nd_stage':
amd64_edac.c:(.text+0x3e9b46): undefined reference to `amd_report_gart_errors'
amd64_edac.c:(.text+0x3e9b55): undefined reference to `amd_register_ecc_decoder'
drivers/built-in.o: In function `amd64_nbea_store':
amd64_edac_dbg.c:(.text+0x3ea22e): undefined reference to `amd_decode_nb_mce'
drivers/built-in.o: In function `amd64_remove_one_instance':
amd64_edac.c:(.devexit.text+0x3eea): undefined reference to `amd_report_gart_errors'
amd64_edac.c:(.devexit.text+0x3ef6): undefined reference to `amd_unregister_ecc_decoder'

the AMD EDAC code has a dependency on CONFIG_CPU_SUP_AMD facilities. The
patch below solves the problem here.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-09-16 11:31:57 +02:00
Borislav Petkov
53bd5fedca EDAC, AMD: decode FR MCEs
See Fam10h BKDG (31116, rev. 3.28), Table 101.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-09-14 19:01:37 +02:00
Borislav Petkov
f9350efd6f EDAC, AMD: decode load store MCEs
See Fam10h BKDG (31116, rev. 3.28), Table 100.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-09-14 19:01:33 +02:00
Borislav Petkov
56cad2d6fb EDAC, AMD: decode bus unit MCEs
... according to Table 69, Fam10h BKDG (31116, rev. 3.28).

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-09-14 19:01:30 +02:00
Borislav Petkov
ab5535e70f EDAC, AMD: decode instruction cache MCEs
See Fam10h BKDG (31116, rev. 3.28), Table 95

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-09-14 19:01:27 +02:00
Borislav Petkov
5196624136 EDAC, AMD: decode data cache MCEs
Those get reported in MC0_STATUS, see Table 92, F10h BKDG (31116, rev.
3.28) for more details.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-09-14 19:01:23 +02:00
Borislav Petkov
d93cc222ad EDAC, AMD: carve out decoding of MCi_STATUS ErrorCode
This is the MCE error code from the MCi_STATUS banks, bits [15:0] which
describe what type of error was encountered: GART TLB, Memory or Bus
error. The semantics of those bits are identical across all MCE banks so
decode those separately, irrespectively of MCE type.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-09-14 19:01:20 +02:00
Borislav Petkov
b69b29de65 EDAC, AMD: carve out MCi_STATUS decoding
The MCi_STATUS registers have most field definitions in common so decode
them in the general path. Do not pass ecc_type along and compute it in
__amd64_decode_bus_error instead.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-09-14 19:01:07 +02:00
Borislav Petkov
549d042df2 x86, mce: pass mce info to EDAC for decoding
Move NB decoder along with required defines to EDAC MCE core. Add
registration routines for further decoding of the MCE info in the AMD64
EDAC module.

CC: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-09-14 18:59:17 +02:00
Borislav Petkov
ecaf5606de amd64_edac: cleanup amd64_decode_bus_error
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-09-14 18:58:37 +02:00
Borislav Petkov
b7225e4fc1 amd64_edac: remove memory and GART TLB error decoders
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-09-14 18:58:29 +02:00
Borislav Petkov
5110dbdeab amd64_edac: cleanup/complete NB MCE decoding
* don't dump info which mcheck already does
* update to newest BKDG
* mv amd64_process_error_info -> amd64_decode_nb_mce
* shorten error struct names
* remove redundant info ptr in amd64_process_error_info
* remove unused ErrorCodeExt[19:16] (MCx_STATUS) defines

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-09-14 18:58:25 +02:00
Borislav Petkov
ef44cc4c22 amd64_edac: cleanup amd64_process_error_info
* mv amd64_error_info_regs -> err_regs

* remove redundant info ptr

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-09-14 18:58:18 +02:00
Borislav Petkov
1c43f2e24d EDAC: beef up ErrorCodeExt error signatures
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-09-14 18:58:14 +02:00
Borislav Petkov
b70ef01016 EDAC: move MCE error descriptions to EDAC core
This is in preparation of adding AMD-specific MCE decoding functionality
to the EDAC core. The error decoding macros originate from the AMD64
EDAC driver albeit in a simplified and cleaned up version here.

While at it, add macros to generate the error description strings and
use them in the error type decoders directly which removes a bunch of
code and makes the decoding functions much more readable. Also, fix
strings and shorten macro names.

Remove superfluous htlink_msgs.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-09-14 18:57:48 +02:00
Doug Thompson
c2718348b4 amd64_edac: print debug statements only on error
Add forgotten return calls for the successful cases.

Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-08-04 12:10:06 +02:00
Doug Thompson
126b67b8d2 amd64_edac: fix ECC checking
On the good path of BIOS enabled ECC and no override, the value returned
is 1 by omission and thus is deemed failing by the probe-function.

Allow proper module initialization by clearing the retval explicitly.

Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-08-03 16:54:20 +02:00
Lu Zhihe
3d768213a6 edac: x38 fix mchbar high register addr
Intel X38 MCHBAR is a 64bits register, base from 0x48, so its higher base
is 0x4C.

Signed-off-by: Lu Zhihe <tombowfly@gmail.com>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Cc: <stable@kernel.org>		[2.6.30.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-07-29 19:10:34 -07:00
Wan Wei
4afcd2dcc6 amd64_edac: read the right F2 maskoffset reg
Signed-off-by: Wan Wei <onewayforever@gmail.com>
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-07-27 14:42:24 +02:00
Yang Shi
b1cfebc923 edac: add DDR3 memory type for MPC85xx EDAC
Since some new MPC85xx SOCs support DDR3 memory now, so add DDR3 memory
type for MPC85xx EDAC.

Signed-off-by: Yang Shi <yang.shi@windriver.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-30 18:55:59 -07:00
Borislav Petkov
37da045067 amd64_edac: misc small cleanups
- cleanup debug calls
- shorten function names
- cleanup error exit paths

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-06-26 13:06:41 +02:00
Borislav Petkov
30c875cbc1 amd64_edac: fix ecc_enable_override handling
amd64_check_ecc_enabled() returns non-zero status when ECC
checking/correcting is disabled and this fails further loading of the
driver even when 'ecc_enable_override' boot param is used.

Fix that by clearing return status in that case.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-06-26 13:06:41 +02:00
Borislav Petkov
584fcff428 amd64_edac: check only ECC bit in amd64_determine_edac_cap
Checking whether the machine is using ECC enabled DRAM is done through
testing the DimmEccEn bit in the DRAM Cfg Low register (F2x[1,0]90). Do
that instead of testing all bits from the DimmEccEn upwards.

Also, remove mci->edac_cap assignment and use value returned from
amd64_determine_edac_cap().

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-06-26 13:06:40 +02:00
GeunSik Lim
e24aca672f edac: Kconfig: fix the meaning of EDAC abbreviation
Fix the meaning of EDAC(Error Detection And Correction) correctly.

[akpm@linux-foundation.org: add missing space]
Signed-off-by: GeunSik Lim <geunsik.lim@samsung.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Acked-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-18 13:03:57 -07:00
Mike Frysinger
20ea8fad9e edac: add missing __devexit_p()
The remove function uses __devexit, so the .remove assignment needs
__devexit_p() to fix a build error with hotplug disabled.

Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Cc: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-18 13:03:57 -07:00
Harry Ciao
1dc9b70d7d edac: add edac_device_alloc_index()
Add edac_device_alloc_index(), because for MAPLE platform there may
exist several EDAC driver modules that could make use of
edac_device_ctl_info structure at the same time. The index allocation
for these structures should be taken care of by EDAC core.

[akpm@linux-foundation.org: cleanups]
Signed-off-by: Harry Ciao <qingtao.cao@windriver.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Cc: Michael Ellerman <michael@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Kumar Gala <galak@gate.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-18 13:03:56 -07:00
Harry Ciao
2a9036afff edac: add CPC925 Memory Controller driver
Introduce IBM CPC925 EDAC driver, which makes use of ECC, CPU and
HyperTransport Link error detections and corrections on the IBM
CPC925 Bridge and Memory Controller.

[akpm@linux-foundation.org: cleanup]
Signed-off-by: Harry Ciao <qingtao.cao@windriver.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Cc: Michael Ellerman <michael@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Kumar Gala <galak@gate.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-18 13:03:56 -07:00
Martin Olsson
98a1708de1 trivial: fix typos s/paramter/parameter/ and s/excute/execute/ in documentation and source comments.
Signed-off-by: Martin Olsson <martin@minimum.se>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2009-06-12 18:01:46 +02:00
Borislav Petkov
9456ffffcf EDAC: do not enable modules by default
Prevent EDAC compilation units from being built by default and let the
user explicitly select the needed modules.

Acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Tested-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-06-10 12:19:41 +02:00
Borislav Petkov
3d37329045 amd64_edac: do not enable module by default
While at it, fix a link failure when !K8_NB.

Acked-by: Doug Thompson <dougthompson@xmission.com>
Acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Tested-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-06-10 12:19:40 +02:00
Doug Thompson
7d6034d321 amd64_edac: add module registration routines
Also, link into Kbuild by adding Kconfig and Makefile entries.

Borislav:
- Kconfig/Makefile splitting
- use zero-sized arrays for the sysfs attrs if not enabled
- rename sysfs attrs to more conform values
- shorten CONFIG_ names
- make multiple structure members assignment vertically aligned
- fix/cleanup comments
- fix function return value patterns
- fix err labels
- fix a memleak bug caught by Ingo
- remove the NUMA dependency and use num_k8_northbrides for initializing
  a driver instance per NB.
- do not copy the pvt contents into the mci struct in
  amd64_init_2nd_stage() and save it in the mci->pvt_info void ptr
  instead.
- cleanup debug calls
- simplify amd64_setup_pci_device()

Reviewed-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-06-10 12:19:28 +02:00
Doug Thompson
f9431992b6 amd64_edac: add ECC reporting initializers
Borislav:
- convert to the new {rd|wr}msr_on_cpus interfaces.
- convert pvt->old_mcgctl to a bitmask thus saving some bytes
- fix/cleanup comments
- fix function return value patterns
- add a proper bugfix found by Doug to amd64_check_ecc_enabled where we
  missed checking for the ECC enabled bit in NB CFG.
- cleanup debug calls

Reviewed-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-06-10 12:19:01 +02:00
Doug Thompson
0ec449ee95 amd64_edac: add EDAC core-related initializers
Borislav:

- add a amd64_free_mc_sibling_devices() helper instead of opencoding the
  release-path.
- fix/cleanup comments
- fix function return value patterns
- cleanup debug calls

Reviewed-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-06-10 12:19:00 +02:00
Doug Thompson
d27bf6fa36 amd64_edac: add error decoding logic
Borislav:

- fold amd64_error_info_valid() into its only user
- fix/cleanup comments
- fix function return value patterns
- cleanup debug calls

Reviewed-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-06-10 12:18:59 +02:00
Doug Thompson
b1289d6f9d amd64_edac: add ECC chipkill syndrome mapping table
Borislav:

- fix comments
- cleanup debug calls

Reviewed-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-06-10 12:18:58 +02:00
Doug Thompson
4d37607adb amd64_edac: add per-family descriptors
Borislav:

- fix comments
- fix function return value patterns

Reviewed-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-06-10 12:18:57 +02:00
Doug Thompson
f71d0a0500 amd64_edac: add F10h-and-later methods-p3
Borislav:

- compute dct_sel_base_off in f10_match_to_this_node() correctly since
it cannot be assumed that the Reserved bits are zero and they have to be
masked out instead.

- cleanup, remove StinkyIdentifiers, simplify logic
- fix function return value patterns
- cleanup debug calls

Reviewed-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-06-10 12:18:56 +02:00
Doug Thompson
6163b5d4fb amd64_edac: add F10h-and-later methods-p2
Borislav:

- fix a wrong negation in f10_determine_base_addr_offset()
- fix a wrong mask in f10_determine_base_addr_offset() which should
select DctSelBaseAddr[31:11] and not [31:16] as it was before
- remove StinkyIdentifiers, trivially simplify code.
- fix/cleanup comments
- fix function return value patterns

Reviewed-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-06-10 12:18:56 +02:00
Doug Thompson
1afd3c98b5 amd64_edac: add F10h-and-later methods-p1
Borislav:

Fail f10_early_channel_count() if error encountered while reading a NB
register since those cached register contents are accessed afterwards.

- fix/cleanup comments
- fix function return value patterns
- cleanup debug calls

Reviewed-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-06-10 12:18:55 +02:00
Doug Thompson
ddff876d20 amd64_edac: add k8-specific methods
Borislav:

- fix/cleanup/move comments
- fix function return value patterns
- cleanup debug calls

Reviewed-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-06-10 12:18:54 +02:00
Doug Thompson
94be4bff21 amd64_edac: assign DRAM chip select base and mask in a family-specific way
Borislav:

- cleanup/fix comments
- fix function return value patterns
- cleanup debug calls

Reviewed-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-06-10 12:18:53 +02:00
Doug Thompson
2da11654ea amd64_edac: add helper to dump relevant registers
Borislav:

- cleanup/fix comments
- fix function return value patterns
- cleanup dbg calls

Reviewed-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-06-10 12:18:52 +02:00
Doug Thompson
93c2df58b5 amd64_edac: add DRAM address type conversion facilities
Borislav:

- cleanup/fix comments, add BKDG refs
- fix function return value patterns
- cleanup dbg calls

Reviewed-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-06-10 12:18:51 +02:00
Doug Thompson
e2ce7255e8 amd64_edac: add functionality to compute the DRAM hole
Borislav:

- cleanup/fix comments, add BKDG refs
- cleanup debug calls

Reviewed-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-06-10 12:18:50 +02:00
Doug Thompson
6775763a23 amd64_edac: add sys addr to memory controller mapping helpers
Borislav:

- cleanup comments
- cleanup debug calls
- simplify find_mc_by_sys_addr's exit path

Reviewed-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-06-10 12:18:49 +02:00
Doug Thompson
2bc6541872 amd64_edac: add memory scrubber interface
Borislav:
- fix/cleanup comments
- fix function return value patterns
- cleanup debug calls

Reviewed-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-06-10 12:18:49 +02:00
Doug Thompson
b52401cece amd64_edac: add MCA error types
Borislav:
- cleanup comments

Reviewed-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-06-10 12:18:48 +02:00
Doug Thompson
eb919690be amd64_edac: add DRAM error injection logic using sysfs
Borislav:
- rename sysfs attrs to more conform names
- cleanup/fix comments according to BKDG text
- fix function return value patterns
- cleanup debug calls

Reviewed-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-06-10 12:18:47 +02:00
Doug Thompson
fd3d6780f7 amd64_edac: add debugging/testing code
This is for dumping different registers and testing the address mapping
logic using the ECC syndromes.

Borislav:

- split sysfs attrs per file
- use more conform names for the sysfs attrs
- fix function return value patterns
- cleanup/fix comments
- cleanup debug calls

Reviewed-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-06-10 12:18:46 +02:00
Doug Thompson
cfe40fdb4a amd64_edac: add driver header
Borislav:
- remove register bit descriptions (complete text in BKDG)
- cleanup and remove excessive/superfluous comments

Reviewed-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-06-10 12:18:45 +02:00
Borislav Petkov
d357cbb445 edac: fold __func__ into edac_debug_printk
This shortens debugfX() calls a bit.

Reviewed-by: Mauro Carvalho Chehab <mchehab@redhat.com>
CC: Doug Thompson <norsk5@yahoo.com>
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-06-10 12:18:44 +02:00
Harry Ciao
715fe7af9f edac: AMD8111 & AMD8131 Kconfig fixup
The amd8111_edac.c driver will fail allmodconfig on architectures other
than PPC, introduce Kconfig dependency to avoid this, since both AMD8111
and AMD8131 chips are only adopted on Maple so far.

Signed-off-by: Harry Ciao <qingtao.cao@windriver.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-05-29 08:40:03 -07:00
Harry Ciao
56ec0c7b88 edac: AMD8111 & AMD8131 use dev_name()
The "bus_id" member in the device structure has been obsolete, use
dev_name() instead.

Signed-off-by: Harry Ciao <qingtao.cao@windriver.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-05-29 08:40:03 -07:00
Dave Jiang
55e5750b3e edac: ppc mpc85xx fix mc err detect
Error found by Jeff Haran.

The error detect register is 0s when no errors are detected.  The check
code is incorrect, so reverse check sense.

Reported-by: Jeff Haran <jharan@Brocade.COM>
Signed-off-by: Dave Jiang <djiang@mvista.com>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-21 13:41:51 -07:00
Jean Delvare
fbeb438474 edac: use to_delayed_work()
The edac-core driver includes code which assumes that the work_struct
which is included in every delayed_work is the first member of that
structure.  This is currently the case but might change in the future, so
use to_delayed_work() instead, which doesn't make such an assumption.

linux-2.6.30-rc1 has the to_delayed_work() function that will allow this
patch to work

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-13 15:04:34 -07:00
Jeff Haran
e6da46b273 edac: fix local pci_write_bits32
Fix the edac local pci_write_bits32 to properly note the 'escape' mask if
all ones in a 32-bit word.

Currently no consumer of this function uses that mask, so there is no
danger to existing code.

Signed-off-by: Jeff Haran <jharan@Brocade.COM>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-13 15:04:33 -07:00
Harry Ciao
58b4ce6f24 edac: AMD8111 driver Kconfig & Makefile
Introduce Kconfig and Makefile options for AMD8111 EDAC driver.

Signed-off-by: Harry Ciao <qingtao.cao@windriver.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-02 19:05:04 -07:00
Harry Ciao
e876558415 edac: AMD8131 driver Kconfig & Makefile
Introduce Kconfig and Makefile options for AMD8131 EDAC driver.

Signed-off-by: Harry Ciao <qingtao.cao@windriver.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-02 19:05:04 -07:00
Harry Ciao
28d16272b1 edac: AMD8131 driver source file
Introduce AMD8131 EDAC driver source file, which makes use of error
detections on the PCI-X Bridge Controllers on the AMD8131 HyperTransport
PCI-X Tunnel.

Signed-off-by: Harry Ciao <qingtao.cao@windriver.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-02 19:05:04 -07:00
Harry Ciao
a35a281880 edac: AMD8131 driver header file
Introduce AMD8131 EDAC driver header file, which adds register and bits
definitions for the PCI-X Bridge Controller on the AMD8131 HyperTransport
I/O Hub.

Signed-off-by: Harry Ciao <qingtao.cao@windriver.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-02 19:05:03 -07:00
Harry Ciao
8641a3845d edac: Add edac_pci_alloc_index()
Add edac_pci_alloc_index(), because for MAPLE platform there may exist
several EDAC driver modules that could make use of edac_pci_ctl_info
structure at the same time.  The index allocation for these structures
should be taken care of by EDAC core.

Signed-off-by: Harry Ciao <qingtao.cao@windriver.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-02 19:05:03 -07:00
Harry Ciao
697dab6484 edac: AMD8111 driver source file
Introduce AMD8111 EDAC driver source file, which makes use of error
detections on the LPC Bridge Controller and PCI Bridge Controller on the
AMD8111 HyperTransport I/O Hub.

Signed-off-by: Harry Ciao <qingtao.cao@windriver.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-02 19:05:03 -07:00
Harry Ciao
ec2cf2e272 edac: AMD8111 driver header file
Introduce AMD8111 EDAC driver header file, which adds register and bits
definitions for the LPC Bridge Controller and PCI Bridge Controller on the
AMD8111 HyperTransport I/O Hub.

Signed-off-by: Harry Ciao <qingtao.cao@windriver.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-02 19:05:03 -07:00
Grant Erickson
dba7a77c0e edac: new ppc4xx driver module
This adds support for an EDAC memory controller adaptation driver for the
"ibm,sdram-4xx-ddr2" ECC controller realized in the AMCC PowerPC 405EX[r].

At present, this driver has been developed and tested against the
controller realization in the AMCC PPC405EX[r] on the AMCC Kilauea and
Haleakala boards (256 MiB w/o ECC memory soldered onto the board) and a
proprietary board based on those designs (128 MiB ECC memory, also
soldered onto the board).

In the future, dynamic feature detection and handling needs to be added
for the other realizations of this controller found in the 440SP, 440SPe,
460EX, 460GT and 460SX.

Eventually, this driver will likely be evolved and adapted to the above
variant realizations of this controller as well as broken apart to handle
the other known ECC-capable controllers prevalent in other PPC4xx
processors:

  - IBM SDRAM (405GP, 405CR and 405EP) "ibm,sdram-4xx"
  - IBM DDR1 (440GP, 440GX, 440EP and 440GR) "ibm,sdram-4xx-ddr"
  - Denali DDR1/DDR2 (440EPX and 440GRX) "denali,sdram-4xx-ddr2"

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Grant Erickson <gerickson@nuovations.com>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-02 19:05:03 -07:00
Doug Thompson
4577ca5568 edac: remove EDAC's experimental status
After 3 years, this is a patch to remove the EXPERIMENTAL tag on EDAC.  We
now have many module drivers submitters in EDAC and believe EDAC is no
longer EXPERIMENTAL

Signed-off-by: Doug Thompson <dougthompson@xmission.com
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-02 19:05:03 -07:00
Hitoshi Mitake
cc18e3cd53 edac: add more verbose debug info
A patch for making a debugging information more verbose for use in
development debugging.

By enabling the new option "More verbose debugging", information about
source file and line number will be added to debugging message.

This is sample output,

EDAC MC0: Giving out device to 'e7xxx_edac' 'E7205': DEV 0000:00:00.0
EDAC DEBUG: in drivers/edac/edac_pci.c, line at 48: edac_pci_alloc_ctl_info()
EDAC DEBUG: in drivers/edac/edac_pci.c, line at 334: edac_pci_add_device()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Signed-off-by: Hitoshi Mitake <h.mitake@gmail.com>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-02 19:05:02 -07:00
Kay Sievers
031d551859 edac: struct device - replace bus_id with dev_name(), dev_set_name()
Cc: dougthompson@xmission.com
Cc: bluesmoke-devel@lists.sourceforge.net
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
2009-03-24 16:38:21 -07:00
Stephen Rothwell
4712fff9be powerpc: More printing warning fixes for the l64 to ll64 conversion
These are all powerpc specific drivers.

res.start in fsl_elbc_nand.c needs to be cast since it may be either 32
or 64 bit.  Thanks to Scott Wood for noticing.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Arnd Bergmann <arnd@arndb.de> call_edac bits in particular
Acked-by: Olof Johansson <olof@lixom.net> pasemi_nand peices
Acked-by: Scott Wood <scottwood@freescale.com> fsl_elbc fixes
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-01-28 17:15:52 +11:00
Mauro Carvalho Chehab
8375d4909a edac: driver for i5400 MCH (update)
Signed-off-by: Ben Woodard <woodard@redhat.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-01-06 15:59:30 -08:00
Mauro Carvalho Chehab
920c8df6ac edac: driver for i5400 MCH (Seaburg)
EDAC driver for i5400 MCH (Seaburg)

This driver adds support for i5400 MCH chipset.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Ben Woodard <woodard@redhat.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-01-06 15:59:30 -08:00
Kumar Gala
29d6cf26a7 edac: fix mpc85xx and add mpc8536 mpc8560
All other compatibles that are uniquely identifying the processor use a
prefix of the form fsl,mpc85...'.  We add support for it so we can
deprecate the older 'fsl,85...' that was improperly used here.

Additionally added mpc8536 & mpc8560 to the compatible lists.

This patch is based on Nate's 8572 patch.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Acked-by: Dave Jiang <djiang@mvista.com>
Cc: Nate Case <ncase@xes-inc.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-01-06 15:59:30 -08:00
Kay Sievers
281efb17d8 edac: struct device: replace bus_id with dev_name(), dev_set_name()
This patch is part of a larger patch series which will remove the "char
bus_id[20]" name string from struct device.  The device name is managed in
the kobject anyway, and without any size limitation, and just needlessly
copied into "struct device".

[akpm@linux-foundation.org: coding-style fixes]
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Acked-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-01-06 15:59:30 -08:00
Arjan van de Ven
1dca00bd02 pci: use pci_ioremap_bar() in drivers/edac
Use the newly introduced pci_ioremap_bar() function in drivers/edac.
pci_ioremap_bar() just takes a pci device and a bar number, with the goal
of making it really hard to get wrong, while also having a central place
to stick sanity checks.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-01-06 15:59:30 -08:00
Linus Torvalds
3c92ec8ae9 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: (144 commits)
  powerpc/44x: Support 16K/64K base page sizes on 44x
  powerpc: Force memory size to be a multiple of PAGE_SIZE
  powerpc/32: Wire up the trampoline code for kdump
  powerpc/32: Add the ability for a classic ppc kernel to be loaded at 32M
  powerpc/32: Allow __ioremap on RAM addresses for kdump kernel
  powerpc/32: Setup OF properties for kdump
  powerpc/32/kdump: Implement crash_setup_regs() using ppc_save_regs()
  powerpc: Prepare xmon_save_regs for use with kdump
  powerpc: Remove default kexec/crash_kernel ops assignments
  powerpc: Make default kexec/crash_kernel ops implicit
  powerpc: Setup OF properties for ppc32 kexec
  powerpc/pseries: Fix cpu hotplug
  powerpc: Fix KVM build on ppc440
  powerpc/cell: add QPACE as a separate Cell platform
  powerpc/cell: fix build breakage with CONFIG_SPUFS disabled
  powerpc/mpc5200: fix error paths in PSC UART probe function
  powerpc/mpc5200: add rts/cts handling in PSC UART driver
  powerpc/mpc5200: Make PSC UART driver update serial errors counters
  powerpc/mpc5200: Remove obsolete code from mpc5200 MDIO driver
  powerpc/mpc5200: Add MDMA/UDMA support to MPC5200 ATA driver
  ...

Fix trivial conflict in drivers/char/Makefile as per Paul's directions
2008-12-28 16:54:33 -08:00
Harry Ciao
d519c8d9cc edac: fix edac core deadlock when removing a device
When deleting an edac device, we have to wait for its edac_dev.work to be
completed before deleting the whole edac_dev structure.  Since we have no
idea which work in current edac_poller's workqueue is the work we are
conerned about, we wait for all work in the edac_poller's workqueue to be
proceseed.  This is done via flush_cpu_workqueue() which inserts a
wq_barrier into the tail of the workqueue and then sleeping on the
completion of this wq_barrier.  The edac_poller will wake up sleepers when
it is found.

EDAC core creates only one kernel worker thread, edac_poller, to run the
works of all current edac devices.  They share the same callback function
of edac_device_workq_function(), which would grab the mutex of
device_ctls_mutex first before it checks the device.  This is exactly
where edac_poller and rmmod would have a great chance to deadlock.

In below call trace of rmmod > ... >
edac_device_del_device >
edac_device_workq_teardown > flush_workqueue > flush_cpu_workqueue,

device_ctls_mutex would have already been grabbed by
edac_device_del_device().  So, on one hand rmmod would sleep on the
completion of a wq_barrier, holding device_ctls_mutex; on the other hand
edac_poller would be blocked on the same mutex when it's running any one
of works of existing edac evices(Note, this edac_dev.work is likely to be
totally irrelevant to the one that is being removed right now)and never
would have a chance to run the work of above wq_barrier to wake rmmod up.

edac_device_workq_teardown() should not be called within the critical
region of device_ctls_mutex.  Just like is done in edac_pci_del_device()
and edac_mc_del_mc(), where edac_pci_workq_teardown() and
edac_mc_workq_teardown() are called after related mutex are released.

Moreover, an edac_dev.work should check first if it is being removed.  If
this is the case, then it should bail out immediately.  Since not all of
existing edac devices are to be removed, this "shutting flag" should be
contained to edac device being removed.  The current edac_dev.op_state can
be used to serve this purpose.

The original deadlock problem and the solution have been witnessed and
tested on actual hardware.  Without the solution, rmmod an edac driver
would result in below deadlock:

root@localhost:/root> rmmod mv64x60_edac
EDAC DEBUG: mv64x60_dma_err_remove()
EDAC DEBUG: edac_device_del_device()
EDAC DEBUG: find_edac_device_by_dev()

(hang for a moment)

INFO: task edac-poller:2030 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
edac-poller   D 00000000     0  2030      2
Call Trace:
[df159dc0] [c0071e3c] free_hot_cold_page+0x17c/0x304 (unreliable)
[df159e80] [c000a024] __switch_to+0x6c/0xa0
[df159ea0] [c03587d8] schedule+0x2f4/0x4d8
[df159f00] [c03598a8] __mutex_lock_slowpath+0xa0/0x174
[df159f40] [e1030434] edac_device_workq_function+0x28/0xd8 [edac_core]
[df159f60] [c003beb4] run_workqueue+0x114/0x218
[df159f90] [c003c674] worker_thread+0x5c/0xc8
[df159fd0] [c004106c] kthread+0x5c/0xa0
[df159ff0] [c0013538] original_kernel_thread+0x44/0x60
INFO: task rmmod:2062 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
rmmod         D 0ff2c9fc     0  2062   1839
Call Trace:
[df119c00] [c0437a74] 0xc0437a74 (unreliable)
[df119cc0] [c000a024] __switch_to+0x6c/0xa0
[df119ce0] [c03587d8] schedule+0x2f4/0x4d8
[df119d40] [c03591dc] schedule_timeout+0xb0/0xf4

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-12-23 15:58:21 -08:00
Benjamin Krill
def434c231 powerpc/cell: add QPACE as a separate Cell platform
Since the QPACE (Chromodynamics Parallel Computing on the
Cell Broadband Engine) platform doesn't use a iommu, doesn't
have PCI devices and a MPIC much lesser setup and
configurations are needed. So far all devices are detected
as OF device. A notifier function is used to set the dma_ops
for the of_platform bus. Further this patch splits the
PPC_CELL_NATIVE into PPC_CELL_COMMON which are parts that are
shared with the QPACE platform and the rest.

Signed-off-by: Benjamin Krill <ben@codiert.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2008-12-22 22:19:19 +01:00
Jarkko Lavinen
09a81269c7 i82875p_edac: fix module remove
Fix module removal bugs of i82875p_edac.  Also i82975x_edac code seems to
have the same module removal bugs as in i82875p_edac.

The problems were:

1. In module removal i82875p_remove_one() is never called.

   Variable i82875p_registered is newer changed from 1, which
   guarantees i82875p_remove_one() is not called (and even if it were
   called, it would be called in wrong order).

   As a result, the edac_mc workque is not stopped and keeps probing.
   If kernel debugging options are not enabled, user may not notice
   anything going wrong.

   if debugging options are enabled and I do "rmmod i82875p_edac", I
   get:

      edac debug: edac_pci_workq_function() checking
      BUG: unable to handle kernel paging request at f882d16f
      ...
      call trace:
       [<f8834df3>] ? edac_mc_workq_function+0x55/0x7e [edac_core]
       [<c0233974>] ? run_workqueue+0xd7/0x1a5
       [<c023392f>] ? run_workqueue+0x92/0x1a5
       [<f8834d9e>] ? edac_mc_workq_function+0x0/0x7e [edac_core]
       [<c0233af9>] ? worker_thread+0xb7/0xc3
       [<c0236a7b>] ? autoremove_wake_function+0x0/0x33
       [<c0233a42>] ? worker_thread+0x0/0xc3
       [<c0236809>] ? kthread+0x3b/0x61
       [<c02367ce>] ? kthread+0x0/0x61
       [<c0204587>] ? kernel_thread_helper+0x7/0x10

   Fix for this is to get rid of needles variable i82875p_registered
   altogether and run i82875p_remove_one() *before*
   pci_unregister_driver().

2. edac_mc_del_mc() uses mci after freeing mci

   edac_mc_del_mc() calls calls edac_remove_sysfs_mci_device().  The
   kobject refcount of mci drops to 0 and mci is freed.  After this
   mci is accessed via debug print and i82875p_remove_one() still
   uses mci->pvt and tries to free mci again with edac_mc_free().

   The fix for this is add kobject_get(&mci->edac_mci_kobj) after
   edac_mc_alloc(). Then the mci is still available after returning
   from edac_mc_del_mc() with refcount 1, and mci->pvt is still
   available. When i82875p_remove_one() finally calls edac_mc_free(),
   this will cause kobject_put() and mci is released properly.

Signed-off-by: Jarkko Lavinen <jlavi@iki.fi>
Cc: Doug Thompson <norsk5@yahoo.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-12-01 19:55:25 -08:00
Jarkko Lavinen
307d114441 i82875p_edac: fix overflow device resource setup
When I do "modprobe i82875p_edac" on my Asus P4C800 MB on kernels 2.6.26
or later, the module load fails due to BAR 0 collision.  On 2.6.25 the
module loads just fine.

The overflow device on the MB seems to be hidden and its resources are not
allocated at normal PCI bus init.  Log shows the missing resource problem:

  EDAC DEBUG: i82875p_probe1()
  PCI: 0000:00:06.0 reg 10 32bit mmio: [fecf0000, fecf0fff]
  pci 0000:00:06.0: device not available because of BAR 0
[0xfecf0000-0xfecf0fff] collisions
  EDAC i82875p: i82875p_setup_overfl_dev(): Failed to enable overflow
device

The patch below fixes this by calling pci_bus_assign_resources() after
the overflow device is revealed and added to the bus. With this patch
I am again able to load and use the module.

Signed-off-by: Jarkko Lavinen <jlavi@iki.fi>
Cc: Doug Thompson <norsk5@yahoo.com>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-12-01 19:55:25 -08:00
Darrick J. Wong
f0f7e0dc73 i5000-edac: hold reference to mci kobject
It turns out that edac_mc_del_mc will kobject_put the last kref on the
mci object.

If the timing is just right, that means that the mci object is freed
before before i5000_remove_one has a chance to free the resources
associated with it, causing a null pointer exceptions when unloading the
driver.  Insert a kobject_{get,put} pair so that this doesn't happen.

Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-12 17:17:16 -08:00
Benjamin Herrenschmidt
992b692dcf edac: fix enabling of polling cell module
The edac driver on cell turned out to be not enabled because of a missing
op_state.  This patch introduces it.  Verified to work on top of Ben's
next branch.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jens Osterkamp <jens@linux.vnet.ibm.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-30 11:38:46 -07:00
Hitoshi Mitake
df8bc08c19 edac x38: new MC driver module
I wrote a new module for Intel X38 chipset.  This chipset is very similar
to Intel 3200 chipset, but there are some different points, so I copyed
i3200_edac.c and modified.

This is Intel's web page describing this chipset.
http://www.intel.com/Products/Desktop/Chipsets/X38/X38-overview.htm

I've tested this new module with broken memory, and it seems to be working
well.

Signed-off-by: Hitoshi Mitake <mitake@clustcom.com>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-30 11:38:45 -07:00
Benjamin Herrenschmidt
3b274f44d2 edac cell: fix incorrect edac_mode
The cell_edac driver is setting the edac_mode field of the csrow's to an
incorrect value, causing the sysfs show routine for that field to go out
of an array bound and Oopsing the kernel when used.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Cc: <stable@kernel.org>		[2.6.27.x, 2.6.26.x. 2.6.25.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-20 08:52:40 -07:00
Aristeu Rozanski
8360e81b5d edac i5000: fix thermal issues
Make the Thermal messages (temperature got past Tmid) be displayed only
once because:

1) it's the BIOS job to configure and handle the memory throttling
2) if the BIOS is broken or is aware about the condition, flooding the
   system logs won't help anything.
3) According to the specification update for Intel 5000 MCHs, all the
   revisions of this MCH have problems on the thermal sensors, making
   not automatic (a.k.a. intelligent thermal throttling) impossible.

Signed-off-by: Aristeu Rozanski <aris@redhat.com>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16 11:21:48 -07:00
Aristeu Rozanski
c066740739 edac i5000: fix error messages
Update the i5000_edac messages, making everything pass through the EDAC
(so the log controls will work) and being more specific about the errors.
Also, it makes the miscellaneous errors optional and disabled by default.

As I didn't found anywhere information about M23ERR-M26ERR
(FERR_NF_THERMAL) on FERR_NF_FBD, I'm removing them.

Signed-off-by: Aristeu Rozanski <aris@redhat.com>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16 11:21:48 -07:00
Andrew Kilkenny
60be75515e edac mpc85xx: add support for mpc8572
This adds support for the dual-core MPC8572 processor.  We have
to support making SPR changes on each core.  Also, since we can
have multiple memory controllers sharing an interrupt, flag the
interrupts with IRQF_SHARED.

Signed-off-by: Andrew Kilkenny <akilkenny@xes-inc.com>
Signed-off-by: Nate Case <ncase@xes-inc.com>
Acked-by: Dave Jiang <djiang@mvista.com>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16 11:21:48 -07:00
Vladislav Bogdanov
53a2fe5804 edac: make i82443bxgx_edac coexist with intel_agp
Fix 443BX/GX MCH suppport in a EDAC.

It makes i82443bxgx_edac coexist with intel_agp using the same approach as
several other EDAC drivers.

Tested on Intel's L443GX with redhat's 2.6.18 with whole EDAC subsystem
backported a while ago.

[root@host ~]# dmesg|grep -iE '(AGP|EDAC)'
Linux agpgart interface v0.101 (c) Dave Jones
agpgart: Detected an Intel 440GX Chipset.
agpgart: AGP aperture is 64M @ 0xf8000000
EDAC MC: Ver: 2.1.0 Jun 27 2008
EDAC MC0: Giving out device to 'i82443bxgx_edac' 'I82443BXGX': DEV 0000:00:00.0
EDAC PCI0: Giving out device to module 'i82443bxgx_edac' controller 'EDAC PCI controller': DEV '0000:00:00.0' (POLLED)

Signed-off-by: Vladislav Bogdanov <slava@nsys.by>
Cc: Doug Thompson <norsk5@yahoo.com>
Cc: Dave Airlie <airlied@linux.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16 11:21:48 -07:00
Adrian Bunk
7a8fc9b248 removed unused #include <linux/version.h>'s
This patch lets the files using linux/version.h match the files that
#include it.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-08-23 12:14:12 -07:00
Dave Jiang
f87bd330ed edac: mpc85xx fix pci ofdev 2nd pass
Convert PCI err device from platform to open firmware of_dev to comply
with powerpc schemes.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Dave Jiang <djiang@mvista.com>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:49 -07:00
Dave Jiang
fcb19171d1 edac: mv64x60 add pci fixup
Fixup of missing bit 0 on 64360 PCIx_ERR_MASK and errata FEr-#11 and
FEr-#16 for the 64460.  Bit 0 must remain 0.

Signed-off-by: Dave Jiang <djiang@mvista.com>
Signed-off-by: Doug Thompson <dougthompson.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:49 -07:00
Dave Jiang
596d394103 edac: mv64x60 fix get_property
Update get_property() call to use of_get_property() in order to fix compile

Signed-off-by: Dave Jiang <djiang@mvista.com>
Signed-off-by: Doug Thompson <dougthompson.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:49 -07:00
Doug Thompson
10d33e9c36 edac: e752x fix too loud on nonmemory errors
This module harvests more than just memory errors, it also harvests
various bus and dma errors that the Chipset detects.  Previously, it would
report all such errors, which would cause output to be TOO loud.

This patches therefore adds a parameter which is used to turn off
NON-MEMORY error reports by default.  Or the reporting can be enabled via
the parameter

Also did code style cleanup: less than 80 characters per line rule

Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:49 -07:00
Arthur Jones
124682c785 edac: core fix added newline to sysfs dimm labels
The channel DIMM label does not seem to be used much in the edac code.
However, where it is used (in the core code), it is assumed to not have a
newline embedded.  This leaves the sysfs file newline free which looks
funny when cat'ing it.  Here we just add the trailing newline to the sysfs
chX_dimm_label output...

[Doug Thompson note: the DIMM label is one of the primary uses of EDAC.
User space daemon scripts, edac-utils@sourceforge, populate the DIMM label
fields, via /sys/devices/system/edac attributes, with the silk screen
labels of the motherboard in use.  dmidecode access BIOS tables, but BIOS
tables are well known to be incorrect and useless in these respects.
edac-utils will strip off any newlines before its use of the output, when
displaying DIMM slot silk screen labels.

Signed-off-by: Arthur Jones <ajones@riverbed.com>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:49 -07:00
Arthur Jones
f9fc82adca edac: core fix static to dynamic kset
Static kobjects and ksets are not supported in Linux kernel.  Convert the
mc_kset from static to dynamic.  This patch depends on my previous patch
to remove the module parameter attributes from mc...

Signed-off-by: Arthur Jones <ajones@riverbed.com>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:49 -07:00
Arthur Jones
327dafb1c6 edac: core fix redundant sysfs controls to parameters
/sys/devices/system/edac/mc has a few files which are duplicated in
/sys/module/edac_core/parameters.  Now that all the functionality is
duplicated between these two locations, we remove the former kobject
attributes and update the documentation.

Signed-off-by: Arthur Jones <ajones@riverbed.com>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:49 -07:00
Arthur Jones
096846e2b0 edac: core fix workq timer
When updating the edac_mc_poll_msec module parameter from the sysfs
/sys/module/edac_core/parameters/edac_mc_poll_msec file, we don't update
the workq timers.  So that, if we move from a big poll time to a small
one, the small one won't take effect until the big one has timed out.

Here we provide a new module parameter set method to call out to the
update routine.  This brings the /sys/module/edac_core/parameters
functionality up to that provided by the /sys/drivers/system/edac/mc sysfs
module parameter files so that we can remove them or at least link to the
/sys/module files...

Signed-off-by: Arthur Jones <ajones@riverbed.com>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:49 -07:00
Arthur Jones
14cc571bb1 edac: core fix to use dynamic kobject
Static kobjects are not supported in linux kernel.  Convert the
edac_pci_top_main_kobj from static to dynamic.  This avoids the double
free of the edac_pci_top_main_kobj.name that we see on module reload of
the e752x edac driver (and probably others as well).

In addition Greg KH <greg@kroah.com> has pointed out that this code may be
cleaned up significantly.  I will look at that as a follow-on patch, for
now, I just want the minimum fix to get this double-free oops bug
squashed...

Many thanks to Greg KH for his patience in showing me what the
Documentation/kobject.txt already said (oops)...

Signed-off-by: Arthur Jones <ajones@riverbed.com>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:48 -07:00
Arthur Jones
b238e57723 edac: i5100: cleanup
Some code cleanliness issues found by Andrew Morton (thanks!) which should
not affect functionality, but which should help make the code more
maintainable.

In particular, we now:

* convert all #define's w/ a parameter to static inlines
* use 1UL rather than 1ULL when calculating an unsigned long
* use pci_disable_device

The resulting code is tested and seems to work fine...

Signed-off-by: Arthur Jones <ajones@riverbed.com>
Cc: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:48 -07:00
Arthur Jones
178d5a7422 edac: i5100 fix unmask ecc bits
Explicitly unmask ECC errors we are interested in reporting.

Signed-off-by: Arthur Jones <ajones@riverbed.com>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:48 -07:00
Arthur Jones
43920a598f edac: i5100 fix enable ecc hardware
It is possible that the BIOS did not enable ECC at boot time.  We check
for that case and fail to load if it is true.

Signed-off-by: Arthur Jones <ajones@riverbed.com>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:48 -07:00
Arthur Jones
f7952ffcff edac: i5100 fix missing bits
The error mask we use to trigger ECC notifications is missing many bits of
interest.  We add these bits here so that all possible ECC errors can be
reported.

Signed-off-by: Arthur Jones <ajones@riverbed.com>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:48 -07:00
Arthur Jones
8f421c595a edac: i5100 new intel chipset driver
Preliminary support for the Intel 5100 MCH.  CE and UE errors are reported
along with the current DIMM label information and other memory parameters.

Reasons why this is preliminary:

1) This chip has 2 independent memory controllers which, for best
   perforance, use interleaved accesses to the DDR2 memory.  This
   architecture does not map very well to the current edac data structures
   which depend on symmetric channel access to the interleaved data.
   Without core changes, the best I could do for now is to map both memory
   controllers to different csrows (first all ranks of controller 0, then
   all ranks of controller 1).  Someone much more familiar with the edac
   core than I will probably need to come up with a more general data
   structure to handle the interleaving and de-interleaving of the two
   memory controllers.

2) I have not yet tackled the de-interleaving of the rank/controller
   address space into the physical address space of the CPU.  There is
   nothing fundamentally missing, it is just ending up to be a lot of
   code, and I'd rather keep it separate for now, esp since it doesn't
   work yet...

3) The code depends on a particular i5100 chip select to DIMM mainboard
   chip select mapping.  This mapping seems obvious to me in order to
   support dual and single ranked memory, but it is not unique and DIMM
   labels could be wrong on other mainboards.  There is no way to query
   this mapping that I know of.

4) The code requires that the i5100 is in 32GB mode.  Only 4 ranks per
   controller, 2 ranks per DIMM are supported.  I do not have hardware
   (nor do I expect to have hardware anytime soon) for the 48GB (6 ranks
   per controller) mode.

5) The serial presence detect code should be broken out into a "real"
   i2c driver so that decode-dimms.pl can work.

Signed-off-by: Arthur Jones <ajones@riverbed.com>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:48 -07:00
Maxim Shchetynin
c134fd868f powerpc/cell/edac: Log a syndrome code in case of correctable error
If correctable error occurs the syndrome code was logged as 0. This patch
lets EDAC to log a correct syndrome code to make problem investigation
easier.

Signed-off-by: Maxim Shchetynin <maxim@de.ibm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2008-07-22 10:39:36 +10:00
Kumar Gala
f99c90094b edac: mpc85xx: fix building as a module
including of <asm/mpc85xx.h> causes build problems since it doesn't exist.

Also removed warning:
drivers/edac/mpc85xx_edac.c:45: warning: 'mpc85xx_ctl_name' defined but not used

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Acked-by: Doug Thompson <dougthompson@xmission.com>
Acked-by: Dave Jiang <djiang@mvista.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-24 09:56:13 -07:00
Stephen Rothwell
17aa7e0344 dev_name introduction fall out fix
Commit 06916639e2 ("driver-core: add
dev_name() to help transition away from using bus_id") added a static
inline dev_name() and used it in dev_printk.

Unfortunately, drivers/edac/edac_core.h defines a macro called
dev_name().  Rename the latter.

Diagnosis by Tony Breeds and Michael Ellerman.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-05 15:08:38 -07:00
Stephen Rothwell
a94a630a4c pasemi_edac needs to include linux/edac.h
Commit c3c52bce69 ("edac: fix module
initialization on several modules 2nd time") added a call to opstate_init
but did not include linux/edac.h that declares it.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-29 19:06:57 -07:00
Hitoshi Mitake
c3c52bce69 edac: fix module initialization on several modules 2nd time
I implemented opstate_init() as a inline function in linux/edac.h.

added calling opstate_init() to:
	i82443bxgx_edac.c
	i82860_edac.c
	i82875p_edac.c
	i82975x_edac.c

I wrote a fixed patch of
edac-fix-module-initialization-on-several-modules.patch,
and tested building 2.6.25-rc7 with applying this. It was succeed.
I think the patch is now correct.

Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Hitoshi Mitake <h.mitake@gmail.com>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-29 08:06:26 -07:00
Adrian Bunk
1a45027d1a edac: remove unneeded functions and add static accessor
Collection of patches, merged into one, from Adrian that do the following:

1) This patch makes the following needlessly global functions static:
- edac_pci_get_log_pe()
- edac_pci_get_log_npe()
- edac_pci_get_panic_on_pe()
- edac_pci_unregister_sysfs_instance_kobj()
- edac_pci_main_kobj_setup()

2) Remove unneeded function edac_device_find()

3) Added #if 0 around function  edac_pci_find()

4) make the needlessly global edac_pci_generic_check() static

5) Removed function edac_check_mc_devices()

Doug Thompson modified Adrian's patches, to bettern represent
the direction of EDAC, and make them one patch.

Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-29 08:06:26 -07:00
Robert P. J. Day
ff6ac2a616 edac: use the shorter LIST_HEAD for brevity
Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca>
Acked-by: Doug Thompson <norsk5@yahoo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-29 08:06:26 -07:00
Peter Tyser
94ee1cf5a8 edac: add e752x parameter for sysbus_parity selection
Add a module parameter "sysbus_parity" to allow forcing system bus parity
error checking on or off.  Also add support to automatically disable system
bus parity errors for processors which do not support it.

If the sysbus_parity parameter is specified, sysbus parity detection will be
forced on or off.  If it is not specified, the driver will attempt to look at
the CPU identifier string and determine if the CPU supports system bus parity.
 A blacklist was used instead of a whitelist so that system bus parity would
be enabled by default and to minimize the chances of breaking things for those
people already using the driver which for some reason have a processor that
does not have a valid CPU identifier string.

[akpm@linux-foundation.org: coding-style fixes]
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Peter Tyser <ptyser@xes-inc.com>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-29 08:06:26 -07:00
Andrei Konovalov
5135b797c8 edac: new support for Intel 3100 chipset
Add Intel 3100 chipset support to e752x EDAC driver.

Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrei Konovalov <akonovalov@ru.mvista.com>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-29 08:06:25 -07:00
Jason Uhlenkott
870897a5ab drivers/edac/i3000: document type promotion
By popular request, add a comment documenting the implicit type promotion
here.

Signed-off-by: Jason Uhlenkott <juhlenko@akamai.com>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-07 08:42:23 -08:00
Hitoshi Mitake
7ed31e0fa0 drivers/edac: i3000: missing init code
There is a missing sequence of initialization code during startup.

Signed-off-by: Hitoshi Mitake <h.mitake@gmail.com>
Signed-off-by: Jason Uhlenkott <juhlenko@akamai.com>
Signed-off-by: Doug Thompson <dougthompson@xmisson.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-07 08:42:23 -08:00
Doug Thompson
cd4755c2a9 drivers/edac: mpc85xx: add static scope
Made a previous global variable, static in scope

Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-07 08:42:23 -08:00
Jason Uhlenkott
f5c0454c86 drivers/edac: i3000: 64bit build
Modified to run on x86_64 as well as x86

i3000_edac builds (and runs) fine on x86_64.

Signed-off-by: Jason Uhlenkott <juhlenko@akamai.com>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-07 08:42:23 -08:00
Bryan Boatright
6b09ff9d78 drivers/edac: pci: broken parity regression
Using the EDAC code in kernel.org kernel version 2.6.23.8 I am seeing the
following problem:

    In the kernel there is a pci device attribute located in sysfs that is
    checked by the EDAC PCI scanning code. If that attribute is set,
    PCI parity/error scannining is skipped for that device. The attribute
    is:

            broken_parity_status

    as is located in /sys/devices/pci<XXX>/0000:XX:YY.Z directorys for
    PCI devices.

I don't think this check was actually implemented.  I have a misbehaved card
that reports a parity error every 1000 ms:

Nov 25 07:28:43 beta kernel: EDAC PCI: Master Data Parity Error on 0000:05:01.0
Nov 25 07:28:44 beta kernel: EDAC PCI: Master Data Parity Error on 0000:05:01.0
Nov 25 07:28:45 beta kernel: EDAC PCI: Master Data Parity Error on 0000:05:01.0

Setting that card's broken_parity_status bit did not mask the error:

echo "1" > /sys/bus/pci/devices/0000:05:01.0/broken_parity_status

I looked through the EDAC code and did not readily see any reference to
broken_parity_status at all (which makes sense based on the behavior I am
seeing).  I applied the following patch as a proof-of-concept and now EDAC's
PCI parity error reporting behaves as documented:

bryan

Good regression find, bryan. It used to work. sigh.
I added more logic to your patch, for more coverage of the error.

Doug T

Signed-off-by: Bryan Boatright <b1@omega71.com>
Signed-off-by: Doug Thompson <dougthompson@xmisson.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-07 08:42:23 -08:00
Dave Jiang
4f4aeeabc0 drivers-edac: add marvell mv64x60 driver
Marvell mv64x60 SoC support for EDAC.  Used on PPC and MIPS platforms.
Development and testing done on PPC Motorola prpmc2800 ATCA board.

[akpm@linux-foundation.org: make mv64x60_ctl_name static]
Signed-off-by: Dave Jiang <djiang@mvista.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk
Signed-off-by: Douglas Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-07 08:42:23 -08:00
Dave Jiang
a9a753d532 drivers-edac: add freescale mpc85xx driver
EDAC chip driver support for Freescale MPC85xx platforms. PPC based.

Signed-off-by: Dave Jiang <djiang@mvista.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk
Signed-off-by:	Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-07 08:42:23 -08:00
Jason Uhlenkott
4d2b165eca drivers-edac: i3000 replace macros with functions
Replace function-like macros with functions.

Signed-off-by: Jason Uhlenkott <juhlenko@akamai.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-07 08:42:23 -08:00
Jason Uhlenkott
ce783d70b9 drivers-edac: i3000 code tidying
Style cleanup, mostly just 80-column fixes.

Signed-off-by: Jason Uhlenkott <juhlenko@akamai.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-07 08:42:23 -08:00
Benjamin Herrenschmidt
48764e4143 drivers-edac: add Cell MC driver
Adds driver for the Cell memory controller when used without a Hypervisor such
as on the IBM Cell blades.  There might still be some improvements to do to
this such as finding if it's possible to properly obtain more details about
the address of the error but it's good enough already to report CE counts
which is our main priority at the moment.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-07 08:42:23 -08:00
Benjamin Herrenschmidt
1d5f726cbf drivers-edac: add Cell XDR memory types
Add the definitions for the Rambus XDR memory type used by the Cell processor.
It's a pre-requisite for the followup Cell EDAC patch.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-07 08:42:23 -08:00
Anton Blanchard
c2ae24cfd1 drivers-edac: use round_jiffies_relative
When rounding a relative timeout we need to use round_jiffies_relative().

Signed-off-by: Anton Blanchard <anton@samba.org>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-07 08:42:23 -08:00
Doug Thompson
56e61a9c5f drivers-edac: turn on edac device error logging
ENABLE the 'logging' of CE and UE events for the EDAC_DEVICE class of error
harvester in EDAC

Cc: Alan Cox <alan@lxorguk.ukuu.org.uk
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-07 08:42:23 -08:00
Joe Perches
6f042b50e0 drivers/edac/: Spelling fixes
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
2008-02-03 17:12:34 +02:00
Paul Mackerras
bd45ac0c5d Merge branch 'linux-2.6' 2008-01-31 11:25:51 +11:00
Kay Sievers
af5ca3f4ec Driver core: change sysdev classes to use dynamic kobject names
All kobjects require a dynamically allocated name now. We no longer
need to keep track if the name is statically assigned, we can just
unconditionally free() all kobject names on cleanup.

Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-01-24 20:40:40 -08:00
Greg Kroah-Hartman
c10997f657 Kobject: convert drivers/* from kobject_unregister() to kobject_put()
There is no need for kobject_unregister() anymore, thanks to Kay's
kobject cleanup changes, so replace all instances of it with
kobject_put().


Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-01-24 20:40:40 -08:00
Greg Kroah-Hartman
b2ed215a33 Kobject: change drivers/edac to use kobject_init_and_add
Stop using kobject_register, as this way we can control the sending of
the uevent properly, after everything is properly initialized.

Acked-by: Doug Thompson <dougthompson@xmission.com>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-01-24 20:40:28 -08:00
Greg Kroah-Hartman
3514faca19 kobject: remove struct kobj_type from struct kset
We don't need a "default" ktype for a kset.  We should set this
explicitly every time for each kset.  This change is needed so that we
can make ksets dynamic, and cleans up one of the odd, undocumented
assumption that the kset/kobject/ktype model has.

This patch is based on a lot of help from Kay Sievers.

Nasty bug in the block code was found by Dave Young
<hidave.darkstar@gmail.com>

Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Dave Young <hidave.darkstar@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-01-24 20:40:10 -08:00
Olof Johansson
0d08a84770 [POWERPC] pasemi: Broaden specific references to 1682M
There will be more product numbers in the future than just PA6T-1682M,
but they will share much of the features. Remove some of the explicit
references and compatibility checks with 1682M, and replace most of them
with the more generic term "PWRficient".

Signed-off-by: Olof Johansson <olof@lixom.net>
Acked-by: Michael Buesch <mb@bu3sch.de>
Acked-by: Doug Thompson <dougthompson@xmission.com>
2007-11-29 22:30:47 -06:00
Darrick J. Wong
57510c2f93 i5000_edac: no need to __stringify() KBUILD_BASENAME
The i5000_edac driver's PCI registration structure has the name
""i5000_edac"" (with extra set of double-quotes) which is probably not
intentional.  Get rid of __stringify.

Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-11-14 18:45:41 -08:00
Stephen Rothwell
1b3e4c706c NULL terminate the pci_device_ids in pasemi_edac
Fixes:
drivers/edac/pasemi_edac: struct pci_device_id is 32 bytes.  The last of 1 is:
0x00 0x00 0x19 0x59 0x00 0x00 0xa0 0x0a 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff
0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
FATAL: drivers/edac/pasemi_edac: struct pci_device_id is not terminated with a NULL entry!

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Douglas Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-19 11:53:56 -07:00
Jiri Slaby
93043ece03 define global BIT macro
define global BIT macro

move all local BIT defines to the new globally define macro.

Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Kumar Gala <galak@gate.crashing.org>
Cc: Dmitry Torokhov <dtor@mail.ru>
Cc: Jeff Garzik <jeff@garzik.org>
Cc: James Bottomley <James.Bottomley@steeleye.com>
Cc: "Antonino A. Daplas" <adaplas@pol.net>
Cc: Russell King <rmk@arm.linux.org.uk>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Cc: "John W. Linville" <linville@tuxdriver.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-19 11:53:42 -07:00
Greg Kroah-Hartman
34980ca8fa Drivers: clean up direct setting of the name of a kset
A kset should not have its name set directly, so dynamically set the
name at runtime.

This is needed to remove the static array in the kobject structure which
will be changed in a future patch.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-10-12 14:51:02 -07:00
Aristeu Rozanski
f9b5a5d193 drivers/edac: fix e752x correct return code
This patch changes the error code when dev0:fun1 was hidden by BIOS to one
more appropriate.

Signed-off-by: Aristeu Rozanski <aris@ruivo.org>
Signed-off-by: Mark Gross <mark.gross@intel.com>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-09-11 17:21:19 -07:00
Doug Thompson
3c8bb2cfa2 drivers/edac: fix printk level down to debug from emerg
When EDAC is configured for EDAC DEBUGGING, the debug printk output level
was set TOO high (EMERG). This patch brings it down to a DEBUG level

Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-09-11 17:21:19 -07:00
Doug Thompson
ddcc3050bd drivers/edac: fix pasemi kconfig depends
Fixed 'depends on PPC_PASEMI' in EDAC Kconfig.  Module PASEMI depends ONLY on
the PASEMI on PPC.

Was previously enabled for ALL PPC

Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Egor N. Martovetsky <egor@pasemi.com>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-26 11:35:18 -07:00
Doug Thompson
d4c1465b7d drivers/edac: fix edac_pci sysfs
This patch fixes sysfs exit code for the EDAC PCI device in a similiar manner
and the previous fixes for EDAC_MC and EDAC_DEVICE.

It removes the old (and incorrect) completion model and uses reference counts
on per instance kobjects and on the edac core module.

This pattern was applied to the edac_mc and edac_device code, but the EDAC PCI
code was missed.  In addition, this fixes a system hang after a low level
driver was unloaded.  (A cleanup function was called twice, which really
screwed things up)

Cc: Greg KH <greg@kroah.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by:  Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-26 11:35:18 -07:00
Doug Thompson
bce19683c1 drivers/edac: fix reset edac_mc pollmsec
This fixes a deadlock that could occur on a 'setup' and 'teardown' sequence of
the workq for a edac_mc control structure instance.  A similiar fix was
previously implemented for the edac_device code.

In addition, the edac_mc device code there was missing code to allow the workq
period valu to be altered via sysfs control.

This patch adds that fix on the code, and allows for the changing of the
period value as well.

Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-26 11:35:18 -07:00
Andrew Morton
4c6a1c130e edac is bust on mips
drivers/edac/edac_stub.c:15:22: asm/edac.h: No such file or directory

was it even supposed to work?

Cc: Douglas Thompson <dougthompson@xmission.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-26 11:35:17 -07:00
Al Viro
0bd8496b59 drivers/ misc __iomem annotations
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-26 11:11:57 -07:00
Doug Thompson
b2a4ac0c28 drivers/edac: fix edac_device sysfs corner case bug
Some simple fixes to properly reference counter values from the block
attribute level of edac_device objects.  Properly sequencing the array pointer
was added, resulting in correct identification of block level attributes from
their base class functions.

Added more verbose debug statement for event tracking.

Also during some corner testing, found a bug in the store/show sequence
of operations for the block attribute/controls management.

An old intermediate structure for 'blocks' was still in the processing
pipeline.  This patch removes that old structure and correctly utilizes the
new struct edac_dev_sysfs_block_attribute for passing control from the sysfs
to the low level store/show function of the edac driver.

Now the proper kobj pointer to passed downward to the store/show
functions.

Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Cc: Greg KH <greg@kroah.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:57 -07:00
Ranganathan Desikan
420390f06a drivers/edac: new i82975x driver
New EDAC driver for the i82975x memory controller chipset Used on ASUS
motherboards

[akpm@linux-foundation.org: fix multiple coding-style bloopers]
Signed-off-by: <arvind@acarlab.com>
Signed-off-by: Ranganathan Desikan <rdesikan@jetzbroadband.com>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Cc: Greg KH <greg@kroah.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:57 -07:00
Doug Thompson
bf52fa4a26 drivers/edac: fix workq reset deadlock
Fix mutex locking deadlock on the device controller linked list.  Was calling
a lock then a function that could call the same lock.  Moved the cancel workq
function to outside the lock

Added some short circuit logic in the workq code

Added comments of description

Code tidying

Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Cc: Greg KH <greg@kroah.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:57 -07:00
Doug Thompson
fb3fb20687 drivers/edac: code tidying on export-gpl
Change EXPORT_SYMBOLs to EXPORT_SYMBOLS_GPL
Tidy changes: blank lines, inline removal, add comment

Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Cc: Greg KH <greg@kroah.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:57 -07:00
Douglas Thompson
1c3631ff1f drivers/edac: fix edac_device sysfs completion code
With feedback, this patch corrects operation of the kobject release operation
on kobjects, attributes and controls for the edac_device.

Cc: Alan Cox alan@lxorguk.ukuu.org.uk
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Acked-by: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:57 -07:00
Doug Thompson
8096cfafbb drivers/edac: fix edac_mc sysfs completion code
This patch refactors the 'releasing' of kobjects for the edac_mc type of
device.  The correct pattern of kobject release is followed.

As internal kobjs are allocated they bump a ref count on the top level kobj.
It in turn has a module ref count on the edac_core module.  When internal
kobjects are released, they dec the ref count on the top level kobj.  When the
top level kobj reaches zero, it decrements the ref count on the edac_core
object, allow it to be unloaded, as all resources have all now been released.

Cc: Alan Cox alan@lxorguk.ukuu.org.uk
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Acked-by: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:57 -07:00
Doug Thompson
d45e7823ba drivers/edac: fix edac_device init apis
Refactoring of sysfs code necessitated the refactoring of the
edac_device_alloc() and edac_device_add_device() apis, of moving the index
value to the alloc() function.  This patch alters the in tree drivers to
utilize this new api signature.

Having the index value performed later created a chicken-and-the-egg issue.
Moving it to the alloc() function allows for creating the necessary sysfs
entries with the proper index number

Cc: Alan Cox alan@lxorguk.ukuu.org.uk
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:57 -07:00
Doug Thompson
b8f6f97552 drivers/edac: fix edac_mc init apis
Refactoring of sysfs code necessitated the refactoring of the edac_mc_alloc()
and edac_mc_add_mc() apis, of moving the index value to the alloc() function.
This patch alters the in tree drivers to utilize this new api signature.

Having the index value performed later created a chicken-and-the-egg issue.
Moving it to the alloc() function allows for creating the necessary sysfs
entries with the proper index number

Cc: Alan Cox alan@lxorguk.ukuu.org.uk
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:57 -07:00
Douglas Thompson
fd309a9d8e drivers/edac: fix leaf sysfs attribute
This patch fixes and enhances the driver level set of sysfs attributes that
can be added to the 'block' level of an edac_device type of driver.

There is a controller information structure, which contains one or more
instances of device.  Each instance will have one or more blocks of device
specific counters.  This patch fixes the ability to have more detailed
attributes/controls for each of the 'blocks', providing for the addition of
controls/attributes from the low level driver to user space via sysfs.

Cc: Alan Cox alan@lxorguk.ukuu.org.uk
Signed-off-by: Douglas Thompson <dougthompson@xmission.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:56 -07:00
Egor Martovetsky
7d8536fb48 drivers/edac: new pasemi driver
NEW EDAC driver for the memory controllers on PA Semi PA6T-1682M.

Changes since last submission:

* Rebased on top of 2.6.22-rc4-mm2 with the EDAC changes merged there.
* Minor checkpatch.pl cleanups
* Renamed ctl_name
* Added dev_name
* edac_mc.h -> edac_core.h

[akpm@linux-foundation.org: make printk more informative]
Cc: Alan Cox alan@lxorguk.ukuu.org.uk
Signed-off-by: Egor Martovetsky <egor@pasemi.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Doug Thompson <dougthompson@xmission.com
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:56 -07:00
Mark Grondona
7297c2617f drivers/edac: fix e752x reversed csrows
Found a 'reversal' decoding bug in the driver.  This patch fixes that mapping
to correctly display the CSROW entries in their proper order.  Users will be
enable to correctly identifiy the failing DIMM with this fix.

[akpm@linux-foundation.org: unneeded (and undesirable) cast of void*]
Cc: Alan Cox alan@lxorguk.ukuu.org.uk
Signed-off-by: Mark Grondona <mgrondona@llnl.gov>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:56 -07:00
Doug Thompson
0ca84761fa drivers/edac: fix edac_device semaphore to mutex
A previous patch changed the edac_mc src file from semaphore usage to mutex
This patch changes the edac_device src file as well, from semaphore use to
mutex operation.

Use a mutex primitive for mutex operations, as it does not require a
semaphore

Cc: Alan Cox alan@lxorguk.ukuu.org.uk
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:56 -07:00
Douglas Thompson
7f065e723b drivers/edac: remove file edac_mc.h
Removed the no-longer-needed file edac_mc.h

Signed-off-by: Douglas Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:56 -07:00
Douglas Thompson
494d0d55bc drivers/edac: mod edac_opt_state_to_string function
Refactored the function edac_op_state_toString() to be edac_op_state_to_string()
for consistent style, and its callers

Signed-off-by: Douglas Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:56 -07:00
Douglas Thompson
7391c6dcab drivers/edac: mod edac_align_ptr function
Refactor the edac_align_ptr() function to reduce the noise of casting the
aligned pointer to the various types of data objects and modified its callers
to its new signature

Signed-off-by: Douglas Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:56 -07:00
Douglas Thompson
52490c8d07 drivers/edac: edac_device code tidying
For the file edac_device.c perform some coding style enhancements
Add some function header comments
Made for better readability commands

Signed-off-by: Douglas Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:55 -07:00
Douglas Thompson
b2ccaecad2 drivers/edac: i5000 code tidying
Various code style conformance patches on the i5000 driver

Signed-off-by: Douglas Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:55 -07:00
Douglas Thompson
f044091ca4 drivers/edac: remove null from statics
Patches to conform to coding style, namely static don't need to be initialized
to NULL nor '0', as that is the default

Signed-off-by: Douglas Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:55 -07:00
Marisuz Kozlowski
977c76bd68 drivers/edac: i5000 define typo
Found a typo in one of the #defines in the driver

MTR_DIM_RANKS --> MTR_DIMM_RANK

Signed-off-by: Marisuz Kozlowski <m.kozlowski@tuxland.pl>
Signed-off-by: Douglas Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:55 -07:00
Douglas Thompson
1c52152b30 drivers/edac: fix ignored return i82875p
Compiling this module gave a warning that the return value of
'pci_bus_add_device()' was not checked.

This patch adds that check and an output message

Signed-off-by:	Douglas Thompson <dougthompson@xmission.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:55 -07:00
Jason Uhlenkott
654ede200f drivers/edac: mod race fix i82875p
If ERRSTS indicates that there's no error then we don't need to bother reading
the other registers.

In addition to making the common case faster, this actually fixes a small race
where we don't see an error but we clear the error bits anyway, potentially
wiping away info on an error that happened in the interim (or where a CE
arrives between the first and second read of ERRSTS, causing us to falsely
claim "UE overwrote CE").

Signed-off-by: Jason Uhlenkott <juhlenko@akamai.com>
Signed-off-by: Douglas Thompson <dougthompson@xmission.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:55 -07:00
Douglas Thompson
b113a3f7e8 drivers/edac: add mips and ppc visibility
1) Remove an old CVS ID string

2) change EDAC from a tristate option to a simple bool option

3) In addition to the X86 arch, PPC and MIPS also have drivers in the
submission queue.  This patch turns on the EDAC flag for those archs.  Each
driver will have its respective 'depends on ARCH' set.

Signed-off-by: Douglas Thompson <dougthompson@xmission.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:55 -07:00
Douglas Thompson
052dfb45cc drivers/edac: cleanup spaces-gotos after Lindent messup
This patch fixes some remnant spaces inserted by the use of Lindent.
Seems Lindent adds some spaces when it shoulded. These have been fixed.
In addition, goto targets have issues, these have been fixed
in this patch.

Signed-off-by: Douglas Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:55 -07:00
Douglas Thompson
8cb2a39831 drivers/edac: add info kconfig
Kconfig - modified the help of EDAC

Signed-off-by: Douglas Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:55 -07:00
Douglas Thompson
d391a7b814 drivers/edac: device output clenaup
The error handling output strings needed to be refactored for better
displaying of the error informaton.

Also needed to added offset_value for output as well

Signed-off-by: Douglas Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:55 -07:00
Douglas Thompson
42a8e397a8 drivers/edac: add device sysfs attributes
Added new controls for the edac_device and edac_mc sysfs folder.
These can be initialized by the low level driver to provide misc
controls into the low level driver for its use

Signed-off-by: Douglas Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:55 -07:00
Dave Jiang
456a2f9552 drivers/edac: drivers to use new PCI operation
Move x86 drivers to new pci controller setup

Signed-off-by: Dave Jiang <djiang@mvista.com>
Signed-off-by: Douglas Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:55 -07:00
Douglas Thompson
cddbfcacf0 drivers/edac: Lindent r82600
Run r82600_edac.c file through Lindent for cleanup

Signed-off-by: Douglas Thompson <dougthompson@xmission.com>
Signed-off-by: Dave Jiang <djiang@mvista.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:55 -07:00
Douglas Thompson
1111660109 drivers/edac: Lindent i82443bxgx
Run i82443bxgx.c file through Lindent for cleanup

Signed-off-by: Douglas Thompson <dougthompson@xmission.com>
Signed-off-by: Dave Jiang <djiang@mvista.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:55 -07:00
Dave Jiang
203333cbba drivers/edac: Lindent e752x
Run e752x_edac.c file through Lindent for cleanup

Signed-off-by: Dave Jiang <djiang@mvista.com>
Signed-off-by: Douglas Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:54 -07:00
Dave Jiang
466b71d584 drivers/edac: Lindent i82875p
Lindent cleanup of i82875p_edac driver

Signed-off-by: Dave Jiang <djiang@mvista.com>
Signed-off-by: Douglas Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:54 -07:00
Dave Jiang
b4e8b37201 drivers/edac: Lindent i82860
Lindent cleanup of i82860 edac driver

Signed-off-by: Dave Jiang <djiang@mvista.com>
Signed-off-by: Douglas Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:54 -07:00
Dave Jiang
36b8289e24 drivers/edac: Lindent i3000
Lindent cleanup of i3000_edac driver

Signed-off-by: Dave Jiang <djiang@mvista.com>
Signed-off-by: Douglas Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:54 -07:00
Dave Jiang
849a4c375a drivers/edac: Lindent e7xxx
Lindent cleanup of e7xxx_edac driver

Signed-off-by: Dave Jiang <djiang@mvista.com>
Signed-off-by: Douglas Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:54 -07:00
Douglas Thompson
f4aff42653 drivers/edac: Lindent i5000
Ran e752x_edac.c file through Lindent for cleanup

Signed-off-by: Douglas Thompson <dougthompson@xmission.com>
Signed-off-by: Dave Jiang <djiang@mvista.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:54 -07:00
Douglas Thompson
67cb2b6122 drivers/edac: Lindent amd76x
Ran this driver through Lindent for cleanup

Signed-off-by: Dave Jiang <djiang@mvista.com>
Signed-off-by: Douglas Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:54 -07:00
Douglas Thompson
86aa8cb7bc drivers/edac: cleanup workq ifdefs
The origin of this code comes from patches at sourceforge, that
allow EDAC to be updated to various kernels. With kernel version 2.6.20 a
new workq system was installed, thus the patches needed to be modified
based on the kernel version. For submitting to the latest kernel.org
those #ifdefs are removed

Signed-off-by: Douglas Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:54 -07:00
Douglas Thompson
542b25881a drivers/edac: edac_device sysfs cleanup
Removal of some old dead and disabled code from the edac_device sysfs code

Signed-off-by: Douglas Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:54 -07:00
Douglas Thompson
079708b917 drivers/edac: core Lindent cleanup
Run the EDAC CORE files through Lindent for cleanup

Signed-off-by: Douglas Thompson <dougthompson@xmission.com>
Signed-off-by: Dave Jiang <djiang@mvista.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:54 -07:00
Dave Jiang
4de78c6877 drivers/edac: mod PCI poll names
Fixup poll values for MC and PCI.
Also make mc function names unique to mc.

Signed-off-by: Dave Jiang <djiang@mvista.com>
Signed-off-by: Douglas Thompson <dougthompson@xmissin.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:54 -07:00
Dave Jiang
66ee2f940a drivers/edac: mod assert_error check
Change error check and clear variable from an atomic to an int

Signed-off-by: Dave Jiang <djiang@mvista.com>
Signed-off-by: Douglas Thompson <dougthompson@xmission.com
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:54 -07:00
Dave Jiang
91b99041c1 drivers/edac: updated PCI monitoring
Moving PCI to a per-instance device model

This should include the correct sysfs setup as well. Please review.

Signed-off-by: Dave Jiang <djiang@mvista.com>
Signed-off-by: Douglas Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:54 -07:00
Dave Jiang
81d87cb13e drivers/edac: mod MC to use workq instead of kthread
Move the memory controller object to work queue based implementation from the
kernel thread based.

Signed-off-by: Dave Jiang <djiang@mvista.com>
Signed-off-by: Douglas Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:54 -07:00
Jason Uhlenkott
535c6a5303 drivers/edac: new inte 30x0 MC driver
Here's a driver for the Intel 3000 and 3010 memory controllers,
relative to today's Sourceforge code drop.  This has only had light
testing (I've yet to actually see it handle a memory error) but it
detects my hardware correctly.

Signed-off-by: Jason Uhlenkott <juhlenko@akamai.com>
Signed-off-by: Douglas Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:54 -07:00
Dave Jiang
c4192705fe drivers/edac: add dev_name getter function
Move dev_name() macro to a more generic interface since it's not possible
to determine whether a device is pci, platform, or of_device easily.

Now each low level driver sets the name into the control structure, and
the EDAC core references the control structure for the information.

Better abstraction.

Signed-off-by: Dave Jiang <djiang@mvista.com>
Signed-off-by: Douglas Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:53 -07:00
Douglas Thompson
20bcb7a81d drivers/edac: mod use edac_core.h
In the refactoring of edac_mc.c into several subsystem files,
the header file edac_mc.h became meaningless. A new header file
edac_core.h was created. All the files that previously included
"edac_mc.h" are changed to include "edac_core.h".

Signed-off-by: Douglas Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:53 -07:00
Dave Jiang
c0d1217202 drivers/edac: add new nmi rescan
Provides a way for NMI reported errors on x86 to notify the EDAC
subsystem pending ECC errors by writing to a software state variable.

Here's the reworked patch. I added an EDAC stub to the kernel so we can
have variables that are in the kernel even if EDAC is a module. I also
implemented the idea of using the chip driver to select error detection
mode via module parameter and eliminate the kernel compile option.
Please review/test. Thx!

Also, I only made changes to some of the chipset drivers since I am
unfamiliar with the other ones. We can add similar changes as we go.

Signed-off-by: Dave Jiang <djiang@mvista.com>
Signed-off-by: Douglas Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:53 -07:00
Andrew Morton
28f96eeafc drivers/edac-new-i82443bxgz-mc-driver: mark as broken
It will claim the PCI devices from under intel_agp.ko's feet.  Greg is brewing
some fix for that.

Cc: Douglas Thompson <dougthompson@xmission.com>
Cc: Tim Small <tim@buttersideup.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:53 -07:00
Tim Small
5a2c675c89 drivers/edac: new i82443bxgz MC driver
This is a NEW EDAC Memory Controller driver for the 440BX chipset (I82443BXGX)
created and submitted by Timm Small

Signed-off-by: Tim Small <tim@buttersideup.com>
Signed-off-by: Douglas Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:53 -07:00
Douglas Thompson
522a94bd1e drivers/edac: core.h fix scrubdefs
Patch to fix some scrubbing #defines in the edac_core.h file

Signed-off-by: Douglas Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:53 -07:00
Eric Wollesen
eb60705ac5 drivers/edac: new intel 5000 MC driver
Eric Wollesen ported the Bluesmoke Memory Controller driver (written by Doug
Thompson) for the Intel 5000X/V/P (Blackford/Greencreek) chipset to the in
kernel EDAC model.

This patch incorporates the module for the 5000X/V/P chipset family

[m.kozlowski@tuxland.pl: edac i5000 parenthesis balance fix]
Signed-off-by: Eric Wollesen <ericw@xmtp.net>
Signed-off-by: Doug Thompson <norsk5@xmission.com>
Signed-off-by: Mariusz Kozlowski <m.kozlowski@tuxland.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:53 -07:00
Matthias Kaehlcke
63b7df9101 drivers/edac: change from semaphore to mutex operation
The EDAC core code uses a semaphore as mutex. use the mutex API
instead of the (binary) semaphore.

Matthaias wrote this, but since I had some patches ahead of it,
I need to modify it to follow my patches.

Signed-off-by: Matthias Kaehlcke <matthias.kaehlcke@gmail.com>
Signed-off-by: Douglas Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:53 -07:00
Dave Jiang
1a9b85e6b3 drivers/edac: mc sysfs add missing mem types
Adding missing mem types for use in the sysfs presentation file for
Memory Controller device objects.

Signed-off-by: Dave Jiang <djiang@mvista.com>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:53 -07:00
Douglas Thompson
e27e3dac65 drivers/edac: add edac_device class
This patch adds the new 'class' of object to be managed, named: 'edac_device'.

As a peer of the 'edac_mc' class of object, it provides a non-memory centric
view of an ERROR DETECTING device in hardware. It provides a sysfs interface
and an abstraction for varioius EDAC type devices.

Multiple 'instances' within the class are possible, with each 'instance'
able to have multiple 'blocks', and each 'block' having 'attributes'.

At the 'block' level there are the 'ce_count' and 'ue_count' fields
which the device driver can update and/or call edac_device_handle_XX()
functions. At each higher level are additional 'total' count fields,
which are a summation of counts below that level.

This 'edac_device' has been used to capture and present ECC errors
which are found in a a L1 and L2 system on a per CORE/CPU basis.

Signed-off-by: Douglas Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:53 -07:00
Douglas Thompson
7c9281d76c drivers/edac: split out functions to unique files
This is a large patch to refactor the original EDAC module in the kernel
and to break it up into better file granularity, such that each source
file contains a given subsystem of the EDAC CORE.

Originally, the EDAC 'core' was contained in one source file: edac_mc.c
with it corresponding edac_mc.h file.

Now, there are the following files:

edac_module.c	The main module init/exit function and other overhead
edac_mc.c	Code handling the edac_mc class of object
edac_mc_sysfs.c	Code handling for sysfs presentation
edac_pci_sysfs.c  Code handling for PCI sysfs presentation
edac_core.h	CORE .h include file for 'edac_mc' and 'edac_device' drivers
edac_module.h	Internal CORE .h include file

This forms a foundation upon which a later patch can create the 'edac_device'
class of object code in a new file 'edac_device.c'.

Signed-off-by: Douglas Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:53 -07:00
Douglas Thompson
d56933e018 drivers/edac: add RDDR2 memory types
Add Registered RDDR2 memory types for displaying DDR2 memories

Signed-off-by: Douglas Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:53 -07:00
Adrian Bunk
2da1c119fd drivers/edac: core: make functions static
This patch makes needlessly global code static, in the edac core

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Cc: Doug Thompson <norsk5@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:53 -07:00
Douglas Thompson
5da0831c59 drivers/edac: add edac_mc_find API
This simple patch adds an important CORE API for EDAC that EDAC drivers can
use to find their edac_mc control structure by passing a mem_ctl_info
'instance' value

Needed for subsequent patches

Signed-off-by: Douglas Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:53 -07:00
Rafael J. Wysocki
8314418629 Freezer: make kernel threads nonfreezable by default
Currently, the freezer treats all tasks as freezable, except for the kernel
threads that explicitly set the PF_NOFREEZE flag for themselves.  This
approach is problematic, since it requires every kernel thread to either
set PF_NOFREEZE explicitly, or call try_to_freeze(), even if it doesn't
care for the freezing of tasks at all.

It seems better to only require the kernel threads that want to or need to
be frozen to use some freezer-related code and to remove any
freezer-related code from the other (nonfreezable) kernel threads, which is
done in this patch.

The patch causes all kernel threads to be nonfreezable by default (ie.  to
have PF_NOFREEZE set by default) and introduces the set_freezable()
function that should be called by the freezable kernel threads in order to
unset PF_NOFREEZE.  It also makes all of the currently freezable kernel
threads call set_freezable(), so it shouldn't cause any (intentional)
change of behaviour to appear.  Additionally, it updates documentation to
describe the freezing of tasks more accurately.

[akpm@linux-foundation.org: build fixes]
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Nigel Cunningham <nigel@nigel.suspend2.net>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Gautham R Shenoy <ego@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-17 10:23:02 -07:00
Jan Engelhardt
751cb5e564 Use menuconfig objects II - EDAC
Change Kconfig objects from "menu, config" into "menuconfig" so
that the user can disable the whole feature without having to
enter the menu first.

Signed-off-by: Jan Engelhardt <jengelh@gmx.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:40 -07:00
Martin Schwidefsky
e25df1205f [S390] Kconfig: menus with depends on HAS_IOMEM.
Add "depends on HAS_IOMEM" to a number of menus to make them
disappear for s390 which does not have I/O memory.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2007-05-10 15:46:07 +02:00
John Feeney
62456726d7 Fix 82875 PCI setup
The 82875 EDAC driver enables an otherwise-hidden PCI device, but doesn't
register it as a PCI device properly.  Therefore, the device list in
/proc/bus/pci/devices is different than the tree in /sys/bus/pci.  This
usually manifests as the X server failing to start, since it expects the
two lists to be consistent.

Signed-off-by: Adam Jackson <ajackson@redhat.com>
Cc: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Cc: Greg KH <greg@kroah.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Doug Thompson <norsk5@xmission.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:15:07 -07:00
eric wollesen
9794f33dde [PATCH] EDAC: Add Fully-Buffered DIMM APIs to core
Eric Wollesen ported the Bluesmoke Memory Controller driver for the Intel
5000X/V/P (Blackford/Greencreek) chipset to the in kernel EDAC model.

This patch incorporates those required changes to the edac_mc.c and edac_mc.h
core files by added new Fully Buffered DIMM interface to the EDAC Core module.

Signed-off-by: eric wollesen <ericw@xmtp.net>
Signed-off-by: doug thompson <norsk5@xmission.com>
Acked-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-12 09:48:32 -08:00
Frithiof Jensen
4f423ddf56 [PATCH] EDAC: Add memory scrubbing controls API to core
This is an attempt of providing an interface for memory scrubbing control in
EDAC.

This patch modifies the EDAC Core to provide the Interface for memory
controller modules to implment.

The following things are still outstanding:

 - K8 is the first implemenation,

   The patch provide a method of configuring the K8 hardware memory scrubber
   via the 'mcX' sysfs directory.  There should be some fallback to a generic
   scrubber implemented in software if the hardware does not support
   scrubbing.

   Or .. the scrubbing sysfs entry should not be visible at all.

 - Only works with SDRAM, not cache,

   The K8 can scrub cache and l2cache also - but I think this is not so
   useful as the cache is busy all the time (one hopes).

   One would also expect that cache scrubbing requires hardware support.

 - Error Handling,

   I would like that errors are returned to the user in "terms of file
   system".

 - Presentation,

   I chose Bandwidth in Bytes/Second as a representation of the scrubbing
   rate for the following reasons:

   I like that the sysfs entries are sort-of textual, related to something
   that makes sense instead of magical values that must be looked up.

   "My People" wants "% main memory scrubbed per hour" others prefer "%
   memory bandwidth used" as representation, "bandwith used" makes it easy to
   calculate both versions in one-liner scripts.

   If one later wants to scrub cache, the scaling becomes wierd for K8
   changing from "blocks of 64 byte memory" to "blocks of 64 cache lines" to
   "blocks of 64 bit".  Using "bandwidth used" makes sense in all three cases,
   (I.M.O.  anyway ;-).

 - Discovery,

   There is no way to discover the possible settings and what they do
   without reading the code and the documentation.

   *I* do not know how to make that work in a practical way.

 - Bugs(??),

   other tools can set invalid values in the memory scrub control register,
   those will read back as '-1', requiring the user to reset the scrub rate.
   This is how *I* think it should be.

 - Afflicting other areas of code,

   I made changes to edac_mc.c and edac_mc.h which will show up globally -
   this is not nice, it would be better that the memory scrubbing fuctionality
   and interface could be entirely contained within the memory controller it
   applies to.

Frithiof Jensen

edac_mc.c and its .h file is a CORE helper module for EDAC
driver modules. This provides the abstraction for device specific
drivers. It is fine to modify this CORE to provide help for
new features of the the drivers

doug thompson

Signed-off-by: Frithiof Jensen <frithiof.jensen@ericson.com>
Signed-off-by: doug thompson <norsk5@xmission.com>
Acked-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-12 09:48:32 -08:00