Commit Graph

38 Commits

Author SHA1 Message Date
Mauro Carvalho Chehab
77c5f5d2f2 ghes_edac: Register at EDAC core the BIOS report
Register GHES at EDAC MC core, in order to avoid other
drivers to also handle errors and mangle with error data.

The edac core will warrant that just one driver will be used,
so the first one to register (BIOS first) will be the one that
will be reporting the hardware errors.

For now, the EDAC driver does nothing but to register at the
EDAC core, preventing the hardware-driven mechanism to
interfere with GHES.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2013-02-25 19:42:12 -03:00
Ralf Baechle
f65aad4177 MIPS: Cavium: Add EDAC support.
Drivers for EDAC on Cavium.  Supported subsystems are:

 o CPU primary caches.  These are parity protected only, so only error
   reporting.
 o Second level cache - ECC protected, provides SECDED.
 o Memory: ECC / SECDEC if used with suitable DRAM modules.  The driver will
   will only initialize if ECC is enabled on a system so is safe to run on
   non-ECC memory.
 o PCI: Parity error reporting

Since it is very hard to test this sort of code the implementation is very
conservative and uses polling where possible for now.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Reviewed-by: Borislav Petkov <borislav.petkov@amd.com>
2012-12-12 16:48:49 +01:00
Rob Herring
69154d0698 edac: add support for Calxeda highbank L2 cache ecc
Add support for L2 ECC on Calxeda highbank platform.

Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2012-06-27 09:01:29 -03:00
Rob Herring
a1b01edb27 edac: add support for Calxeda highbank memory controller
Add support for memory controller on Calxeda Highbank platforms. Highbank
platforms support a single 4GB mini-DIMM with 1-bit correction and 2-bit
detection.

Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2012-06-27 09:00:57 -03:00
Mauro Carvalho Chehab
3d78c9af78 edac: sb_edac: Add it to the building system
Some changes on it were required due to changeset cd90cc84c6bf0, that
changed the glue with the MCE logic.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2011-11-01 10:01:54 -02:00
Borislav Petkov
4140c54266 i7core_edac: Drop the edac_mce facility
Remove edac_mce pieces and use the normal MCE decoder notifier chain by
retaining the same functionality with considerably less code.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2011-11-01 10:01:24 -02:00
Chris Metcalf
5c77075548 drivers/edac: provide support for tile architecture
Add tile support for the EDAC driver, which provides unified system
error (memory, PCI, etc.) reporting. For now, the TILEPro port
reports memory correctable error (CE) only.

Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
2011-03-10 13:30:14 -05:00
Tracey Dent
f570e1dd84 EDAC: Remove deprecated kbuild goal definitions
Change EDAC's Makefile to use <modules>-y instead of
<modules>-objs because -objs is deprecated and not mentioned in
Documentation/kbuild/makefiles.txt.

 [bp: Fixup commit message]
 [bp: Fixup indentation]

Signed-off-by: Tracey Dent <tdent48227@gmail.com>
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-11-22 15:35:31 +01:00
Linus Torvalds
8de547e182 Merge branch 'devel' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/edac
* 'devel' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/edac: (25 commits)
  i7300_edac: Properly initialize per-csrow memory size
  V4L/DVB: i7300_edac: better initialize page counts
  MAINTAINERS: Add maintainer for i7300-edac driver
  i7300-edac: CodingStyle cleanup
  i7300_edac: Improve comments
  i7300_edac: Cleanup: reorganize the file contents
  i7300_edac: Properly detect channel on CE errors
  i7300_edac: enrich FBD error info for corrected errors
  i7300_edac: enrich FBD error info for fatal errors
  i7300_edac: pre-allocate a buffer used to prepare err messages
  i7300_edac: Fix MTR x4/x8 detection logic
  i7300_edac: Make the debug messages coherent with the others
  i7300_edac: Cleanup: remove get_error_info logic
  i7300_edac: Add a code to cleanup error registers
  i7300_edac: Add support for reporting FBD errors
  i7300_edac: Properly detect the type of error correction
  i7300_edac: Detect if the device is on single mode
  i7300_edac: Adds detection for enhanced scrub mode on x8
  i7300_edac: Clear the error bit after reading
  i7300_edac: Add error detection code for global errors
  ...
2010-10-24 13:06:57 -07:00
Borislav Petkov
47ca08a40b EDAC, MCE: Rename files
Drop "edac_" string from the filenames since they're prefixed with edac/
in their pathname anyway.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-10-21 14:48:00 +02:00
Borislav Petkov
9cdeb404a1 EDAC, MCE: Rework MCE injection
Add sysfs injection facilities for testing of the MCE decoding code.
Remove large parts of amd64_edac_dbg.c, as a result, which did only
NB MCE injection anyway and the new injection code supports that
functionality already.

Add an injection module so that MCE decoding code in production kernels
like those in RHEL and SLES can be tested.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-10-21 14:47:59 +02:00
Mauro Carvalho Chehab
fcaf780b2a i7300_edac: start a driver for i7300 chipset (Clarksboro)
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-08-30 14:56:42 -03:00
Mauro Carvalho Chehab
696e409dbd edac_mce: Add an interface driver to report mce errors via edac
edac_mce module is an interface module that gets mcelog data and
forwards to any registered edac module that expects to receive data via
mce.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10 11:44:49 -03:00
Mauro Carvalho Chehab
a0c36a1f0f i7core_edac: Add an EDAC memory controller driver for Nehalem chipsets
This driver is meant to support i7 core/i7core extreme desktop
processors and Xeon 35xx/55xx series with integrated memory controller.
It is likely that it can be expanded in the future to work with other
processor series based at the same Memory Controller design.

For now, it has just a few MCH status reads.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-10 11:44:45 -03:00
Borislav Petkov
0d18b2e34b x86: EDAC: carve out AMD MCE decoding logic
This converts the MCE decoding logic into a standalone config
option which can be built-in or a module, the first one being the
default for MCEs happening early on in the boot process.

This, beyond being separated in a cleaner way, also saves RAM by
making the decoding logic modular.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andi Kleen <andi@firstfloor.org>
LKML-Reference: <20091002133148.GD28682@aftab>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-02 15:42:19 +02:00
Ingo Molnar
f436f8bb73 x86: EDAC: MCE: Fix MCE decoding callback logic
Make decoding of MCEs happen only on AMD hardware by registering a
non-default callback only on CPU families which support it.

While looking at the interaction of decode_mce() with the other MCE
code i also noticed a few other things and made the following
cleanups/fixes:

 - Fixed the mce_decode() weak alias - a weak alias is really not
   good here, it should be a proper callback. A weak alias will be
   overriden if a piece of code is built into the kernel - not
   good, obviously.

 - The patch initializes the callback on AMD family 10h and 11h.

 - Added the more correct fallback printk of:

	No support for human readable MCE decoding on this CPU type.
	Transcribe the message and run it through 'mcelog --ascii' to decode.

   On CPUs that dont have a decoder.

 - Made the surrounding code more readable.

Note that the callback allows us to have a default fallback -
without having to check the CPU versions during the printout
itself. When an EDAC module registers itself, it can install the
decode-print function.

(there's no unregister needed as this is core code.)

version -v2 by Borislav Petkov:

 - add K8 to the set of supported CPUs

 - always build in edac_mce_amd since we use an early_initcall now

 - fix checkpatch warnings

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andi Kleen <andi@firstfloor.org>
LKML-Reference: <20091001141432.GA11410@aftab>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-02 15:42:18 +02:00
Jason Uhlenkott
dd8ef1db87 edac: i3200 memory controller driver
A driver for the Intel 3200 and 3210 memory controllers.  It has only had
light testing so far, and currently makes no attempt to decode error
addresses at anything finer than csrow granularity.

Signed-off-by: Jason Uhlenkott <juhlenko@akamai.com>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-09-24 07:21:04 -07:00
Borislav Petkov
b70ef01016 EDAC: move MCE error descriptions to EDAC core
This is in preparation of adding AMD-specific MCE decoding functionality
to the EDAC core. The error decoding macros originate from the AMD64
EDAC driver albeit in a simplified and cleaned up version here.

While at it, add macros to generate the error description strings and
use them in the error type decoders directly which removes a bunch of
code and makes the decoding functions much more readable. Also, fix
strings and shorten macro names.

Remove superfluous htlink_msgs.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-09-14 18:57:48 +02:00
Harry Ciao
2a9036afff edac: add CPC925 Memory Controller driver
Introduce IBM CPC925 EDAC driver, which makes use of ECC, CPU and
HyperTransport Link error detections and corrections on the IBM
CPC925 Bridge and Memory Controller.

[akpm@linux-foundation.org: cleanup]
Signed-off-by: Harry Ciao <qingtao.cao@windriver.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Cc: Michael Ellerman <michael@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Kumar Gala <galak@gate.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-18 13:03:56 -07:00
Doug Thompson
7d6034d321 amd64_edac: add module registration routines
Also, link into Kbuild by adding Kconfig and Makefile entries.

Borislav:
- Kconfig/Makefile splitting
- use zero-sized arrays for the sysfs attrs if not enabled
- rename sysfs attrs to more conform values
- shorten CONFIG_ names
- make multiple structure members assignment vertically aligned
- fix/cleanup comments
- fix function return value patterns
- fix err labels
- fix a memleak bug caught by Ingo
- remove the NUMA dependency and use num_k8_northbrides for initializing
  a driver instance per NB.
- do not copy the pvt contents into the mci struct in
  amd64_init_2nd_stage() and save it in the mci->pvt_info void ptr
  instead.
- cleanup debug calls
- simplify amd64_setup_pci_device()

Reviewed-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-06-10 12:19:28 +02:00
Harry Ciao
715fe7af9f edac: AMD8111 & AMD8131 Kconfig fixup
The amd8111_edac.c driver will fail allmodconfig on architectures other
than PPC, introduce Kconfig dependency to avoid this, since both AMD8111
and AMD8131 chips are only adopted on Maple so far.

Signed-off-by: Harry Ciao <qingtao.cao@windriver.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-05-29 08:40:03 -07:00
Grant Erickson
dba7a77c0e edac: new ppc4xx driver module
This adds support for an EDAC memory controller adaptation driver for the
"ibm,sdram-4xx-ddr2" ECC controller realized in the AMCC PowerPC 405EX[r].

At present, this driver has been developed and tested against the
controller realization in the AMCC PPC405EX[r] on the AMCC Kilauea and
Haleakala boards (256 MiB w/o ECC memory soldered onto the board) and a
proprietary board based on those designs (128 MiB ECC memory, also
soldered onto the board).

In the future, dynamic feature detection and handling needs to be added
for the other realizations of this controller found in the 440SP, 440SPe,
460EX, 460GT and 460SX.

Eventually, this driver will likely be evolved and adapted to the above
variant realizations of this controller as well as broken apart to handle
the other known ECC-capable controllers prevalent in other PPC4xx
processors:

  - IBM SDRAM (405GP, 405CR and 405EP) "ibm,sdram-4xx"
  - IBM DDR1 (440GP, 440GX, 440EP and 440GR) "ibm,sdram-4xx-ddr"
  - Denali DDR1/DDR2 (440EPX and 440GRX) "denali,sdram-4xx-ddr2"

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Grant Erickson <gerickson@nuovations.com>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-02 19:05:03 -07:00
Mauro Carvalho Chehab
920c8df6ac edac: driver for i5400 MCH (Seaburg)
EDAC driver for i5400 MCH (Seaburg)

This driver adds support for i5400 MCH chipset.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Ben Woodard <woodard@redhat.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-01-06 15:59:30 -08:00
Hitoshi Mitake
df8bc08c19 edac x38: new MC driver module
I wrote a new module for Intel X38 chipset.  This chipset is very similar
to Intel 3200 chipset, but there are some different points, so I copyed
i3200_edac.c and modified.

This is Intel's web page describing this chipset.
http://www.intel.com/Products/Desktop/Chipsets/X38/X38-overview.htm

I've tested this new module with broken memory, and it seems to be working
well.

Signed-off-by: Hitoshi Mitake <mitake@clustcom.com>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-30 11:38:45 -07:00
Arthur Jones
8f421c595a edac: i5100 new intel chipset driver
Preliminary support for the Intel 5100 MCH.  CE and UE errors are reported
along with the current DIMM label information and other memory parameters.

Reasons why this is preliminary:

1) This chip has 2 independent memory controllers which, for best
   perforance, use interleaved accesses to the DDR2 memory.  This
   architecture does not map very well to the current edac data structures
   which depend on symmetric channel access to the interleaved data.
   Without core changes, the best I could do for now is to map both memory
   controllers to different csrows (first all ranks of controller 0, then
   all ranks of controller 1).  Someone much more familiar with the edac
   core than I will probably need to come up with a more general data
   structure to handle the interleaving and de-interleaving of the two
   memory controllers.

2) I have not yet tackled the de-interleaving of the rank/controller
   address space into the physical address space of the CPU.  There is
   nothing fundamentally missing, it is just ending up to be a lot of
   code, and I'd rather keep it separate for now, esp since it doesn't
   work yet...

3) The code depends on a particular i5100 chip select to DIMM mainboard
   chip select mapping.  This mapping seems obvious to me in order to
   support dual and single ranked memory, but it is not unique and DIMM
   labels could be wrong on other mainboards.  There is no way to query
   this mapping that I know of.

4) The code requires that the i5100 is in 32GB mode.  Only 4 ranks per
   controller, 2 ranks per DIMM are supported.  I do not have hardware
   (nor do I expect to have hardware anytime soon) for the 48GB (6 ranks
   per controller) mode.

5) The serial presence detect code should be broken out into a "real"
   i2c driver so that decode-dimms.pl can work.

Signed-off-by: Arthur Jones <ajones@riverbed.com>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:48 -07:00