Commit Graph

1141 Commits

Author SHA1 Message Date
Linus Torvalds 9cf5c095b6 Merge tag 'asm-generic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic
Pull asm-generic cleanups from Arnd Bergmann:
 "The asm-generic changes for 4.4 are mostly a series from Christoph
  Hellwig to clean up various abuses of headers in there.  The patch to
  rename the io-64-nonatomic-*.h headers caused some conflicts with new
  users, so I added a workaround that we can remove in the next merge
  window.

  The only other patch is a warning fix from Marek Vasut"

* tag 'asm-generic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
  asm-generic: temporarily add back asm-generic/io-64-nonatomic*.h
  asm-generic: cmpxchg: avoid warnings from macro-ized cmpxchg() implementations
  gpio-mxc: stop including <asm-generic/bug>
  n_tracesink: stop including <asm-generic/bug>
  n_tracerouter: stop including <asm-generic/bug>
  mlx5: stop including <asm-generic/kmap_types.h>
  hifn_795x: stop including <asm-generic/kmap_types.h>
  drbd: stop including <asm-generic/kmap_types.h>
  move count_zeroes.h out of asm-generic
  move io-64-nonatomic*.h out of asm-generic
2015-11-06 14:22:15 -08:00
Linus Torvalds b831ef2cad Merge branch 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RAS changes from Ingo Molnar:
 "The main system reliability related changes were from x86, but also
  some generic RAS changes:

   - AMD MCE error injection subsystem enhancements.  (Aravind
     Gopalakrishnan)

   - Fix MCE and CPU hotplug interaction bug.  (Ashok Raj)

   - kcrash bootup robustness fix.  (Baoquan He)

   - kcrash cleanups.  (Borislav Petkov)

   - x86 microcode driver rework: simplify it by unmodularizing it and
     other cleanups.  (Borislav Petkov)"

* 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits)
  x86/mce: Add a default case to the switch in __mcheck_cpu_ancient_init()
  x86/mce: Add a Scalable MCA vendor flags bit
  MAINTAINERS: Unify the microcode driver section
  x86/microcode/intel: Move #ifdef DEBUG inside the function
  x86/microcode/amd: Remove maintainers from comments
  x86/microcode: Remove modularization leftovers
  x86/microcode: Merge the early microcode loader
  x86/microcode: Unmodularize the microcode driver
  x86/mce: Fix thermal throttling reporting after kexec
  kexec/crash: Say which char is the unrecognized
  x86/setup/crash: Check memblock_reserve() retval
  x86/setup/crash: Cleanup some more
  x86/setup/crash: Remove alignment variable
  x86/setup: Cleanup crashkernel reservation functions
  x86/amd_nb, EDAC: Rename amd_get_node_id()
  x86/setup: Do not reserve crashkernel high memory if low reservation failed
  x86/microcode/amd: Do not overwrite final patch levels
  x86/microcode/amd: Extract current patch level read to a function
  x86/ras/mce_amd_inj: Inject bank 4 errors on the NBC
  x86/ras/mce_amd_inj: Trigger deferred and thresholding errors interrupts
  ...
2015-11-03 17:51:33 -08:00
Tan Xiaojun 990995bad1 EDAC: Fix PAGES_TO_MiB macro misuse
The PAGES_TO_MiB macro is used for unit conversion but the
trace_mc_event() tracepoint expects a page address. Fix that.

Signed-off-by: Tan Xiaojun <tanxiaojun@huawei.com>
Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1445341538-24271-1-git-send-email-tanxiaojun@huawei.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2015-10-22 22:57:30 +02:00
Aravind Gopalakrishnan 1a6775c1a2 x86/amd_nb, EDAC: Rename amd_get_node_id()
This function doesn't give us the "Node ID" as the function name
suggests. Rather, it receives a PCI device as argument, checks
the available F3 PCI device IDs in the system and returns the
index of the matching Bus/Device IDs.

Rename it to amd_pci_dev_to_node_id().

No functional change is introduced.

Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1445246268-26285-3-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-21 11:10:55 +02:00
Dinh Nguyen 941fd2e709 EDAC, altera: SoCFPGA EDAC should not look for ECC_CORR_EN
The bootloader may or may not enable the ECC_CORR_EN bit. By
not enabling ECC_CORR_EN, when error happens, it is the user's
responsibility to perform a full SDRAM scrub.

Remove the check for ECC_CORR_EN.

Signed-off-by: Dinh Nguyen <dinguyen@opensource.altera.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Cc: Thor Thayer <tthayer@opensource.altera.com>
Link: http://lkml.kernel.org/r/1444864456-21778-1-git-send-email-dinguyen@opensource.altera.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2015-10-15 11:57:23 +02:00
Christoph Hellwig 2f8e2c8777 move io-64-nonatomic*.h out of asm-generic
These are not implementations of default architecture code but helpers
for drivers. Move them to the place they belong to.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Darren Hart <dvhart@linux.intel.com>
Acked-by: Hitoshi Mitake <mitake.hitoshi@lab.ntt.co.jp>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2015-10-15 00:21:07 +02:00
Tan Xiaojun 30f84a891b EDAC: Use edac_debugfs_remove_recursive()
debugfs_remove() is used to remove a file or a directory from the
debugfs filesystem, but mci->debugfs might not empty.

This can be triggered by the following sequence:

1) Enable CONFIG_EDAC_DEBUG
2) insmod an EDAC module (like i3000_edac or similar)
3) rmmod this module
4) we can see files remaining under <debugfs_mountpoint>/edac/ like
   "fake_inject", for example.

Removing edac_core then, causes a NULL pointer dereference.

Reported-by: Yun Wu (Abel) <wuyun.wu@huawei.com>
Signed-off-by: Tan Xiaojun <tanxiaojun@huawei.com>
Cc: Doug Thompson <dougthompson@xmission.com>
Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1444787364-104353-1-git-send-email-tanxiaojun@huawei.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2015-10-14 18:50:32 +02:00
Luis de Bethencourt 90b3b37383 EDAC, ppc4xx_edac: Fix module autoload for OF platform driver
This platform driver has an OF device ID table but the OF module alias
information is not created so module autoloading won't work.

Signed-off-by: Luis de Bethencourt <luisbg@osg.samsung.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20150917114619.GA13145@goodgumbo.baconseed.org
Signed-off-by: Borislav Petkov <bp@suse.de>
2015-10-03 12:19:42 +02:00
Aravind Gopalakrishnan 1a8bc7707e EDAC, amd64_edac: Update copyright and remove changelog
Git provides us all the changelogs anyway. So trim the comments section
here. Update the copyrights info while at it.

Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1443440593-2316-3-git-send-email-Aravind.Gopalakrishnan@amd.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2015-09-29 13:41:04 +02:00
Aravind Gopalakrishnan da92110dfd EDAC, amd64_edac: Extend scrub rate support to F15hM60h
The scrub rate control register has moved to function 2 in PCI config
space and is at a different offset on family 0x15, models 0x60 and
later. The minimum recommended scrub rate has also changed. (Refer to
D18F2x1c9_dct[1:0][DramScrub] in Fam15hM60h BKDG).

Adjust set_scrub_rate() and get_scrub_rate() functions to accommodate
this.

Tested on F15hM60h, Fam15h, models 00h-0fh and Fam10h systems.

Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1443440593-2316-2-git-send-email-Aravind.Gopalakrishnan@amd.com
[ Cleanup conditionals. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
2015-09-29 13:25:33 +02:00
Toshi Kani d0c9c93019 EDAC: Don't allow empty DIMM labels
Updating dimm_label to an empty string does not make much sense. Change
the sysfs dimm_label store operation to fail a request when an input
string is empty.

Suggested-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Cc: elliott@hpe.com
Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1443124767.25474.172.camel@hpe.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2015-09-28 16:39:05 +02:00
Toshi Kani 438470b84c EDAC: Fix sysfs dimm_label store operation
Sysfs "dimm_label" and "chX_dimm_label" nodes have the following issues
in their store operation:

 1) A newline-terminated input string causes redundant newlines:

  # echo "test" > /sys/bus/mc0/devices/dimm0/dimm_label
  # cat  /sys/bus/mc0/devices/dimm0/dimm_label
  test

  #  od -bc /sys/bus/mc0/devices/dimm0/dimm_label
  0000000 164 145 163 164 012 012
            t   e   s   t  \n  \n
  0000006

 2) The original label string (31 characters) cannot be stored due to
    an improper size check:

  # echo "CPU_SrcID#0_Ha#0_Chan#0_DIMM#0" > /sys/bus/mc0/devices/dimm0/dimm_label
  # cat /sys/bus/mc0/devices/dimm0/dimm_label

  # od -bc /sys/bus/mc0/devices/dimm0/dimm_label
   0000000 012 012
            \n  \n
   0000002

 3) An input string longer than the buffer size results a wrong label
    info as it allows a retry with the remaining string:

  # echo "CPU_SrcID#0_Ha#0_Chan#0_DIMM#0_TEST" > /sys/bus/mc0/devices/dimm0/dimm_label
  # cat  /sys/bus/mc0/devices/dimm0/dimm_label
  _TEST

Fix these issues by making the following changes:
 1) Replace a newline character at the end by setting a null. It also
    assures that the string is null-terminated in the label buffer.
 2) Check the label buffer size with 'sizeof(dimm->label)'.
 3) Fail a request if its string exceeds the label buffer size.

Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Cc: Robert Elliott <elliott@hpe.com>
Link: http://lkml.kernel.org/r/1443121564.25474.160.camel@hpe.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2015-09-25 19:45:59 +02:00
Toshi Kani 1ea62c59c8 EDAC: Fix sysfs dimm_label show operation
After

  7d375bffa5 ("sb_edac: Fix support for systems with two home agents per socket")

sysfs "dimm_label" and "chX_dimm_label" show their label string without a
newline "\n" at the end.

  [root@orange ~]# cat /sys/bus/mc0/devices/dimm0/dimm_label
  CPU_SrcID#0_Ha#0_Chan#0_DIMM#0[root@orange ~]#

  [root@orange ~]# cat /sys/devices/system/edac/mc/mc0/csrow0/ch0_dimm_label
  CPU_SrcID#0_Ha#0_Chan#0_DIMM#0[root@orange ~]#

The label strings now have 31 characters, which are the same as
EDAC_MC_LABEL_LEN. Since the snprintf()s in channel_dimm_label_show()
and dimmdev_label_show() limit the whole length by EDAC_MC_LABEL_LEN,
the newline in the format "%s\n" is ignored.

  [root@orange ~]# od -bc /sys/bus/mc0/devices/dimm0/dimm_label
  0000000 103 120 125 137 123 162 143 111 104 043 060 137 110 141 043 060
            C   P   U   _   S   r   c   I   D   #   0   _   H   a   #   0
  0000020 137 103 150 141 156 043 060 137 104 111 115 115 043 060 000
            _   C   h   a   n   #   0   _   D   I   M   M   #   0  \0
  0000037

Fix it by using 'sizeof(dimm->label) + 1' as the whole length in the
snprintf()s in channel_dimm_label_show() and dimmdev_label_show().

Reported-by: Robert Elliott <elliott@hpe.com>
Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Link: http://lkml.kernel.org/r/1442933883-21587-2-git-send-email-toshi.kani@hpe.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2015-09-25 19:19:21 +02:00
Loc Ho f864b79ba2 EDAC, xgene: Add SoC support
Add support for the SoC component.

Signed-off-by: Loc Ho <lho@apm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: devicetree@vger.kernel.org
Cc: ijc+devicetree@hellion.org.uk
Cc: jcm@redhat.com
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: mark.rutland@arm.com
Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Cc: patches@apm.com
Cc: robh+dt@kernel.org
Link: http://lkml.kernel.org/r/1443055261-8613-4-git-send-email-lho@apm.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2015-09-25 15:41:46 +02:00
Loc Ho 9bc1c0c0ec EDAC, xgene: Fix possible sprintf() overflow issue
Replace sprintf() with snprintf() to avoid possible string array
overflow.

Signed-off-by: Loc Ho <lho@apm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: devicetree@vger.kernel.org
Cc: ijc+devicetree@hellion.org.uk
Cc: jcm@redhat.com
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: mark.rutland@arm.com
Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Cc: patches@apm.com
Cc: robh+dt@kernel.org
Link: http://lkml.kernel.org/r/1443116287-11752-1-git-send-email-lho@apm.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2015-09-25 15:39:52 +02:00
Loc Ho 9347473c7d EDAC, xgene: Add L3 support
Add EDAC support for the L3 component.

Signed-off-by: Loc Ho <lho@apm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: devicetree@vger.kernel.org
Cc: ijc+devicetree@hellion.org.uk
Cc: jcm@redhat.com
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: mark.rutland@arm.com
Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Cc: patches@apm.com
Cc: robh+dt@kernel.org
Link: http://lkml.kernel.org/r/1443055261-8613-3-git-send-email-lho@apm.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2015-09-25 15:36:31 +02:00
Seth Jennings 2900ea6096 EDAC, sb_edac: Fix TAD presence check for sbridge_mci_bind_devs()
In commit

  7d375bffa5 ("sb_edac: Fix support for systems with two home agents per socket")

NUM_CHANNELS was changed to 8 and the channel space was renumerated to
handle EN, EP, and EX configurations.

The *_mci_bind_devs() functions - except for sbridge_mci_bind_devs() -
got a new device presence check in the form of saw_chan_mask. However,
sbridge_mci_bind_devs() still uses the NUM_CHANNELS for loop.

With the increase in NUM_CHANNELS, this loop fails at index 4 since
SB only has 4 TADs.  This results in the following error on SB machines:

  EDAC sbridge: Some needed devices are missing
  EDAC sbridge: Couldn't find mci handler
  EDAC sbridge: Couldn't find mci handle

This patch adapts the saw_chan_mask logic for sbridge_mci_bind_devs() as
well.

After this patch:

  EDAC MC0: Giving out device to module sbridge_edac.c controller Sandy Bridge Socket#0: DEV 0000:3f:0e.0 (POLLED)
  EDAC MC1: Giving out device to module sbridge_edac.c controller Sandy Bridge Socket#1: DEV 0000:7f:0e.0 (POLLED)

Signed-off-by: Seth Jennings <sjenning@redhat.com>
Acked-by: Aristeu Rozanski <aris@redhat.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Tested-by: Borislav Petkov <bp@suse.de>
Cc: <stable@vger.kernel.org> # v4.2
Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1438798561-10180-1-git-send-email-sjenning@redhat.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2015-09-24 20:40:50 +02:00
Aravind Gopalakrishnan 58a9c251c9 EDAC, ghes_edac: Remove redundant memory_type array
We already have edac_mem_types[] that enumerates the different kinds of
memory. So, use that and remove the redundant memory_type[] array here.

Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1442436811-23382-2-git-send-email-Aravind.Gopalakrishnan@amd.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2015-09-23 16:59:25 +02:00
Borislav Petkov 09bd1b4f81 EDAC, xgene: Convert to debugfs wrappers
Drop CONFIG_EDAC_DEBUG ifdeffery too, while at it.

Tested-by: Loc Ho <lho@apm.com>
Cc: linux-edac@vger.kernel.org
Signed-off-by: Borislav Petkov <bp@suse.de>
2015-09-23 11:37:45 +02:00
Borislav Petkov 52019e406c EDAC, i5100: Convert to debugfs wrappers
This driver creates its debugfs hierarchy under the toplevel debugfs dir
- see i5100_init() - so make it use edac_debugfs_create_dir_at( , NULL)
because we're not breaking userspace. Oh well.

Signed-off-by: Borislav Petkov <bp@suse.de>
2015-09-22 18:21:12 +02:00
Borislav Petkov bba3b31e44 EDAC, altera: Convert to debugfs wrappers
Use the EDAC-specific wrappers. Drop CONFIG_EDAC_DEBUG ifdeffery.

Cc: Thor Thayer <tthayer@opensource.altera.com>
Cc: <linux-edac@vger.kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
2015-09-22 18:20:56 +02:00
Borislav Petkov 4397bcb4fa EDAC: Add debugfs wrappers
Later patches will convert EDAC users to those.

Signed-off-by: Borislav Petkov <bp@suse.de>
2015-09-22 18:10:22 +02:00
Borislav Petkov 7ac8bf9bc9 EDAC: Carve out debugfs functionality
... into a separate compilation unit and drop a couple of
CONFIG_EDAC_DEBUG ifdefferies. Rename edac_create_debug_nodes() to
edac_create_debugfs_nodes(), while at it.

No functionality change.

Cc: <linux-edac@vger.kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
2015-09-22 12:29:46 +02:00
Linus Torvalds d9b44fe30f Merge tag 'edac/v4.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-edac
Pull edac updates from Mauro Carvalho Chehab:
 "Two EDAC fixes for Intel systems (Haswell and Ivy Bridge)"

* tag 'edac/v4.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-edac:
  sb_edac: correctly fetch DIMM width on Ivy Bridge and Haswell
  sb_edac: look harder for DDRIO on Haswell systems
2015-09-11 16:21:12 -07:00
Aristeu Rozanski 12f0721c5a sb_edac: correctly fetch DIMM width on Ivy Bridge and Haswell
dimm_dev_type has been incorrectly determined in sb_edac. This patch fixes it
for Ivy Bridge and Haswell only since nothing like exists for Sandy Bridge.
We tested this patch in multiple systems matching the results with the
installed memory modules.

Acked-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Aristeu Rozanski <aris@redhat.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
2015-09-08 20:33:48 -03:00