struct mce.cpuid contains CPUID(1).EAX which contains family, model and
stepping and thus has enough information for our purposes. Thus get rid
of some external dependencies which are not really needed.
No functionality change.
Signed-off-by: Borislav Petkov <bp@suse.de>
On Deverton server, the P2SB PCI device (DEV:1F, FUN:1) is used by multiple
device drivers.
If it's hidden by some device driver (e.g. with the i801 I2C driver,
the commit
9424693035 ("i2c: i801: Create iTCO device on newer Intel PCHs")
unconditionally hid the P2SB PCI device wrongly) it will make the
pnd2_edac driver read out an invalid BAR value of 0xffffffff and then
fail on ioremap().
Therefore, store the presence state of P2SB PCI device before unhiding
it for reading BAR and restore the presence state after reading BAR.
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: linux-i2c@vger.kernel.org
Link: http://lkml.kernel.org/r/20170814154845.21663-1-qiuxu.zhuo@intel.com
Signed-off-by: Borislav Petkov <bp@suse.de>
Basically, there are full memory mirroring and address range partial
memory mirroring (supported by Haswell EX and Broadwell EX) modes.
a) In full memory mirroring, the memory behind each memory controller
is mirrored, i.e. the memory is split into two identical mirrors
(primary and secondary), half of the memory is reserved for redundancy.
b) In address range partial memory mirroring, the memory size (range)
of primary and secondary behind each memory controller can be user
defined by the TAD0 register. The rest of memory ranges defined by
TAD1/TAD2/... in that memory controller are non-mirrored.
For more detail on memory mirroring, see the following link written by Tony Luck:
https://01.org/lkp/blogs/tonyluck/2016/address-range-partial-memory-mirroring-linux
Currently the sb_edac driver only supports address decoding in full
memory mirroring and non-mirroring modes. In address range partial
memory mirroring mode, it may fail to decode an address that falls in a
non-mirroring area (the following was one of this kind of failed logs).
mce: Uncorrected hardware memory error in user-access at 566d53a400
Memory failure: 0x566d53a: Killing einj_mem_uc:4647 due to hardware memory corruption
Memory failure: 0x566d53a: recovery action for dirty LRU page: Recovered
mce: [Hardware Error]: Machine check events logged
EDAC sbridge MC1: HANDLING MCE MEMORY ERROR
EDAC sbridge MC1: CPU 48: Machine Check Event: 0 Bank 7: ec00000000010090
EDAC sbridge MC1: TSC 4b914aa5a99dab
EDAC sbridge MC1: ADDR 566d53a400
EDAC sbridge MC1: MISC 1443a0c86
EDAC sbridge MC1: PROCESSOR 0:406f1 TIME 1499712764 SOCKET 2 APIC 80
EDAC MC1: 0 UE Can't discover the memory rank for ch addr 0x7fb54e900 on any memory ( page:0x0 offset:0x0 grain:32)
mce: [Hardware Error]: Machine check events logged
Therefore, classify memory mirroring modes and make the address decoding
in address range partial memory mode correct.
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170730180651.30060-1-qiuxu.zhuo@intel.com
Signed-off-by: Borislav Petkov <bp@suse.de>
Xiaolong Ye reported the following failure on Broadwell D server:
EDAC sbridge: Some needed devices are missing
EDAC MC: Removed device 0 for sbridge_edac.c Broadwell SrcID#0_Ha#0: DEV 0000:ff:12.0
EDAC sbridge: Couldn't find mci handler
EDAC sbridge: Failed to register device with error -19.
Broadwell D (only IMC0 per socket) and Broadwell X (IMC0 and IMC1 per
socket) use the same PCI device IDs for IMC0 per socket, then they
share pci_dev_descr_broadwell_table (n_imcs_per_sock=2). In this case,
Broadwell D wrongly creates the nonexistent SOCK EDAC memory controller
and reports above error messages, since it has no IMC1 per socket.
Avoid creating the nonexistent SOCK memory controller.
Reported-and-tested-by: Xiaolong Ye <xiaolong.ye@intel.com>
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170608113351.25323-1-qiuxu.zhuo@intel.com
[ Massage. ]
Signed-off-by: Borislav Petkov <bp@suse.de>