linux-apfs

mirror of https://github.com/linux-apfs/linux-apfs.git synced 2026-05-01 15:00:59 -07:00

Author	SHA1	Message	Date
Tony Luck	c4fc1956fa	EDAC: i7core, sb_edac: Don't return NOTIFY_BAD from mce_decoder callback Both of these drivers can return NOTIFY_BAD, but this terminates processing other callbacks that were registered later on the chain. Since the driver did nothing to log the error it seems wrong to prevent other interested parties from seeing it. E.g. neither of them had even bothered to check the type of the error to see if it was a memory error before the return NOTIFY_BAD. Signed-off-by: Tony Luck <tony.luck@intel.com> Acked-by: Aristeu Rozanski <aris@redhat.com> Acked-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/72937355dd92318d2630979666063f8a2853495b.1461864507.git.tony.luck@intel.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-04-29 15:43:10 +02:00
Tony Luck	ea5dfb5fae	x86 EDAC, sb_edac.c: Take account of channel hashing when needed Haswell and Broadwell can be configured to hash the channel interleave function using bits [27:12] of the physical address. On those processor models we must check to see if hashing is enabled (bit21 of the HASWELL_HASYSDEFEATURE2 register) and act accordingly. Based on a patch by patrickg <patrickg@supermicro.com> Tested-by: Patrick Geary <patrickg@supermicro.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Acked-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: Aristeu Rozanski <arozansk@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-edac@vger.kernel.org Cc: stable@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-04-22 10:10:01 +02:00
Tony Luck	ff15e95c82	x86 EDAC, sb_edac.c: Repair damage introduced when "fixing" channel address In commit: `eb1af3b71f` ("Fix computation of channel address") I switched the "sck_way" variable from holding the log2 value read from the h/w to instead be the actual number. Unfortunately it is needed in log2 form when used to shift the address. Tested-by: Patrick Geary <patrickg@supermicro.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Acked-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: Aristeu Rozanski <arozansk@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-edac@vger.kernel.org Cc: stable@vger.kernel.org Fixes: `eb1af3b71f` ("Fix computation of channel address") Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-04-22 10:10:01 +02:00
Linus Torvalds	047486d8e7	Merge tag 'edac_for_4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp Pull EDAC updates from Borislav Petkov: - Altera: L2 cache and On-Chip RAM support (Thor Thayer). - EDAC: Workqueue handling cleanups (Borislav Petkov). - Xgene: Register bus error handling (Loc Ho). - Misc small fixes. * tag 'edac_for_4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp: ARM: socfpga: Enable OCRAM ECC on startup ARM: socfpga: Enable L2 cache ECC on startup ARM: dts: Add Altera L2 Cache and OCRAM EDAC entries EDAC, altera: Add Altera L2 cache and OCRAM support EDAC: Use edac_debugfs_remove_recursive() in edac_debugfs_exit() EDAC, mpc85xx: Silence unused variable warning EDAC: Cleanup/sync workqueue functions EDAC: Kill workqueue setup/teardown functions EDAC: Balance workqueue setup and teardown arm64: Update the APM X-Gene EDAC node with the RB register resource EDAC, xgene: Add missing SoC register bus error handling Documentation, EDAC: Update xgene binding for missing register bus EDAC, amd64_edac: Shift wrapping issue in f1x_get_norm_dct_addr()	2016-03-16 08:36:55 -07:00
Linus Torvalds	d88bfe1d68	Merge branch 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RAS updates from Ingo Molnar: "Various RAS updates: - AMD MCE support updates for future CPUs, fixes and 'SMCA' (Scalable MCA) error decoding support (Aravind Gopalakrishnan) - x86 memcpy_mcsafe() support, to enable smart(er) hardware error recovery in NVDIMM drivers, based on an extension of the x86 exception handling code. (Tony Luck)" * 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: EDAC/sb_edac: Fix computation of channel address x86/mm, x86/mce: Add memcpy_mcsafe() x86/mce/AMD: Document some functionality x86/mce: Clarify comments regarding deferred error x86/mce/AMD: Fix logic to obtain block address x86/mce/AMD, EDAC: Enable error decoding of Scalable MCA errors x86/mce: Move MCx_CONFIG MSR definitions x86/mce: Check for faults tagged in EXTABLE_CLASS_FAULT exception table entries x86/mm: Expand the exception table logic to allow new handling options x86/mce/AMD: Set MCAX Enable bit x86/mce/AMD: Carve out threshold block preparation x86/mce/AMD: Fix LVT offset configuration for thresholding x86/mce/AMD: Reduce number of blocks scanned per bank x86/mce/AMD: Do not perform shared bank check for future processors x86/mce: Fix order of AMD MCE init function call	2016-03-14 18:43:51 -07:00
Luck, Tony	eb1af3b71f	EDAC/sb_edac: Fix computation of channel address Large memory Haswell-EX systems with multiple DIMMs per channel were sometimes reporting the wrong DIMM. Found three problems: 1) Debug printouts for socket and channel interleave were not interpreting the register fields correctly. The socket interleave field is a 2^X value (0=1, 1=2, 2=4, 3=8). The channel interleave is X+1 (0=1, 1=2, 2=3. 3=4). 2) Actual use of the socket interleave value didn't interpret as 2^X 3) Conversion of address to channel address was complicated, and wrong. Signed-off-by: Tony Luck <tony.luck@intel.com> Acked-by: Aristeu Rozanski <arozansk@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-edac@vger.kernel.org Cc: stable@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-03-10 18:31:55 +01:00
Aravind Gopalakrishnan	be0aec23bf	x86/mce/AMD, EDAC: Enable error decoding of Scalable MCA errors For Scalable MCA enabled processors, errors are listed per IP block. And since it is not required for an IP to map to a particular bank, we need to use HWID and McaType values from the MCx_IPID register to figure out which IP a given bank represents. We also have a new bit (TCC) in the MCx_STATUS register to indicate Task context is corrupt. Add logic here to decode errors from all known IP blocks for Fam17h Model 00-0fh and to print TCC errors. [ Minor fixups. ] Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1457021458-2522-3-git-send-email-Aravind.Gopalakrishnan@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-03-08 11:48:14 +01:00
Hubert Chrzaniuk	83bdaad4d9	EDAC, sb_edac: Fix logic when computing DIMM sizes on Xeon Phi Correct a typo introduced by `d0cdf90031` ("EDAC, sb_edac: Add Knights Landing (Xeon Phi gen 2) support") As a result under some configurations DIMMs were not correctly recognized. Problem affects only Xeon Phi architecture. Signed-off-by: Hubert Chrzaniuk <hubert.chrzaniuk@intel.com> Acked-by: Aristeu Rozanski <aris@redhat.com> Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: lukasz.anaczkowski@intel.com Link: http://lkml.kernel.org/r/1457361045-26221-1-git-send-email-hubert.chrzaniuk@intel.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-03-07 19:07:40 +01:00
Thor Thayer	c3eea1942a	EDAC, altera: Add Altera L2 cache and OCRAM support Add L2 Cache and On-Chip RAM EDAC support for the Altera SoCs. The SDRAM controller is using the Memory Controller model. Each type of ECC is individually configurable. Signed-off-by: Thor Thayer <tthayer@opensource.altera.com> Cc: devicetree@vger.kernel.org Cc: dinguyen@opensource.altera.com Cc: galak@codeaurora.org Cc: grant.likely@linaro.org Cc: ijc+devicetree@hellion.org.uk Cc: linux-arm-kernel@lists.infradead.org Cc: linux@arm.linux.org.uk Cc: linux-doc@vger.kernel.org Cc: linux-edac <linux-edac@vger.kernel.org> Cc: mark.rutland@arm.com Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: pawel.moll@arm.com Cc: robh+dt@kernel.org Link: http://lkml.kernel.org/r/1455132384-17108-1-git-send-email-tthayer@opensource.altera.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-02-11 12:23:06 +01:00
Thor Thayer	9bf4f00567	EDAC: Use edac_debugfs_remove_recursive() in edac_debugfs_exit() debugfs_remove() is used to remove a file or a directory from the debugfs filesystem on an EDAC device exit. However edac_debugfs might not be empty. This is similar to `30f84a891b` ("EDAC: Use edac_debugfs_remove_recursive()") which changed the EDAC MCI code to use edac_debugfs_remove_recursive(). Suggested-by: Borislav Petkov <bp@alien8.de> Signed-off-by: Thor Thayer <tthayer@opensource.altera.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1455064165-3816-1-git-send-email-tthayer@opensource.altera.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-02-10 10:37:46 +01:00
Sudip Mukherjee	f2b59ac66f	EDAC, mpc85xx: Silence unused variable warning We were getting this build warning: drivers/edac/mpc85xx_edac.c:1247:6: warning: unused variable 'pvr' pvr is only used if CONFIG_FSL_SOC_BOOKE is defined. Declare it __maybe_unused. Suggested-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1454427573-7994-1-git-send-email-sudipm.mukherjee@gmail.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-02-02 18:53:15 +01:00
Borislav Petkov	06e912d4d4	EDAC: Cleanup/sync workqueue functions They're both running only when ->edac_check is initialized so remove that check from the workqueue function itself. Synchronize/generalize the ->op_state check between the two. Kill useless comments, while at it. Signed-off-by: Borislav Petkov <bp@suse.de>	2016-02-02 11:38:50 +01:00
Borislav Petkov	626a7a4dba	EDAC: Kill workqueue setup/teardown functions We have the generic wrappers now, use those. edac_pci_workq_setup() had an unused argument anyway. Signed-off-by: Borislav Petkov <bp@suse.de>	2016-02-02 11:38:43 +01:00
Borislav Petkov	0966760619	EDAC: Balance workqueue setup and teardown We use the ->edac_check function pointers to determine whether we need to setup a polling workqueue. However, the destroy path is not balanced and we might try to teardown an unitialized workqueue. Balance init and destroy paths by looking at ->edac_check in both cases. Set op_state to OP_OFFLINE before destroying anything. Reported-by: Zhiqiang Hou <Zhiqiang.Hou@freescale.com> Cc: Varun Sethi <Varun.Sethi@freescale.com> Signed-off-by: Borislav Petkov <bp@suse.de>	2016-02-02 11:04:29 +01:00
Loc Ho	4d67e3ce75	EDAC, xgene: Add missing SoC register bus error handling Add missing register bus error handling for APM X-Gene EDAC SoC and fix a checking condition for CE error promoted to UE. Signed-off-by: Loc Ho <lho@apm.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: devicetree@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-edac <linux-edac@vger.kernel.org> Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: patches@apm.com Link: http://lkml.kernel.org/r/1453495625-28006-3-git-send-email-lho@apm.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-01-25 11:17:22 +01:00
Dan Carpenter	6f3508f61c	EDAC, amd64_edac: Shift wrapping issue in f1x_get_norm_dct_addr() dct_sel_base_off is declared as a u64 but we're only using the lower 32 bits because of a shift wrapping bug. This can possibly truncate the upper 16 bits of DctSelBaseOffset[47:26], causing us to misdecode the CS row. Fixes: `c8e518d567` ('amd64_edac: Sanitize f10_get_base_addr_offset') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/20160120095451.GB19898@mwanda Signed-off-by: Borislav Petkov <bp@suse.de>	2016-01-25 11:17:14 +01:00
Geliang Tang	1cac5503fb	EDAC, i5100: Use to_delayed_work() Use to_delayed_work() instead of open-coding it. Signed-off-by: Geliang Tang <geliangtang@163.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/58c0e319c7263a10b692100c657c06c42814aecf.1451659910.git.geliangtang@163.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-01-01 18:31:34 +01:00
Hubert Chrzaniuk	45f4d3ab3e	EDAC, sb_edac: Set fixed DIMM width on Xeon Knights Landing Knights Landing does not come with register that could be used to fetch DIMM width. However the value is fixed for this architecture so it can be hardcoded. Signed-off-by: Hubert Chrzaniuk <hubert.chrzaniuk@intel.com> Cc: Doug Thompson <dougthompson@xmission.com> Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: lukasz.anaczkowski@intel.com Link: http://lkml.kernel.org/r/1449840082-18673-1-git-send-email-hubert.chrzaniuk@intel.com Signed-off-by: Borislav Petkov <bp@suse.de>	2015-12-11 16:58:32 +01:00
Borislav Petkov	c4cf3b454e	EDAC: Rework workqueue handling Hide the EDAC workqueue pointer in a separate compilation unit and add accessors for the workqueue manipulations needed. Remove edac_pci_reset_delay_period() which wasn't used by anything. It seems it got added without a user with `91b99041c1` ("drivers/edac: updated PCI monitoring") Signed-off-by: Borislav Petkov <bp@suse.de>	2015-12-11 16:56:43 +01:00
Borislav Petkov	e136fa016f	EDAC: Make edac_device workqueue setup/teardown functions static They're not used anywhere else. Signed-off-by: Borislav Petkov <bp@suse.de>	2015-12-11 16:56:42 +01:00
Borislav Petkov	d4538000ca	EDAC: Remove edac_get_sysfs_subsys() error handling It cannot fail now. We either load EDAC core after having successfully initialized edac_subsys or we don't. Signed-off-by: Borislav Petkov <bp@suse.de>	2015-12-11 16:56:41 +01:00
Borislav Petkov	a97d262701	EDAC: Unexport and make edac_subsys static ... and use the accessor instead. Signed-off-by: Borislav Petkov <bp@suse.de>	2015-12-11 16:56:40 +01:00
Borislav Petkov	733476cf20	EDAC: Rip out the edac_subsys reference counting This was really dumb - reference counting for the main EDAC sysfs object. While we could've simply registered it as the first thing in the module init path and then hand it around to what needs it. Do that and rip out all the code around it, thus simplifying the whole handling significantly. Move the edac_subsys node back to edac_module.c. Signed-off-by: Borislav Petkov <bp@suse.de>	2015-12-11 16:56:39 +01:00
Borislav Petkov	fcd5c4dd82	EDAC: Robustify workqueues destruction EDAC workqueue destruction is really fragile. We cancel delayed work but if it is still running and requeues itself, we still go ahead and destroy the workqueue and the queued work explodes when workqueue core attempts to run it. Make the destruction more robust by switching op_state to offline so that requeuing stops. Cancel any pending work synchronously too. EDAC i7core: Driver loaded. general protection fault: 0000 [#1] SMP CPU 12 Modules linked in: Supported: Yes Pid: 0, comm: kworker/0:1 Tainted: G IE 3.0.101-0-default #1 HP ProLiant DL380 G7 RIP: 0010:[<ffffffff8107dcd7>] [<ffffffff8107dcd7>] __queue_work+0x17/0x3f0 < ... regs ...> Process kworker/0:1 (pid: 0, threadinfo ffff88019def6000, task ffff88019def4600) Stack: ... Call Trace: call_timer_fn run_timer_softirq __do_softirq call_softirq do_softirq irq_exit smp_apic_timer_interrupt apic_timer_interrupt intel_idle cpuidle_idle_call cpu_idle Code: ... RIP __queue_work RSP <...> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: <stable@vger.kernel.org>	2015-12-11 16:56:39 +01:00
Borislav Petkov	12e26969b3	EDAC, mc_sysfs: Fix freeing bus' name I get the splat below when modprobing/rmmoding EDAC drivers. It happens because bus->name is invalid after bus_unregister() has run. The Code: section below corresponds to: .loc 1 1108 0 movq 672(%rbx), %rax # mci_1(D)->bus, mci_1(D)->bus .loc 1 1109 0 popq %rbx # .loc 1 1108 0 movq (%rax), %rdi # _7->name, jmp kfree # and %rax has some funky stuff 2030203020312030 which looks a lot like something walked over it. Fix that by saving the name ptr before doing stuff to string it points to. general protection fault: 0000 [#1] SMP Modules linked in: ... CPU: 4 PID: 10318 Comm: modprobe Tainted: G I EN 3.12.51-11-default+ #48 Hardware name: HP ProLiant DL380 G7, BIOS P67 05/05/2011 task: ffff880311320280 ti: ffff88030da3e000 task.ti: ffff88030da3e000 RIP: 0010:[<ffffffffa019da92>] [<ffffffffa019da92>] edac_unregister_sysfs+0x22/0x30 [edac_core] RSP: 0018:ffff88030da3fe28 EFLAGS: 00010292 RAX: 2030203020312030 RBX: ffff880311b4e000 RCX: 000000000000095c RDX: 0000000000000001 RSI: ffff880327bb9600 RDI: 0000000000000286 RBP: ffff880311b4e750 R08: 0000000000000000 R09: ffffffff81296110 R10: 0000000000000400 R11: 0000000000000000 R12: ffff88030ba1ac68 R13: 0000000000000001 R14: 00000000011b02f0 R15: 0000000000000000 FS: 00007fc9bf8f5700(0000) GS:ffff8801a7c40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000403c90 CR3: 000000019ebdf000 CR4: 00000000000007e0 Stack: Call Trace: i7core_unregister_mci.isra.9 i7core_remove pci_device_remove __device_release_driver driver_detach bus_remove_driver pci_unregister_driver i7core_exit SyS_delete_module system_call_fastpath 0x7fc9bf426536 Code: 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 53 48 89 fb e8 52 2a 1f e1 48 8b bb a0 02 00 00 e8 46 59 1f e1 48 8b 83 a0 02 00 00 5b <48> 8b 38 e9 26 9a fe e0 66 0f 1f 44 00 00 66 66 66 66 90 48 8b RIP [<ffffffffa019da92>] edac_unregister_sysfs+0x22/0x30 [edac_core] RSP <ffff88030da3fe28> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: <stable@vger.kernel.org> # v3.6.. Fixes: `7a623c0390` ("edac: rewrite the sysfs code to use struct device")	2015-12-11 16:56:38 +01:00

1 2 3 4 5 ...

1173 Commits