At the time this was submitted by Leonardo, I confirmed -- or thought
I had confirmed -- with PowerVM partition firmware development that
the following RTAS functions:
- ibm,get-xive
- ibm,int-off
- ibm,int-on
- ibm,set-xive
were safe to call on multiple CPUs simultaneously, not only with
respect to themselves as indicated by PAPR, but with arbitrary other
RTAS calls:
https://lore.kernel.org/linuxppc-dev/875zcy2v8o.fsf@linux.ibm.com/
Recent discussion with firmware development makes it clear that this
is not true, and that the code in commit b664db8e3f ("powerpc/rtas:
Implement reentrant rtas call") is unsafe, likely explaining several
strange bugs we've seen in internal testing involving DLPAR and
LPM. These scenarios use ibm,configure-connector, whose internal state
can be corrupted by the concurrent use of the "reentrant" functions,
leading to symptoms like endless busy statuses from RTAS.
Fixes: b664db8e3f ("powerpc/rtas: Implement reentrant rtas call")
Cc: stable@vger.kernel.org # v5.8+
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Reviewed-by: Laurent Dufour <laurent.dufour@fr.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220907220111.223267-1-nathanl@linux.ibm.com
By default old pre-3.0 Freescale PCIe controllers reports invalid PCI Class
Code 0x0b20 for PCIe Root Port. It can be seen by lspci -b output on P2020
board which has this pre-3.0 controller:
$ lspci -bvnn
00:00.0 Power PC [0b20]: Freescale Semiconductor Inc P2020E [1957:0070] (rev 21)
!!! Invalid class 0b20 for header type 01
Capabilities: [4c] Express Root Port (Slot-), MSI 00
Fix this issue by programming correct PCI Class Code 0x0604 for PCIe Root
Port to the Freescale specific PCIe register 0x474.
With this change lspci -b output is:
$ lspci -bvnn
00:00.0 PCI bridge [0604]: Freescale Semiconductor Inc P2020E [1957:0070] (rev 21) (prog-if 00 [Normal decode])
Capabilities: [4c] Express Root Port (Slot-), MSI 00
Without any "Invalid class" error. So class code was properly reflected
into standard (read-only) PCI register 0x08.
Same fix is already implemented in U-Boot pcie_fsl.c driver in commit:
d18d06ac35
Fix activated by U-Boot stay active also after booting Linux kernel.
But boards which use older U-Boot version without that fix are affected and
still require this fix.
So implement this class code fix also in kernel fsl_pci.c driver.
Cc: stable@vger.kernel.org
Signed-off-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220706101043.4867-1-pali@kernel.org
Merge our fixes branch. In particular this brings in commit
9864816180 ("powerpc/book3e: Fix PUD allocation size in
map_kernel_page()") which fixes a build failure in next, because commit
2db2008e63 ("powerpc/64e: Rewrite p4d_populate() as a static inline
function") depends on it.
kasan detects access beyond the end of the xibm->bitmap allocation:
BUG: KASAN: slab-out-of-bounds in _find_first_zero_bit+0x40/0x140
Read of size 8 at addr c00000001d1d0118 by task swapper/0/1
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.19.0-rc2-00001-g90df023b36dd #28
Call Trace:
[c00000001d98f770] [c0000000012baab8] dump_stack_lvl+0xac/0x108 (unreliable)
[c00000001d98f7b0] [c00000000068faac] print_report+0x37c/0x710
[c00000001d98f880] [c0000000006902c0] kasan_report+0x110/0x354
[c00000001d98f950] [c000000000692324] __asan_load8+0xa4/0xe0
[c00000001d98f970] [c0000000011c6ed0] _find_first_zero_bit+0x40/0x140
[c00000001d98f9b0] [c0000000000dbfbc] xive_spapr_get_ipi+0xcc/0x260
[c00000001d98fa70] [c0000000000d6d28] xive_setup_cpu_ipi+0x1e8/0x450
[c00000001d98fb30] [c000000004032a20] pSeries_smp_probe+0x5c/0x118
[c00000001d98fb60] [c000000004018b44] smp_prepare_cpus+0x944/0x9ac
[c00000001d98fc90] [c000000004009f9c] kernel_init_freeable+0x2d4/0x640
[c00000001d98fd90] [c0000000000131e8] kernel_init+0x28/0x1d0
[c00000001d98fe10] [c00000000000cd54] ret_from_kernel_thread+0x5c/0x64
Allocated by task 0:
kasan_save_stack+0x34/0x70
__kasan_kmalloc+0xb4/0xf0
__kmalloc+0x268/0x540
xive_spapr_init+0x4d0/0x77c
pseries_init_irq+0x40/0x27c
init_IRQ+0x44/0x84
start_kernel+0x2a4/0x538
start_here_common+0x1c/0x20
The buggy address belongs to the object at c00000001d1d0118
which belongs to the cache kmalloc-8 of size 8
The buggy address is located 0 bytes inside of
8-byte region [c00000001d1d0118, c00000001d1d0120)
The buggy address belongs to the physical page:
page:c00c000000074740 refcount:1 mapcount:0 mapping:0000000000000000 index:0xc00000001d1d0558 pfn:0x1d1d
flags: 0x7ffff000000200(slab|node=0|zone=0|lastcpupid=0x7ffff)
raw: 007ffff000000200 c00000001d0003c8 c00000001d0003c8 c00000001d010480
raw: c00000001d1d0558 0000000001e1000a 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
c00000001d1d0000: fc 00 fc fc fc fc fc fc fc fc fc fc fc fc fc fc
c00000001d1d0080: fc fc 00 fc fc fc fc fc fc fc fc fc fc fc fc fc
>c00000001d1d0100: fc fc fc 02 fc fc fc fc fc fc fc fc fc fc fc fc
^
c00000001d1d0180: fc fc fc fc 04 fc fc fc fc fc fc fc fc fc fc fc
c00000001d1d0200: fc fc fc fc fc 04 fc fc fc fc fc fc fc fc fc fc
This happens because the allocation uses the wrong unit (bits) when it
should pass (BITS_TO_LONGS(count) * sizeof(long)) or equivalent. With small
numbers of bits, the allocated object can be smaller than sizeof(long),
which results in invalid accesses.
Use bitmap_zalloc() to allocate and initialize the irq bitmap, paired with
bitmap_free() for consistency.
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220623182509.3985625-1-nathanl@linux.ibm.com
The kexec code paths involve code that necessarily run in real mode, as
CPUs are disabled and control is transferred to the new kernel. Disable
address sanitization for the kexec code and the functions called in real
mode on CPUs being disabled.
[paulus@ozlabs.org: combined a few work-in-progress commits of
Daniel's and wrote the commit message.]
Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
[mpe: Move pseries_machine_kexec() into kexec.c so setup.c can be instrumented]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/YoTFSQ2TUSEaDdVC@cleo