The following testcase may result in a page table entries with a invalid
CCA field being generated:
static void *bindstack;
static int sysrqfd;
static void protect_low(int protect)
{
mprotect(bindstack, BINDSTACK_SIZE, protect);
}
static void sigbus_handler(int signal, siginfo_t * info, void *context)
{
void *addr = info->si_addr;
write(sysrqfd, "x", 1);
printf("sigbus, fault address %p (should not happen, but might)\n",
addr);
abort();
}
static void run_bind_test(void)
{
unsigned int *p = bindstack;
p[0] = 0xf001f001;
write(sysrqfd, "x", 1);
/* Set trap on access to p[0] */
protect_low(PROT_NONE);
write(sysrqfd, "x", 1);
/* Clear trap on access to p[0] */
protect_low(PROT_READ | PROT_WRITE | PROT_EXEC);
write(sysrqfd, "x", 1);
/* Check the contents of p[0] */
if (p[0] != 0xf001f001) {
write(sysrqfd, "x", 1);
/* Reached, but shouldn't be */
printf("badness, shouldn't happen but does\n");
abort();
}
}
int main(void)
{
struct sigaction sa;
sysrqfd = open("/proc/sysrq-trigger", O_WRONLY);
if (sigprocmask(SIG_BLOCK, NULL, &sa.sa_mask)) {
perror("sigprocmask");
return 0;
}
sa.sa_sigaction = sigbus_handler;
sa.sa_flags = SA_SIGINFO | SA_NODEFER | SA_RESTART;
if (sigaction(SIGBUS, &sa, NULL)) {
perror("sigaction");
return 0;
}
bindstack = mmap(NULL,
BINDSTACK_SIZE,
PROT_READ | PROT_WRITE | PROT_EXEC,
MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
if (bindstack == MAP_FAILED) {
perror("mmap bindstack");
return 0;
}
printf("bindstack: %p\n", bindstack);
run_bind_test();
printf("done\n");
return 0;
}
There are multiple ingredients for this:
1) PAGE_NONE is defined to _CACHE_CACHABLE_NONCOHERENT, which is CCA 3
on all platforms except SB1 where it's CCA 5.
2) _page_cachable_default must have bits set which are not set
_CACHE_CACHABLE_NONCOHERENT.
3) Either the defective version of pte_modify for XPA or the standard
version must be in used. However pte_modify for the 36 bit address
space support is no affected.
In that case additional bits in the final CCA mode may generate an invalid
value for the CCA field. On the R10000 system where this was tracked
down for example a CCA 7 has been observed, which is Uncached Accelerated.
Fixed by:
1) Using the proper CCA mode for PAGE_NONE just like for all the other
PAGE_* pte/pmd bits.
2) Fix the two affected variants of pte_modify.
Further code inspection also shows the same issue to exist in pmd_modify
which would affect huge page systems.
Issue in pte_modify tracked down by Alastair Bridgewater, PAGE_NONE
and pmd_modify issue found by me.
The history of this goes back beyond Linus' git history. Chris Dearman's
commit 351336929c ("[MIPS] Allow setting of
the cache attribute at run time.") missed the opportunity to fix this
but it was originally introduced in lmo commit
d523832cf12007b3242e50bb77d0c9e63e0b6518 ("Missing from last commit.")
and 32cc38229ac7538f2346918a09e75413e8861f87 ("New configuration option
CONFIG_MIPS_UNCACHED.")
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Reported-by: Alastair Bridgewater <alastair.bridgewater@gmail.com>
When emulating TLB miss / invalid exceptions during CACHE instruction
emulation, be sure to set up the correct PC and host_cp0_badvaddr state
for the kvm_mips_emlulate_tlb*_ld() function to pick up for guest EPC
and BadVAddr.
PC needs to be rewound otherwise the guest EPC will end up pointing at
the next instruction after the faulting CACHE instruction.
host_cp0_badvaddr must be set because guest CACHE instructions trap with
a Coprocessor Unusable exception, which doesn't update the host BadVAddr
as a TLB exception would.
This doesn't tend to get hit when dynamic translation of emulated
instructions is enabled, since only the first execution of each CACHE
instruction actually goes through this code path, with subsequent
executions hitting the SYNCI instruction that it gets replaced with.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: kvm@vger.kernel.org
Cc: linux-mips@linux-mips.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When a CACHE instruction is emulated by kvm_mips_emulate_cache(), the PC
is first updated to point to the next instruction, and afterwards it
falls through the "dont_update_pc" label, which rewinds the PC back to
its original address.
This works when dynamic translation of emulated instructions is enabled,
since the CACHE instruction is replaced with a SYNCI which works without
trapping, however when dynamic translation is disabled the guest hangs
on CACHE instructions as they always trap and are never stepped over.
Roughly swap the meanings of the "done" and "dont_update_pc" to match
kvm_mips_emulate_CP0(), so that "done" will roll back the PC on failure,
and "dont_update_pc" won't change PC at all (for the sake of exceptions
that have already modified the PC).
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: kvm@vger.kernel.org
Cc: linux-mips@linux-mips.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When faulting guest addresses are matched against guest segments with
the KVM_GUEST_KSEGX() macro, change the mask to 0xe0000000 so as to
include bit 31.
This is mainly for safety's sake, as it prevents a rogue BadVAddr in the
host kseg2/kseg3 segments (e.g. 0xC*******) after a TLB exception from
matching the guest kseg0 segment (e.g. 0x4*******), triggering an
internal KVM error instead of allowing the corresponding guest kseg0
page to be mapped into the host vmalloc space.
Such a rogue BadVAddr was observed to happen with the host MIPS kernel
running under QEMU with KVM built as a module, due to a not entirely
transparent optimisation in the QEMU TLB handling. This has already been
worked around properly in a previous commit.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: kvm@vger.kernel.org
Cc: linux-mips@linux-mips.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Copy __kvm_mips_vcpu_run() into unmapped memory, so that we can never
get a TLB refill exception in it when KVM is built as a module.
This was observed to happen with the host MIPS kernel running under
QEMU, due to a not entirely transparent optimisation in the QEMU TLB
handling where TLB entries replaced with TLBWR are copied to a separate
part of the TLB array. Code in those pages continue to be executable,
but those mappings persist only until the next ASID switch, even if they
are marked global.
An ASID switch happens in __kvm_mips_vcpu_run() at exception level after
switching to the guest exception base. Subsequent TLB mapped kernel
instructions just prior to switching to the guest trigger a TLB refill
exception, which enters the guest exception handlers without updating
EPC. This appears as a guest triggered TLB refill on a host kernel
mapped (host KSeg2) address, which is not handled correctly as user
(guest) mode accesses to kernel (host) segments always generate address
error exceptions.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: kvm@vger.kernel.org
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org> # 3.10.x-
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In microMIPS kernels, handle_signal() sets the isa16 mode bit in the
vdso address so that the sigreturn trampolines (which are offset from
the VDSO) get executed as microMIPS.
However commit ebb5e78cc6 ("MIPS: Initial implementation of a VDSO")
changed the offsets to come from the VDSO image, which already have the
isa16 mode bit set correctly since they're extracted from the VDSO
shared library symbol table.
Drop the isa16 mode bit handling from handle_signal() to fix sigreturn
for cores which support both microMIPS and normal MIPS. This doesn't fix
microMIPS only cores, since the VDSO is still built for normal MIPS, but
thats a separate problem.
Fixes: ebb5e78cc6 ("MIPS: Initial implementation of a VDSO")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org> # 4.4.x-
Patchwork: https://patchwork.linux-mips.org/patch/13348/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Avoid an aliasing issue causing a build error in VDSO:
In file included from include/linux/srcu.h:34:0,
from include/linux/notifier.h:15,
from ./arch/mips/include/asm/uprobes.h:9,
from include/linux/uprobes.h:61,
from include/linux/mm_types.h:13,
from ./arch/mips/include/asm/vdso.h:14,
from arch/mips/vdso/vdso.h:27,
from arch/mips/vdso/gettimeofday.c:11:
include/linux/workqueue.h: In function 'work_static':
include/linux/workqueue.h:186:2: error: dereferencing type-punned pointer will break strict-aliasing rules [-Werror=strict-aliasing]
return *work_data_bits(work) & WORK_STRUCT_STATIC;
^
cc1: all warnings being treated as errors
make[2]: *** [arch/mips/vdso/gettimeofday.o] Error 1
with a CONFIG_DEBUG_OBJECTS_WORK configuration and GCC 5.2.0. Include
`-fno-strict-aliasing' along with compiler options used, as required for
kernel code, fixing a problem present since the introduction of VDSO
with commit ebb5e78cc6 ("MIPS: Initial implementation of a VDSO").
Thanks to Tejun for diagnosing this properly!
Signed-off-by: Maciej W. Rozycki <macro@imgtec.com>
Reviewed-by: James Hogan <james.hogan@imgtec.com>
Fixes: ebb5e78cc6 ("MIPS: Initial implementation of a VDSO")
Cc: Tejun Heo <tj@kernel.org>
Cc: linux-mips@linux-mips.org
Cc: stable@vger.kernel.org # v4.3+
Patchwork: https://patchwork.linux-mips.org/patch/13357/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The Hardware page Table Walker (HTW) is being misconfigured on 64-bit
kernels. The PWSize.PS (pointer size) bit determines whether pointers
within directories are loaded as 32-bit or 64-bit addresses, but was
never being set to 1 for 64-bit kernels where the unsigned long in pgd_t
is 64-bits wide.
This actually reduces rather than improves performance when the HTW is
enabled on P6600 since the HTW is initiated lots, but walks are all
aborted due I think to bad intermediate pointers.
Since we were already taking the width of the PTEs into account by
setting PWSize.PTEW, which is the left shift applied to the page table
index *in addition to* the native pointer size, we also need to reduce
PTEW by 1 when PS=1. This is done by calculating PTEW based on the
relative size of pte_t compared to pgd_t.
Finally in order for the HTW to be used when PS=1, the appropriate
XK/XS/XU bits corresponding to the different 64-bit segments need to be
set in PWCtl. We enable only XU for now to enable walking for XUSeg.
Supporting walking for XKSeg would be a bit more involved so is left for
a future patch. It would either require the use of a per-CPU top level
base directory if supported by the HTW (a bit like pgd_current but with
a second entry pointing at swapper_pg_dir), or the HTW would prepend bit
63 of the address to the global directory index which doesn't really
match how we split user and kernel page directories.
Fixes: cab25bc753 ("MIPS: Extend hardware table walking support to MIPS64")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/13364/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Simplify the DSP instruction wrapper macros which use explicit encodings
for microMIPS and normal MIPS by using the new encoding macros and
removing duplication.
To me this makes it easier to read since it is much shorter, but it also
ensures .insn is used, preventing objdump disassembling the microMIPS
code as normal MIPS.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/13314/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Hardcoded MIPS instruction encodings are provided for tlbinvf, mfhc0 &
mthc0 instructions, but microMIPS encodings are missing. I doubt any
microMIPS cores exist at present which support these instructions, but
the microMIPS encodings exist, and microMIPS cores may support them in
the future. Add the missing microMIPS encodings using the new macros.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/13313/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
When the toolchain doesn't support MSA we encode MSA instructions
explicitly in assembly. Unfortunately we use .word for both MIPS and
microMIPS encodings which is wrong, since 32-bit microMIPS instructions
are made up from a pair of halfwords.
- The most significant halfword always comes first, so for little endian
builds the halves will be emitted in the wrong order.
- 32-bit alignment isn't guaranteed, so the assembler may insert a
16-bit nop instruction to pad the instruction stream to a 32-bit
boundary.
Use the new instruction encoding macros to encode microMIPS MSA
instructions correctly.
Fixes: d96cc3d1ec ("MIPS: Add microMIPS MSA support.")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paul Burton <Paul.Burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/13312/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Toolchains may be used which support microMIPS but not VZ instructions
(i.e. binutis 2.22 & 2.23), so extend the explicitly encoded versions of
the guest COP0 register & guest TLB access macros to support microMIPS
encodings too, using the new macros.
This prevents non-microMIPS instructions being executed in microMIPS
mode during CPU probe on cores supporting VZ (e.g. M5150), which cause
reserved instruction exceptions early during boot.
Fixes: bad50d7925 ("MIPS: Fix VZ probe gas errors with binutils <2.24")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/13311/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
To allow simplification of macros which use inline assembly to
explicitly encode instructions, add a few simple abstractions to
mipsregs.h which expand to specific microMIPS or normal MIPS encodings
depending on what type of kernel is being built:
_ASM_INSN_IF_MIPS(_enc) : Emit a 32bit MIPS instruction if microMIPS is
not enabled.
_ASM_INSN32_IF_MM(_enc) : Emit a 32bit microMIPS instruction if enabled.
_ASM_INSN16_IF_MM(_enc) : Emit a 16bit microMIPS instruction if enabled.
The macros can be used one after another since the MIPS / microMIPS
macros are mutually exclusive, for example:
__asm__ __volatile__(
".set push\n\t"
".set noat\n\t"
"# mfgc0 $1, $%1, %2\n\t"
_ASM_INSN_IF_MIPS(0x40610000 | %1 << 11 | %2)
_ASM_INSN32_IF_MM(0x002004fc | %1 << 16 | %2 << 11)
"move %0, $1\n\t"
".set pop"
: "=r" (__res)
: "i" (source), "i" (sel));
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/13310/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>