When kernel's binary becomes large enough (32M and more) errors
may occur during the final linkage stage. It happens because
the build system uses short relocations for ARC by default.
This problem may be easily resolved by passing -mlong-calls
option to GCC to use long absolute jumps (j) instead of short
relative branchs (b).
But there are fragments of pure assembler code exist which use
branchs in inappropriate places and cause a linkage error because
of relocations overflow.
First of these fragments is .fixup insertion in futex.h and
unaligned.c. It inserts a code in the separate section (.fixup)
with branch instruction. It leads to the linkage error when
kernel becomes large.
Second of these fragments is calling scheduler's functions
(common kernel code) from entry.S of ARC's code. When kernel's
binary becomes large it may lead to the linkage error because
scheduler may occur far enough from ARC's code in the final
binary.
Signed-off-by: Yuriy Kolerov <yuriy.kolerov@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Callers of cmpxchg_futex_value_locked() in futex code expect bimodal
return value:
!0 (essentially -EFAULT as failure)
0 (success)
Before this patch, the success return value was old value of futex,
which could very well be non zero, causing caller to possibly take the
failure path erroneously.
Fix that by returning 0 for success
(This fix was done back in 2011 for all upstream arches, which ARC
obviously missed)
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Michel Lespinasse <walken@google.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
The atomic ops on futex need to provide the full barrier just like
regular atomics in kernel.
Also remove pagefault_enable/disable in futex_atomic_cmpxchg_inatomic()
as core code already does that
Cc: David Hildenbrand <dahi@linux.vnet.ibm.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Michel Lespinasse <walken@google.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
In case of ARCv2 CPU there're could be following configurations
that affect cache handling for data exchanged with peripherals
via DMA:
[1] Only L1 cache exists
[2] Both L1 and L2 exist, but no IO coherency unit
[3] L1, L2 caches and IO coherency unit exist
Current implementation takes care of [1] and [2].
Moreover support of [2] is implemented with run-time check
for SLC existence which is not super optimal.
This patch introduces support of [3] and rework of DMA ops
usage. Instead of doing run-time check every time a particular
DMA op is executed we'll have 3 different implementations of
DMA ops and select appropriate one during init.
As for IOC support for it we need:
[a] Implement empty DMA ops because IOC takes care of cache
coherency with DMAed data
[b] Route dma_alloc_coherent() via dma_alloc_noncoherent()
This is required to make IOC work in first place and also
serves as optimization as LD/ST to coherent buffers can be
srviced from caches w/o going all the way to memory
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
[vgupta:
-Added some comments about IOC gains
-Marked dma ops as static,
-Massaged changelog a bit]
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Pull MIPS fixes from Ralf Baechle:
"Another round of MIPS fixes for 4.2. No area does particularly stand
out but we have a two unpleasant ones:
- Kernel ptes are marked with a global bit which allows the kernel to
share kernel TLB entries between all processes. For this to work
both entries of an adjacent even/odd pte pair need to have the
global bit set. There has been a subtle race in setting the other
entry's global bit since ~ 2000 but it take particularly
pathological workloads that essentially do mostly vmalloc/vfree to
trigger this.
This pull request fixes the 64-bit case but leaves the case of 32
bit CPUs with 64 bit ptes unsolved for now. The unfixed cases
affect hardware that is not available in the field yet.
- Instruction emulation requires loading instructions from user space
but the current fast but simplistic approach will fail on pages
that are PROT_EXEC but !PROT_READ. For this reason we temporarily
do not permit this permission and will map pages with PROT_EXEC |
PROT_READ.
The remainder of this pull request is more or less across the field
and the short log explains them well"
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
MIPS: Make set_pte() SMP safe.
MIPS: Replace add and sub instructions in relocate_kernel.S with addiu
MIPS: Flush RPS on kernel entry with EVA
Revert "MIPS: BCM63xx: Provide a plat_post_dma_flush hook"
MIPS: BMIPS: Delete unused Kconfig symbol
MIPS: Export get_c0_perfcount_int()
MIPS: show_stack: Fix stack trace with EVA
MIPS: do_mcheck: Fix kernel code dump with EVA
MIPS: SMP: Don't increment irq_count multiple times for call function IPIs
MIPS: Partially disable RIXI support.
MIPS: Handle page faults of executable but unreadable pages correctly.
MIPS: Malta: Don't reinitialise RTC
MIPS: unaligned: Fix build error on big endian R6 kernels
MIPS: Fix sched_getaffinity with MT FPAFF enabled
MIPS: Fix build with CONFIG_OF=y for non OF-enabled targets
CPUFREQ: Loongson2: Fix broken build due to incorrect include.
Pull ARC fixes from Vineet Gupta:
"Here's a late pull request for accumulated ARC fixes which came out of
extended testing of the new ARCv2 port with LTP etc. llock/scond
livelock workaround has been reviewed by PeterZ. The changes look a
lot but I've crafted them into finer grained patches for better
tracking later.
I have some more fixes (ARC Futex backend) ready to go but those will
have to wait for tglx to return from vacation.
Summary:
- Enable a reduced config of HS38 (w/o div-rem, ll64...)
- Add software workaround for LLOCK/SCOND livelock
- Fallout of a recent pt_regs update"
* tag 'arc-v4.2-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
ARCv2: spinlock/rwlock/atomics: reduce 1 instruction in exponential backoff
ARC: Make pt_regs regs unsigned
ARCv2: spinlock/rwlock: Reset retry delay when starting a new spin-wait cycle
ARCv2: spinlock/rwlock/atomics: Delayed retry of failed SCOND with exponential backoff
ARC: LLOCK/SCOND based rwlock
ARC: LLOCK/SCOND based spin_lock
ARC: refactor atomic inline asm operands with symbolic names
Revert "ARCv2: STAR 9000837815 workaround hardware exclusive transactions livelock"
ARCv2: [axs103_smp] Reduce clk for Quad FPGA configs
ARCv2: Fix the peripheral address space detection
ARCv2: allow selection of page size for MMUv4
ARCv2: lib: memset: Don't assume 64-bit load/stores
ARCv2: lib: memcpy: Missing PREFETCHW
ARCv2: add knob for DIV_REV in Kconfig
ARC/time: Migrate to new 'set-state' interface
Pull USB fixes from Greg KH:
"Here are some USB and PHY fixes for 4.2-rc6 that resolve some reported
issues.
All of these have been in the linux-next tree for a while, full
details on the patches are in the shortlog below"
* tag 'usb-4.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
ARM: dts: dra7: Add syscon-pllreset syscon to SATA PHY
drivers/usb: Delete XHCI command timer if necessary
xhci: fix off by one error in TRB DMA address boundary check
usb: udc: core: add device_del() call to error pathway
phy: ti-pipe3: i783 workaround for SATA lockup after dpll unlock/relock
phy-sun4i-usb: Add missing EXPORT_SYMBOL_GPL for sun4i_usb_phy_set_squelch_detect
USB: sierra: add 1199:68AB device ID
usb: gadget: f_printer: actually limit the number of instances
usb: gadget: f_hid: actually limit the number of instances
usb: gadget: f_uac2: fix calculation of uac2->p_interval
usb: gadget: bdc: fix a driver crash on disconnect
usb: chipidea: ehci_init_driver is intended to call one time
USB: qcserial: Add support for Dell Wireless 5809e 4G Modem
USB: qcserial/option: make AT URCs work for Sierra Wireless MC7305/MC7355
If we have a series of events from userpsace, with %fprs=FPRS_FEF,
like follows:
ETRAP
ETRAP
VIS_ENTRY(fprs=0x4)
VIS_EXIT
RTRAP (kernel FPU restore with fpu_saved=0x4)
RTRAP
We will not restore the user registers that were clobbered by the FPU
using kernel code in the inner-most trap.
Traps allocate FPU save slots in the thread struct, and FPU using
sequences save the "dirty" FPU registers only.
This works at the initial trap level because all of the registers
get recorded into the top-level FPU save area, and we'll return
to userspace with the FPU disabled so that any FPU use by the user
will take an FPU disabled trap wherein we'll load the registers
back up properly.
But this is not how trap returns from kernel to kernel operate.
The simplest fix for this bug is to always save all FPU register state
for anything other than the top-most FPU save area.
Getting rid of the optimized inner-slot FPU saving code ends up
making VISEntryHalf degenerate into plain VISEntry.
Longer term we need to do something smarter to reinstate the partial
save optimizations. Perhaps the fundament error is having trap entry
and exit allocate FPU save slots and restore register state. Instead,
the VISEntry et al. calls should be doing that work.
This bug is about two decades old.
Reported-by: James Y Knight <jyknight@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This function may copy the si_addr_lsb, si_lower and si_upper fields to
user mode when they haven't been initialized, which can leak kernel
stack data to user mode.
Just checking the value of si_code is insufficient because the same
si_code value is shared between multiple signals. This is solved by
checking the value of si_signo in addition to si_code.
Signed-off-by: Amanieu d'Antras <amanieu@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This function can leak kernel stack data when the user siginfo_t has a
positive si_code value. The top 16 bits of si_code descibe which fields
in the siginfo_t union are active, but they are treated inconsistently
between copy_siginfo_from_user32, copy_siginfo_to_user32 and
copy_siginfo_to_user.
copy_siginfo_from_user32 is called from rt_sigqueueinfo and
rt_tgsigqueueinfo in which the user has full control overthe top 16 bits
of si_code.
This fixes the following information leaks:
x86: 8 bytes leaked when sending a signal from a 32-bit process to
itself. This leak grows to 16 bytes if the process uses x32.
(si_code = __SI_CHLD)
x86: 100 bytes leaked when sending a signal from a 32-bit process to
a 64-bit process. (si_code = -1)
sparc: 4 bytes leaked when sending a signal from a 32-bit process to a
64-bit process. (si_code = any)
parsic and s390 have similar bugs, but they are not vulnerable because
rt_[tg]sigqueueinfo have checks that prevent sending a positive si_code
to a different process. These bugs are also fixed for consistency.
Signed-off-by: Amanieu d'Antras <amanieu@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kishon writes:
phy: for 4.2-rc6
*) Fix compiler error when sun4i usb phy driver is built as module
*) Fix SATA Lockup issue in dra7 SoC
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Pull KVM fixes from Paolo Bonzini:
"Just two very small & simple patches"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: MTRR: Use default type for non-MTRR-covered gfn before WARN_ON
KVM: s390: Fix hang VCPU hang/loop regression
On MIPS the GLOBAL bit of the PTE must have the same value in any
aligned pair of PTEs. These pairs of PTEs are referred to as
"buddies". In a SMP system is is possible for two CPUs to be calling
set_pte() on adjacent PTEs at the same time. There is a race between
setting the PTE and a different CPU setting the GLOBAL bit in its
buddy PTE.
This race can be observed when multiple CPUs are executing
vmap()/vfree() at the same time.
Make setting the buddy PTE's GLOBAL bit an atomic operation to close
the race condition.
The case of CONFIG_64BIT_PHYS_ADDR && CONFIG_CPU_MIPS32 is *not*
handled.
Signed-off-by: David Daney <david.daney@cavium.com>
Cc: <stable@vger.kernel.org>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/10835/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
KGDB fails to build after f51e2f1911 ("ARC: make sure instruction_pointer()
returns unsigned value")
The hack to force one specific reg to unsigned backfired. There's no
reason to keep the regs signed after all.
| CC arch/arc/kernel/kgdb.o
|../arch/arc/kernel/kgdb.c: In function 'kgdb_trap':
| ../arch/arc/kernel/kgdb.c:180:29: error: lvalue required as left operand of assignment
| instruction_pointer(regs) -= BREAK_INSTR_SIZE;
Reported-by: Yuriy Kolerov <yuriy.kolerov@synopsys.com>
Fixes: f51e2f1911 ("ARC: make sure instruction_pointer() returns unsigned value")
Cc: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>