With the previous attempt reverted this switches to conditionalizing the
end address. Nominally VMALLOC_END, but extended for P3_ADDR_MAX in the
store queue case.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This reverts commit 20e7c297ef.
With store queues enabled the area above P4SEG has special properties
from the MMU's point of view, which was causing fixmap failure. We'll
have to do something else to satisfy the vmalloc range check.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
The SSR.MD status amongst other things are already made available, which
can be used for encoding a more precise fault code value.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This unifies the fast-path TLB miss handler, allowing for further cleanup
and eventual utilization of a shared _32/_64 handler.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Now that the fast-path handler has been moved, we also need to update the
Makefile to ensure that the same restrictions for caller-save registers
are observed.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This brings the sh64 version in line with the sh32 one with regards to
how errors are handled. Base work for further unification of the
implementations.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Now that we have a method for finding out if we're handling an ITLB fault
or not without passing it all the way down the chain, it's possible to
use the __update_tlb() interface in place of a special __do_tlb_refill().
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This moves the now generic _32 page fault handling code to a shared place
and adapts the _64 implementation to make use of it.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This was reworked some time ago to go through fixmaps instead, leaving
the range itself unused. As such, kill off the remaining references and
hand over the remaining space for fixmaps directly. This also makes it
possible to simplify the vmalloc fault case as we no longer have to care
about the special section.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
At the moment the top of the fixmap space is calculated from P4SEG, which
places it at the end of the store queue space when that API is enabled.
Make sure we use P3_ADDR_MAX here instead to find the proper address
limit. With this done, it's also possible to switch to the generic
vmalloc address range check now that VMALLOC_START/END encapsulate the
translatable areas that we care about.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This provides a simple interface modelled after sparc64/m32r to encode
the error code in the upper byte of thread_info for finer-grained
handling in the page fault path.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This follows the x86 changes for tidying up the page fault error paths.
We'll build on top of this for _32/_64 unification.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
While the trap number and error code are passed around for debugging
purposes, this occurs wholly independently of the thread struct values.
These values were never part of the sigcontext ABI and are thus never
passed anywhere, so we can just kill them off across the board.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
In some cases the opps error reporting doesn't give enough information
to diagnose the problem, only printing information if it is thought
to be valid. Replace the current code with more detailed output.
This code is based on the ARM reporting, with minor changes for the SH.
[lethal@linux-sh.org: fixed up for 64-bit PTEs and pte_offset_kernel()]
Signed-off-by: Stuart Menefy <stuart.menefy@st.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
The problem is caused by the interaction of two features in the Linux
memory management code.
A processes address space is described by a struct mm_struct, and
every thread has a pointer to the mm it should run in. The exception
to this are kernel threads, which don't have an mm, and so borrow
the mm from the last thread which ran. The system is bootstrapped
by the initial kernel thread using init's mm (even though init hasn't
been created yet, its mm is the static init_mm).
The other feature is how the kernel handles the page table which
describes the portion of the address space which is only visible when
executing inside the kernel, and which is shared by all threads. On
the SH4 the only portion of the kernel's address space which described
using the page table is called P3, from 0xc0000000 to 0xdfffffff. This
portion of the address space is divided into three:
- mappings for dma_alloc_coherent()
- mappings for vmalloc() and ioremap()
- fixmap mappings, primarily used in copy_user_pages() to create
kernel mappings of user pages with the correct cache colour.
To optimise the TLB miss handler we don't want to add an additional
condition which checks whether the faulting address is in the user or
the kernel portion of the address space, and so all page tables have a
common portion which describes the kernel part of the address
space. As the SH4 uses a two level page table, only the kernel portion
of first level page table (the pgd entries) is duplicated. These all
point to the same second level entries (the pte's), and so no memory
is wasted.
The reference page table for the kernel is called the swapper_pg_dir,
and when a new page table is created for a new process the kernel
portion of the page table is copied from swapper_pg_dir. This works
fine when changes only occur in the second level of the kernel's page
table, or the first level entries are created before any new user
processes. However if a change occurs to the first level of the page
table, and there are existing processes which don't have this entry in
their page table, this new entry needs to be added. This is done on
demand, when the kernel accesses a P3 address which isn't mapped using
the current page table, the code in vmalloc_fault() copies the entry
from the reference page table (swapper_pg_dir) into the current
processes page table.
The bug which this patch addresses is that the code in vmalloc_fault()
was not copying addresses which fell in the dma_alloc_coherent()
portion of the address space, and it should have been copying any P3
address.
Why we hadn't seen this before, and what made this hard to reproduce,
is that normally the kernel will have called dma_alloc_coherent(), and
accessed the memory mapping created, before any user process
runs. Typically drivers such as USB or SATA will have created and used
mappings of this type during the kernel initialisation, when probing
for the attached devices, before init runs. Ethernet is slightly
different, as it normally only creates and accesses
dma_alloc_coherent() mappings when the network is brought up, but if
kernel level IP configuration is used this will also occur before any
user space process runs. So the first reproduction of this problem
which we saw was occurred when USB and SATA were removed from the
kernel, and then bring up Ethernet from user space using ifconfig.
I'd like to thank Joseph Bormolini who did the hard work reducing the
problem to this simple to reproduce criteria.
In your case the situation is slightly different, and turns out to
depends on the exact kernel configuration (which we had) and your
ramdisk contents (which we didn't - hence the need for some assumptions).
In this case the problem is a side effect of kernel level module
loading. Kernel subsystems sometimes trigger the load of kernel
modules directly, for example the crypto subsystem tries to load the
cryptomgr and MTD tries to load modules for Flash partitioning if
these are not built into the kernel. This is done by the kernel
creating a user process which runs insmod to try and load the
appropriate module.
In order for this to cause problems the system must be running with a
initrd or initramfs, which contains an insmod executable - if the
kernel can't find an insmod to run, no user process is created, and
the problem doesn't occur. If an insmod is found, a process is
created to run it, which will inherit the kernel portion of the
swapper_pg_dir first level page table. It doesn't matter whether the
inmod is successful or not, but when the the kernel scheduler context
switches back to the kernel initialisation thread, the insmod's mm is
'borrowed' by the kernel thread, as it doesn't have an address space
of its own. (Reference counting is used to ensure this mm is not
destroyed, even though the user process which caused its creation may no
longer exist.) If this address space doesn't have a first level page
table entry for the consistent mappings, and a driver tries to access
such a mapping, we are in the same situation as described above,
except this time in a kernel thread rather than a user thread
executing inside the kernel.
See bugzilla: 15425, 15836, 15862, 16106, 16793
Signed-off-by: Stuart Menefy <stuart.menefy@st.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
In the future we'll be unifying some of the 32/64 page fault path, so
start to tidy up the _64 one by killing off some of the unused debug
cruft.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Commit d065bd810b
(mm: retry page fault when blocking on disk transfer) and
commit 37b23e0525
(x86,mm: make pagefault killable)
The above commits introduced changes into the x86 pagefault handler
for making the page fault handler retryable as well as killable.
These changes reduce the mmap_sem hold time, which is crucial
during OOM killer invocation.
Port these changes to the 32-bit SH platform.
Signed-off-by: Kautuk Consul <consul.kautuk@gmail.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Pull SuperH fixes from Paul Mundt.
* tag 'sh-for-linus' of git://github.com/pmundt/linux-sh:
sh: fix clock-sh7757 for the latest sh_mobile_sdhi driver
serial: sh-sci: use serial_port_in/out vs sci_in/out.
sh: vsyscall: Fix up .eh_frame generation.
sh: dma: Fix up device attribute mismatch from sysdev fallout.
sh: dwarf unwinder depends on SHcompact.
sh: fix up fallout from system.h disintegration.
Pull DMA mapping branch from Marek Szyprowski:
"Short summary for the whole series:
A few limitations have been identified in the current dma-mapping
design and its implementations for various architectures. There exist
more than one function for allocating and freeing the buffers:
currently these 3 are used dma_{alloc, free}_coherent,
dma_{alloc,free}_writecombine, dma_{alloc,free}_noncoherent.
For most of the systems these calls are almost equivalent and can be
interchanged. For others, especially the truly non-coherent ones
(like ARM), the difference can be easily noticed in overall driver
performance. Sadly not all architectures provide implementations for
all of them, so the drivers might need to be adapted and cannot be
easily shared between different architectures. The provided patches
unify all these functions and hide the differences under the already
existing dma attributes concept. The thread with more references is
available here:
http://www.spinics.net/lists/linux-sh/msg09777.html
These patches are also a prerequisite for unifying DMA-mapping
implementation on ARM architecture with the common one provided by
dma_map_ops structure and extending it with IOMMU support. More
information is available in the following thread:
http://thread.gmane.org/gmane.linux.kernel.cross-arch/12819
More works on dma-mapping framework are planned, especially in the
area of buffer sharing and managing the shared mappings (together with
the recently introduced dma_buf interface: commit d15bd7ee44
"dma-buf: Introduce dma buffer sharing mechanism").
The patches in the current set introduce a new alloc/free methods
(with support for memory attributes) in dma_map_ops structure, which
will later replace dma_alloc_coherent and dma_alloc_writecombine
functions."
People finally started piping up with support for merging this, so I'm
merging it as the last of the pending stuff from the merge window.
Looks like pohmelfs is going to wait for 3.5 and more external support
for merging.
* 'for-linus' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping:
common: DMA-mapping: add NON-CONSISTENT attribute
common: DMA-mapping: add WRITE_COMBINE attribute
common: dma-mapping: introduce mmap method
common: dma-mapping: remove old alloc_coherent and free_coherent methods
Hexagon: adapt for dma_map_ops changes
Unicore32: adapt for dma_map_ops changes
Microblaze: adapt for dma_map_ops changes
SH: adapt for dma_map_ops changes
Alpha: adapt for dma_map_ops changes
SPARC: adapt for dma_map_ops changes
PowerPC: adapt for dma_map_ops changes
MIPS: adapt for dma_map_ops changes
X86 & IA64: adapt for dma_map_ops changes
common: dma-mapping: introduce generic alloc() and free() methods