Fix several bugs related to prefetch, ordering & speculation:
- GRU cch_allocate() instruction causes cacheable memory
to be created. Add a barriers to prevent speculation
from prefetching data before it exists.
- Add memory barriers before cache-flush instructions to ensure
that previously stored data is included in the line flushed to memory.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix bug caused by failure to allocate a GRU gts structure. The old code
failed to handle the case where the vma was invalid.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Improve existing driver self-tests. Add a new debugging test to the SGI
GRU driver for verifying the global GRU copy function.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add a debug option to the SGI GRU driver for flushing GRU cache lines from
memory. In theory this is not needed but it is useful for debugging.
This has no use by end users.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Under some conditions, mmu_notifier_register() will fail to register a
mmu_notifier. Fix the GRU driver to correctly handle these failures.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Increase the maximum address supported by the SGI GRU driver to a full 64
bits. Note that GRU addresses are not always the same as socket virtual
addresses. Sockets may not necessarily support the full 64 bits.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch builds on the infrastructure introduced in the patches that
allow user specification of GRU blades & chiplets for context allocation.
This patch simplifies the algorithms for migrating GRU contexts between
blades.
No new functionality is introduced.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add support to the GRU driver to allow users to specify the blade &
chiplet for allocation of GRU contexts. Add new statistics for context
loading/unloading/retargeting. Also deleted a few GRU stats that were no
longer being unused.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add table & user request infrastructure that is needed to allow users to
specify the blade and chiplet for allocation of GRU contexts. Use of this
information is in a subsequent patch.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Do not use alloc_pages_exact_node() to allocate GRU tables. If a blade
has no local memory, nid will be -1. Use alloc_pages_node() instead.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
TLB dropins require updates to the CBR instruction istatus field. This is
needed to resolve race conditions in the chip.
The code currently uses the user address of the CBR. This works but opens
up additional endcases related to stealing of contexts and accessing the
CBR from tasks that do not have access to the user address space. (Some
of this non-user task access is debug code that is not currently being
pushed to the community).
User CBRs are also directly accessible using the kernel mapping of the
CBR. Change the TLB dropin code to use the the kernel mapping of the CBR.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add comments from previous code reviews. The comments help explain some
of the more esoteric aspects of the driver.
Move a free() to the other side of an unlock.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Change the GRU initialization code to initialize based on blade topology
instead of node topology. The result is the same but blade-based
initialization is cleaner.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This was a difficult bug to trip. XPC was in the middle of sending an
acknowledgement for a received message.
In xpc_received_payload_uv():
.
ret = xpc_send_gru_msg(ch->sn.uv.cached_notify_gru_mq_desc, msg,
sizeof(struct xpc_notify_mq_msghdr_uv));
if (ret != xpSuccess)
XPC_DEACTIVATE_PARTITION(&xpc_partitions[ch->partid], ret);
msg->hdr.msg_slot_number += ch->remote_nentries;
at the point in xpc_send_gru_msg() where the hardware has dispatched the
acknowledgement, the remote side is able to reuse the message structure
and send a message with a different slot number. This problem is made
worse by interrupts.
The adjustment of msg_slot_number and the BUG_ON in
xpc_handle_notify_mq_msg_uv() which verifies the msg_slot_number is
consistent are only used for debug purposes. Since a fix for this that
preserves the debug functionality would either have to infringe upon the
payload or allocate another structure just for debug, I decided to remove
it entirely.
Signed-off-by: Robin Holt <holt@sgi.com>
Cc: Jack Steiner <steiner@sgi.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Many times while the initial connection is being made, the contacted
partition will send back both the ACTIVATING and the ACTIVE
remote_act_state changes in very close succescion. The 1/4 second delay
in the make first contact loop is large enough to nearly always miss the
ACTIVATING state change.
Since either state indicates the remote partition has acknowledged our
state change, accept either.
Signed-off-by: Robin Holt <holt@sgi.com>
Cc: Jack Steiner <steiner@sgi.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Under heavy load conditions, our set of xpc messages may become exhausted.
The code handles this correctly with the exception of the management code
which hits a NULL pointer dereference.
Signed-off-by: Robin Holt <holt@sgi.com>
Cc: Jack Steiner <steiner@sgi.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The UV BIOS has moved the location of some of their pointers to the
"partition reserved page" from memory into a uv hub MMR. The GRU does not
support bcopy operations from MMR space so we need to special case the MMR
addresses using VLOAD operations.
Additionally, the BIOS call for registering a message queue watchlist has
removed the 'blade' value and eliminated the structure that was being
passed in. This is also reflected in this patch.
Signed-off-by: Robin Holt <holt@sgi.com>
Cc: Jack Steiner <steiner@sgi.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The BIOS has decided to store a pointer to the partition reserved page in
a scratch MMR. The GRU is only able to read an MMR using a vload
instruction. The gru_read_gpa() function will implemented.
Signed-off-by: Robin Holt <holt@sgi.com>
Signed-off-by: Jack Steiner <steiner@sgi.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The UV BIOS has been updated to implement some of our interface
functionality differently than originally expected. These patches update
the kernel to the bios implementation and include a few minor bug fixes
which prevent us from doing significant testing on real hardware.
This patch:
For SGI UV systems, translate from a global physical address back to a
socket physical address. This does nothing to ensure the socket physical
address is actually addressable by the kernel. That is the responsibility
of the user of the function.
Signed-off-by: Robin Holt <holt@sgi.com>
Cc: Jack Steiner <steiner@sgi.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently the locking in blockdev_direct_IO is a mess, we have three
different locking types and very confusing checks for some of them. The
most complicated one is DIO_OWN_LOCKING for reads, which happens to not
actually be used.
This patch gets rid of the DIO_OWN_LOCKING - as mentioned above the read
case is unused anyway, and the write side is almost identical to
DIO_NO_LOCKING. The difference is that DIO_NO_LOCKING always sets the
create argument for the get_blocks callback to zero, but we can easily
move that to the actual get_blocks callbacks. There are four users of the
DIO_NO_LOCKING mode: gfs already ignores the create argument and thus is
fine with the new version, ocfs2 only errors out if create were ever set,
and we can remove this dead code now, the block device code only ever uses
create for an error message if we are fully beyond the device which can
never happen, and last but not least XFS will need the new behavour for
writes.
Now we can replace the lock_type variable with a flags one, where no flag
means the DIO_NO_LOCKING behaviour and DIO_LOCKING is kept as the first
flag. Separate out the check for not allowing to fill holes into a
separate flag, although for now both flags always get set at the same
time.
Also revamp the documentation of the locking scheme to actually make
sense.
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Badari Pulavarty <pbadari@us.ibm.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Zach Brown <zach.brown@oracle.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Alex Elder <aelder@sgi.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>