Several interfaces were missing and others misnumbered or
improperly documented.
Also, make sure to check the return value when registering
the kernel TSBs with the hypervisor. This helped to find
the 4MB kernel TSB alignment bug fixed in a previous changeset.
Signed-off-by: David S. Miller <davem@davemloft.net>
Hypervisor interfaces need to be negotiated in order to use
some API calls reliably. So add a small set of interfaces
to request API versions and query current settings.
This allows us to fix some bugs in the hypervisor console:
1) If we can negotiate API group CORE of at least major 1
minor 1 we can use con_read and con_write which can improve
console performance quite a bit.
2) When we do a console write request, we should hold the
spinlock around the whole request, not a byte at a time.
What would happen is that it's easy for output from
different cpus to get mixed with each other.
3) Use consistent udelay() based polling, udelay(1) each
loop with a limit of 1000 polls to handle stuck hypervisor
console.
Signed-off-by: David S. Miller <davem@davemloft.net>
When I added the entries for the robust futex syscall entries, I
forgot to bump NR_SYSCALLS. The current situation is error-prone
because NR_SYSCALLS lives in entry.S where the system call limit
checks are enforced. Move the definition to asm/unistd.h in order to
make this mistake much more difficult to make.
And wire up sys_migrate_pages since the powerpc folks implemented the
compat wrapper for us.
Signed-off-by: David S. Miller <davem@davemloft.net>
This is the long overdue conversion of sparc64 over to
the generic IRQ layer.
The kernel image is slightly larger, but the BSS is ~60K
smaller due to the reduced size of struct ino_bucket.
A lot of IRQ implementation details, including ino_bucket,
were moved out of asm-sparc64/irq.h and are now private to
arch/sparc64/kernel/irq.c, and most of the code in irq.c
totally disappeared.
One thing that's different at the moment is IRQ distribution,
we do it at enable_irq() time. If the cpu mask is ALL then
we round-robin using a global rotating cpu counter, else
we pick the first cpu in the mask to support single cpu
targetting. This is similar to what powerpc's XICS IRQ
support code does.
This works fine on my UP SB1000, and the SMP build goes
fine and runs on that machine, but lots of testing on
different setups is needed.
Signed-off-by: David S. Miller <davem@davemloft.net>
This is the first in a series of cleanups that will hopefully
allow a seamless attempt at using the generic IRQ handling
infrastructure in the Linux kernel.
Define PIL_DEVICE_IRQ and vector all device interrupts through
there.
Get rid of the ugly pil0_dummy_{bucket,desc}, instead vector
the timer interrupt directly to a specific handler since the
timer interrupt is the only event that will be signaled on
PIL 14.
The irq_worklist is now in the per-cpu trap_block[].
Signed-off-by: David S. Miller <davem@davemloft.net>
There were several bugs in the SUN4V cpu mondo dispatch code.
In fact, if we ever got a EWOULDBLOCK or other error from
the hypervisor call, we'd potentially send a cpu mondo multiple
times to the same cpu and even worse we could loop until the
timeout resending the same mondo over and over to such cpus.
So let's bulletproof this thing as follows:
1) Implement cpu_mondo_send() and cpu_state() hypervisor calls
in arch/sparc64/kernel/entry.S, add prototypes to asm/hypervisor.h
2) Don't build and update the cpulist using inline functions, this
was causing the cpu mask to not get updated in the caller.
3) Disable interrupts during the entire mondo send, otherwise our
cpu list and/or mondo block could get overwritten if we take
an interrupt and do a cpu mondo send on the current cpu.
4) Check for all possible error return types from the cpu_mondo_send()
hypervisor call. In particular:
HV_EOK) Our work is done, all cpus have received the mondo.
HV_CPUERROR) One or more of the cpus in the cpu list we passed
to the hypervisor are in error state. Use cpu_state()
calls over the entries in the cpu list to see which
ones. Record them in "error_mask" and report this
after we are done sending the mondo to cpus which are
not in error state.
HV_EWOULDBLOCK) We need to keep trying.
Any other error we consider fatal, we report the event and exit
immediately.
5) We only timeout if forward progress is not made. Forward progress
is defined as having at least one cpu get the mondo successfully
in a given cpu_mondo_send() call. Otherwise we bump a counter
and delay a little. If the counter hits a limit, we signal an
error and report the event.
Also, smp_call_function_mask() error handling reports the number
of cpus incorrectly.
Signed-off-by: David S. Miller <davem@davemloft.net>
We need to use the real hardware processor ID when
targetting interrupts, not the "define to 0" thing
the uniprocessor build gives us.
Also, fill in the Node-ID and Agent-ID fields properly
on sun4u/Safari.
Signed-off-by: David S. Miller <davem@davemloft.net>
On uniprocessor, it's always zero for optimize that.
On SMP, the jmpl to the stub kills the return address stack in the cpu
branch prediction logic, so expand the code sequence inline and use a
code patching section to fix things up. This also always better and
explicit register selection, which will be taken advantage of in a
future changeset.
The hard_smp_processor_id() function is big, so do not inline it.
Fix up tests for Jalapeno to also test for Serrano chips too. These
tests want "jbus Ultra-IIIi" cases to match, so that is what we should
test for.
Signed-off-by: David S. Miller <davem@davemloft.net>
Just flip the bit off of whatever it's currently set to.
PSTATE_IE is guarenteed to be enabled when we get here.
Signed-off-by: David S. Miller <davem@davemloft.net>
UltraSPARC has special sets of global registers which are switched to
for certain trap types. There is one set for MMU related traps, one
set of Interrupt Vector processing, and another set (called the
Alternate globals) for all other trap types.
For what seems like forever we've hard coded the values in some of
these trap registers. Some examples include:
1) Interrupt Vector global %g6 holds current processors interrupt
work struct where received interrupts are managed for IRQ handler
dispatch.
2) MMU global %g7 holds the base of the page tables of the currently
active address space.
3) Alternate global %g6 held the current_thread_info() value.
Such hardcoding has resulted in some serious issues in many areas.
There are some code sequences where having another register available
would help clean up the implementation. Taking traps such as
cross-calls from the OBP firmware requires some trick code sequences
wherein we have to save away and restore all of the special sets of
global registers when we enter/exit OBP.
We were also using the IMMU TSB register on SMP to hold the per-cpu
area base address, which doesn't work any longer now that we actually
use the TSB facility of the cpu.
The implementation is pretty straight forward. One tricky bit is
getting the current processor ID as that is different on different cpu
variants. We use a stub with a fancy calling convention which we
patch at boot time. The calling convention is that the stub is
branched to and the (PC - 4) to return to is in register %g1. The cpu
number is left in %g6. This stub can be invoked by using the
__GET_CPUID macro.
We use an array of per-cpu trap state to store the current thread and
physical address of the current address space's page tables. The
TRAP_LOAD_THREAD_REG loads %g6 with the current thread from this
table, it uses __GET_CPUID and also clobbers %g1.
TRAP_LOAD_IRQ_WORK is used by the interrupt vector processing to load
the current processor's IRQ software state into %g6. It also uses
__GET_CPUID and clobbers %g1.
Finally, TRAP_LOAD_PGD_PHYS loads the physical address base of the
current address space's page tables into %g7, it clobbers %g1 and uses
__GET_CPUID.
Many refinements are possible, as well as some tuning, with this stuff
in place.
Signed-off-by: David S. Miller <davem@davemloft.net>
Also, the Solaris syscall table is sized differrently,
and does not go beyond entry 255, so trim off the excess
entries.
Signed-off-by: David S. Miller <davem@davemloft.net>