Commit Graph

38 Commits

Author SHA1 Message Date
Sam Ravnborg 4787083fcc [SPARC64]: Fix inconsistent .section usage in lib/
A few places missed the "a" specifier for the __ex_table section. Add
these so we avoid generation an additional section at link time.

Latest modpost would otherwise complain like this:

WARNING: vmlinux.o (__ex_table.2): section name inconsistency.
(.[number]+) following section name.
Did you forget to use "ax"/"aw" in a .S file?
Note that for example <linux/init.h> contains
section definitions for use in .S files.
WARNING: vmlinux.o (__ex_table.4): section name inconsistency.
(.[number]+) following section name.
Did you forget to use "ax"/"aw" in a .S file?
Note that for example <linux/init.h> contains
section definitions for use in .S files.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:32:44 -08:00
Sam Ravnborg c6d64c16bb [SPARC/SPARC64]: Fix usage of .section .sched.text in assembler code.
ld will generate an unique named section when assembler do not use
"ax" but gcc does. Add the missing annotation.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:32:43 -08:00
David S. Miller 24f287e412 [SPARC64]: Implement atomic backoff.
When the cpu count is high and contention hits an atomic object, the
processors can synchronize such that some cpus continually get knocked
out and cannot complete the atomic update.

So implement an exponential backoff when SMP.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-17 16:24:55 -07:00
David S. Miller d060db63fd [SPARC64]: Fix register usage in xor_raid_4().
Some typos led to using %i6/%i7 instead of %l6/%l7 in loads which is
really really bad because those are the frame pointer and return PC.

Based upon a raid5 crash report by Bertrand Joel.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-13 21:53:14 -07:00
David S. Miller a4aa2e867c [SPARC64]: Don't use in/local regs for ldx/stx data in N1 memcpy.
It doesn't matter for use in 64-bit objects, but when used in
32-bit environments the top 32-bits of the local and in
registers will get chopped off on the next register window
spill/restore which leads to difficult to track down and
subtle bugs.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-02 16:17:17 -07:00
David S. Miller 25e5566ed3 [SPARC64]: Fix missing load-twin usage in Niagara-1 memcpy.
For the case where the source is not aligned modulo 8
we don't use load-twins to suck the data in and this
kills performance since normal loads allocate in the
L1 cache (unlike load-twin) and thus big memcpys swipe
the entire L1 D-cache.

We need to allocate a register window to implement this
properly, but that actually simplifies a lot of things
as a nice side-effect.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-02 01:03:09 -07:00
David S. Miller cf5adce117 [SPARC64]: Niagara-2 optimized copies.
The bzero/memset implementation stays the same as Niagara-1.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-08-16 01:47:25 -07:00
David S. Miller 6c70b6fc7b [SPARC64]: Do not assume sun4v chips have load-twin/store-init support.
Check the cpu type in the OBP device tree before committing to
using the optimized Niagara memcpy and memset implementation.

If we don't recognize the cpu type, use a completely generic
version.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-08-08 17:33:45 -07:00
David S. Miller 8b99cfb8cc [SPARC64]: More sensible udelay implementation.
Take a page from the powerpc folks and just calculate the
delay factor directly.

Since frequency scaling chips use a system-tick register,
the value is going to be the same system-wide.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-16 04:05:02 -07:00
David S. Miller 24d559cac4 [SPARC64]: store-init needs trailing membar.
The manual says that it is required and we actually have crash reports
where loads see stale data due to not having membars here.

In one case the networking does:

	memset(skb, 0, offsetof(struct sk_buff, truesize));

and then some code later checks skb->nohdr for zero, but it's still
the value that was there before the memset().

Note that arch/sparc64/lib/xor.S already got this right.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-03-19 13:27:33 -07:00
Jörn Engel 6ab3d5624e Remove obsolete #include <linux/config.h>
Signed-off-by: Jörn Engel <joern@wohnheim.fh-wedel.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-06-30 19:25:36 +02:00
David S. Miller ae5de0ff0b [SPARC64]: Fix missing fold at end of checksums.
Both csum_partial() and the csum_partial_copy*() family of routines
forget to do a final fold on the computed checksum value on sparc64.
So do the standard Sparc "add + set condition codes, add carry"
sequence, then make sure the high 32-bits of the return value are
clear.

Based upon some excellent detective work and debugging done by
Richard Braun and Samuel Thibault.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-04 21:32:01 -07:00
Akinobu Mita 2d78d4beb6 [PATCH] bitops: sparc64: use generic bitops
- remove __{,test_and_}{set,clear,change}_bit() and test_bit()
- remove ffz()
- remove __ffs()
- remove generic_fls()
- remove generic_fls64()
- remove sched_find_first_bit()
- remove ffs()

- unless defined(ULTRA_HAS_POPULATION_COUNT)

  - remove generic_hweight{64,32,16,8}()

- remove find_{next,first}{,_zero}_bit()
- remove ext2_{set,clear,test,find_first_zero,find_next_zero}_bit()
- remove minix_{test,set,test_and_clear,test,find_first_zero}_bit()

Signed-off-by: Akinobu Mita <mita@miraclelinux.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:57:14 -08:00
David S. Miller bb8646d834 [SPARC64]: Optimized TSB table initialization.
We only need to write an invalid tag every 16 bytes,
so taking advantage of this can save many instructions
compared to the simple memset() call we make now.

A prefetching implementation is implemented for sun4u
and a block-init store version if implemented for Niagara.

The next trick is to be able to perform an init and
a copy_tsb() in parallel when growing a TSB table.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:16:41 -08:00
David S. Miller 3634476239 [SPARC64]: Niagara optimized XOR functions for RAID.
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:14:03 -08:00
David S. Miller 8ca2557c48 [SPARC64]: Niagara optimized memset/bzero/clear_user.
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:13:50 -08:00
David S. Miller 3763be32d5 [SPARC64]: Define ARCH_HAS_READ_CURRENT_TIMER.
This gives more consistent bogomips and delay() semantics,
especially on sun4v.  It gives weird looking values though...

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:13:29 -08:00
David S. Miller c857e3fdbc [SPARC64]: __bzero_noasi --> __clear_user
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:13:28 -08:00
David S. Miller 6241e5cc6a [SPARC64]: Fix branch signedness bug in all code patching.
The bug that hit SUN4V TLB patching exists elsewhere.
Make sure we cure all such cases.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:12:29 -08:00
David S. Miller c4bce90ea2 [SPARC64]: Deal with PTE layout differences in SUN4V.
Yes, you heard it right, they changed the PTE layout for
SUN4V.  Ho hum...

This is the simple and inefficient way to support this.
It'll get optimized, don't worry.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:12:25 -08:00
David S. Miller 0d4bc95b9c [SPARC64]: Fix some Niagara memcpy() bugs.
We need to restore the %asi register properly.
For the kernel this means get_fs(), for user this
means ASI_PNF.

Also, NGcopy_to_user.S was including U3memcpy.S instead
of NGmemcpy.S, oops :-)

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:12:20 -08:00
David S. Miller 8591e30272 [SPARC64]: Niagara copy/clear page.
Happily we have no D-cache aliasing issues on these
chips, so the implementation is very straightforward.

Add a stub in bootup which will be where the patching
calls will be made for niagara/sun4v/hypervisor.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:11:54 -08:00
David S. Miller 398d108308 [SPARC64]: Niagara optimized memcpy() and copy_{to,from}_user().
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:11:42 -08:00
David S. Miller 4da808c352 [SPARC64]: Fix bogus flush instruction usage.
Some of the trap code was still assuming that alternate
global %g6 was hard coded with current_thread_info().
Let's just consistently flush at KERNBASE when we need
a pipeline synchronization.  That's locked into the TLB
and will always work.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:11:22 -08:00
David S. Miller 4d000d5b96 [SPARC64]: Mark __ex_table section correctly.
We must use the "a" (allocate) attribute every time we
emit an entry into the __ex_table section.

For consistency, use "a" instead of #alloc which is some
Solaris compat cruft GNU as provides on Sparc.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-04 23:23:56 -08:00