Henrik Rydgård
c794f4bd41
Add an unrelated comment and some casts
2024-06-05 08:35:09 +02:00
Henrik Rydgård
737ec3e90b
NEON buildfix
2023-11-28 18:40:10 +01:00
Henrik Rydgård
4ec2d76bc9
NEON-optimize matrix tranposes
2023-11-27 23:57:26 +01:00
Henrik Rydgård
45aae7b9da
ARM32: Backport a lot of previously 64-bit-only NEON optimizations to ARM32.
2023-11-27 23:51:10 +01:00
Unknown W. Brackets
d5b4c98f96
softgpu: Reduce some non-SIMD lighting math.
...
Small perf improvement for vertex/lighting heavy (i.e. 3D) scenes.
2023-07-16 10:31:44 -07:00
fp64
cd9f01c4df
Remove SSE4 path from Vec4<int>::operator*
2023-06-30 22:07:26 -04:00
fp64
f133739cd0
Replace some signed divison in SoftGPU
...
This also adds a few bitwise operations to Vec4<int> and further
SIMDifies it.
Also, fixes unrelated warning.
2023-06-29 16:43:21 -04:00
Unknown W. Brackets
dfe113e846
Merge pull request #17634 from fp64/macro-x86-loadu
...
Streamline x86 SSE workaround
2023-06-27 23:01:41 -07:00
M4xw
99ce3125df
[Softgpu] Fix AArch64 oversight
2023-06-27 17:20:11 +02:00
fp64
436b49c4f2
Streamline x86 SSE workaround
...
Seems clearer than using #ifdef's at each site. Also rationale
is clearly spelled out, one 'Go to definition' away from any instance.
2023-06-27 00:30:01 -04:00
Henrik Rydgård
08d578dce9
Merge pull request #17618 from unknownbrackets/softgpu-opt-cast
...
Optimize casts in softgpu
2023-06-25 07:55:30 +02:00
Unknown W. Brackets
15b66ba6c0
softgpu: Make SIMD on x86_32 a bit safer.
2023-06-24 14:49:23 -07:00
Unknown W. Brackets
795de9b164
softgpu: Use SIMD for more Vec4 casts.
...
A number of these were falling back to some pretty terrible code.
Thanks to fp64 for noticing.
2023-06-24 12:36:44 -07:00
Unknown W. Brackets
a7fa37d114
softgpu: Use SIMD more for dot products.
2023-06-12 19:54:32 -07:00
Unknown W. Brackets
b55dbdab7f
softgpu: Use NEON for some color conv.
2023-01-07 19:06:34 -08:00
Unknown W. Brackets
a7b7bf7826
Global: Set many read-only params as const.
...
This makes what they do and which args to use clearer, if nothing else.
2022-12-10 21:13:36 -08:00
Henrik Rydgård
37b0c90a2d
Silence address-sanitizer warnings in Math3D.h on ARM64 (not very serious but good to fix)
2022-12-09 23:47:42 +01:00
Unknown W. Brackets
b2e6a086dc
softgpu: Reduce size of VertexData texture coords.
...
There's no real benefit to this with only two values.
Not much of a gain perf wise, but still good to transfer less data.
2022-09-12 21:10:46 -07:00
Unknown W. Brackets
b90fc7137f
softgpu: Correct accuracy of fog calculation.
...
This matches values from a PSP exactly, with the help of immediate mode
vertex values (since this directly allows specifying the fog factor
without any floating point math.)
2022-09-11 08:24:40 -07:00
Henrik Rydgård
cd92151de7
Add ARM64_NEON compile arch flag
...
This allows doing ARM64 builds without NEON support, and allows simplifying some checks.
2022-06-25 07:29:20 +02:00
Unknown W. Brackets
163fa352e8
softgpu: Avoid some unaligned access on x86_32.
2022-03-13 12:44:58 -07:00
Unknown W. Brackets
8a00c2d233
GPU: Allow gcc/clang/icc runtime SSE4 usage.
...
All our builds before were only using SSE4 in jit...
2022-01-08 17:09:09 -08:00
Unknown W. Brackets
43f71884ee
softgpu: Clarify internal matrix multiply usage.
2022-01-07 17:53:24 -08:00
Unknown W. Brackets
e7d66f2029
softgpu: Reuse SSE/NEON matrix code.
2022-01-06 21:19:47 -08:00
Unknown W. Brackets
079b67e7ed
softgpu: Use common SIMD matrix multiplies.
2022-01-06 21:19:47 -08:00