Henrik Rydgård
c2bf09744a
SoftGPU: Fix refactoring mistake where we could return an uninitialized value. Oops.
2023-09-15 10:01:28 +02:00
Henrik Rydgård
10f93875c6
Fix the semantics of DenseHashMap to be consistent even when inserting nulls
2023-09-11 12:07:18 +02:00
Henrik Rydgård
9a7579f8fa
Typo fix, fixes #18038
2023-09-06 17:10:17 +02:00
Henrik Rydgård
f0fd9e85aa
Try dirtying CULL_PLANES in Execute_BoundingBox in SoftGPU
2023-07-30 18:35:18 +02:00
Henrik Rydgård
fd656c629d
More dirtying
2023-07-30 17:45:19 +02:00
Unknown W. Brackets
8f404a1961
softgpu: Fix minor typo.
2023-07-25 19:42:36 -07:00
Unknown W. Brackets
d6a5e84db5
softgpu: Fix worldpos skipping.
...
Oops, was reversed. We need worldpos for non-directional lights.
2023-07-16 10:59:44 -07:00
Unknown W. Brackets
47c29e0874
sopftgpu: Disable lights if all else disabled.
...
Tiny gain, but seeing it happen so might as well.
2023-07-16 10:31:58 -07:00
Unknown W. Brackets
d5b4c98f96
softgpu: Reduce some non-SIMD lighting math.
...
Small perf improvement for vertex/lighting heavy (i.e. 3D) scenes.
2023-07-16 10:31:44 -07:00
fp64
b0f71e08f4
Simplify projective texcoord calculation
...
As mentioned in https://github.com/hrydgard/ppsspp/issues/17613#issuecomment-1613583152 .
2023-07-03 10:59:09 -04:00
fp64
cd9f01c4df
Remove SSE4 path from Vec4<int>::operator*
2023-06-30 22:07:26 -04:00
fp64
f133739cd0
Replace some signed divison in SoftGPU
...
This also adds a few bitwise operations to Vec4<int> and further
SIMDifies it.
Also, fixes unrelated warning.
2023-06-29 16:43:21 -04:00
fp64
436b49c4f2
Streamline x86 SSE workaround
...
Seems clearer than using #ifdef's at each site. Also rationale
is clearly spelled out, one 'Go to definition' away from any instance.
2023-06-27 00:30:01 -04:00
Unknown W. Brackets
fedb92b0e9
softgpu: Ensure early depth test uses SIMD.
2023-06-25 10:18:21 -07:00
Henrik Rydgård
08d578dce9
Merge pull request #17618 from unknownbrackets/softgpu-opt-cast
...
Optimize casts in softgpu
2023-06-25 07:55:30 +02:00
Henrik Rydgård
ec92675c5e
Merge pull request #17619 from unknownbrackets/softgpu-opt-z
...
softgpu: Improve Z interpolation SIMD
2023-06-25 07:55:03 +02:00
Unknown W. Brackets
d42642edd2
softgpu: Improve Z interpolation SIMD.
2023-06-24 22:17:11 -07:00
Unknown W. Brackets
ae9d34370e
softgpu: Move wsum_recip out of the triangle loop.
...
Seems like a small benefit, but not seeing any issues from this.
Noticed by fp64.
2023-06-24 12:38:05 -07:00
fp64
159faaa2ec
softgpu: Optimize (bi-)linear texture filtering
...
Seeing as SampleLinearLevel is near the top in the profiler,
optimize actual bilinear filtering using SSE2. Solid win in the
synthetic benchmark (https://godbolt.org/z/fqh3xvbGx , also doubles
as correctness check), no visible difference in actual PPSSPP.
Note: profiler suggests that hot part of SampleLinearLevel is
elsewhere.
2023-06-21 20:02:34 +03:00
Unknown W. Brackets
efd8565ffe
Merge pull request #17592 from fp64/anymask-movemask
...
Use _mm_movemask_ps for AnyMask
2023-06-17 09:48:09 -07:00
fp64
ab85c46161
Use _mm_movemask_ps for AnyMask
...
Probably very minor speed improvement, but it's rather neat.
2023-06-17 01:05:02 -04:00
Henrik Rydgård
5b4fa06b00
Revert Dot33 on 32-bit x86 only. See #17584
2023-06-16 23:43:33 +02:00
fp64
f0d844a5a3
Convert Dot33 to SSE2
...
Simpler, lower requirements, and doesn't seem to hurt speed. See #17571 .
2023-06-14 22:02:50 -04:00
Henrik Rydgård
963ca50ba7
Merge pull request #17567 from hrydgard/uvscale-as-argument
...
Pass uvScale in as a fourth argument to the vertex decoder
2023-06-13 09:49:31 +02:00
Unknown W. Brackets
a7fa37d114
softgpu: Use SIMD more for dot products.
2023-06-12 19:54:32 -07:00