Henrik Rydgård
10f93875c6
Fix the semantics of DenseHashMap to be consistent even when inserting nulls
2023-09-11 12:07:18 +02:00
fp64
cd9f01c4df
Remove SSE4 path from Vec4<int>::operator*
2023-06-30 22:07:26 -04:00
fp64
f133739cd0
Replace some signed divison in SoftGPU
...
This also adds a few bitwise operations to Vec4<int> and further
SIMDifies it.
Also, fixes unrelated warning.
2023-06-29 16:43:21 -04:00
fp64
159faaa2ec
softgpu: Optimize (bi-)linear texture filtering
...
Seeing as SampleLinearLevel is near the top in the profiler,
optimize actual bilinear filtering using SSE2. Solid win in the
synthetic benchmark (https://godbolt.org/z/fqh3xvbGx , also doubles
as correctness check), no visible difference in actual PPSSPP.
Note: profiler suggests that hot part of SampleLinearLevel is
elsewhere.
2023-06-21 20:02:34 +03:00
Герман Семенов
122b63b9a8
GPU: using if constexpr C++17 optimization
2023-04-02 16:36:37 +02:00
Unknown W. Brackets
cd3fc26190
samplerjit: Prevent thread local stale cache read.
...
If the generation count happens to match, would still get a stale pointer
and crash. Let's just make the generation count static so it always
increases.
2023-02-22 21:15:03 -08:00
Unknown W. Brackets
62fe03dcb4
softgpu: Use NEON for texture blending.
2023-01-07 19:06:35 -08:00
Unknown W. Brackets
49f6c461ad
Reporting: Fix some header includes.
...
Particularly in Common, avoid including Core/Reporting.h.
2022-12-27 14:58:20 -08:00
Unknown W. Brackets
d9522a7ac5
softgpu: Avoid clear hazard for last cached funcs.
2022-12-06 21:23:56 -08:00
Unknown W. Brackets
eda3ce556e
softgpu: Avoid atomic structs.
...
Apparently we don't link libatomic and rather than fighting that, I'll
just use thread local values.
2022-12-06 20:35:07 -08:00
Unknown W. Brackets
400f6abf9a
softgpu: Optimize lookup of last jit func.
...
This is common (for example, maybe a pixel state is updated but sampler is
not), and reduces time spent in ComputeRasterizerState() quite a bit in
Darkstalkers, where jits are available (i.e. Intel currently.)
2022-12-06 19:16:19 -08:00
Unknown W. Brackets
87fb9eef37
softgpu: Remove std::function usage.
...
Wanted to avoid coupling these, but don't like the std::function
construct/destructs showing in profiles...
2022-12-06 19:15:57 -08:00
Unknown W. Brackets
38eb0a7a82
softgpu: Check for queued compile.
...
Rarely, we could have queued compiling the same one, which would crash on
a double insert.
2022-12-03 12:15:58 -08:00
Unknown W. Brackets
778a0487cb
softjit: Switch to DenseHashMap.
2022-12-02 20:59:13 -08:00
Unknown W. Brackets
4d06400548
softgpu: Fix compile hazard while running.
...
This prevents any clearing of cache while other threads may be using
previously cached funcs, and avoids wx exclusive hazards.
2022-11-20 12:04:02 -08:00
Unknown W. Brackets
ce51942508
softgpu: Correct WX-exclusive platform hazards.
...
Should mainly affect BSD at this point.
2022-11-20 10:55:35 -08:00
Unknown W. Brackets
79b1d1d35f
softgpu: Better approximate slope mip level mode ( #16276 )
...
* samplerjit: Remove unused x/y parameters.
Still need to tune the accuracy of filtering, but those were not the
right way.
* softgpu: Better approximate slope mip level mode.
This isn't exactly right, but it's closer.
* softgpu: Calculate auto from largest difference.
Direction shouldn't matter.
2022-10-23 10:15:43 +02:00
Unknown W. Brackets
167213c746
softgpu: Cache texture bufws at 16 bit.
...
Reducing the size of state a bit.
2022-09-12 21:57:00 -07:00
Unknown W. Brackets
90e009edb9
softgpu: Clamp/wrap textures at 512 pixels.
...
A texture larger than 512 is "valid", but simply wraps/clamps at 512.
Importantly, the texture coords are still calculated at the specified
size, which can be up to 32768.
2022-09-10 20:23:09 -07:00
Unknown W. Brackets
a88c9a0680
softgpu: Remove incorrect offsetting for X/Y.
2022-02-20 09:13:20 -08:00
Unknown W. Brackets
2479d52202
Global: Reduce includes of common headers.
...
In many places, string, map, or Common.h were included but not needed.
2022-01-30 16:35:33 -08:00
Unknown W. Brackets
d200ef40de
samplerjit: Compile sampler funcs together.
...
We can't have the cache clear between nearest/linear, because then we'll
call a bunch of int3's.
2022-01-29 20:28:20 -08:00
Unknown W. Brackets
99d6d569f0
samplerjit: Reduce transfers in nearest texel calc.
...
This benefits a few games, mostly where there's lots of UI or similar.
2022-01-24 21:28:04 -08:00
Unknown W. Brackets
c1e657ed47
samplerjit: Better vectorize UV linear calc.
...
Gives about 1-2% when mips are used.
2022-01-24 20:42:07 -08:00
Unknown W. Brackets
c2985bca31
softjit: Centralize some common funcs from sampler.
...
No need to duplicate this code.
2022-01-19 00:03:59 -08:00