Unknown W. Brackets
fc3688d273
samplerjit: Small AVX optimization to modulate.
...
Only gives about 0.5% but it's still something.
2021-12-31 08:10:04 -08:00
Henrik Rydgård
244b0a86f6
Merge pull request #15262 from unknownbrackets/samplerjit-vec
...
samplerjit: Use SSSE3/SSE4 in linear filtering
2021-12-31 09:29:59 +01:00
Unknown W. Brackets
33e9841a4a
softgpu: Skip zero size triangles.
...
These were drawing before, incorrectly, which caused artifacts.
Noticeable in Blade Dancer.
2021-12-31 00:20:12 -08:00
Unknown W. Brackets
1addf84e90
samplerjit: Use SSSE3/SSE4 in linear filtering.
2021-12-30 23:22:56 -08:00
Unknown W. Brackets
147b81d6f7
x64jit: Add AVX/AVX2 encodings.
...
Also fix the FMA double ones, which were passing W wrongly.
2021-12-29 19:46:26 -08:00
Unknown W. Brackets
4bd94a4e5e
samplerjit: Pass funcs as an argument.
...
Seeing computing the ID in some profiles, so want to avoid computing per
thread/invocation.
2021-12-29 07:11:53 -08:00
Unknown W. Brackets
28cfbe0e5a
samplerjit: Add an alternate profiling method.
...
This is more useful to group common operations together for profiling.
2021-12-29 07:11:39 -08:00
Unknown W. Brackets
3aedea89eb
samplerjit: Correct level lookup offset.
2021-12-29 07:09:36 -08:00
Unknown W. Brackets
bf06342f9d
samplerjit: Minor SSE4 optimizations.
...
These seem to be a bit faster.
2021-12-29 07:07:35 -08:00
Unknown W. Brackets
631706a8ba
samplerjit: Set stackArgPos_ early.
...
Unfortunately, this has to match the value set lower...
2021-12-28 20:21:21 -08:00
Unknown W. Brackets
74eb450e76
samplerjit: Move texture function into jit.
...
Could do this also for nearest, might end up with a third set of functions
there for a direct sample lookup (for debug funcs.)
2021-12-28 17:52:17 -08:00
Unknown W. Brackets
940e6bb1d7
samplerjit: Lookup both mip tex values.
2021-12-28 16:22:54 -08:00
Unknown W. Brackets
6b55d328e5
samplerjit: Use regcache for linear filtering.
...
This makes it easier to reuse for mipmap filtering.
2021-12-28 15:37:25 -08:00
Unknown W. Brackets
cdf14c8579
samplerjit: Calculate mip level U/V/offsets.
...
Not actually doing the sampling for the second mip level in the single jit
pass yet, but close.
2021-12-28 14:12:58 -08:00
Unknown W. Brackets
a4558a5736
samplerjit: Take texptr/bufw as arrays.
...
Prep for moving mip map sampling into linear.
2021-12-28 12:04:16 -08:00
Unknown W. Brackets
4864850b3b
samplerjit: Handle mipmap width/height in S/T calc.
2021-12-28 11:29:29 -08:00
Unknown W. Brackets
a84accf713
samplerjit: Move S/T calculation into jit.
...
Gives a pretty decent 5-10% improvement in many places.
2021-12-28 09:58:23 -08:00
Unknown W. Brackets
476dfdf731
samplerjit: Add more bits for S/T, skip multiply.
...
For now, we're not using those other bits yet.
2021-12-27 18:24:37 -08:00
Unknown W. Brackets
9cc0883d53
softgpu: Correct non-SSE T clamp.
2021-12-27 15:31:37 -08:00
Unknown W. Brackets
39d5b1c221
softgpu: Reduce mipmap fraction to 4 bits.
...
For CONST (and SLOPE with flat w), this produces accurate values.
SLOPE is still wrong in its handling of w, and AUTO seems to calculate
using a different and less accurate ramp. But they both produce values
with 16 steps, in any case.
2021-12-27 11:37:33 -08:00
Unknown W. Brackets
d6b6ef4cb1
softgpu: Correct nearest filtering too.
...
Turns out to have the same behavior as linear, when it comes to the
subpixel offset.
2021-12-27 11:37:33 -08:00
Unknown W. Brackets
1dfaea9062
softgpu: Remove no longer possible report.
...
Also, it's known how this behaves, now.
2021-12-27 11:37:33 -08:00
Unknown W. Brackets
75f105f84b
softgpu: Make linear filtering more accurate.
...
This matches tests for various u/v offsets and x/y subpixel offsets.
Mipmaps are probably still wrong.
2021-12-27 11:37:32 -08:00
Unknown W. Brackets
3cd19b02ac
samplerjit: Handle unswizzled offsets too.
2021-12-27 11:37:32 -08:00
Unknown W. Brackets
820361f34b
samplerjit: Calculate texel byte offset as vector.
2021-12-27 11:37:32 -08:00