Unknown W. Brackets
355bad666c
softjit: Optimize common case bloom blending.
...
Bloom often uses fixed ONE + ONE, which is a lot less work for us. And
bloom often runs over and over again on pixels, so saving work is good.
2022-01-02 08:47:04 -08:00
Unknown W. Brackets
496545e55c
softgpu: Add code for tracking GPU writes.
...
Unfortunately, it has a pretty noticeable speed impact, even at the basic
"assume everything's written" level. Compiled off by default, but at
least it's there.
Doesn't account for tests (i.e. alpha test skipping write) so still not
perfectly accurate.
2022-01-02 08:28:30 -08:00
Henrik Rydgård
cb1f26122d
Merge pull request #15269 from unknownbrackets/softgpu-opt
...
softgpu: Reduce interpolation if not needed
2022-01-02 09:47:19 +01:00
Henrik Rydgård
da38c027b5
Merge pull request #15268 from unknownbrackets/samplerjit-nearest
...
Implement nearest in samplerjit, like linear
2022-01-02 09:46:29 +01:00
Unknown W. Brackets
025ac99f2f
softgpu: Reduce interpolation if not needed.
...
About 3% gain in some areas.
2022-01-01 18:34:04 -08:00
Unknown W. Brackets
40240be91c
samplerjit: Update nearest args, temp disable jit.
...
This temporarily disables jit for nearest, but refactors to use the new
arg structure. It now matches linear.
2022-01-01 16:58:05 -08:00
Unknown W. Brackets
06e954fe2a
samplerjit: Create a separate fetch func.
...
This allows nearest to become more similar to linear, where it applies the
texture function.
2022-01-01 16:58:04 -08:00
Unknown W. Brackets
d41e42d247
softgpu: Correct off-by-one scissor mask.
...
Fixes Brave Story in the software renderer. Was overwriting display list
data in the stride gap.
2022-01-01 16:42:36 -08:00
Unknown W. Brackets
b35ca3d472
softgpu: Cleanup min/max tri range handling.
...
The previous looked like it had off by one errors. This is simpler.
2022-01-01 16:42:36 -08:00
Unknown W. Brackets
12405709f0
softgpu: Skip processing scissored triangles.
...
If only one side was scissored (common), we might even put it on a thread,
which ended up as a lot of overhead. Gives 3-4% improvement in some
places.
2022-01-01 16:40:34 -08:00
Unknown W. Brackets
33e9841a4a
softgpu: Skip zero size triangles.
...
These were drawing before, incorrectly, which caused artifacts.
Noticeable in Blade Dancer.
2021-12-31 00:20:12 -08:00
Unknown W. Brackets
4bd94a4e5e
samplerjit: Pass funcs as an argument.
...
Seeing computing the ID in some profiles, so want to avoid computing per
thread/invocation.
2021-12-29 07:11:53 -08:00
Unknown W. Brackets
74eb450e76
samplerjit: Move texture function into jit.
...
Could do this also for nearest, might end up with a third set of functions
there for a direct sample lookup (for debug funcs.)
2021-12-28 17:52:17 -08:00
Unknown W. Brackets
940e6bb1d7
samplerjit: Lookup both mip tex values.
2021-12-28 16:22:54 -08:00
Unknown W. Brackets
6b55d328e5
samplerjit: Use regcache for linear filtering.
...
This makes it easier to reuse for mipmap filtering.
2021-12-28 15:37:25 -08:00
Unknown W. Brackets
a4558a5736
samplerjit: Take texptr/bufw as arrays.
...
Prep for moving mip map sampling into linear.
2021-12-28 12:04:16 -08:00
Unknown W. Brackets
a84accf713
samplerjit: Move S/T calculation into jit.
...
Gives a pretty decent 5-10% improvement in many places.
2021-12-28 09:58:23 -08:00
Unknown W. Brackets
9cc0883d53
softgpu: Correct non-SSE T clamp.
2021-12-27 15:31:37 -08:00
Unknown W. Brackets
39d5b1c221
softgpu: Reduce mipmap fraction to 4 bits.
...
For CONST (and SLOPE with flat w), this produces accurate values.
SLOPE is still wrong in its handling of w, and AUTO seems to calculate
using a different and less accurate ramp. But they both produce values
with 16 steps, in any case.
2021-12-27 11:37:33 -08:00
Unknown W. Brackets
d6b6ef4cb1
softgpu: Correct nearest filtering too.
...
Turns out to have the same behavior as linear, when it comes to the
subpixel offset.
2021-12-27 11:37:33 -08:00
Unknown W. Brackets
1dfaea9062
softgpu: Remove no longer possible report.
...
Also, it's known how this behaves, now.
2021-12-27 11:37:33 -08:00
Unknown W. Brackets
75f105f84b
softgpu: Make linear filtering more accurate.
...
This matches tests for various u/v offsets and x/y subpixel offsets.
Mipmaps are probably still wrong.
2021-12-27 11:37:32 -08:00
Unknown W. Brackets
b00a66e34c
samplerjit: Pass u/v coords as vector.
2021-12-27 11:37:32 -08:00
Unknown W. Brackets
3180e6c043
softgpu: Correct alpha on add + invalid texfuncs.
2021-12-05 16:28:37 -08:00
Unknown W. Brackets
325a1f75aa
softgpu: Match texenv blend texfunc accurately.
2021-12-05 16:09:26 -08:00