605 Commits

Author SHA1 Message Date
Henrik Rydgård
e0e29a1556 Merge pull request #16197 from hrydgard/more-uniform-optimization
More uniform optimization, fixes
2022-10-12 01:00:27 +02:00
Unknown W. Brackets
416265431b GE Debugger: Display if tex is framebuf.
Rather than guessing based on size, let's show explicitly.
2022-10-10 22:35:42 -07:00
Henrik Rydgård
ee46f8992e Don't use fragmentShaderInt32Support as a replacement for checking for bitwiseOps 2022-10-10 18:02:19 +02:00
Unknown W. Brackets
55d5dc3834 GPU: Rename readback and buffer write operations.
Avoid download/upload and pack, which don't have clear directions.
2022-10-09 13:49:41 -07:00
Unknown W. Brackets
7d331f1928 GPU: Ignore depth when masked and ALWAYS.
Seen in NFS Pro Street, for example.  Shouldn't be interpreted as depth
usage.
2022-10-08 17:49:25 -07:00
Unknown W. Brackets
3aa863ec41 GPU: Clip against neg Z even w/o cull support.
This should fix rendering issues on Apple devices.
2022-10-06 00:34:02 -07:00
Henrik Rydgård
d6bd08cae7 Merge pull request #16162 from unknownbrackets/geo-shader
Implement negative Z clipping in geometry shader
2022-10-06 01:00:41 +02:00
Unknown W. Brackets
f24edbe8a8 Compat: Remove DisableRangeCulling.
This hack was used because culling previously incorrectly handled Z, which
was fixed in #14833.
2022-10-04 22:19:40 -07:00
Unknown W. Brackets
a1efed31b9 GPU: Use flags to fix triggered upload/download.
No longer using mirror hacks.
2022-10-03 20:17:25 -07:00
Unknown W. Brackets
878a049f60 GPU: Add dirtying for geo shader state.
Not yet used, but dirtied at the right times.
2022-10-02 07:42:16 -07:00
Unknown W. Brackets
24999e792a Ge: Report and save Edram translation value.
See #16126 for some details on its usage and effects.
2022-10-01 23:18:42 -07:00
Unknown W. Brackets
7cf05d0a46 GPU: Fix missed dirtying when fast loading tgen. 2022-09-29 22:31:07 -07:00
Unknown W. Brackets
904fb38003 GPU: Restore matrices with dirtying.
Without this, it's possible we might not notice or apply a change
whether in uniforms or etc.
2022-09-29 22:31:02 -07:00
Unknown W. Brackets
6b20c0318d softgpu: Correct matrix value update wrapping.
The values read back when saving a context or getting matrix data are set
differently than the actual values used for rendering.

This implements the wrapping and bleeding between matrices within softgpu,
but leaves hardware rendering to only use the rendering registers for
speed.
2022-09-27 22:29:55 -07:00
Unknown W. Brackets
95d2083f04 Ge: Move matrix reading into GPU.
Let's keep managing its state / registers internal.
2022-09-27 22:23:02 -07:00
Henrik Rydgård
07ca9e4656 Fold the "materialUpdate" flag into the light ubershader part.
This reduces the number of vertex shaders and thus pipelines by quite a
bit more in a few games, like Tekken and GoW, continuing the fight
against shader compile stutter.

The perf impact should be minimal if not positive due to less pipeline
changes.

GLES fixes

Make the vertex input declarations match (always declare fog input).  Fixes D3D11 validation

Tess fix
2022-09-26 12:06:16 +02:00
Henrik Rydgård
76f03d30bf Remove suspicious dirty flag 2022-09-26 11:21:40 +02:00
Henrik Rydgård
9d1355e137 Always do the vertex shader part of the fog computation.
In #16104, we drastically reduced the number of shader variants for
games that use flexible lighting setups. I looked at a few games and it
seems that a lot of games have the same shaders with fog on/off, while
fog is super cheap to compute. So let's just always do it, reducing
vertex shader variants further (though the amount of pipelines will probably
remain the same, since we still specialize the fragment shader).

Might also be worth adding a dynamic bool for the fragment shader, but
if so, doing it separately.
2022-09-26 09:30:54 +02:00
Henrik Rydgård
7adba20fac Experiment: Generate "Ubershaders" that can handle all lighting configurations
This drastically reduces the shader compile stutter that happens when a lot of new
light setups are created, like on the first punch in Tekken 6.

There's more stuff that might benefit from being made dynamic like this.
These branches are very cheap on modern GPUs since they're branching on
a uniform variable, so no divergence.

Only tested on Vulkan. I think we'll need to keep the old path too for
gpus like Mali-450...
2022-09-25 23:35:01 +02:00
Unknown W. Brackets
fc39f042ae softgpu: Avoid unnecessary flushing for curves.
We don't need to flush all drawing between curves in softgpu, let them
queue up.
2022-09-22 00:08:38 -07:00
Henrik Rydgård
aa19712fc3 Unify depth texture and framebuffer fetch checks 2022-09-20 10:47:49 +02:00
Henrik Rydgård
09bcf3ec13 Unify range culling detection 2022-09-20 10:15:04 +02:00
Henrik Rydgård
1ae7c0132c Start unifying setting of the GPU feature flags, now that thin3d has feature detection. 2022-09-20 10:07:01 +02:00
Unknown W. Brackets
9f84cde062 GPU: Fix crash on imm vert triangles.
Was crashing because the frag and vert shaders didn't match up.
2022-09-18 06:16:26 -07:00
Unknown W. Brackets
6877ff1af2 softgpu: Fix state/continuation for imm prims. 2022-09-18 06:16:26 -07:00