Commit Graph

1212 Commits

Author SHA1 Message Date
Unknown W. Brackets
212e730e98 samplerjit: Fix some Linux register issues. 2022-01-22 00:14:15 -08:00
Unknown W. Brackets
c0c3f7284a softgpu: Avoid flush texturing from stride.
This generally detects overlap more accurately using a dirty rectangles
approach.  Also detects render to self much more accurately, including
with depth.
2022-01-20 18:39:01 -08:00
Unknown W. Brackets
dec0ba7b79 softgpu: Flush framebuf only on change.
Sometimes games are reasserting the same framebuf, which was causing
unnecessary flushing.
2022-01-20 17:02:23 -08:00
Unknown W. Brackets
c4c54730bf softgpu: Remove bin asserts.
These are active in release and used in tight loops.
2022-01-20 16:59:38 -08:00
Unknown W. Brackets
55c11425e4 softgpu: Use persistent bin task state.
It's constant, so it's better to avoid the copying and allocation.  A
small win, but removes new from the profile.
2022-01-20 16:58:43 -08:00
Unknown W. Brackets
3e4d768e7a softgpu: Pack vertexdata a bit better.
This reduces the BinItem size by 15%.
2022-01-19 23:17:09 -08:00
Unknown W. Brackets
6ec819878a samplerjit: Reduce prolog/epilog spill.
Track reg usage so we only push/pop what we need.
2022-01-19 00:03:59 -08:00
Unknown W. Brackets
357e2e9d68 softjit: Simplify constant writes. 2022-01-19 00:03:59 -08:00
Unknown W. Brackets
c2985bca31 softjit: Centralize some common funcs from sampler.
No need to duplicate this code.
2022-01-19 00:03:59 -08:00
Henrik Rydgård
b1d158e3e6 Merge pull request #15327 from unknownbrackets/softjit-const
softjit: Switch to constant pool for draw pixel
2022-01-18 09:08:44 +01:00
Unknown W. Brackets
ac2b96cec0 softjit: Switch to constant pool.
This is simpler without RIP access checks, and tends to be fast.
2022-01-17 19:50:37 -08:00
Unknown W. Brackets
0ba2d05da5 samplerjit: Simplify AVX shift-copies.
These have been the most common and the fallback is safe.  Let's just add
a helper.
2022-01-17 15:15:36 -08:00
Henrik Rydgård
4ea1c08551 Merge pull request #15323 from unknownbrackets/softgpu-opt2
softgpu: Guide more SSE light factor handling
2022-01-17 15:56:46 +01:00
Unknown W. Brackets
7218fbbe97 softgpu: Guide more SSE light factor handling.
Missed these others in computed state.  Helps mostly to do this inside
Process().
2022-01-17 06:25:52 -08:00
Unknown W. Brackets
abef17caca softgpu: Simplify mask check.
This performs a bit better.
2022-01-16 23:40:57 -08:00
Unknown W. Brackets
89bc87a388 softgpu: Reduce copying during clipping.
Common case is nothing needs to be clipped.
2022-01-16 23:33:46 -08:00
Henrik Rydgård
128e2fa14e Merge pull request #15318 from unknownbrackets/softgpu-opt
softgpu: Heuristic to avoid over-draining
2022-01-17 07:43:34 +01:00
Henrik Rydgård
5c15054181 Merge pull request #15321 from unknownbrackets/debugger
Debugger: Fix crash in software renderer
2022-01-17 07:41:59 +01:00
Henrik Rydgård
e603e201da Merge pull request #15320 from unknownbrackets/softgpu-flush
softgpu: Fix block transfer flush detection
2022-01-17 07:41:01 +01:00
Unknown W. Brackets
653c036ac8 Debugger: Fix crash in software renderer.
The clut isn't set by sampler state, it's set normally by the binner.
2022-01-16 21:53:55 -08:00
Unknown W. Brackets
206d586c1f softgpu: Fix block transfer flush detection.
Fixes video graphics in Gods Eater Burst.
2022-01-16 21:40:19 -08:00
Unknown W. Brackets
fcc3b7684e softgpu: Use SSE in lighting param computation.
The compiler couldn't figure this out.  Halves time in this func.
2022-01-16 21:31:53 -08:00
Unknown W. Brackets
73c143c44c softgpu: Precompute some of screen space multiply.
This at least avoids the shifts and makes it easier to vectorize.
Only helps a little.
2022-01-16 21:31:53 -08:00
Unknown W. Brackets
31745110e8 softpu: Premultiply matrix transforms.
Where possible, we can skip some multiplies per vertex.
2022-01-16 21:31:52 -08:00
Unknown W. Brackets
12a4c63fc7 softgpu: Precompute state for vertex transform.
Doesn't help a ton, but with lots of verts can improve a percent or two.
2022-01-16 21:31:52 -08:00