Henrik Rydgard
|
4b25afb7b4
|
x86 Jit: SIMD some more instructions
|
2014-11-26 22:30:06 +01:00 |
|
Henrik Rydgard
|
804de50711
|
x86 jit: SIMD-ify VFPU register file writebacks where possible
|
2014-11-26 01:33:05 +01:00 |
|
Henrik Rydgard
|
b3c8a82c49
|
x86 jit: SIMD-ify some more
|
2014-11-25 23:56:46 +01:00 |
|
Henrik Rydgard
|
b5ee47a80c
|
x86 jit: SIMD-ify lv.q and sv.q
|
2014-11-25 23:28:29 +01:00 |
|
Henrik Rydgård
|
4db6b7f3e2
|
SIMD-ify a couple instructions a bit
|
2014-11-25 22:47:26 +01:00 |
|
Unknown W. Brackets
|
5347431c20
|
x86jit: Initial simd for VecDo3(). Broken.
I'm not sure why/where it's broken...
|
2014-11-16 13:33:15 -08:00 |
|
Unknown W. Brackets
|
2862367927
|
x86jit: Add force-non-simd to all current ops.
Unless they already use MapRegs, because that will automatically handle
it.
|
2014-11-16 13:33:12 -08:00 |
|
Henrik Rydgard
|
bfcd3690b6
|
x86 jit: Fix+enable quaternion product, optimize "sw zero, *"
|
2014-11-16 18:37:38 +01:00 |
|
Henrik Rydgard
|
1c78e29c79
|
x86 jit: For clarity, use TEMPREG where it doesn't matter that it's EAX.
Might have missed a few places.
|
2014-11-16 17:38:26 +01:00 |
|
Henrik Rydgard
|
8b90f881b8
|
x86 jit: A tiny optimization and a tiny bugfix
|
2014-11-16 16:46:35 +01:00 |
|
Unknown W. Brackets
|
096b41cceb
|
x86jit: Interleave reg usage in vcmp.
|
2014-11-10 23:22:04 -08:00 |
|
Unknown W. Brackets
|
0e1aa35e84
|
x86jit: Just do the ES/NS compare once.
|
2014-11-10 23:04:38 -08:00 |
|
Unknown W. Brackets
|
2758e8fa3c
|
x86jit: Optimize vcmp for single and simd.
|
2014-11-10 23:04:37 -08:00 |
|
Unknown W. Brackets
|
27d8108bb2
|
x86jit: Optimize loads of 0 into fp regs.
|
2014-11-08 18:41:16 -08:00 |
|
Unknown W. Brackets
|
57caa95273
|
x86jit: Implement round.w.s and friends.
They are not terribly fast, though, updating MXCSR.
|
2014-11-08 17:59:38 -08:00 |
|
Unknown W. Brackets
|
671dee85c7
|
x86jit: Micro optimize vi2f a little bit.
This didn't help overall perf much but micro benchmarks are better.
|
2014-11-08 13:07:01 -08:00 |
|
Unknown W. Brackets
|
c29b126357
|
x86jit: Oops, can't have an imm here.
|
2014-11-08 12:41:48 -08:00 |
|
Unknown W. Brackets
|
c0be19edb6
|
x86jit: Simplify vavg a bit.
|
2014-11-08 12:40:04 -08:00 |
|
Unknown W. Brackets
|
761e269e5f
|
x86jit: Avoid some regcache pollution.
|
2014-11-08 12:38:08 -08:00 |
|
Unknown W. Brackets
|
bc7497857a
|
x86jit: Micro optimize vi2x a bit with ssse3/sse4.
Both are small wins.
|
2014-11-08 12:13:26 -08:00 |
|
Unknown W. Brackets
|
0e646f748a
|
x86jit: Implement vi2x instructions.
Also, my opcodes were wrong in the test (shifted the pair bit the wrong
way, oops.)
AFAICT, there's no reason PSRAD/etc. were not encoding REX...
|
2014-11-08 12:13:26 -08:00 |
|
Unknown W. Brackets
|
ddc90ee550
|
x86jit: Implement vfad and vavg.
|
2014-11-08 12:13:25 -08:00 |
|
Unknown W. Brackets
|
5ae43defd9
|
Oops, these should be signed.
|
2014-11-08 09:39:17 -08:00 |
|
Unknown W. Brackets
|
316e923b40
|
x86jit: Implement other forms of vx2i.
Gains 3.2% performance in Grand Knights History.
|
2014-11-08 00:39:40 -08:00 |
|
Unknown W. Brackets
|
097a483d77
|
x86jit: Micro optimize vs2i a bit.
|
2014-11-06 22:45:54 -08:00 |
|