Commit Graph

160 Commits

Author SHA1 Message Date
mitaclaw
ffc7bcfbf8 Emitters: Define Trivial Getters Inline 2024-07-21 21:35:29 -07:00
mitaclaw
28f8ab9e8a Arm64FloatEmitter: 64-Bit Assert In ABI_PushRegisters 2024-05-07 13:51:50 -07:00
Pokechu22
fbbfea8e8e Replace Common::BitCast with std::bit_cast 2024-05-03 18:43:51 -07:00
JosJuice
de33831783 Arm64Emitter: Fix shadowed variable
A lambda at the end of ARM64XEmitter::ParallelMoves named its parameter
`move`.
2024-04-21 16:20:59 +02:00
JosJuice
e140491fa9 Arm64Emitter: Fix incorrect assert category 2024-04-21 16:19:10 +02:00
JosJuice
b5c5371848 Arm64Emitter: Don't optimize ADD to MOV for SP
Unlike ADD (immediate), MOV (register) treats SP as ZR. Therefore the
ADDI2R optimization that was added in 67791d227c can't optimize ADD to
MOV when exactly one of the registers is SP.

There currently isn't any code in Dolphin that calls ADDI2R with
parameters that would trigger this case.
2024-02-06 21:58:07 +01:00
JosJuice
d8c78f2a92 JitArm64: Fix the "do nothing" cases of ANDI2R and friends
So somehow I forgot that AArch64 uses three-operand encoding...

Fixes a regression from 6303416201 which manifested in various ways,
such as incorrect rendering of the Wind Waker title screen.
2023-12-21 20:51:32 +01:00
JosJuice
dc60bc5f1e JitArm64: Improve codegen in ANDI2R and friends
The codegen for the functions themselves, not for the emitted code.

This seems to save 32 bytes per function. We also get rid of the oddity
we had before where ANDI2R would do masking for 32-bit operations but
the other functions wouldn't.
2023-12-17 18:13:32 +01:00
JosJuice
a8e1e1ae48 JitArm64: Optimize additional cases of ANDI2R and friends
Now we'll never need a scratch register for values that are all zeroes
or all ones.
2023-12-17 18:13:32 +01:00
JosJuice
6303416201 JitArm64: Optimize ANDI2R and friends to no-ops when possible
This optimizes rlwnmx with mask == 0xFFFFFFFF.
2023-12-17 18:13:30 +01:00
JosJuice
e0eb4ef5bc JitArm64: Use enum class for LogicalImm size parameter
This should prevent issues like the one fixed in the previous commit
from happening again.
2023-12-16 16:48:26 +01:00
JosJuice
67791d227c JitArm64: Add special zero case to ADDI2R
This normally doesn't reduce the instruction count, but is nonetheless
useful on CPUs that can do 0-cycle moves.
2023-12-01 21:31:11 +01:00
JosJuice
25ffb0dbfc JitArm64: Mask input to 32-bit ADDI2R
In case the input was a s32 that got sign extended as part of conversion
to u64.
2023-12-01 21:26:37 +01:00
JosJuice
c248a69268 JitArm64: Add utility for calling a function with arguments
With this, situations where multiple arguments need to be moved
from multiple registers become easy to handle, and we also get
compile-time checking that the number of arguments is correct.
2023-11-01 19:01:58 +01:00
JosJuice
6e88c44d5d Move SmallVector to Common
We had one implementation of this type of data structure in Arm64Emitter
and one in VideoCommon. This moves the Arm64Emitter implementation to
its own file and adds begin and end functions to it, so that VideoCommon
can use it.

You may notice that the license header for the new file is CC0. I wrote
the Arm64Emitter implementation of SmallVector, so this should be no
problem.
2023-08-22 13:19:49 +02:00
Lioncash
784a216927 Common/MathUtil: Move IntLog2 into MathUtil namespace
Gets this out of the global namespace.
2023-04-15 03:35:05 -04:00
JosJuice
b5b8871bce Arm64Emitter: Fix SHRN/SHRN2
The "vector shift by immediate" category encodes the shift amount for
right shifts as `size - amount`, whereas left shifts use `amount`.

We're not actually using SHRN/SHRN2 anywhere, which is why this has gone
undetected.
2022-12-10 11:20:23 +01:00
JosJuice
06e60ac327 JitArm64: Implement accurate NaNs
For quite some time now, we've had a setting on x86-64 that makes Dolphin
handle NaNs in a more accurate but slower way. There's only one game that
cares about this, Dragon Ball: Revenge of King Piccolo, and what that game
cares about more specifically is that the default NaN (or "generated NaN"
as I believe it's called in PowerPC documentation) is the same as on
PowerPC. On ARM, the default NaN is the same as on PowerPC, so for the
longest time we didn't need to do anything special to get Dragon Ball:
Revenge of King Piccolo working. However, in 93e636a I changed how we
handle FMA instructions in a way that resulted in the sign of NaNs
becoming inverted for nmadd/nmsub instructions, breaking the game.
To fix this, let's implement the AccurateNaNs setting, like on x86-64.
2022-12-03 19:41:32 +01:00
JosJuice
f45d3a6a2c JitArm64: Optimize ps_mergeXX
1. In some cases, ps_merge01 can be implemented using one instruction.
2. When we need two instructions for ps_merge01, it's best to start with
   a MOV to avoid false dependencies on the destination register.
3. ps_merge10 can be implemented using a single EXT instruction.
2022-11-26 18:14:58 +01:00
JosJuice
4dbf0b8e90 JitArm64: Reimplement Force25BitPrecision
The previous implementation of Force25BitPrecision was essentially a
translation of the x86-64 implementation. It worked, but we can make a
more efficient implementation by using an AArch64 instruction I don't
believe x86-64 has an equivalent of: URSHR. The latency is the same as
before, but the instruction count and register count are both reduced.
2022-10-22 10:03:52 +02:00
JosJuice
84375a91d9 Arm64Emitter: Combine immh and immb for Emit(Scalar)ShiftImm
This simplifies the callers of EmitShiftImm and EmitScalarShiftImm.
2022-10-19 20:20:39 +02:00
Pokechu22
a34d5e5960 Arm64Emitter: Add additional alignment assertions
Before, unaligned values would be silently ignored in most cases.
2022-09-18 23:33:24 -07:00
JosJuice
52661dcc76 Arm64Emitter: Fix encoding of size for ADD (vector)
This was causing a bug in the rounding of paired single multiplication
operands. If Force25BitPrecision was called for quad registers, the
element size of its ADD instruction would get treated as if it was 16
instead of the intended 64, which would cause the result of the
calculation to be incorrect if the carry had to pass a 16-bit boundary.

Fixes one of the two bugs reported in
https://bugs.dolphin-emu.org/issues/12998.
2022-08-05 21:49:28 +02:00
Pokechu22
1a92699455 Cast to int for enums that are not formattable 2022-01-13 11:11:08 -08:00
Pokechu22
558de04cfc Common/Assert: Actually use the ASSERT_MSG's log type parameter
Since it was unused, nonexistent values were used in a few places.  I've replaced them.
2022-01-09 12:44:14 -08:00