But still throw hlsl_fixme() when there is more than one.
Prioritizing among multiple compatible function overloads in the same way
as the native compiler would require systematic testing.
This was originally left alone in order to allow functions without early return
to succeed, since in that case we would already emit the correct bytecode
despite not handling the HLSL_IR_JUMP_RETURN instruction.
Now that we lower return statements, however, any unhandled instructions are
either definitely going to result in invalid bytecode, or rare enough that it's
not worth returning success anyway.
Vectors cannot be used as array indexes, however, single-component
swizzles (such as vec.x) can be used.
This suggests that single-component swizzles should actually be
scalars and not vectors of dimx = 1.
It is worth noting that the use of single-component swizzles on scalars
should still be allowed.
Co-authored-by: Francisco Casas <fcasas@codeweavers.com>
Co-authored-by: Zebediah Figura <zfigura@codeweavers.com>
Because copy_propagation_transform_object_load() replaces a deref
instead of an instruction, it is currently prone to two problems:
1- It can replace a deref with the same deref, returning true every
time and getting the compilation stuck in an endless loop of
copy-propagation iterations.
2- When performed multiple times in the same deref, the second time it
can replace the deref with a deref from a temp that is only valid in
another point of the program execution, resulting in an incorrect value.
This patch preempts this by avoiding replacing derefs when the new deref
doesn't point to a uniform variable. Because, uniform variables cannot
be written to.
Reinterpret min16float, min10float, min16int, min12int, and min16uint
as their regular counterparts: float, float, int, int, uint,
respectively.
A proper implementation would require adding minimum precision
indicators to all the dxbc-tpf instructions that use these types.
Consider the output of fxc 10.1 with the following shader:
uniform int i;
float4 main() : sv_target
{
min16float4 a = {0, 1, 2, i};
min16int2 b = {4, i};
min10float3 c = {6.4, 7, i};
min12int d = 9.4;
min16uint4x2 e = {14.4, 15, 16, 17, 18, 19, 20, i};
return mul(e, b) + a + c.xyzx + d;
}
However, if the graphics driver doesn't have minimum precision support,
it ignores the minimum precision indicators and runs at 32-bit
precision, which is equivalent as working with regular types.