The handling of write masks and swizzles for 64 bit data types is
currently irregular: write masks are always 64 bit, while swizzles
are usually 32 bit, except for SSA registers with are 64 bit.
With this change we always use 64 bit swizzles, in order to make
the situation less surprising and make it easier to convert
registers between SSA and TEMP.
64 bit swizzles are always required to have X in their last two
components.
For example, this occurred in a shader:
reg_idx write_mask
0 xyz
1 xyzw
2 xyzw
3 xyz
The dcl_indexrange instruction covered only xyz, so once merged, searching for
xyzw failed.
It is impossible to declare an input array where elements have different
component counts, but the optimiser can create this case. One way for
this to occur is to dynamically index input values via a local array
containing copies of the input values. The optimiser converts this to
dynamically indexed inputs.
They were originally made const because no optimization/normalization
pass existed. Now having to cast away const all the time is becoming
more and more burdening.
For relative addressing, the vkd3d_shader_registers must point to
another vkd3d_shader_src_param. For now, use the sm4_instruction to save
them, since the only purpose of this struct is to be used as paramter
for write_sm4_instruction.
RWBuffer objects would trigger a vkd3d_unreachable() in sm4_base_type().
It would be easy enough to add the required case there, but (manual,
unfortunately) tests show that we aren't supposed to write constant
buffer entries for objects in the first place, as you'd expect.
This particular path ends up being exercised by vkd3d's internal UAV
clear shaders, but unfortunately it looks like our RDEF data may have
more issues; the ability to write tests for it would seem helpful.
Co-authored-by: Evan Tang <etang@codeweavers.com>
Evan Tang reported that new fixmes appeared on the shader_runner when
running some of his tests after
f50d0ae2cb.
vkd3d:652593:fixme:shader_sm4_read_src_param Unhandled mask 0x4.
The change to blame seems to be this added line in
sm4_src_from_constant_value().
+ src->swizzle = VKD3D_SHADER_NO_SWIZZLE;
On tpf binaries the last 12 bits of each src register in an instruction
specify the swizzle, and there are 5 possible combinations:
Dimension NONE
-------- 00
Dimension SCALAR
-------- 01
Dimension VEC4, with a 4 bit writemask:
---- xxxx 00 01
Dimension VEC4, with an 8 bit swizzle:
xx xx xx xx 01 01
Dimension VEC4, with a 2bit scalar dimension number:
------ xx 10 01
So far, we have only seen src registers use 4 bit writemasks in a
single case: for vec4 constants, and it is always zero.
So we expect this:
---- 0000 00 01
Now, I probably wanted to initialize src->swizzle to zero when writing
constants, but VKD3D_SHADER_NO_SWIZZLE is not zero, it is actually the
default swizzle:
11 10 01 00
And the last 4 bits (0x4) get written in the mask part, which causes
the reader to complain.