They were originally made const because no optimization/normalization
pass existed. Now having to cast away const all the time is becoming
more and more burdening.
For relative addressing, the vkd3d_shader_registers must point to
another vkd3d_shader_src_param. For now, use the sm4_instruction to save
them, since the only purpose of this struct is to be used as paramter
for write_sm4_instruction.
RWBuffer objects would trigger a vkd3d_unreachable() in sm4_base_type().
It would be easy enough to add the required case there, but (manual,
unfortunately) tests show that we aren't supposed to write constant
buffer entries for objects in the first place, as you'd expect.
This particular path ends up being exercised by vkd3d's internal UAV
clear shaders, but unfortunately it looks like our RDEF data may have
more issues; the ability to write tests for it would seem helpful.
Co-authored-by: Evan Tang <etang@codeweavers.com>
Evan Tang reported that new fixmes appeared on the shader_runner when
running some of his tests after
f50d0ae2cb.
vkd3d:652593:fixme:shader_sm4_read_src_param Unhandled mask 0x4.
The change to blame seems to be this added line in
sm4_src_from_constant_value().
+ src->swizzle = VKD3D_SHADER_NO_SWIZZLE;
On tpf binaries the last 12 bits of each src register in an instruction
specify the swizzle, and there are 5 possible combinations:
Dimension NONE
-------- 00
Dimension SCALAR
-------- 01
Dimension VEC4, with a 4 bit writemask:
---- xxxx 00 01
Dimension VEC4, with an 8 bit swizzle:
xx xx xx xx 01 01
Dimension VEC4, with a 2bit scalar dimension number:
------ xx 10 01
So far, we have only seen src registers use 4 bit writemasks in a
single case: for vec4 constants, and it is always zero.
So we expect this:
---- 0000 00 01
Now, I probably wanted to initialize src->swizzle to zero when writing
constants, but VKD3D_SHADER_NO_SWIZZLE is not zero, it is actually the
default swizzle:
11 10 01 00
And the last 4 bits (0x4) get written in the mask part, which causes
the reader to complain.
Adding extra bits to instr->opcode doesn't seem correct, given that it
is an enum.
For instance, get_opcode_info() would return NULL if additional bits are
added to instr->opcode. This is not a problem now because that function
is called when reading and not writing.
Components only span across a single regset, so instead of expecting the
regset as input for the offset, hlsl_type_get_component_offset() can
actually retrieve it.