I think the main argument for preallocating instructions and
passing them to helpers is that this simplifies error handling.
However it seems that the simplification is close to negligible,
while the current solution makes it harder to use the iterator
abstraction layer for the instruction array, and it also makes
the code harder to read and check.
The main reason is to avoid making false assumptions in the code.
I don't know how that could be used, say, to introduce a security
bug, but I think validating untrusted input should be done by
default.
Conveniently this also acts as documentation for who needs to know
what fields we indeed expect to find in a well-known structure.
Emission of code into individual block instruction arrays was done to
enable construction of a control flow graph. A graph is constructed from
the flat instruction array in a later pass, so blocks are not needed.
It is possible to emit instructions directly into the array in struct
vsir_program instead of from sm6_function_emit_instructions(), but since
the patch constant function occurs first in DXIL hull shaders, this would
reverse the current order of functions in the flat array. That may be
acceptable, but it is left for a later patch in case any issues arise.
Instead of an int3.
Gather operations expect an offset with only two components.
Currently the following field (which is the gather channel) is
parsed as a third component, which leads to wrong and invalid
results.
This fixes a crash on a shader from WRC Generations.
The input DXIL can sometimes contain constant arrays not referenced by the
resulting vsir program. It doesn't hurt much to generate ICBs for those
anyway, but it's a little pointless.
Currently, on what we consider normalized vsir, destination write masks
are not relative to the signature element's mask, even though source
swizzles are. Also for most instructions, the source swizzles are masked
by the destination write mask, as given by vsir_src_is_masked().
The DXIL parser however, is not derelativizing the destination write
masks for system value signature elements, so we fix that to make it
consistent with how other front-ends are handled.
For instance, when the test introduced in commit
ca5bc63e5e is compiled to DXIL using DXC,
and then parsed using vkd3d-compiler, we get the following store
instructions:
vs_6_0
.input
.param POSITION.xyzw, v0.xyzw, float
.output
.param SV_Position.xyzw, o0.xyzw, float, POS
.param SV_CullDistance.x, o1.x, float, CULLDST
.param SV_ClipDistance.y, o1.y, float, CLIPDST
.descriptors
.text
label l1
...
mov o1.x <v4:f32>, sr1 <s:f32>
mov o2.x <v4:f32>, sr2 <s:f32> // Note the .x write mask!
ret
whereas, when compiling using FXC and parsing the TPF using
vkd3d-compiler we get:
vs_4_0
.input
.param POSITION.xyzw, v0.xyzw, float
.output
.param SV_POSITION.xyzw, o0.xyzw, float, POS
.param SV_CULLDISTANCE.x, o1.x, float, CULLDST
.param SV_CLIPDISTANCE.y, o1.y, float, CLIPDST
.descriptors
.text
label l1
mov o0.xyzw <v4:f32>, v0.xyzw <v4:f32>
mov o1.x <v4:f32>, v0.x <v4:f32>
mov o2.y <v4:f32>, v0.y <v4:f32> // Note the .y write mask.
ret
This only really matters for cases where we have a system value semantic
whose mask doesn't start at .x, which is very rare. For instance, it
requires the clip/cull distance combo, which share registers, so one of
them pushes the other to start on another component.
According to the tests, the only thing relying on this behaviour is the
handling of private variables for system value semantics on the SPIR-V
backend, which expects destination write masks as if the element started
at .x even though it might not. This is modified then.