We're already using functions not made available through either
metal_common or metal_texture. This doesn't seem to be an issue for the
Metal shader runner, possibly because the "online" compiler includes the
required headers by default. In any case, including metal_stdlib should
make all of MSLib available.
MSL doesn't seem to have any special handling for undefined values,
differently from SPIR-V. Thus we just emit zeros.
UNDEF registers are sometimes created by the DXIL parser,
for example in sm6_parser_emit_composite_construct().
For an if block
if (cond)
{
<then_block>
}
else
{
<else_block>
}
We flatten it by first replacing any store instruction `v[[k]] = x`
in the then_block with the following:
1: load(v[[k]])
2: cond ? x : @1
3: v[[k]] = @2
Similarly, we replace any store instruction `v[[k]] = x` in the
else_block with the following:
1: load(v[[k]])
2: cond ? @1 : x
3: v[[k]] = @2
Then we can concatenate <then_block> and <else_block> together and
get rid of the if block.
The input DXIL can sometimes contain constant arrays not referenced by the
resulting vsir program. It doesn't hurt much to generate ICBs for those
anyway, but it's a little pointless.
Currently, on what we consider normalized vsir, destination write masks
are not relative to the signature element's mask, even though source
swizzles are. Also for most instructions, the source swizzles are masked
by the destination write mask, as given by vsir_src_is_masked().
The DXIL parser however, is not derelativizing the destination write
masks for system value signature elements, so we fix that to make it
consistent with how other front-ends are handled.
For instance, when the test introduced in commit
ca5bc63e5e is compiled to DXIL using DXC,
and then parsed using vkd3d-compiler, we get the following store
instructions:
vs_6_0
.input
.param POSITION.xyzw, v0.xyzw, float
.output
.param SV_Position.xyzw, o0.xyzw, float, POS
.param SV_CullDistance.x, o1.x, float, CULLDST
.param SV_ClipDistance.y, o1.y, float, CLIPDST
.descriptors
.text
label l1
...
mov o1.x <v4:f32>, sr1 <s:f32>
mov o2.x <v4:f32>, sr2 <s:f32> // Note the .x write mask!
ret
whereas, when compiling using FXC and parsing the TPF using
vkd3d-compiler we get:
vs_4_0
.input
.param POSITION.xyzw, v0.xyzw, float
.output
.param SV_POSITION.xyzw, o0.xyzw, float, POS
.param SV_CULLDISTANCE.x, o1.x, float, CULLDST
.param SV_CLIPDISTANCE.y, o1.y, float, CLIPDST
.descriptors
.text
label l1
mov o0.xyzw <v4:f32>, v0.xyzw <v4:f32>
mov o1.x <v4:f32>, v0.x <v4:f32>
mov o2.y <v4:f32>, v0.y <v4:f32> // Note the .y write mask.
ret
This only really matters for cases where we have a system value semantic
whose mask doesn't start at .x, which is very rare. For instance, it
requires the clip/cull distance combo, which share registers, so one of
them pushes the other to start on another component.
According to the tests, the only thing relying on this behaviour is the
handling of private variables for system value semantics on the SPIR-V
backend, which expects destination write masks as if the element started
at .x even though it might not. This is modified then.