The following table shows how each intrinsic maps to d3d assembly and the
flags that appear in the tpf bytecode, in binary.
GroupMemoryBarrier() sync_g 0010
GroupMemoryBarrierWithGroupSync() sync_g_t 0011
DeviceMemoryBarrier() sync_uglobal 1000
DeviceMemoryBarrierWithGroupSync() sync_uglobal_t 1001
AllMemoryBarrier() sync_uglobal_g 1010
AllMemoryBarrierWithGroupSync() sync_uglobal_g_t 1011
The idea is that sm6_value and related structures should not commit
to a specific vsir register representation yet, because at the time
they are created some information might be still unavailable. For
instance, it might be not yet known whether the program is using
native 16-bit types or minimum precision types. vsir registers
should only be synthesized during instruction emission.
Field "reg" in struct sm6_value will be handled in a later commit.
All stream output objects need to have a stream index allocated,
whether they are used or not.
We allocate stream outputs here, before other objects are allocated,
because the stream index is needed to create the appropriate output
semantic variables during append_output_copy(), which will be called
in a lowering pass for the Append() method.