Move the temp allocation back to hlsl_codegen.c.
Note that the DCL_TEMPS instructions wouldn't be necessary if we had the
capacity to store the temp_count for both the main program and the patch
constant program (or more generally speaking, a temp_count for all
phases).
The plan is to eventually also move the HS_CONTROL_POINT and
HS_FORK_PHASE markers to the vsir_program, making it able to contain
both functions.
This node type will be deleted (again) once the hlsl->vsir->tpf
translation is complete. It serves the purpose of allowing to keep
both real hlsl_ir_nodes and vsir_instructions in the hlsl_block,
until all the former can be translated into the latter.
Currently the mutable descriptor set is repeated many times in the
pipeline layout in order to cover the indices for all the
descriptor types that would be present if mutable descriptors were
not used. This is useless and wasteful, but was necessary before
the descriptor sets backing the SRV-UAV-CBV heap were moved at the
end of the allocation table because descriptor set indices are
currently a compile-time constant in many places.
Now this is not needed any more and we can just avoid putting
many copies of the mutable descriptor set in the pipeline layout,
making it easier to meet Vulkan implementation limits.
So that when mutable descriptors are in use we can avoid putting
the other descriptor sets backing the SRV-UAV-CBV descriptor heap
in the pipeline layout altogether.
So we avoid hardcoding that it is number zero. There are two
goals here: first making the code easier to understand and
second allow reshuffling the descriptor set allocation in a
later commit.
This is required to properly optimize signatures, because these
semantics must be alligned while being packed:
- Array elements.
- The first fields of structs.
- Major vectors of a matrix.
For now this has no effect since semantics are allocated with reg_size
4, but will have effect when optimizing interstage signatures.
The generated pixel shader input signature must be consistent with the
generated vertex shader output signature for the same data type.
Since the interpolation mode affects allocation order, the allocator
needs to know the modifiers for both input and output signature elements.
The descriptor heap implementation is a rather central behavior element
in vkd3d, so it's useful to have all the relevant information logged
in a single place.
The goal is to make a requirement for VSIR that signature element
masks are always contiguous. The SPIR-V backend already implicitly
makes that assumption, since it just consider the LSB and popcount
of the mask.
For example, consider this HLSL pixel shader:
float4 main(float4 color : COLOR) : SV_Target
{
return float4(color.x, 10.0f, 11.0f, color.w);
}
Currently the parser describes the input signature element
corresponding to semantic COLOR as having mask .xw, which is
sensible. However, the SPIR-V parser will interpret that as
a mask starting at x and with popcount 2, and assuming it is
contiguous it will implicitly act as if it were .xy. This is
not correct, because the wrong component will be loaded from
the vertex stage.
Slightly simplifies descriptor write addressing, and makes layouts
essentially the same as array layouts, differing only in the binding
details, and therefore easier to understand. This also simplifies the
addition of storage buffer bindings, which can all be added onto the end.
Allows descriptor set layouts to be created after all bindings are
mapped. This is less complex and fragile than the current scheme, and in
a future patch it will support separating descriptor types into different
sets. Descriptors on virtual heaps are currently allocated from pools
which contain an equal number of each descriptor type used by vkd3d, and
this can waste a significant amount of device memory.
Here, we implement single inheritance by inserting a field at the
beginning of the derived struct with name "$super".
For the following struct declarations
struct a
{
float4 aa;
float4 bb;
};
struct b : a
{
float4 cc;
};
struct c : b
{
float4 bb;
};
this commit generates the following:
struct a
{
float4 aa;
float4 bb;
};
struct b
{
struct a $super;
float4 cc;
};
struct c
{
struct b $super;
float4 bb;
};
Similarly to the already done split from
HLSL IR -> d3dbc
to
HLSL IR -> vsir -> d3bc
we now start splitting the
HLSL IR -> tpf
translation into
HLSL IR -> vsir -> tpf
So hlsl_sm4_write is split into two functions, sm4_generate_vsir() and
tpf_compile().
This translation should be completed once tpf_compile() no longer needs
the hlsl_ctx and entry_func parameters.
Specifically we should write the sysval semantic as an instruction idx
for the following instructions:
VKD3D_SM4_OP_DCL_INPUT_SGV
VKD3D_SM4_OP_DCL_INPUT_PS_SGV
VKD3D_SM4_OP_DCL_INPUT_SIV
VKD3D_SM4_OP_DCL_INPUT_PS_SIV
VKD3D_SM4_OP_DCL_OUTPUT_SIV
and not the following ones:
VKD3D_SM4_OP_DCL_INPUT
VKD3D_SM4_OP_DCL_PS_INPUT
VKD3D_SM4_OP_DCL_OUTPUT
Which is consistent with what we do when reading these instructions in
the following functions:
shader_sm4_read_declaration_register_semantic()
shader_sm4_read_dcl_input_ps_siv()
and
shader_sm4_read_dcl_input_ps()
shader_sm4_read_declaration_dst()
for the non-SGV and non-SIV cases.
Note that the non-SGV and non-SIV instructions don't need/use this
extra information because they rely on the dst register type and index.
I suggest to introduce this change because the here replaced check is
brittle, and we might be omitting the sysval semantic in some cases.