This field is now analogous to vkd3d_shader_register_index.rel_addr.
Also, it makes sense to rename it now because all the constant part of
the offset is now handled to hlsl_deref.const_offset. Consequently, it
may also be NULL now.
This uint will be used for the following:
- Since SM4's relative addressing (the capability of passing a register
as an index to another register) only has whole-register granularity,
we will need to make the offset node express the offset in
whole-registers and specify the register component in this uint,
otherwise we would have to add additional / and % operations in the
output binary.
- If, after we apply constant folding and copy propagation, we determine
that the offset is a single constant node, we can store all the offset
in this uint constant, and remove the offset src.
This allows DCE to remove a good bunch of the nodes previously required
only for the offset constants, which makes the output more liteweight
and readable, and simplifies the implementation of relative addressing
when writing tpf in the following patches.
In dump_deref(), we use "c" to indicate components instead of whole
registers. Since now both the offset node and the offset uint are in
components a lowered deref would look like:
var[@42c + 2c]
But, once we express the offset node in whole registers we will remove
the "c" from the node part:
var[@22 + 3c]
Some functions work with dereferences and need to know if they are
lowered yet.
This can be known checking if deref->offset.node is NULL or
deref->data_type is NULL. I am using the latter since it keeps working
even after the following patches that split deref->offset into
constant and variable parts.
We have to distinguish between the "bind count" and the "allocation size"
of variables.
The "allocation size" affects the starting register id for the resource to
be allocated next, while the "bind count" is determined by the last field
actually used. The former may be larger than the latter.
What we are currently calling hlsl_reg.bind_count is actually the
"allocation size", so a rename is in order.
The real "bind count", which will be introduced in following patches,
is important because it is what should be shown in the RDEF table and
some resource allocation rules depend on it.
For instance, for this shader:
texture2D texs[3];
texture2D tex;
float4 main() : sv_target
{
return texs[0].Load(int3(0, 0, 0)) + tex.Load(int3(0, 0, 0));
}
the variable "texs" has a "bind count" of 1, but an "allocation size" of
3:
// Resource Bindings:
//
// Name Type Format Dim HLSL Bind Count
// ------------------------------ ---------- ------- ----------- -------------- ------
// texs texture float4 2d t0 1
// tex texture float4 2d t3 1
We are using the hlsl_ir_var.is_uniform flag to indicate when an object
is a uniform copy created from a variable with the HLSL_STORAGE_UNIFORM
modifier.
We should be checking for this instead of the HLSL_STORAGE_UNIFORM flag
which is also set to 1 for the original variables, and there should be
no reason to use this flag instead of "is_uniform" after the uniform
copies and combined/separated samplers are created.
After lowering the derefs path to a single offset node, there was no way
of knowing the type of the referenced part of the variable. This little
modification allows to avoid having to pass the data type everywhere and
it is required for supporting instructions that reference objects
components within struct types.
Since deref->data_type allows us to retrieve the type of the deref,
deref->offset_regset is no longer necessary.
lower_narrowing_casts() currently creates a new cast calling
hlsl_new_cast(). This cast may be redundant, but it is not folded, which
is making SM1 emit an unnecessary fixme in some shaders:
Aborting due to not yet implemented feature: SM1 "cast" expression.
Other passes that call hlsl_new_cast() are lower_int_division() and
lower_int_modulus(), so the new fold_redundant_casts() pass is called
after these as well.
Non-constant vector indexing is not solved with relative addressing
in the register indexes because this indexation cannot be at the level
of register-components.
Mathematical operations must be used instead.
Variables that contain more than one object (arrays or structs) require
the allocation of contiguous registers in the respective object
register spaces.
This patch makes index expressions on resources hlsl_ir_index nodes
instead of hlsl_ir_resource_load nodes, because it is not known if they
will be used later as the lhs of an hlsl_ir_resource_store.
For now, the only benefit is consistency.
Since in SM1 all vector types use 4 register components, and since SM1
doesn't consider vectors of different dimx incompatible, it is necessary
to ensure that the semantic var is created with dimx=4, and to add a
cast node.
The use of the hlsl_semantic.reported_duplicated_output_next_index field
allows reporting multiple overlapping indexes, such as in the following
vertex shader:
void main(out float1x3 x : OVERLAP0, out float1x3 y : OVERLAP1)
{
x = float3(1.0, 2.0, 3.2);
y = float3(5.0, 6.0, 5.0);
}
apple.hlsl:1:41: E5013: Output semantic "OVERLAP1" is used multiple times.
apple.hlsl:1:13: First use of "OVERLAP1" is here.
apple.hlsl:1:41: E5013: Output semantic "OVERLAP2" is used multiple times.
apple.hlsl:1:13: First use of "OVERLAP2" is here.
While at the same time avoiding reporting overlaps more than once for
large arrays:
struct apple
{
float2 p : sv_position;
};
void main(out apple aps[4])
{
}
apple.hlsl:3:8: E5013: Output semantic "sv_position0" is used multiple times.
apple.hlsl:3:8: First use of "sv_position0" is here.