This is achieved by means of creating a variable storing zero,
loading every array element, comparing if the non-constant index
matches the index of that element at runtime, and in that case
store the corresponding element in the variable.
This seems to be the same strategy that the native compiler uses.
We are currently using &offset_node->loc when offset_node is NULL.
A NULL dereference of rel_offset can also happen if
hlsl_offset_from_deref() fails because the dereference is out of
bounds.
As the newly added documentation describes, this reroll serves two purposes:
* to allow shader parameters to be used for any target type (which allows using
parameters for things like Direct3D 8-9 alpha test),
* to allow the union in struct vkd3d_shader_parameter to contain types larger
than 32 bits (by specifying them indirectly through a pointer).
The idea is to start splitting the
HLSL IR -> d3dbc
translation into
HLSL IR -> vsir -> d3dbc
So hlsl_sm1_write is split into two functions, sm1_generate_vsir()
which should handle the first part and d3dbc_compile() which should
handle the second part.
This translation should be completed once the hlsl_ctx and entry_func
are no longer used in d3dbc_compile().
While on SM1 a register reservation reserves the whole size in
registers of the variable's data type, overlapping conflicts are only
checked up to the bind_count (used size) for each variable.
track_object_components_usage() had to be improved to also
register derefs on resource stores.
It was not doing it because it assumed that for every resource store
there was a resource load already, which was true, before calling DCE.
As the diffstat shows, HLSL_CLASS_OBJECT does not really have much in common.
Resource types (TEXTURE, SAMPLER, UAV) sometimes behave similarly to each other,
but do not generally behave similarly to effect-specific types (string, shader,
state, view). Most consumers of HLSL_CLASS_OBJECT subsequently check the base
type anyway.
Hence we want to replace HLSL_TYPE_* with individual classes for object types.
As a first step, change the last few places that only check HLSL_CLASS_OBJECT.
Properly passing the inverse-trig.shader_test tests whose qualifiers
have been removed requires making spirv.c capable of handling ABS.
The same happens for the ps_3_0 equality test in
float-comparison.shader_test.
Otherwise we may get a failing
"hlsl_types_are_equal(arg1->data_type, arg2->data_type)"
assertion on hlsl_new_binary_expr() when creating the MUL.
This happens in the test introducing in the following patch:
int a, b, c;
void main(out float4 res : COLOR1, in float4 pos : position, out float4 out_pos : sv_position)
{
out_pos = pos;
res = a ? b/1000.0 : c/1000.0;
}
I realized that it is better to lower casts to int to FLOOR+REINTERPET
instead of appending a FLOOR to all casts to int and assuming that this
is the case for all of them in d3dbc.c.
This in case we introduce new passes in the future that add casts that
we forget to lower, after the lower_casts_to_bool pass.
Wine-Bug: https://bugs.winehq.org/show_bug.cgi?id=56162
Storing to a vector component using a non-constant index is not allowed
on profiles lower than 6.0. Unless this happens inside a loop that can be
unrolled, which we are not doing yet.
For this reason, a validate_nonconstant_vector_store_derefs pass is
added to detect these cases.
Ideally we would want to emit an hlsl_error on this pass, but before
implementing loop unrolling, we could reach this point on valid HLSL.
Also, as pointed out by Nikolay in the mentioned bug, currently
new_offset_from_path_index() fails an assertion when this happens,
because it expects an hlsl_ir_constant, so a check is added.
It also felt correct to emit an hlsl_fixme there, despite the
redundancy.
This preempts us from replacing a swizzle incorrectly, as in the
following example:
1: A.x = 1.0
2: A
3: A.x = 2.0
4: @2.x
were @4 ends up being 2.0 instead of 1.0, because that's the value stored in
A.x at time 4, and we should be querying it at time 2.
This also helps us to avoid replacing a swizzle with itself in copy-prop
which can result in infinite loops, as with the included tests this commit.
Consider the following sequence of instructions:
1 : A
2 : B = @1
3 : B
4 : A = @3
5 : @1.x
Current copy-prop would replace 5 so it points to @3 now:
1 : A
2 : B = @1
3 : B
4 : A = @3
5 : @3.x
But in the next iteration it would make it point back to @1, keeping it
spinning infinitively.
The solution is to index the instructions and don't replace the swizzle
if the new load happens after the current load.
Instead of only storing the value that each variable's component has at
the moment of the instruction currently handled by copy-prop, we store
the trace of all the historic values with their timestamps, i.e. the
instruction index on which the value was stored.
This would allow us to query the value that the variable had at the time
of execution of previous instructions.