The new fixmes can be triggered in presence of object components within
structs (for SM5).
In shaders such as this one:
struct apple
{
Texture2D tex : TEX;
float4 color : COLOR;
};
float4 main(struct apple input) : sv_target
{
return input.tex.Load(int3(1, 2, 3));
}
Or this one:
struct
{
Texture2D tex;
float4 color;
} s;
float4 main() : sv_target
{
return s.tex.Load(int3(1, 2, 3));
}
Co-authored-by: Francisco Casas <fcasas@codeweavers.com>
Co-authored-by: Zebediah Figura <zfigura@codeweavers.com>
Because copy_propagation_transform_object_load() replaces a deref
instead of an instruction, it is currently prone to two problems:
1- It can replace a deref with the same deref, returning true every
time and getting the compilation stuck in an endless loop of
copy-propagation iterations.
2- When performed multiple times in the same deref, the second time it
can replace the deref with a deref from a temp that is only valid in
another point of the program execution, resulting in an incorrect value.
This patch preempts this by avoiding replacing derefs when the new deref
doesn't point to a uniform variable. Because, uniform variables cannot
be written to.
If a hlsl_ir_load loads a variable whose components are stored from different
instructions, copy propagation doesn't replace it.
But if all these instructions are constants (which currently is the case
for value constructors), the load could be replaced with a constant value.
Which is expected in some other instructions, e.g. texel_offsets when
using aoffimmi modifiers.
For instance, this shader:
```
sampler s;
Texture2D t;
float4 main() : sv_target
{
return t.Gather(s, float2(0.6, 0.6), int2(0, 0));
}
```
results in the following IR before applying the patch:
```
float | 6.00000024e-01
float | 6.00000024e-01
uint | 0
| = (<constructor-2>[@4].x @2)
uint | 1
| = (<constructor-2>[@6].x @3)
float2 | <constructor-2>
int | 0
int | 0
uint | 0
| = (<constructor-5>[@11].x @9)
uint | 1
| = (<constructor-5>[@13].x @10)
int2 | <constructor-5>
float4 | gather_red(resource = t, sampler = s, coords = @8, offset = @15)
| return
| = (<output-sv_target0> @16)
```
and this IR afterwards:
```
float2 | {6.00000024e-01 6.00000024e-01 }
int2 | {0 0 }
float4 | gather_red(resource = t, sampler = s, coords = @2, offset = @3)
| return
| = (<output-sv_target0> @4)
```
Note that in the future we should call
validate_static_object_references() after DCE and pruning branches,
because shaders such as these compile (at least in more modern versions
of the native compiler):
Branch pruning:
```
static RWTexture2D<float> tex;
float4 main() : sv_target
{
if (0)
{
tex[int2(0, 0)] = 2;
}
return 0;
}
```
DCE:
```
static Texture2D tex;
uniform uint i;
float4 main() : sv_target
{
float4 unused = tex.Load(int3(0, 1, 2));
return 0;
}
```
These are "todo" tests in hlsl-static-initializer.shader_test
that depend on this.
Otherwise, for instance, the added test results in:
debug_hlsl_writemask: Assertion `!(writemask & ~VKD3DSP_WRITEMASK_ALL)' failed.
Which happens in allocate_variable_temp_register() when the variable's
type reg_size is <= 4 but its component count is larger, which may
happen if it contains objects.
Do not rely on a draw or dispatch command to do this.
This allows more efficiently testing syntax, in cases where testing the actual
shader functionality is not interesting.
It is responsibility of the shader's programmer to ensure that
object references can be solved statically.
Resource arrays for ps_5_1 and vs_5_1 are an exception which is not
properly handled yet. They probably deserve a different object type.
Signed-off-by: Francisco Casas <fcasas@codeweavers.com>