Prevent them from being ever looked up.
Our naming scheme for synthetic variables already effectively prevents this, but
this is better for clarity. We also will need to be able to move some named
variables into a dummy scope to account for complexities around function
definition and declarations.
Co-authored-by: Francisco Casas <fcasas@codeweavers.com>
Co-authored-by: Zebediah Figura <zfigura@codeweavers.com>
Because copy_propagation_transform_object_load() replaces a deref
instead of an instruction, it is currently prone to two problems:
1- It can replace a deref with the same deref, returning true every
time and getting the compilation stuck in an endless loop of
copy-propagation iterations.
2- When performed multiple times in the same deref, the second time it
can replace the deref with a deref from a temp that is only valid in
another point of the program execution, resulting in an incorrect value.
This patch preempts this by avoiding replacing derefs when the new deref
doesn't point to a uniform variable. Because, uniform variables cannot
be written to.
Reinterpret min16float, min10float, min16int, min12int, and min16uint
as their regular counterparts: float, float, int, int, uint,
respectively.
A proper implementation would require adding minimum precision
indicators to all the dxbc-tpf instructions that use these types.
Consider the output of fxc 10.1 with the following shader:
uniform int i;
float4 main() : sv_target
{
min16float4 a = {0, 1, 2, i};
min16int2 b = {4, i};
min10float3 c = {6.4, 7, i};
min12int d = 9.4;
min16uint4x2 e = {14.4, 15, 16, 17, 18, 19, 20, i};
return mul(e, b) + a + c.xyzx + d;
}
However, if the graphics driver doesn't have minimum precision support,
it ignores the minimum precision indicators and runs at 32-bit
precision, which is equivalent as working with regular types.
If the offset of a gather resource load can be represented as an
aoffimmi (vectori of ints from -8 to 7), use one.
This is of particular importance for 4.0 profiles, where this is the only
valid way of representing offsets for this operation.
If a hlsl_ir_load loads a variable whose components are stored from different
instructions, copy propagation doesn't replace it.
But if all these instructions are constants (which currently is the case
for value constructors), the load could be replaced with a constant value.
Which is expected in some other instructions, e.g. texel_offsets when
using aoffimmi modifiers.
For instance, this shader:
```
sampler s;
Texture2D t;
float4 main() : sv_target
{
return t.Gather(s, float2(0.6, 0.6), int2(0, 0));
}
```
results in the following IR before applying the patch:
```
float | 6.00000024e-01
float | 6.00000024e-01
uint | 0
| = (<constructor-2>[@4].x @2)
uint | 1
| = (<constructor-2>[@6].x @3)
float2 | <constructor-2>
int | 0
int | 0
uint | 0
| = (<constructor-5>[@11].x @9)
uint | 1
| = (<constructor-5>[@13].x @10)
int2 | <constructor-5>
float4 | gather_red(resource = t, sampler = s, coords = @8, offset = @15)
| return
| = (<output-sv_target0> @16)
```
and this IR afterwards:
```
float2 | {6.00000024e-01 6.00000024e-01 }
int2 | {0 0 }
float4 | gather_red(resource = t, sampler = s, coords = @2, offset = @3)
| return
| = (<output-sv_target0> @4)
```
Rename it to copy_propagation_replace_with_single_instr() accordingly.
The idea is to introduce a constant vector replacement pass which will do the
same thing.
copy_propagation_compute_replacement() is not doing very much for us, and
conceptually is a bit of an odd fit anyway, since it's meant to deal with
multi-component types.
validate_static_object_references() validates that uninitialized static
objects are not referenced in the shader.
In case a static variable contains both numeric and object types, the
"Static variables cannot have both numeric and resource components."
error should preempt uninitialized numeric values to reach further
compilation steps.
Note that in the future we should call
validate_static_object_references() after DCE and pruning branches,
because shaders such as these compile (at least in more modern versions
of the native compiler):
Branch pruning:
```
static RWTexture2D<float> tex;
float4 main() : sv_target
{
if (0)
{
tex[int2(0, 0)] = 2;
}
return 0;
}
```
DCE:
```
static Texture2D tex;
uniform uint i;
float4 main() : sv_target
{
float4 unused = tex.Load(int3(0, 1, 2));
return 0;
}
```
These are "todo" tests in hlsl-static-initializer.shader_test
that depend on this.
We are currently not initializing static values to zero by default.
Consider the following shader:
```hlsl
static float4 va;
float4 main() : sv_target
{
return va;
}
```
we get the following output:
```
ps_5_0
dcl_output o0.xyzw
dcl_temps 2
mov r0.xyzw, r1.xyzw
mov o0.xyzw, r0.xyzw
ret
```
where r1.xyzw is not initialized.
This patch solves this by assigning the static variable the value of an
uint 0, and thus, relying on complex broadcasts.
This seems to be the behaviour of the 9.29.952.3111 version of the native
compiler, since it retrieves the following error on a shader that lacks
an initializer on a data type with object components:
```
error X3017: cannot convert from 'uint' to 'struct <unnamed>'
```
We have a different system of generating intrinsics, which makes it easier to
deal with "polymorphic" arithmetic functions.
Defining and storing intrinsics as hlsl_ir_function_decls would also require
more space in memory (and more optimization passes to get rid of the parameter
variables), and doesn't really save us any effort in terms of source code.
Using add_unary_arithmetic_expr() instead of hlsl_new_unary_expr()
allows the intrinsic to work with matrices.
Otherwise we get:
E5017: Aborting due to not yet implemented feature: Copying from unsupported node type.
because an HLSL_IR_EXPR reaches split_matrix_copies().
Atomic ops on images with Unknown type will cause SPIR-V validation failure,
and assertion failure in Mesa debug builds. D3D12 allows atomics on typed
buffers, and this requires a distinction to be made between UAV reads and
atomic ops.
Wine-Bug: https://bugs.winehq.org/show_bug.cgi?id=53874
Unlike compatible_data_types() and implicit_compatible_data_types(),
this function is intended to be symmetrical. So it makes sense to
preserve the names "t1" and "t2" for the arguments.
Otherwise, for instance, the added test results in:
debug_hlsl_writemask: Assertion `!(writemask & ~VKD3DSP_WRITEMASK_ALL)' failed.
Which happens in allocate_variable_temp_register() when the variable's
type reg_size is <= 4 but its component count is larger, which may
happen if it contains objects.
We would like to generate SPIR-V for input formats other than DXBC.
The "vkd3d_" prefix is dropped, partly to make names shorter, and partly to help
clarify what is an internal function.
I prefer avoiding the vkd3d_* prefix on all internal functions, for these
reasons. However, I'm open to restoring it.
The function has far too many arguments, including multiple different arguments
with the same type. Use a structure for clarity and to avoid errors.
Merge hlsl_new_sample_lod() into hlsl_new_resource_load() accordingly.
This should silence warnings about some branches non returning any value
without requiring additional "return 0" statement or similar.
Also, in theory this might enable to compiler to optimize the program
a little bit more, though that's unlikely to have any measurable effect.
Also, TextureCube and TextureCubeArray don't support the offset
argument, so this check is updated here too.
Signed-off-by: Francisco Casas <fcasas@codeweavers.com>
Also, TextureCube and TextureCubeArray don't support the offset
argument, so this check is updated.
Signed-off-by: Francisco Casas <fcasas@codeweavers.com>
HLSL_ARRAY_ELEMENTS_COUNT_IMPLICIT (zero) is used as a temporal value
for elements_count for implicit size arrays.
This value is replaced by the correct one after parsing the initializer.
In case the implicit array is not initialized correctly, hlsl_error()
is called but the array size is kept at 0. So the rest of the code
must handle these cases.
In shader model 5.1, unlike in 5.0, declaring a multi-dimensional
object-type array with the last dimension implicit results in
an error. This happens even in presence of an initializer.
So, both gen_struct_fields() and declare_vars() first check if the
shader model is 5.1, the array elements are objects, and if there is
at least one implicit array size to handle the whole type as an
unbounded resource array.
Signed-off-by: Francisco Casas <fcasas@codeweavers.com>
It is responsibility of the shader's programmer to ensure that
object references can be solved statically.
Resource arrays for ps_5_1 and vs_5_1 are an exception which is not
properly handled yet. They probably deserve a different object type.
Signed-off-by: Francisco Casas <fcasas@codeweavers.com>
Otherwise we get false in implicit_compatible_data_types() when passing
types that are equal but not convertible according to
convertible_data_type(); e.g. getting:
"Can't implicitly convert from Texture2D<float4> to Texture2D<float4>."
Signed-off-by: Francisco Casas <fcasas@codeweavers.com>
Don't assume that enums and uint32_t parameters are identical. Clang
16 changes the diagonstic for incompatible function pointer types
from a warning into an error by default.
This fixes the following error, when built (for aarch64, but probably
also for other architectures) in MSVC mode:
../src/libs/vkd3d/libs/vkd3d-shader/spirv.c:1083:13: error: incompatible function pointer types passing 'uint32_t (struct vkd3d_spirv_builder *, uint32_t, SpvDim, uint32_t, uint32_t, uint32_t, uint32_t, SpvImageFormat)' (aka 'unsigned int (struct vkd3d_spirv_builder *, unsigned int, enum SpvDim_, unsigned int, unsigned int, unsigned int, unsigned int, enum SpvImageFormat_)') to parameter of type 'vkd3d_spirv_build7_pfn' (aka 'unsigned int (*)(struct vkd3d_spirv_builder *, unsigned int, unsigned int, unsigned int, unsigned int, unsigned int, unsigned int, unsigned int)') [-Wincompatible-function-pointer-types]
vkd3d_spirv_build_op_type_image);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../src/libs/vkd3d/libs/vkd3d-shader/spirv.c:612:68: note: passing argument to parameter 'build_pfn' here
SpvOp op, const uint32_t *operands, vkd3d_spirv_build7_pfn build_pfn)
^
Signed-off-by: Martin Storsjö <martin@martin.st>
hlsl_new_store() and hlsl_new_load() are deleted, so now there are no more
direct ways to create derefs with offsets in hlsl.c and hlsl.h.
Signed-off-by: Francisco Casas <fcasas@codeweavers.com>
Signed-off-by: Giovanni Mascellani <gmascellani@codeweavers.com>
Signed-off-by: Francisco Casas <fcasas@codeweavers.com>
Signed-off-by: Zebediah Figura <zfigura@codeweavers.com>
Signed-off-by: Giovanni Mascellani <gmascellani@codeweavers.com>
This can be done now, to ensure that register offsets are no longer used
in hlsl.c and hlsl.h.
Signed-off-by: Francisco Casas <fcasas@codeweavers.com>
Signed-off-by: Giovanni Mascellani <gmascellani@codeweavers.com>
At this point, the parse code is free of offsets; it only uses index
paths.
Signed-off-by: Francisco Casas <fcasas@codeweavers.com>
Signed-off-by: Giovanni Mascellani <gmascellani@codeweavers.com>
The transform_deref_paths_into_offsets pass turns these index paths back
into register offsets.
Signed-off-by: Francisco Casas <fcasas@codeweavers.com>
Signed-off-by: Giovanni Mascellani <gmascellani@codeweavers.com>
At this point add_load() is split into add_load_component() and
add_load_index(); register offsets are hidden for these functions.
Signed-off-by: Francisco Casas <fcasas@codeweavers.com>
Signed-off-by: Giovanni Mascellani <gmascellani@codeweavers.com>
Specifying R32 for UAVs created with a vector format, e.g. R32G32B32A32_FLOAT,
results in only the red being loaded/stored, potentially causing images to
contain only the red component.
Signed-off-by: Conor McCarthy <cmccarthy@codeweavers.com>
Signed-off-by: Giovanni Mascellani <gmascellani@codeweavers.com>
Signed-off-by: Francisco Casas <fcasas@codeweavers.com>
Signed-off-by: Zebediah Figura <zfigura@codeweavers.com>
Signed-off-by: Henri Verbeet <hverbeet@codeweavers.com>
Signed-off-by: Alexandre Julliard <julliard@winehq.org>
When using PE vkd3d through Wine, debug output may be swallowed by writing to
Win32 stderr. Avoid this by providing a way to hook up vkd3d log output to Wine
output.
Signed-off-by: Zebediah Figura <zfigura@codeweavers.com>
Signed-off-by: Giovanni Mascellani <gmascellani@codeweavers.com>
Signed-off-by: Henri Verbeet <hverbeet@codeweavers.com>
Signed-off-by: Alexandre Julliard <julliard@winehq.org>
Signed-off-by: Giovanni Mascellani <gmascellani@codeweavers.com>
Signed-off-by: Francisco Casas <fcasas@codeweavers.com>
Signed-off-by: Zebediah Figura <zfigura@codeweavers.com>
Signed-off-by: Henri Verbeet <hverbeet@codeweavers.com>
Signed-off-by: Alexandre Julliard <julliard@winehq.org>
Signed-off-by: Giovanni Mascellani <gmascellani@codeweavers.com>
Signed-off-by: Francisco Casas <fcasas@codeweavers.com>
Signed-off-by: Zebediah Figura <zfigura@codeweavers.com>
Signed-off-by: Henri Verbeet <hverbeet@codeweavers.com>
Signed-off-by: Alexandre Julliard <julliard@winehq.org>
Signed-off-by: Giovanni Mascellani <gmascellani@codeweavers.com>
Signed-off-by: Francisco Casas <fcasas@codeweavers.com>
Signed-off-by: Henri Verbeet <hverbeet@codeweavers.com>
Signed-off-by: Zebediah Figura <zfigura@codeweavers.com>
Signed-off-by: Alexandre Julliard <julliard@winehq.org>
Signed-off-by: Giovanni Mascellani <gmascellani@codeweavers.com>
Signed-off-by: Zebediah Figura <zfigura@codeweavers.com>
Signed-off-by: Henri Verbeet <hverbeet@codeweavers.com>
Signed-off-by: Francisco Casas <fcasas@codeweavers.com>
Signed-off-by: Zebediah Figura <zfigura@codeweavers.com>
Signed-off-by: Alexandre Julliard <julliard@winehq.org>
Otherwise we can get failed assertions:
assert(node->type == HLSL_IR_LOAD);
because broadcasts to these types are not implemented yet.
Signed-off-by: Francisco Casas <fcasas@codeweavers.com>
Signed-off-by: Giovanni Mascellani <gmascellani@codeweavers.com>
Signed-off-by: Henri Verbeet <hverbeet@codeweavers.com>
Signed-off-by: Zebediah Figura <zfigura@codeweavers.com>
Signed-off-by: Alexandre Julliard <julliard@winehq.org>
Signed-off-by: Francisco Casas <fcasas@codeweavers.com>
Signed-off-by: Giovanni Mascellani <gmascellani@codeweavers.com>
Signed-off-by: Zebediah Figura <zfigura@codeweavers.com>
Signed-off-by: Henri Verbeet <hverbeet@codeweavers.com>
Signed-off-by: Alexandre Julliard <julliard@winehq.org>
The current workaround is broken for texture cube arrays, which already have 4
components.
Sampling with the components not packed together apparently succeeds with newer
NVidia drivers, so just remove the workaround. Tested with 470.103.01 on a GTX
1060.
Wine-Bug: https://bugs.winehq.org/show_bug.cgi?id=52886
Signed-off-by: Zebediah Figura <zfigura@codeweavers.com>
Signed-off-by: Henri Verbeet <hverbeet@codeweavers.com>
Signed-off-by: Alexandre Julliard <julliard@winehq.org>
Signed-off-by: Giovanni Mascellani <gmascellani@codeweavers.com>
Signed-off-by: Zebediah Figura <zfigura@codeweavers.com>
Signed-off-by: Henri Verbeet <hverbeet@codeweavers.com>
Signed-off-by: Francisco Casas <fcasas@codeweavers.com>
Signed-off-by: Alexandre Julliard <julliard@winehq.org>
Signed-off-by: Giovanni Mascellani <gmascellani@codeweavers.com>
Signed-off-by: Zebediah Figura <zfigura@codeweavers.com>
Signed-off-by: Francisco Casas <fcasas@codeweavers.com>
Signed-off-by: Henri Verbeet <hverbeet@codeweavers.com>
Signed-off-by: Alexandre Julliard <julliard@winehq.org>
This commit includes work by Francisco Casas.
Signed-off-by: Giovanni Mascellani <gmascellani@codeweavers.com>
Signed-off-by: Henri Verbeet <hverbeet@codeweavers.com>
Signed-off-by: Francisco Casas <fcasas@codeweavers.com>
Signed-off-by: Zebediah Figura <zfigura@codeweavers.com>
Signed-off-by: Alexandre Julliard <julliard@winehq.org>
This commit includes work by Francisco Casas.
Signed-off-by: Giovanni Mascellani <gmascellani@codeweavers.com>
Signed-off-by: Henri Verbeet <hverbeet@codeweavers.com>
Signed-off-by: Francisco Casas <fcasas@codeweavers.com>
Signed-off-by: Zebediah Figura <zfigura@codeweavers.com>
Signed-off-by: Alexandre Julliard <julliard@winehq.org>
Signed-off-by: Francisco Casas <fcasas@codeweavers.com>
Signed-off-by: Giovanni Mascellani <gmascellani@codeweavers.com>
Signed-off-by: Henri Verbeet <hverbeet@codeweavers.com>
Signed-off-by: Zebediah Figura <zfigura@codeweavers.com>
Signed-off-by: Alexandre Julliard <julliard@winehq.org>
Signed-off-by: Francisco Casas <fcasas@codeweavers.com>
Signed-off-by: Giovanni Mascellani <gmascellani@codeweavers.com>
Signed-off-by: Henri Verbeet <hverbeet@codeweavers.com>
Signed-off-by: Zebediah Figura <zfigura@codeweavers.com>
Signed-off-by: Alexandre Julliard <julliard@winehq.org>
Signed-off-by: Francisco Casas <fcasas@codeweavers.com>
Signed-off-by: Giovanni Mascellani <gmascellani@codeweavers.com>
Signed-off-by: Zebediah Figura <zfigura@codeweavers.com>
Signed-off-by: Henri Verbeet <hverbeet@codeweavers.com>
Signed-off-by: Alexandre Julliard <julliard@winehq.org>
With this change it is possible to store booleans as 0xffffffff,
similarly as what happens at runtime.
Signed-off-by: Giovanni Mascellani <gmascellani@codeweavers.com>
Signed-off-by: Zebediah Figura <zfigura@codeweavers.com>
Signed-off-by: Francisco Casas <fcasas@codeweavers.com>
Signed-off-by: Matteo Bruni <mbruni@codeweavers.com>
Signed-off-by: Henri Verbeet <hverbeet@codeweavers.com>
Signed-off-by: Alexandre Julliard <julliard@winehq.org>
This is especially a problem when e.g. it introduces a whitespace
before a #pragma directive, breaking shader compilation.
Signed-off-by: Matteo Bruni <mbruni@codeweavers.com>
Signed-off-by: Zebediah Figura <zfigura@codeweavers.com>
Signed-off-by: Henri Verbeet <hverbeet@codeweavers.com>
Signed-off-by: Alexandre Julliard <julliard@winehq.org>