Commit Graph

2414 Commits

Author SHA1 Message Date
fa971f32bc vkd3d-shader/hlsl: Write 'exp' instructions for SM1. 2023-01-26 21:52:25 +01:00
335f741630 vkd3d-shader/hlsl: Add a helper to write per-component unary instructions. 2023-01-26 21:52:24 +01:00
f33ca836d7 vkd3d-shader/hlsl: Make single-component swizzles retrieve a scalar. 2023-01-26 21:52:18 +01:00
d2f8a576a8 vkd3d-shader/hlsl: Avoid infinite loop and invalid derefs in copy-prop.
Co-authored-by: Francisco Casas <fcasas@codeweavers.com>
Co-authored-by: Zebediah Figura <zfigura@codeweavers.com>

Because copy_propagation_transform_object_load() replaces a deref
instead of an instruction, it is currently prone to two problems:

1- It can replace a deref with the same deref, returning true every
time and getting the compilation stuck in an endless loop of
copy-propagation iterations.

2- When performed multiple times in the same deref, the second time it
can replace the deref with a deref from a temp that is only valid in
another point of the program execution, resulting in an incorrect value.

This patch preempts this by avoiding replacing derefs when the new deref
doesn't point to a uniform variable. Because, uniform variables cannot
be written to.
2023-01-26 21:52:07 +01:00
653cc02f4c vkd3d-shader/hlsl: Write SM4 thread ID registers. 2023-01-25 22:47:46 +01:00
404a2d6a3d vkd3d-shader/hlsl: Reinterpret minimum precision types as their regular counterparts.
Reinterpret min16float, min10float, min16int, min12int, and min16uint
as their regular counterparts: float, float, int, int, uint,
respectively.

A proper implementation would require adding minimum precision
indicators to all the dxbc-tpf instructions that use these types.
Consider the output of fxc 10.1 with the following shader:

    uniform int i;

    float4 main() : sv_target
    {
        min16float4 a = {0, 1, 2, i};
        min16int2 b = {4, i};
        min10float3 c = {6.4, 7, i};
        min12int d = 9.4;
        min16uint4x2 e = {14.4, 15, 16, 17, 18, 19, 20, i};

        return mul(e, b) + a + c.xyzx + d;
    }

However, if the graphics driver doesn't have minimum precision support,
it ignores the minimum precision indicators and runs at 32-bit
precision, which is equivalent as working with regular types.
2023-01-25 22:10:23 +01:00
3c23e1713c vkd3d-shader/hlsl: Implement sqrt() for SM1. 2023-01-25 22:10:15 +01:00
b84b9349bf vkd3d-shader/hlsl: Handle RSQ output for SM1. 2023-01-25 22:10:13 +01:00
3e6fccdbf9 vkd3d-shader/hlsl: Support frac() intrinsic.
Wine-Bug: https://bugs.winehq.org/show_bug.cgi?id=34242
Signed-off-by: Nikolay Sivov <nsivov@codeweavers.com>
2023-01-25 22:10:05 +01:00
3db509383b vkd3d: Store a heap array index in each CBV/SRV/UAV descriptor.
A pointer to the containing descriptor heap can be derived from this
information.

PE build of vkd3d uses Windows critical sections for synchronisation,
and these slow down on the very high lock/unlock rate during multithreaded
descriptor copying in Shadow of the Tomb Raider. This patch speeds up the
demo by about 8%. By comparison, using SRW locks in the allocators and
locking them for read only where applicable is about 4% faster.
2023-01-25 22:10:01 +01:00
d14f42be9d vkd3d-shader/spirv: Pass a parser pointer to spirv_compiler_generate_spirv(). 2023-01-24 18:11:16 +01:00
2a5ae0a8c6 vkd3d-shader/sm4: Use the instruction array interface in compile_dxbc_tpf(). 2023-01-24 18:11:14 +01:00
2d3f05184f vkd3d-shader/glsl: Use the instruction array interface in vkd3d_glsl_generator_generate(). 2023-01-24 18:11:13 +01:00
2559d622de vkd3d-shader: Use the instruction array interface in scan_with_parser(). 2023-01-24 18:11:12 +01:00
e9a2642d6a vkd3d-shader/trace: Use the instruction array interface in vkd3d_dxbc_binary_to_text(). 2023-01-24 18:11:10 +01:00
e8cb90608d vkd3d-shader: Initialise the instruction array in vkd3d_shader_parser_init(). 2023-01-24 18:11:10 +01:00
a9aaa59df0 vkd3d-shader/sm4: Store parsed instructions in an array. 2023-01-24 18:11:08 +01:00
007f894b94 vkd3d-shader/sm1: Store parsed instructions in an array. 2023-01-24 18:11:06 +01:00
6b82ba9488 vkd3d-shader/hlsl: Fold swizzle chains. 2023-01-24 18:10:53 +01:00
b7d34e8307 vkd3d-shader/hlsl: Apply copy propagation to swizzled loads. 2023-01-24 18:10:50 +01:00
18adf0d726 vkd3d-shader/hlsl: Use aoffimmis when writing gather resource loads.
If the offset of a gather resource load can be represented as an
aoffimmi (vectori of ints from -8 to 7), use one.
This is of particular importance for 4.0 profiles, where this is the only
valid way of representing offsets for this operation.
2023-01-24 18:10:49 +01:00
c2a7a40d3a vkd3d-shader/hlsl: Replace loads with constants in copy prop.
If a hlsl_ir_load loads a variable whose components are stored from different
instructions, copy propagation doesn't replace it.

But if all these instructions are constants (which currently is the case
for value constructors), the load could be replaced with a constant value.
Which is expected in some other instructions, e.g. texel_offsets when
using aoffimmi modifiers.

For instance, this shader:

```
sampler s;
Texture2D t;

float4 main() : sv_target
{
    return t.Gather(s, float2(0.6, 0.6), int2(0, 0));
}
```

results in the following IR before applying the patch:
```
  float | 6.00000024e-01
  float | 6.00000024e-01
   uint | 0
        | = (<constructor-2>[@4].x @2)
   uint | 1
        | = (<constructor-2>[@6].x @3)
 float2 | <constructor-2>
    int | 0
    int | 0
   uint | 0
        | = (<constructor-5>[@11].x @9)
   uint | 1
        | = (<constructor-5>[@13].x @10)
   int2 | <constructor-5>
 float4 | gather_red(resource = t, sampler = s, coords = @8, offset = @15)
        | return
        | = (<output-sv_target0> @16)
```

and this IR afterwards:
```
 float2 | {6.00000024e-01 6.00000024e-01 }
   int2 | {0 0 }
 float4 | gather_red(resource = t, sampler = s, coords = @2, offset = @3)
        | return
        | = (<output-sv_target0> @4)
```
2023-01-24 18:10:45 +01:00
8c2b8ff245 vkd3d-shader/hlsl: Synthesize the swizzle and replace the instruction inside of copy_propagation_compute_replacement().
Rename it to copy_propagation_replace_with_single_instr() accordingly.

The idea is to introduce a constant vector replacement pass which will do the
same thing.
2023-01-24 18:10:41 +01:00
5d34790402 vkd3d-shader/hlsl: Call copy_propagation_get_value() directly in copy_propagation_transform_object_load().
copy_propagation_compute_replacement() is not doing very much for us, and
conceptually is a bit of an odd fit anyway, since it's meant to deal with
multi-component types.
2023-01-24 18:10:40 +01:00
8fd30aa87d vkd3d-shader/hlsl: Add some swizzle manipulation definitions. 2023-01-24 18:10:39 +01:00