Commit Graph

3876 Commits

Author SHA1 Message Date
Francisco Casas
321fda9c26 vkd3d-shader/hlsl: Only use the temp copy for variables that are written.
This can save a significant amount of temp registers because it allows to
avoid referencing the temp (and having to store it) when not needed.

For instance, this patch lowers the number of required temps for the
following ps_2_0 shader from 24 to 19:

    int i;
    float3x3 mats[4];

    float4 main() : sv_target
    {
        return mul(mats[i], float3(1, 2, 3)).xyzz;
    }

Also, it is needed for SM1 vertex shader relative addressing since
non-constant loads are required to be directly on the uniform ('c'
registers) instead of the temp, and non-constant loads cannot be
transformed by copy propagation.
2025-02-20 15:44:09 +01:00
Elizabeth Figura
8e6ddb0c1a vkd3d-shader/hlsl: Don't mark extern variables with an explicit first_write or last_read.
Fix the last few places that care.
2025-02-20 15:44:09 +01:00
Francisco Casas
1d74ff075e vkd3d-make/hlsl: Trace the number of registers allocated in allocate_temp_registers(). 2025-02-20 15:44:04 +01:00
Nikolay Sivov
f830ac1206 vkd3d-shader/preproc: Do not attempt to load empty included files.
Signed-off-by: Nikolay Sivov <nsivov@codeweavers.com>
2025-02-20 15:40:34 +01:00
Francisco Casas
3aecbc5ac1 vkd3d-shader/hlsl: Also dump preprocessed shaders.
This could be useful since there are many shaders that contain `#include`
directives or use parameter-defined macros and we can't reproduce bugs
from the source alone.
2025-02-19 17:34:24 +01:00
Giovanni Mascellani
665c29f0be vkd3d-shader/tpf: Allow I/O index ranges to not intersect a signature element for a given register.
The current TPF validator enforces that for each register involved
in a DCL_INDEX_RANGE instruction there must be a signature element
for that register and the DCL_INDEX_RANGE write mask. This is an
excessively strong request, and causes some shaders from The Falconeer
to be invalidly rejected.

The excessively strong check was needed to avoid triggering a bug
in the I/O normaliser. Since that bug is now solved, the check
can be relaxed.
2025-02-19 17:30:25 +01:00
Giovanni Mascellani
4b84fb486b vkd3d-shader/ir: Handle index ranges that do not touch a signature element for each register.
A good part of the I/O normaliser job is to merge together signature
elements that are spanned by DCL_INDEX_RANGE instructions. The
current algorithm assumes that each index range touches exactly
one signature element for each index spanned by the range. The
assumption is used in shader_signature_merge() in the form of
expecting that, if the index range is N registers long, then,
once you find the first signature element of an index range, the
other elements that will have to be merged with it are exactly the
following N-1 according to the order given by
signature_element_register_compare() or
signature_element_mask_compare(), depending on the signature type.

This doesn't necessarily happen. For example, The Falconeer has a
few hull shaders in which this happens:

    hs_fork_phase
    dcl_hs_fork_phase_instance_count 13
    dcl_input vForkInstanceId
    dcl_output o4.z
    dcl_output o5.z
    dcl_output o6.z
    dcl_output o7.z
    dcl_output o12.z
    dcl_output o13.z
    dcl_output o14.z
    dcl_output o15.z
    dcl_output o16.z
    dcl_output o17.z
    dcl_output o18.z
    dcl_output o19.z
    dcl_output o20.z
    dcl_temps 1
    dcl_index_range o4.z 17
    iadd r0.x, vForkInstanceId.x, l(4)
    ult r0.y, vForkInstanceId.x, l(4)
    movc r0.x, r0.y, vForkInstanceId.x, r0.x
    mov o[r0.x + 4].z, l(0)
    ret

Here the index range "skips" o8.z through o11.z, because those
registers only use mask .xy. The current algorithm fails on such
a shader.

Even depending on the signature element order doesn't look ideal.
I don't have a full counterexample for that, but it looks fragile,
especially given that the register allocation algorithm in FXC is
notoriously full of unexpected corner cases.

We solve both problems by slightly changing the architecture of
the normaliser: first we move computing the masks for the merge
signature element from signature_element_range_expand_mask(),
which is executed while merging signature, to
io_normaliser_add_index_range(), which is executed before merging
signatures. Then, while we are merging signatures, we can decide
for each single signature element whether it has to be retained
or not, and how it should be patched. The algorithm becomes
independent of the order, because each signature element can be
processed individually.
2025-02-19 17:30:00 +01:00
Giovanni Mascellani
b5350f9387 vkd3d-shader/ir: Use a structure to record range data in the I/O normaliser.
In order to make it able to have other fields.
2025-02-19 17:01:54 +01:00
Giovanni Mascellani
ec7bac7ba7 vkd3d-shader/ir: Report errors in the I/O normaliser instead of asserting.
The assumptions the I/O normaliser makes on its input program are
rather intricated. In theory the VSIR validator checks should be
strong enough, but the validator isn't run by default anyway.
Whether the TPF parser validation is strong enough is not
completely clear to me, and considering that the I/O normaliser
could end up being used on different programs as well it's
probably better to revalidate locally just in case.
2025-02-19 17:01:54 +01:00
Henri Verbeet
985d317e0e Release 1.15. 2025-02-19 12:00:00 +01:00
Henri Verbeet
872a96e599 vkd3d-shader/ir: Remove vForkInstanceId and vJoinInstanceId declarations in vsir_program_flatten_hull_shader_phases().
Commit 4717775abb removed the code turning
the corresponding DCL_INPUT instructions into NOPs, presumably because
vsir_program_remove_io_decls() now already removes them. That function
only removes the instructions though, not the declarations as such; we
now need to remove these from the "io_dcls" bitmap instead.
2025-02-19 10:44:18 +01:00
Elizabeth Figura
fe52e69662 vkd3d-shader/hlsl: Use a block in hlsl_normalize_binary_exprs(). 2025-02-05 13:53:53 +01:00
Elizabeth Figura
d7cd33fd88 vkd3d-shader/hlsl: Use a block in prepend_input_var_copy(). 2025-02-05 13:53:53 +01:00
Elizabeth Figura
b7d7deb983 vkd3d-shader/hlsl: Pass the block to add_zero_mipmap_level(). 2025-02-05 13:18:42 +01:00
Elizabeth Figura
602103dcf0 vkd3d-shader/hlsl: Handle error instructions in add_switch(). 2025-02-05 13:18:42 +01:00
Elizabeth Figura
fb290f3847 vkd3d-shader/hlsl: Add an add_switch() helper. 2025-02-05 13:18:42 +01:00
Shaun Ren
ec6b4ed4ff vkd3d-shader/hlsl: Generate vsir registers from patch variable derefs. 2025-02-03 16:36:16 +01:00
Shaun Ren
2a1e3b100b vkd3d-shader/hlsl: Allocate semantic registers for patch variables. 2025-02-03 16:15:11 +01:00
Shaun Ren
2ddbc69f1a vkd3d-shader/hlsl: Declare semantics for patch variables in vsir. 2025-02-03 16:15:03 +01:00
Shaun Ren
f127f0849e vkd3d-shader/hlsl: Generate vsir signature entries for patch variables. 2025-02-03 16:04:21 +01:00
Shaun Ren
aa29d0a2e5 vkd3d-shader/tpf: Improve readability of compat mapping in sm4_sysval_semantic_from_semantic_name(). 2025-02-03 16:04:21 +01:00
Shaun Ren
29abe73918 vkd3d-shader/hlsl: Implement input semantic variable copies for patch variables.
The semantic variables from a patch parameter are split as usual, with
the difference being that the semantic variable being added is a patch
variable itself, with the type being the split variable type, and its
number of control points being equal to the original patch variable's
number of control points. It is then stored in the original patch
variable as follows:

  for (i = 0; i < n; ++i)
      patch[i][f] := <inputpatch-sem-var>[i]

where n is the number of control points of "patch", and f is the field
index in patch corresponding to "<inputpatch-sem-var>".

We use special prefixes, "inputpatch-" or "outputpatch-", when adding
the semantic patch variables, in order to distinguish them from
non-patch semantic variables of the same name.
2025-02-03 16:04:11 +01:00
Shaun Ren
f5d216835a vkd3d-shader/hlsl: Add an "is_patch_constant_func" field to struct hlsl_ctx.
In anticipation of the need for is_patch_constant_func in
sm4_generate_vsir_reg_from_deref(), in order to generate vsir
registers from InputPatch/OutputPatch dereferences.
2025-02-03 16:00:38 +01:00
Elizabeth Figura
f1412e422c vkd3d-shader/hlsl: Handle error instructions in add_shader_compilation(). 2025-01-29 17:58:00 +01:00
Elizabeth Figura
fbd17266cf vkd3d-shader/hlsl: Do not abort on variable redefinition.
There is no harm in defining two variables with the same name.
2025-01-29 17:58:00 +01:00