vkd3d

wine/vkd3d

mirror of https://gitlab.winehq.org/wine/vkd3d.git synced 2025-12-15 08:03:30 -08:00

Author	SHA1	Message	Date
Henri Verbeet	fe4143ad19	vkd3d-shader/dxil: Generate I/O signatures with 16-bit component types for native 16-bit shaders. Which incidentally matches the I/O signatures from the DXBC container.	2025-02-24 15:10:08 +01:00
Henri Verbeet	f5d702b09a	vkd3d-shader/dxbc: Validate component types in shader_parse_signature().	2025-02-24 15:10:08 +01:00
Henri Verbeet	b8d740ebfc	vkd3d-shader/dxbc: Output messages for invalid semantic name references in shader_parse_signature().	2025-02-24 15:10:08 +01:00
Henri Verbeet	3bcdb85ddc	vkd3d-shader/dxbc: Set the "elements_capacity" field as well in shader_parse_signature(). Leaving it as 0 mostly ends up doing the right thing in practice, but isn't quite right.	2025-02-24 15:10:08 +01:00
Giovanni Mascellani	2feb3a3bba	vkd3d: Take the root signature from shaders when creating graphics pipelines. If the root signature wasn't explicitly specified. This fixes a failure in The Touryst.	2025-02-20 16:00:55 +01:00
Henri Verbeet	0796af7b4b	vkd3d: Avoid vkd3d_shader_parse_input_signature().	2025-02-20 15:57:26 +01:00
Henri Verbeet	2e62e9ea7e	vkd3d-shader: Handle arrayed elements in vkd3d_shader_signature_from_shader_signature().	2025-02-20 15:57:26 +01:00
Henri Verbeet	4e28d1c658	vkd3d-shader/dxbc: Do not extract I/O signatures for DXIL shaders. The DXIL parser doesn't need them.	2025-02-20 15:57:26 +01:00
Henri Verbeet	f4a3d17269	vkd3d-shader/dxil: Avoid using the I/O signatures from the DXBC container. We currently generate our own I/O signatures inside the DXIL parser, but use the element counts from the DXBC container signatures to allocate the input_params/output_params/patch_constant_params arrays. That happens to work for well-behaved inputs, but it's asking for trouble.	2025-02-20 15:57:26 +01:00
Elizabeth Figura	d5a2ff5c12	vkd3d-shader/hlsl: Add a hlsl_block_add_int_constant() helper.	2025-02-20 15:56:31 +01:00
Elizabeth Figura	992d20def3	vkd3d-shader/hlsl: Add a hlsl_block_add_uint_constant() helper.	2025-02-20 15:50:13 +01:00
Elizabeth Figura	79ad8c9354	vkd3d-shader/hlsl: Handle error instructions in hlsl_new_swizzle(). We already check for error instructions when parsing swizzles, but if allocation fails at codegen time we would like to avoid asserting when subsequently constructing a swizzle.	2025-02-20 15:49:40 +01:00
Elizabeth Figura	4072aa4a4b	vkd3d-shader/hlsl: Remove the type equality assertions in hlsl_new_ternary_expr(). Similar to `d1c2ae3f0e`, this is a bit too strict and may prevent e.g. simultaneous use of float and float1 at codegen time. However, in this case the inciting factor is that in the case of allocation failure at codegen time, we would like to allow one or more arguments to have error type.	2025-02-20 15:48:25 +01:00
Elizabeth Figura	ba868ed4a6	vkd3d-shader/hlsl: Skip transformation passes on error. The primary motivation here is to avoid needing to worry about instructions potentially pointing to the preallocated error instruction in the case of allocation failure. This doesn't cover all passes, but none of the other passes make assumptions about instruction sources.	2025-02-20 15:48:24 +01:00
Francisco Casas	153b7c8460	vkd3d-shader/hlsl: Run folding passes again after lower_nonconstant_array_loads. This is because lower_nonconstant_array_loads() can potentially turn nonconstant loads into constant loads, allowing copy-prop to turn these loads into previous instructions, which might help other passes as well. This patch lowers the number of required temps for the following ps_2_0 shader from 19 to 16: int i; float3x3 mats[4]; float4 main() : sv_target { return mul(mats[i], float3(1, 2, 3)).xyzz; }	2025-02-20 15:44:09 +01:00
Francisco Casas	321fda9c26	vkd3d-shader/hlsl: Only use the temp copy for variables that are written. This can save a significant amount of temp registers because it allows to avoid referencing the temp (and having to store it) when not needed. For instance, this patch lowers the number of required temps for the following ps_2_0 shader from 24 to 19: int i; float3x3 mats[4]; float4 main() : sv_target { return mul(mats[i], float3(1, 2, 3)).xyzz; } Also, it is needed for SM1 vertex shader relative addressing since non-constant loads are required to be directly on the uniform ('c' registers) instead of the temp, and non-constant loads cannot be transformed by copy propagation.	2025-02-20 15:44:09 +01:00
Elizabeth Figura	8e6ddb0c1a	vkd3d-shader/hlsl: Don't mark extern variables with an explicit first_write or last_read. Fix the last few places that care.	2025-02-20 15:44:09 +01:00
Francisco Casas	1d74ff075e	vkd3d-make/hlsl: Trace the number of registers allocated in allocate_temp_registers().	2025-02-20 15:44:04 +01:00
Nikolay Sivov	f830ac1206	vkd3d-shader/preproc: Do not attempt to load empty included files. Signed-off-by: Nikolay Sivov <nsivov@codeweavers.com>	2025-02-20 15:40:34 +01:00
Giovanni Mascellani	07b7975d09	vkd3d: Put all root descriptors in a single Vulkan descriptor set when using Vulkan heaps. Since `4a94bfc2f6` we segregate different D3D12 descriptor types in different Vulkan descriptor sets. This change was introduced to reduce descriptor wasting when allocating a new descriptor pool; that can be very useful when using virtual heaps, which have to often cycle through many descriptors, but it is expected to have limited impact for Vulkan heaps, given that in that case most descriptors are allocated through the descriptor heap rather than through the command allocator. Instead, it has a rather detrimental effect with Vulkan heaps, because it tends to use many more Vulkan descriptor sets than necessary, often with just a handful of descriptors each. This causes a regression on some Vulkan implementations that support too few descriptor sets. With this change we revert to a situation similar to before, stuffing all the descriptors that do not live in a root descriptor table in as few descriptor sets as possible (at most one or two, depending on whether push descriptors are used).	2025-02-19 17:58:23 +01:00
Giovanni Mascellani	6415c6b0e0	vkd3d: Rename push_descriptor_set to root_descriptor_set. Soon it won't be used necessarily for push descriptors anymore, but it will still contain root descriptors.	2025-02-19 17:57:15 +01:00
Giovanni Mascellani	a7337bc999	vkd3d: Require extension VK_KHR_maintenance2. We're already implicitly using it for image layouts in which either depth or stencil is writeable and the other is not. Correspondingly, add the _KHR suffix in those cases, so the extension usage is more evident. According to the Vulkan Hardware Database, only four reports without this extension were filed since 2023, and all of them for configurations we likely don't target.	2025-02-19 17:41:30 +01:00
Francisco Casas	3aecbc5ac1	vkd3d-shader/hlsl: Also dump preprocessed shaders. This could be useful since there are many shaders that contain `#include` directives or use parameter-defined macros and we can't reproduce bugs from the source alone.	2025-02-19 17:34:24 +01:00
Giovanni Mascellani	665c29f0be	vkd3d-shader/tpf: Allow I/O index ranges to not intersect a signature element for a given register. The current TPF validator enforces that for each register involved in a DCL_INDEX_RANGE instruction there must be a signature element for that register and the DCL_INDEX_RANGE write mask. This is an excessively strong request, and causes some shaders from The Falconeer to be invalidly rejected. The excessively strong check was needed to avoid triggering a bug in the I/O normaliser. Since that bug is now solved, the check can be relaxed.	2025-02-19 17:30:25 +01:00
Giovanni Mascellani	4b84fb486b	vkd3d-shader/ir: Handle index ranges that do not touch a signature element for each register. A good part of the I/O normaliser job is to merge together signature elements that are spanned by DCL_INDEX_RANGE instructions. The current algorithm assumes that each index range touches exactly one signature element for each index spanned by the range. The assumption is used in shader_signature_merge() in the form of expecting that, if the index range is N registers long, then, once you find the first signature element of an index range, the other elements that will have to be merged with it are exactly the following N-1 according to the order given by signature_element_register_compare() or signature_element_mask_compare(), depending on the signature type. This doesn't necessarily happen. For example, The Falconeer has a few hull shaders in which this happens: hs_fork_phase dcl_hs_fork_phase_instance_count 13 dcl_input vForkInstanceId dcl_output o4.z dcl_output o5.z dcl_output o6.z dcl_output o7.z dcl_output o12.z dcl_output o13.z dcl_output o14.z dcl_output o15.z dcl_output o16.z dcl_output o17.z dcl_output o18.z dcl_output o19.z dcl_output o20.z dcl_temps 1 dcl_index_range o4.z 17 iadd r0.x, vForkInstanceId.x, l(4) ult r0.y, vForkInstanceId.x, l(4) movc r0.x, r0.y, vForkInstanceId.x, r0.x mov o[r0.x + 4].z, l(0) ret Here the index range "skips" o8.z through o11.z, because those registers only use mask .xy. The current algorithm fails on such a shader. Even depending on the signature element order doesn't look ideal. I don't have a full counterexample for that, but it looks fragile, especially given that the register allocation algorithm in FXC is notoriously full of unexpected corner cases. We solve both problems by slightly changing the architecture of the normaliser: first we move computing the masks for the merge signature element from signature_element_range_expand_mask(), which is executed while merging signature, to io_normaliser_add_index_range(), which is executed before merging signatures. Then, while we are merging signatures, we can decide for each single signature element whether it has to be retained or not, and how it should be patched. The algorithm becomes independent of the order, because each signature element can be processed individually.	2025-02-19 17:30:00 +01:00

... 7 8 9 10 11 ...

5412 Commits