The generated Vulkan calls look right and do not trigger any
validation error, but the returned timestamp is 0. A valid
timestamp is returned if the CopyResource() call is commented,
or the second EndQuery() call is moved before CopyResource(),
or the first EndQuery() call is commented. I am not seeing any
sensible pattern here, so I guess there is just a bug in
MoltenVK.
Specifically, MoltenVK seems to be able to load from stencil, but
the specific replicating swizzle (repeating the stencil value on
all the channels) is not honored. The stencil value is read only
on the red channel.
The relative-addressed case in shader_register_normalise_arrayed_addressing()
leaves the control point id in idx[0], while for constant register
indices it is placed in idx[1]. The latter case could be fixed instead,
but placing the control point count in the outer dimension is more
logical.
The FXC optimiser sometimes converts a local array of input values into
direct array addressing of the inputs, which can result in a
dcl_indexrange instruction spanning input elements with different masks.
Apparently Metal doesn't support specifying a bias directly in the
sampler, and, with "nearest" mip filtering, it doesn't switch
precisely at LOD 0.5 (though still between 0.5 and 0.6).
In Shader Model 6 each signature element can span a range of register
indices, or 'rows', and system values do not share a register index with
non-system values. Inputs and outputs are referenced by element index
instead of register index. This patch merges multiple signature elements
into a single element under the following conditions:
- The register index in a load or store is specified dynamically by
including a relative address parameter with a base register index. The
dcl_index_range instruction is used to identify these.
- A register declaration is split across multiple elements which declare
different components of the register.
- A patch constant function writes tessellation factors. These are an
array in SPIR-V, but in SM 5.x each factor is declared as a separate
register, and these are dynamically indexed by the fork/join instance
id. Elimination of multiple fork/join phases converts the indices to
constants, but merging the signature elements into a single arrayed
element matches the SPIR-V output.
All references to input/output register indices are converted to element
indices. If a relative address is present, the element index is moved up
a slot so it cannot be confused with a constant offset. Existing code
only handles register index relative addressing for tessellation factors.
This patch adds generic support for it.
Some drivers (AMD Radeon RX 6700 XT, with radeonsi from Mesa 22.2.0-rc3) emit
less than one invocation per pixel, presumably because they detect that the
shader control flow is uniform for all pixels. Having the control flow depend on
SV_Position avoids this test failure.
Cf. 34bd0dd0704c613abef8a9aa3ba2a2507ed02843 in wine.
The expected use case where a heap is freed before its contained
resources is not reasonably testable, so the ability to place a new
resource is tested instead.
Until vkd3d-shader is patched, an atomic op on a typed buffer where
StorageImageReadWithoutFormat is available will cause SPIR-V validation
failure, and assertion in Mesa debug builds, because the image will be
declared with Unknown format.
Fixes reflections in Control appearing with only their red component.
Wine-Bug: https://bugs.winehq.org/show_bug.cgi?id=52146
Signed-off-by: Conor McCarthy <cmccarthy@codeweavers.com>
This currently fails if the shader loads from the UAV, because it causes
vkd3d-shader to specify the R32f format instead of Unknown.
Signed-off-by: Conor McCarthy <cmccarthy@codeweavers.com>
Move the "resource" field to a new "d3d12_resource_readback" structure
encapsulating struct resource_readback.
Signed-off-by: Zebediah Figura <zfigura@codeweavers.com>
Signed-off-by: Giovanni Mascellani <gmascellani@codeweavers.com>
Signed-off-by: Henri Verbeet <hverbeet@codeweavers.com>
Signed-off-by: Alexandre Julliard <julliard@winehq.org>
Strictly increasing timeline values must be mapped to fence virtual values
to avoid invalid use of Vulkan timeline semaphores. In particular, non-
increasing values and value jumps of >= 4G are permitted in d3d12.
Different virtual D3D12 command queues may map to the same Vulkan queue.
If a wait of value N is submitted on one command queue, and then a signal
for >= N is submitted on another, but they are sent to the same Vk queue,
the wait will never complete. The solution is to buffer out-of-order waits
and any subsequent queue commands until an unblocking signal value is
submitted to a different D3D12 queue, or signaled on the CPU.
Buffering out-of-order waits also fixes the old fence implementation so it
is fully functional, though a bit less efficient than timeline semaphores.
Based in part on vkd3d-proton patches by Hans-Kristian Arntzen. Unlike the
vkd3d-proton implementation, this patch does not use worker threads for
submissions to the Vulkan queue.
Signed-off-by: Conor McCarthy <cmccarthy@codeweavers.com>
Signed-off-by: Henri Verbeet <hverbeet@codeweavers.com>
Signed-off-by: Alexandre Julliard <julliard@winehq.org>