Commit Graph

407 Commits

Author SHA1 Message Date
Conor McCarthy
e63201a7a3 vkd3d: Delay writing Vulkan descriptors until submitted to a queue.
Eliminates vk_sets_mutex. Performance on average may be lower until
the descriptor mutexes are replaced and Vulkan writes are buffered
to reduce thunk calls.
2023-04-25 22:20:09 +02:00
Zebediah Figura
dea212688a vkd3d: Remove a double space in a trace message. 2023-02-23 21:46:49 +01:00
Giovanni Mascellani
8e087b0f17 vkd3d: Use a dedicated mutex to protect the blocked queues. 2023-02-13 22:16:44 +01:00
Giovanni Mascellani
552926cfca vkd3d: Do not allow synchronization primitives to fail.
In practice they never fail. If they fail, it means that there
is some underlying platform problem and there is little we can do
anyway. Under pthreads function prototypes allow returning failure,
but that's only used for "error checking" mutexes, which we
don't use.

On the other hand, error handling in vkd3d is rather inconsistent:
sometimes the errors are ignored, sometimes logged, sometimes
passed to the caller. It's hard to handle failures appropriately
if you can't even keep your state consistent, so I think it's
better to avoid trying, assume that synchronization primitives do
not fail and at least have consistent logging if something goes
wrong.
2023-02-02 20:51:27 +01:00
Conor McCarthy
3db509383b vkd3d: Store a heap array index in each CBV/SRV/UAV descriptor.
A pointer to the containing descriptor heap can be derived from this
information.

PE build of vkd3d uses Windows critical sections for synchronisation,
and these slow down on the very high lock/unlock rate during multithreaded
descriptor copying in Shadow of the Tomb Raider. This patch speeds up the
demo by about 8%. By comparison, using SRW locks in the allocators and
locking them for read only where applicable is about 4% faster.
2023-01-25 22:10:01 +01:00
Zebediah Figura
27a6963d6a vkd3d: Avoid an unused variable warning when building for Win32. 2022-09-27 20:14:35 +02:00
Conor McCarthy
4afe69d04a vkd3d: Send typed UAV unknown format read support info to vkd3d-shader.
Fixes reflections in Control appearing with only their red component.

Wine-Bug: https://bugs.winehq.org/show_bug.cgi?id=52146
Signed-off-by: Conor McCarthy <cmccarthy@codeweavers.com>
2022-08-09 22:14:28 +02:00
Conor McCarthy
971ab01add vkd3d: Check specific formats for typed UAV load feature support.
Vulkan's shaderStorageImageExtendedFormats includes more formats than are
required by D3D12.

Signed-off-by: Conor McCarthy <cmccarthy@codeweavers.com>
2022-08-09 22:14:28 +02:00
Conor McCarthy
8cae046803 vkd3d: Map timeline semaphore values to fence virtual values and buffer out-of-order waits.
Strictly increasing timeline values must be mapped to fence virtual values
to avoid invalid use of Vulkan timeline semaphores. In particular, non-
increasing values and value jumps of >= 4G are permitted in d3d12.

Different virtual D3D12 command queues may map to the same Vulkan queue.
If a wait of value N is submitted on one command queue, and then a signal
for >= N is submitted on another, but they are sent to the same Vk queue,
the wait will never complete. The solution is to buffer out-of-order waits
and any subsequent queue commands until an unblocking signal value is
submitted to a different D3D12 queue, or signaled on the CPU.

Buffering out-of-order waits also fixes the old fence implementation so it
is fully functional, though a bit less efficient than timeline semaphores.

Based in part on vkd3d-proton patches by Hans-Kristian Arntzen. Unlike the
vkd3d-proton implementation, this patch does not use worker threads for
submissions to the Vulkan queue.

Signed-off-by: Conor McCarthy <cmccarthy@codeweavers.com>
Signed-off-by: Henri Verbeet <hverbeet@codeweavers.com>
Signed-off-by: Alexandre Julliard <julliard@winehq.org>
2022-05-13 18:20:10 +02:00
Conor McCarthy
488722b9b5 vkd3d: Create one fence worker thread per command queue.
Simplifies the handling of GPU waits, and in vkd3d-proton is reported
to increase performance when support for multiple Vulkan queues is
enabled, because it avoids the problem of fences being signaled while
they sit in the pending buffer waiting to be moved to the wait buffer.

Based on a vkd3d-proton patch by Philip Rebohle.

Signed-off-by: Conor McCarthy <cmccarthy@codeweavers.com>
Signed-off-by: Henri Verbeet <hverbeet@codeweavers.com>
Signed-off-by: Alexandre Julliard <julliard@winehq.org>
2022-05-02 20:19:32 +02:00
Conor McCarthy
6b893b597b vkd3d: Prevent a null pointer dereference when a descriptor is not a UAV.
Fixes crashes in Shadow of the Tomb Raider, GRID 2019 and probably others.

Signed-off-by: Conor McCarthy <cmccarthy@codeweavers.com>
Signed-off-by: Henri Verbeet <hverbeet@codeweavers.com>
Signed-off-by: Alexandre Julliard <julliard@winehq.org>
2022-04-25 22:12:04 +02:00
Zebediah Figura
a58e713314 include: Move vkd3d_dl*() helpers to vkd3d_common.h.
Signed-off-by: Zebediah Figura <zfigura@codeweavers.com>
Signed-off-by: Henri Verbeet <hverbeet@codeweavers.com>
Signed-off-by: Alexandre Julliard <julliard@winehq.org>
2022-04-20 16:41:37 +02:00
Conor McCarthy
22d8665300 vkd3d: Use Vulkan timeline semaphores for D3D12 fences.
D3D12 supports signalling a fence to a lower value, while Vulkan timeline
semaphores do not. On the GPU side this is handled by simply submitting
the signal anyway, if a test for this passes on device creation, because
working around this is impractical. For CPU signals the Vulkan semaphore
is replaced with a new one at the lower value only if no waits and/or
signals are pending on the GPU. Otherwise, a fixme is emitted.

Partly based on a vkd3d-proton patch by Hans-Kristian Arntzen (not
including the handling of lower fence values).

The old implementation is used if KHR_timeline_semaphore is not
available or GPU signals do not work for a lower value.

Signed-off-by: Conor McCarthy <cmccarthy@codeweavers.com>
Signed-off-by: Henri Verbeet <hverbeet@codeweavers.com>
Signed-off-by: Alexandre Julliard <julliard@winehq.org>
2022-03-24 19:47:24 +01:00
Conor McCarthy
0627462192 vkd3d: Use Vulkan null descriptors if EXT_robustness2 is available.
This implements all remaining unsupported image view dimensions and saves
a small amount of resources because null buffers and images are no longer
needed. It matches the D3D12 requirement that all reads return zero,
which is not strictly true of the existing implementation using resources
of small but non-zero size. Warnings on null view creation are silenced
because there should no longer be a difference from D3D12 behaviour.

Signed-off-by: Conor McCarthy <cmccarthy@codeweavers.com>
Signed-off-by: Henri Verbeet <hverbeet@codeweavers.com>
Signed-off-by: Alexandre Julliard <julliard@winehq.org>
2022-03-24 19:46:13 +01:00
Conor McCarthy
f34168481d vkd3d: Remove an invalid NULL check.
The pointer is never NULL.

Signed-off-by: Conor McCarthy <cmccarthy@codeweavers.com>
Signed-off-by: Henri Verbeet <hverbeet@codeweavers.com>
Signed-off-by: Alexandre Julliard <julliard@winehq.org>
2022-03-23 18:40:58 +01:00
Conor McCarthy
ae2219a7f7 vkd3d: Do not enable Vulkan-backed descriptor heaps if required update-after-bind features are missing.
descriptorBindingUniformBufferUpdateAfterBind is false for Intel Skylake
(and maybe others).

Signed-off-by: Conor McCarthy <cmccarthy@codeweavers.com>
Signed-off-by: Henri Verbeet <hverbeet@codeweavers.com>
Signed-off-by: Alexandre Julliard <julliard@winehq.org>
2022-03-23 18:40:56 +01:00
Conor McCarthy
2b0fd2a055 vkd3d: Do not copy descriptors having identical views.
Improves performance in Control, which copies large numbers of descriptors
per frame where often only ~10% are not identical.

Signed-off-by: Conor McCarthy <cmccarthy@codeweavers.com>
Signed-off-by: Henri Verbeet <hverbeet@codeweavers.com>
Signed-off-by: Alexandre Julliard <julliard@winehq.org>
2022-03-18 12:40:02 +01:00
Conor McCarthy
5e4f1e1ead vkd3d: Optimise descriptor copying for Vulkan-backed heaps.
Source descriptors are copied to separate arrays to facilitate use of
pre-initialised Vulkan structures, and allow arrayed writes where
possible.

Signed-off-by: Conor McCarthy <cmccarthy@codeweavers.com>
Signed-off-by: Henri Verbeet <hverbeet@codeweavers.com>
Signed-off-by: Alexandre Julliard <julliard@winehq.org>
2022-03-18 12:40:02 +01:00
Conor McCarthy
2b71ea406f vkd3d: Back descriptor heaps with Vulkan descriptor sets if descriptor indexing is available.
The existing implementation using virtual descriptor heaps, where Vk
descriptor sets are created for the bindings in the root descriptor tables,
is inefficient when multiple command lists are used with large descriptor
heaps. It also cannot support updating a descriptor set after it is bound.

This patch creates Vk sets for each D3D12 heap. Because D3D12 heaps
can contain CBV, SRV and UAV descriptors in the same heap, multiple Vk sets
are needed for each heap, however the total number of populated descriptors
is never more than (heap size + UAV counter count).

A new 'virtual_heaps' config option is introduced to make the old
implementation available when needed. It's not always possible to determine
if this is necessary when the device is created.

Up to nine Vk descriptor sets may be used. It's theoretically possible to
reduce this to eight by placing immutable samplers in the push descriptor
set layout, but contradictions in earlier versions of the Vulkan spec made
driver support inconsistent. The documentation was corrected in version
1.2.203.

This patch also adds support for UAV counter descriptor arrays. It's not
practical to add this in a separate patch due to complications with
combining the old UAV counter implementation with the new descriptor heap
implementation.

Wine-Bug: https://bugs.winehq.org/show_bug.cgi?id=47713
Wine-Bug: https://bugs.winehq.org/show_bug.cgi?id=47154
Signed-off-by: Conor McCarthy <cmccarthy@codeweavers.com>
Signed-off-by: Henri Verbeet <hverbeet@codeweavers.com>
Signed-off-by: Alexandre Julliard <julliard@winehq.org>
2022-03-18 12:38:53 +01:00
Francois Gouget
419c746806 vkd3d: Fix the spelling of a couple of trace message.
Signed-off-by: Francois Gouget <fgouget@free.fr>
Signed-off-by: Henri Verbeet <hverbeet@codeweavers.com>
Signed-off-by: Alexandre Julliard <julliard@winehq.org>
2022-03-07 16:31:28 +01:00
Conor McCarthy
65e353d5df vkd3d: Use device descriptor limits when creating descriptor pools.
Signed-off-by: Conor McCarthy <cmccarthy@codeweavers.com>
Signed-off-by: Henri Verbeet <hverbeet@codeweavers.com>
Signed-off-by: Alexandre Julliard <julliard@winehq.org>
2022-02-21 20:32:22 +01:00
Alexandre Julliard
c78174f004 vkd3d: Add a create thread implementation for Windows.
Signed-off-by: Alexandre Julliard <julliard@winehq.org>
Signed-off-by: Henri Verbeet <hverbeet@codeweavers.com>
2022-02-07 17:33:26 +01:00
Alexandre Julliard
129b0be7ac vkd3d: Add inline wrappers for the pthread synchronization functions.
Signed-off-by: Alexandre Julliard <julliard@winehq.org>
Signed-off-by: Henri Verbeet <hverbeet@codeweavers.com>
2022-02-04 16:46:03 +01:00
Conor McCarthy
ecb854c6c1 vkd3d: Add DXGI_FORMAT_UNKNOWN to the array of vkd3d_format objects.
This results in a valid format instead of NULL being returned for
buffers and any other case where DXGI_FORMAT_UNKNOWN is specified.
In some cases invalid use of a buffer or DXGI_FORMAT_UNKNOWN will
not result in E_INVALIDARG, and would need to be tested explicitly
if proven to be an issue.

Signed-off-by: Conor McCarthy <cmccarthy@codeweavers.com>
Signed-off-by: Henri Verbeet <hverbeet@codeweavers.com>
Signed-off-by: Alexandre Julliard <julliard@winehq.org>
2022-01-19 17:33:42 +01:00
Conor McCarthy
66bc2672a1 vkd3d: Implement ID3D12CommandQueue_GetClockCalibration().
Extends vkd3d_instance_create_info with struct vkd3d_host_time_domain_info
to allow host ticks per second to be changed from the default 10000000.

Signed-off-by: Conor McCarthy <cmccarthy@codeweavers.com>
Signed-off-by: Henri Verbeet <hverbeet@codeweavers.com>
Signed-off-by: Alexandre Julliard <julliard@winehq.org>
2022-01-18 09:22:56 +01:00