449 Commits

Author SHA1 Message Date
Conor McCarthy
0b67481496 vkd3d: Validate tiled resources tier for 3D textures. 2023-06-27 22:33:58 +02:00
Conor McCarthy
1a0d85b8d6 vkd3d: Validate tiled resources support during reserved resource creation.
Check directly for Vulkan support because the D3D12 tiled resources
tier may in future be modified by a config option.
2023-06-27 22:33:57 +02:00
Conor McCarthy
f039c86aac vkd3d: Create smaller UAV-only descriptor pools in the allocator if Vulkan-backed heaps are enabled.
In this case d3d12_command_allocator_allocate_descriptor_set() is
only called for clearing UAVs. This helps on platforms with limited
descriptor maximum counts.
2023-05-08 20:22:02 +02:00
Conor McCarthy
5366ca7001 vkd3d: Synchronise concurrent descriptor heap binding by multiple command lists.
It is possible for multiple command lists to use the same heap, and
submit it simultaneously to multiple d3d12 queues.
2023-04-28 21:04:02 +02:00
Conor McCarthy
fa63da6030 vkd3d: Track all descriptor heaps bound during command list recording and flush their writes.
Wine-Bug: https://bugs.winehq.org/show_bug.cgi?id=54895
2023-04-28 21:04:02 +02:00
Conor McCarthy
f50e53e7c9 vkd3d: Use atomic exchange for descriptor writes.
The descriptor component of struct d3d12_desc is replaced with a union
containing a pointer which can be swapped out using
InterlockedExchangePointer(). To make it safe to increment the refcount
of such an object it is necessary to cache freed objects. Elimination
of the descriptor mutexes on games which use multithreaded descriptor
writes nearly doubles framerate on recent hardware.
2023-04-25 22:20:15 +02:00
Conor McCarthy
e63201a7a3 vkd3d: Delay writing Vulkan descriptors until submitted to a queue.
Eliminates vk_sets_mutex. Performance on average may be lower until
the descriptor mutexes are replaced and Vulkan writes are buffered
to reduce thunk calls.
2023-04-25 22:20:09 +02:00
Conor McCarthy
505c8c5a2f vkd3d: Ensure descriptors are pointer aligned.
The descriptor structure contains pointer and size types.
2023-04-25 22:20:06 +02:00
Conor McCarthy
0526f232cd vkd3d: Support null address for SRV/UAV root descriptors. 2023-04-19 20:46:00 +02:00
Conor McCarthy
88667098eb vkd3d: Do not destroy a heap until its resource count is zero.
Fixes a crash on exit in Horizon Zero Dawn (which requres added SM 6.0 support).

Placed resources should hold a reference to their heap:
https://learn.microsoft.com/en-us/windows/win32/api/d3d12/nf-d3d12-id3d12device-createheap
2023-04-03 17:59:41 +02:00
Giovanni Mascellani
bb2fa97c33 vkd3d: Do not keep the CS queue locked while processing it.
d3d12_command_queue_flush_ops() can renter itself while processing signal
events. Since we don't use recursive mutexes, we currently have to check
some of the queue variables without holding the mutex, which is not safe.

This is solved by allowing the queue to release its mutex while it is
processing entries: when flushing, the queue is briefly locked, the
is_flushing flag is set, the queue content is copied away and the
queue is unlocked again. After having processed the entries, the
queue is locked again to check is something else was added in the
meantime. This is repeated until the queue is empty (or a wait operation
is blocking it).

This should also remove some latency when a thread pushes to the queue
while another one is processing it, but I didn't try to measure any
impact. While it is expected that with this patch the queue mutex
will be locked and unlocked more frequently, it should also remain
locked for less time, hopefully creating little contention.
2023-03-08 20:14:39 +01:00
Giovanni Mascellani
8e087b0f17 vkd3d: Use a dedicated mutex to protect the blocked queues. 2023-02-13 22:16:44 +01:00
Zebediah Figura
898fc9e198 vkd3d: Fix checking for failure from SleepConditionVariableCS().
Fixes: 552926cfca64db45e9731f675c65a7214bfa6441
2023-02-07 22:15:06 +01:00
Giovanni Mascellani
552926cfca vkd3d: Do not allow synchronization primitives to fail.
In practice they never fail. If they fail, it means that there
is some underlying platform problem and there is little we can do
anyway. Under pthreads function prototypes allow returning failure,
but that's only used for "error checking" mutexes, which we
don't use.

On the other hand, error handling in vkd3d is rather inconsistent:
sometimes the errors are ignored, sometimes logged, sometimes
passed to the caller. It's hard to handle failures appropriately
if you can't even keep your state consistent, so I think it's
better to avoid trying, assume that synchronization primitives do
not fail and at least have consistent logging if something goes
wrong.
2023-02-02 20:51:27 +01:00
Conor McCarthy
3db509383b vkd3d: Store a heap array index in each CBV/SRV/UAV descriptor.
A pointer to the containing descriptor heap can be derived from this
information.

PE build of vkd3d uses Windows critical sections for synchronisation,
and these slow down on the very high lock/unlock rate during multithreaded
descriptor copying in Shadow of the Tomb Raider. This patch speeds up the
demo by about 8%. By comparison, using SRW locks in the allocators and
locking them for read only where applicable is about 4% faster.
2023-01-25 22:10:01 +01:00
Brendan Shanks
963ea98a52 vkd3d-common: Add a Windows implementation of vkd3d_set_thread_name(). 2022-10-25 21:25:38 +02:00
Giovanni Mascellani
4112c36076 vkd3d: Do not store the latch bit in an object that could be overwritten.
Once a event is signaled, the corresponding struct vkd3d_waiting_event
entry is considered dead and could be overwritten, so it's not safe to
keep a pointer to it in d3d12_fence_SetEventOnCompletion(). Instead,
keep the latch bit in d3d12_fence_SetEventOnCompletion() and put a
pointer to it in struct vkd3d_waiting_event.
2022-08-09 22:14:30 +02:00
Conor McCarthy
4afe69d04a vkd3d: Send typed UAV unknown format read support info to vkd3d-shader.
Fixes reflections in Control appearing with only their red component.

Wine-Bug: https://bugs.winehq.org/show_bug.cgi?id=52146
Signed-off-by: Conor McCarthy <cmccarthy@codeweavers.com>
2022-08-09 22:14:28 +02:00
Giovanni Mascellani
5168929edc vkd3d: Remove unused field fence_destruction_cond. 2022-08-08 18:55:22 +02:00
Conor McCarthy
8cae046803 vkd3d: Map timeline semaphore values to fence virtual values and buffer out-of-order waits.
Strictly increasing timeline values must be mapped to fence virtual values
to avoid invalid use of Vulkan timeline semaphores. In particular, non-
increasing values and value jumps of >= 4G are permitted in d3d12.

Different virtual D3D12 command queues may map to the same Vulkan queue.
If a wait of value N is submitted on one command queue, and then a signal
for >= N is submitted on another, but they are sent to the same Vk queue,
the wait will never complete. The solution is to buffer out-of-order waits
and any subsequent queue commands until an unblocking signal value is
submitted to a different D3D12 queue, or signaled on the CPU.

Buffering out-of-order waits also fixes the old fence implementation so it
is fully functional, though a bit less efficient than timeline semaphores.

Based in part on vkd3d-proton patches by Hans-Kristian Arntzen. Unlike the
vkd3d-proton implementation, this patch does not use worker threads for
submissions to the Vulkan queue.

Signed-off-by: Conor McCarthy <cmccarthy@codeweavers.com>
Signed-off-by: Henri Verbeet <hverbeet@codeweavers.com>
Signed-off-by: Alexandre Julliard <julliard@winehq.org>
2022-05-13 18:20:10 +02:00
Conor McCarthy
07e38212ec vkd3d: Replace the signaled semaphore list with a resizable array.
Order does not need to be preserved here, and another function will add
to this array when mapped timeline semaphores are implemented.

Signed-off-by: Conor McCarthy <cmccarthy@codeweavers.com>
Signed-off-by: Henri Verbeet <hverbeet@codeweavers.com>
Signed-off-by: Alexandre Julliard <julliard@winehq.org>
2022-05-02 20:19:35 +02:00
Conor McCarthy
488722b9b5 vkd3d: Create one fence worker thread per command queue.
Simplifies the handling of GPU waits, and in vkd3d-proton is reported
to increase performance when support for multiple Vulkan queues is
enabled, because it avoids the problem of fences being signaled while
they sit in the pending buffer waiting to be moved to the wait buffer.

Based on a vkd3d-proton patch by Philip Rebohle.

Signed-off-by: Conor McCarthy <cmccarthy@codeweavers.com>
Signed-off-by: Henri Verbeet <hverbeet@codeweavers.com>
Signed-off-by: Alexandre Julliard <julliard@winehq.org>
2022-05-02 20:19:32 +02:00
Conor McCarthy
34e7b87966 vkd3d: Introduce an internal refcount to d3d12_fence to replace the thread waiting mechanism.
Simplifies the preservation of fence objects until worker threads are
done with them, and will be needed when threaded queue submission is
added.

Signed-off-by: Conor McCarthy <cmccarthy@codeweavers.com>
Signed-off-by: Henri Verbeet <hverbeet@codeweavers.com>
Signed-off-by: Alexandre Julliard <julliard@winehq.org>
2022-05-02 20:19:29 +02:00
Conor McCarthy
22d8665300 vkd3d: Use Vulkan timeline semaphores for D3D12 fences.
D3D12 supports signalling a fence to a lower value, while Vulkan timeline
semaphores do not. On the GPU side this is handled by simply submitting
the signal anyway, if a test for this passes on device creation, because
working around this is impractical. For CPU signals the Vulkan semaphore
is replaced with a new one at the lower value only if no waits and/or
signals are pending on the GPU. Otherwise, a fixme is emitted.

Partly based on a vkd3d-proton patch by Hans-Kristian Arntzen (not
including the handling of lower fence values).

The old implementation is used if KHR_timeline_semaphore is not
available or GPU signals do not work for a lower value.

Signed-off-by: Conor McCarthy <cmccarthy@codeweavers.com>
Signed-off-by: Henri Verbeet <hverbeet@codeweavers.com>
Signed-off-by: Alexandre Julliard <julliard@winehq.org>
2022-03-24 19:47:24 +01:00
Conor McCarthy
0627462192 vkd3d: Use Vulkan null descriptors if EXT_robustness2 is available.
This implements all remaining unsupported image view dimensions and saves
a small amount of resources because null buffers and images are no longer
needed. It matches the D3D12 requirement that all reads return zero,
which is not strictly true of the existing implementation using resources
of small but non-zero size. Warnings on null view creation are silenced
because there should no longer be a difference from D3D12 behaviour.

Signed-off-by: Conor McCarthy <cmccarthy@codeweavers.com>
Signed-off-by: Henri Verbeet <hverbeet@codeweavers.com>
Signed-off-by: Alexandre Julliard <julliard@winehq.org>
2022-03-24 19:46:13 +01:00