Add basic DX12 Work Graph support.
For this first pass there is no exposed RHI functionality for directly dispatching a work graph. Instead shader bundles have been extended to support a work graph based implementation.
Nanite compute materials now can use work graph shader bundles on D3D12 when r.Nanite.AllowWorkGraphMaterials and r.Nanite.Bundle.Shading are both set. Both of these default to off at the moment.
Also DataDrivenPlatformInfo now expose bSupportsWorkGraphs. This is false everywhere, but will be enabled for D3D12_SM6 as soon as we have the latest DXC shader compiler with lib_6_8 support submitted.
#rb Kenzo.Terelst, Yuriy.ODonnell
[CL 32196717 by jeremy moore in ue5-main branch]
While the mobile team wants to keep independent samples disbaled for PCD3D_ES3_1, it makes sense to move this as a platform dependent feature into the DDSPI configuration.
#rb brian.white, christopher.waters, Yuriy.ODonnell
[FYI] Dmitriy.Dyomin, Carl.Lloyd
#rnx
[CL 31764896 by laura hermanns in ue5-main branch]
- Some static variables made it tricky, so added some support to make it simpler on users of the static variables
#rb David.Harvey
[CL 31431441 by josh adams in ue5-main branch]
- D3D12 PC Bindless needs descriptor heaps managed on the CPU; we cannot update them on the GPU timeline.
- Each context now has a FD3D12ContextBindlessState that contains the per-context GPU descriptor heap as well as descriptor rollbacks to apply to the heap before submission.
- We have to roll descriptors back to their values before the incoming view updates were applied so that we can leverage the CPU heap copy at all times. This isn't deferring the updates; it's storing the values before the updates and making sure the heap is used with those values.
- When a context encounters a draw/dispatch and there were any descriptor updates, the previously used heap is updated with the correct set of descriptors before a new heap is created for the subsequent draws/dispatches.
Additional changes
- Allowing RHIs to override FRHITextureReference for custom bindless implementations.
- Adding FRHIDescriptorAllocator::GetAllocatedRange to allow managers to find the smallest range of descriptors that need to be copied to new heaps.
- DescriptorCache now holds reference counted pointers to the bindless heaps to avoid potential use after free scenarios.
- Adding ED3D12DescriptorHeapFlags to mirror D3D12_DESCRIPTOR_HEAP_FLAGS while adding new flags.
- Adding ability to pool descriptor heaps to avoid high OS overhead when constantly allocating new ones.
- Pooling descriptor heaps required more descriptor heap managers to implement CleanupResources.
#jira UE-162014
#rb Luke.Thatcher
[CL 30183702 by christopher waters in ue5-main branch]
- Adding EShaderCodeFeatures::Barycentrics
- Adding GRHIGlobals.SupportsBarycentricsSemantic
- Adding FDataDrivenShaderPlatformInfo::GetSupportsBarycentricsIntrinsics to control COMPILER_SUPPORTS_BARYCENTRIC_INTRINSICS
- Adding FDataDrivenShaderPlatformInfo::GetSupportsBarycentricsSemantic to control PLATFORM_SUPPORTS_BARYCENTRICS_SEMANTIC
#jira UE-193429
#rb graham.wihlidal, mihnea.balta
[CL 29771745 by christopher waters in ue5-main branch]
OpenGL ES and Metal use framebuffer fetch.
Vulkan uses dual source blending.
For Vulkan and OpenGL ES there is a fallback shader permutation for drivers that don't support this. The fallback is the same as the existing solution that uses regular blending (i.e. looks different).
Others uses dual source blending and we force use of DXC for those shaders.
#rb Dmitriy.Dyomin, Florin.Pascu
[CL 29245271 by florian penzkofer in ue5-main branch]
This functionality is now well covered by Lumen reflections which are the supported path going forward.
#jira UE-198247
#rb Aleksander.Netzel,Yuriy.ODonnell
[CL 28866153 by chris kulla in ue5-main branch]
- Added support for up to 32 samplers to the D3D12 RHI.
- Added 'MaxSamplers' to ShaderPlatform in DataDrivenPlatformInfo, this is defaulted to 16 for all shader platforms and can be modified the the DDPI ini files. This value will set the shader compiler define 'PLATFORM_MAX_SAMPLERS'.
- Added a 'RequiredSamplersSwitch' material editor node to branch based on a given shader platform's maximum sampler count.
- To support more than 16 samplers on feature level SM5 platforms Dxc and sm6.0 are used to compile the shader. On platforms that don't support Dxc and sm6.0 the preivew menu item will be disabled.
#jira UE-191404
#rb florin.pascu
[CL 27850133 by michael wanderson in ue5-main branch]
Allocating and freeing slots can be contended, getting and setting a slot value is fast but has an additional indirection compared with OS TLS. Used only for windows editor for now, because it's a bit slower than OS TLS implementation.
#rb dmytro.vovk, francis.hurteau
[CL 27827939 by andriy tylychko in ue5-main branch]