This change includes significant refactor work performed in //UE5/Dev-ParallelRendering. A brief summary of the work is as follows:
Refactored RHI command lists
- Removal of the "immediate" async compute command list
- Introduced an "active pipe" on each command list, allowing RHICmdLists to record work for either graphics or async compute. Pipes can be selected using the SwitchPipeline() function, or the FRHICommandListScopedPipeline helper.
- New explicit command list submission RHI API (RHIFinalizeContext, RHISubmitCommandLists). The IRHICommandContextContainer type has been removed.
- Explicit GPU submission is automatically appended to the immediate command list when it is dispatched to the RHI thread.
Platform RHI implementations
- The new submission API has been implemented across all platforms. Some platforms required a significant refactor.
#rb Mihnea.Balta,Kenzo.Terelst
#jira UE-139550
#preflight 6332e3641003050806d802ef
[CL 22239063 by luke thatcher in ue5-main branch]
- Passes are added to a queue that is consumed by a task to perform pass setup actions.
- Moved CompilePassBarriers to overlap with resource collection.
- Converted most tasks to new task system.
#preflight 630e5e2e660db81edb9a0562
[CL 21710710 by zach bethel in ue5-main branch]
- Refactord 'Finalized Access' feature into a more flexible 'External' vs. 'Internal' access mode per resource toggle.
- Resources can transition between modes multiple times within the graph.
- Supports async compute pipeline.
- Supports queueing of requests to avoid back-to-back helper passes.
- This feature is needed to support conversion of GPU scene buffers.
- Deprecated the ReadOnly and ForceTracking resource flags and added a 'SkipTracking' flag instead.
- Previous semantics were confusing and error prone.
- New model requires a manual flag to tell RDG never to transition a resource.
- This flag is used for read-only dummy resources as an optimization.
- Renamed some of the auxiliary 'FinalizedResource' utilities since the name no longer matches the semantics.
#preflight 6266cc6d0634d0904ce4ba46
[CL 19904734 by zach bethel in ue5-main branch]
- Added resource pool counters and events.
- Added AllocatePooledBuffer method and refactored pool to no longer take a command list.
- Refactored swap chain barrier logic to be a bit cleaner.
- Added helper methods to cast between views.
- Added power of two alignment option to buffer pool.
- Added GetTypeHash implementations for RDG SRV | UAV descriptors.
#preflight 62631046006fa20b683d130f
[CL 19873407 by zach bethel in ue5-main branch]
- New RHI command list SetTrackedAccess method for the user to supply a current whole-resource state.
- New RHI command context GetTrackedAccess method for querying the tracked access in RHIBeginTransitions / RHIEndTransitions on the RHI thread.
- Hooked RHICmdList.Transition and FRHICommandListExecutor::Transition to assign tracked state automatically.
- Refactored RDG and resource pools to use new RHI tracking.
- FRDGPooledBuffer / FRDGPooledTexture no longer contain tracked state. RDG temp-allocates state through the graph allocator instead.
- All prologue transitions are 'Unknown', and all epilogue transitions coalesce into a whole resource state.
- Implemented platform support for patching the 'before' state with the tracked state.
- Implemented various RHI validation checks:
- Asserts that the user assigned tracked state matches RHI validation tracked state, for all subresources.
- Asserts that tracked state is not assigned or queried from a parallel translation context.
- Added FRHIViewableResource and FRHIView base classes to RHI. FRHIView contains a pointer to an FRHIViewableResource. This is currently a raw pointer, but should be extended to a full reference in a later CL.
NOTE on RHI thread constraint:
Transition evaluation is now restricted to the RHI thread (i.e. no parallel translation contexts). Transitions aren't performed in parallel translate contexts anyway, so this is not a problem. If, however, we decide to refactor parallel translation to be more general, this implementation could be extended to track the state per context and update from the 'dispatch' thread.
#preflight 6233b4396666d7e753a16aaf
#rb kenzo.terelst
[CL 19513316 by zach bethel in ue5-main branch]
- Deprecated legacy members from FPooledRenderTargetDesc.
- Deprecated ETextureRenderTarget and removed from RDG.
- TargetableTexture always equals ShaderResourceTexture.
- Simplified render target pool FindFreeElement.
- Create pooled buffers and textures with a known state.
#rb graham.wihlidal
#preflight 61f8488568795b2f45852274
#ROBOMERGE-AUTHOR: zach.bethel
#ROBOMERGE-SOURCE: CL 18796880 in //UE5/Release-5.0/... via CL 18797840 via CL 18799070
#ROBOMERGE-BOT: UE5 (Release-Engine-Test -> Main) (v908-18788545)
[CL 18799188 by zach bethel in ue5-main branch]
- Implemented common transient page allocator in RHICore.
- Implemented XBox specific GPU page table mapping allocator.
- Extended RDG insights to support viewing heap visualization or page pool visualization.
#preflight 61d356682e0e436c725818bf
[CL 18504626 by zach bethel in ue5-main branch]
This should fully address UniformBufferLayout/UniformBuffer crashes that randomly happen during shader compiling in the Editor.
#jira none
#rb arciel.rekman, ben.ingram, mihnea.balta, stu.mckenna, will.damon
#preflight 611eb6c6008be90001f8b031
[CL 17243608 by christopher waters in ue5-main branch]
- New Drain() method on FRDGBuilder; will flush all pending work.
- Drained passes are not culled; resource lifetimes are extended; async compute fences are optimized as best as possible but fence joining may occur after the drain.
- Batch up and pre-build all resource transitions. This is a prerequisite for parallel command lists.
- Removed ServiceLocalQueue passes with built-in RDG AddDispatchHint().
#jira UE-114622
[CL 16393495 by zach bethel in ue5-main branch]
- Simplified texture subresource tracking.
- Removed map lookup for each resource in SetupPass.
- Improved Compile / CollectPassResources to reduce cache misses.
- Added some container reservations to reduce reallocation costs.
- Added snapping of buffers to page boundaries to improve re-use.
#rb none
[CL 16208311 by zach bethel in ue5-main branch]
- Views are cached on RHI transient resources; view renames are no longer necessary.
- RHI Transient resources utilize a single cache per heap keyed off of the descriptor + offset. Resource caches and heaps are garbage collected.
- CPU performance is effectively equivalent to the existing pooled resource method.
- Added common RHI transient resource allocator implementation in RHI core; significantly reduces the amount of platform code.
- Resource aliasing overlaps are tracked by the RHI and submitted through an acquire operation.
- Fixed D3D12 implementation to support multi-GPU.
- Removed condition that excluded small (<64k) buffers in the transient allocator.
- RHI validation now checks that resource overlaps are valid; i.e. if an overlap occurs between resource A and B during an acquire of B, validation checks that A has been discarded.
#rb graham.wihlidal, luke.thatcher, kenzo.terelst
[CL 16076280 by zach bethel in ue5-main branch]