Re-enabled async compute for Nanite. Fixed submission bug with async compute and parallel RDG where the async compute command list wasn't being submitted correctly in order.
#rb jamie.hayes, luke.thatcher
[FYI] graham.wihlidal
#jira UE-114775
#ROBOMERGE-SOURCE: CL 16937065 in //UE5/Main/...
#ROBOMERGE-BOT: STARSHIP (Main -> Release-Engine-Test) (v838-16927207)
[CL 16937073 by zach bethel in ue5-release-engine-test branch]
- Added ERDGBuilderFlags::AllowParallelExecute to tag specific builders to attempt parallel execution. This avoids cases where small graphs fork tasks and end up causing contention. Only the main scene render graphs are tagged.
- Moved RHI transition creation to an async task.
- Moved parallel execute setup and dispatch to an async task.
- Fixed RDG draining asserts using a short-term workaround by tagging relevant scene textures as non-transient.
- Deprecated RDG AddPass utilities without names and fixed up last remnants.
- Enabled parallel RDG execution by default.
[FYI] christopher.waters
#ROBOMERGE-SOURCE: CL 16925941 in //UE5/Main/...
#ROBOMERGE-BOT: STARSHIP (Main -> Release-Engine-Test) (v836-16769935)
[CL 16925957 by zach bethel in ue5-release-engine-test branch]
- Refactored RDG to support free-threaded execution of passes.
- Refactored renderer to use specific RHI command list variants in pass lambda. Immediate command list passes are forced to stay on the render thread, while other variants can be parallelized.
#rb christopher.waters
#ROBOMERGE-SOURCE: CL 16838717 in //UE5/Main/...
#ROBOMERGE-BOT: STARSHIP (Main -> Release-Engine-Test) (v836-16769935)
[CL 16838724 by zach bethel in ue5-release-engine-test branch]
- Uses the InstanceCullingLoadBalancer to pre-distribute the work on the CPU to ensure even load.
- Make instance culling use the instance data offset in MDC instead of translating primitive IDs.
- Track single-instance draws separately from instanced to optimize handling (disable culling for single-instance primitives).
#rb Graham.wihlidal,andrew.lauritzen
[FYI] dmitriy.dyomin
#preflight 60d0eafa2ab2180001269160
#ROBOMERGE-SOURCE: CL 16733827 in //UE5/Main/...
#ROBOMERGE-BOT: STARSHIP (Main -> Release-Engine-Test) (v835-16672529)
[CL 16733837 by ola olsson in ue5-release-engine-test branch]
- New Drain() method on FRDGBuilder; will flush all pending work.
- Drained passes are not culled; resource lifetimes are extended; async compute fences are optimized as best as possible but fence joining may occur after the drain.
- Batch up and pre-build all resource transitions. This is a prerequisite for parallel command lists.
- Removed ServiceLocalQueue passes with built-in RDG AddDispatchHint().
#jira UE-114622
[CL 16393495 by zach bethel in ue5-main branch]
- Simplified texture subresource tracking.
- Removed map lookup for each resource in SetupPass.
- Improved Compile / CollectPassResources to reduce cache misses.
- Added some container reservations to reduce reallocation costs.
- Added snapping of buffers to page boundaries to improve re-use.
#rb none
[CL 16208311 by zach bethel in ue5-main branch]
- Views are cached on RHI transient resources; view renames are no longer necessary.
- RHI Transient resources utilize a single cache per heap keyed off of the descriptor + offset. Resource caches and heaps are garbage collected.
- CPU performance is effectively equivalent to the existing pooled resource method.
- Added common RHI transient resource allocator implementation in RHI core; significantly reduces the amount of platform code.
- Resource aliasing overlaps are tracked by the RHI and submitted through an acquire operation.
- Fixed D3D12 implementation to support multi-GPU.
- Removed condition that excluded small (<64k) buffers in the transient allocator.
- RHI validation now checks that resource overlaps are valid; i.e. if an overlap occurs between resource A and B during an acquire of B, validation checks that A has been discarded.
#rb graham.wihlidal, luke.thatcher, kenzo.terelst
[CL 16076280 by zach bethel in ue5-main branch]