- MorphVertexBuffer was not in UAV state before being cleared with RHIClearUAVUint.
Fix RDGImmediate not uploading buffers if no render passes were enqueued.
- Also remove stray debug string compare / debug break
#rb Zach.Bethel
#jira MH-8828
#preflight 64023e2aa726961ed9d598fe
[CL 24507908 by luke thatcher in ue5-main branch]
- Fixed incorrect accumulation of reference counts when RDG culling is disabled but parallel setup is enabled.
- Added assert to catch bad referencing counting in the future, which will be easier to debug than a leak.
#preflight 63d063d5f2318350a2bd6071
#jira UE-173062
[CL 23841216 by zach bethel in ue5-main branch]
A lot of files touched, but generally it's a mechanical matter of removing the global variable GNumAlternateFrameRenderingGroups, and treating all code using it as if it's a fixed constant of one. If a conditional becomes always false, the code block is removed. Certain utility functions only called from dead stripped AFR code are then removed (e.g. RHIBroadcastTemporalEffect). On the D3D11 side, RHIBeginUpdateMultiFrameResource / RHIEndUpdateMultiFrameResource become NOPs (return at the top of the function) when GNumAlternateFrameRenderingGroups is 1, so those are removed across the board.
#jira none
#rb jason.nadro
#preflight 63cea0afb91ac945f5117110
[CL 23820752 by jason hoerner in ue5-main branch]
The RDG builder holds a scope in order to avoid deletion of any resources during the graph setup / execution. This resolves the need to hold a strong reference during the RDG graph lifetime.
If FlushPendingDeletes is called within one of these scopes, the resources are instead queued onto the immediate command list and flushed at the end of the scope.
This change removes strong references held of uniform buffers held by RHI commands and certain platform contexts. When a flush is encountered, the backends will clear all caches removing any potential dangling references.
Resources can opt out of lifetime extension by calling ->DisableLifetimeExtension(). Subsequent calls to FlushPendingDeletes will release the resource immediately. This is used by a couple edge cases where resources must be deleted mid-frame (namely, DumpGPU and BVH building).
#rb christopher.waters
#preflight 63c5e5722e714f64ad017cfd
[CL 23734079 by zach bethel in ue5-main branch]
[REVIEW] [at]mickael.gilabert, [at]john.huelin
#ushell-cherrypick of 23297379 by zach.bethel
#localization none
#tests manual
[CL 23300654 by zach bethel in ue5-main branch]
[REVIEW] [at]mihnea.balta, [at]luke.thatcher, [at]ben.woodhouse
#rb Luke.Thatcher
#localization none
#tests reprod and fixes issue and ran a few replays without any others issues. perf improvements are still there when nanite parallal translate is enabled again via cvar
[CL 23227981 by kenzo terelst in ue5-main branch]
Move nanite programmable raster command list build to parallel translate tasks to offload RHI thread on PC
[REVIEW] [at]yuriy.odonnell, [at]zach.bethel, [at]luke.thatcher, [at]mihnea.balta
#localization none
#tests ran local replay and compared perf with it on and off and saves around 3 to 4 msec on rhi thread
[CL 23227976 by luke thatcher in ue5-main branch]
- Moved dispatch busy-wait out of the command-list recording task. Allows the render thread to help with jobs once it's done.
- Added workload to pass to help load balancing.
[CL 23025379 by zach bethel in ue5-main branch]
- Added explicit SwitchPipeline calls to async command lists.
- Moved pipeline push / pop calls from platform implementations to RHI command list.
#rb christopher.waters
#preflight 636937204d3c1d9d9264ce7b
[CL 23012810 by zach bethel in ue5-main branch]
* Feature that merges command lists from QueueAsyncCommandListSubmit into a single Payload where possible (r.D3D12.AllowPayloadMerge, default enabled). Saves around 0.5 ms per async command list batch, with total savings from 1 to 4 ms on VP test scenes.
* Remove unnecessary command flush before Present on D3D12. Added a virtual function "NeedFlushBeforeEndDrawing" that can return false if the platform function RHIEndDrawingViewport function flushes commands, and the calling function doesn't need to.
* Wrapped three scene rendering flushes, plus BeginFlushResourcesRHI with "RHIIncludeOptionalFlushes()", which returns false on D3D12. These were a net perf loss on D3D12, as they wrap hardly any rendering. I didn't want to change behavior on other platforms, hence the conditional.
* Overall the above changes remove 8 command list flushes in a two view scene, which typically cost in the range of 0.06 ms each, saving 0.48 ms CPU total. Perf win increases with more views, and removing the flushes also helps with GPU bubbles.
* Added a flush in a strategic location at the end of Shadows and Lumen. In a sample scene, saved 1.5 ms GPU eliminating a bubble.
#jira UE-167553
#rb luke.thatcher
#rnx
#preflight 635547519e14ee3c790cc14e
#lockdown mihnea.balta
[CL 22728231 by jason hoerner in ue5-main branch]