Platforms that do support uniform buffer objects can now provide batched primitive data through UBO. There is a limit UBO range that can be accessed in shaders, so we group instances into batches that fit into this limit. Switch uses 64KB views, other platfroms16KB views. For each primitive we allocate 512Bytes and 256Bytes for instance. Mobile platforms that do not support UBO will use a desktop version of GPUScene.
There are a few things that still missing: Dynamic mesh passes, static lighting
#rb ola.olsson, benjamin.rouveyrol
[CL 26354848 by marc audy in ue5-main branch]
- RHICreate{Vertex, Index, Structured}Buffer
- RHICreate{ShaderResource, UnorderedAccess}View
- RHIUpdateUniformBuffer
- Various initialization / locking methods for helper buffer types in RHIUtilities.h
The goal is to continue to force resource creation through command lists to avoid surprises with moving things off the render thread.
#rb christopher.waters
[CL 26183242 by zach bethel in ue5-main branch]
- Previously there was a callback associated with r.RDG.Debug.FlushGPU that tried to disable AsyncCompute however it didn't work because it got reenabled by CVarRDGAsyncComputeSink.
#rb zach.bethel
#preflight 6465eb87317ee2d9d1e25325
[CL 25522735 by tiago costa in ue5-main branch]
#ushell-cherrypick of 24664569 by zach.bethel
#lockdown michal.valient
#rb graham.wihlidal, yuriy.odonnell
#preflight skip
[CL 24701530 by graham wihlidal in ue5-main branch]
- MorphVertexBuffer was not in UAV state before being cleared with RHIClearUAVUint.
Fix RDGImmediate not uploading buffers if no render passes were enqueued.
- Also remove stray debug string compare / debug break
#rb Zach.Bethel
#jira MH-8828
#preflight 64023e2aa726961ed9d598fe
[CL 24507908 by luke thatcher in ue5-main branch]
- Fixed incorrect accumulation of reference counts when RDG culling is disabled but parallel setup is enabled.
- Added assert to catch bad referencing counting in the future, which will be easier to debug than a leak.
#preflight 63d063d5f2318350a2bd6071
#jira UE-173062
[CL 23841216 by zach bethel in ue5-main branch]
A lot of files touched, but generally it's a mechanical matter of removing the global variable GNumAlternateFrameRenderingGroups, and treating all code using it as if it's a fixed constant of one. If a conditional becomes always false, the code block is removed. Certain utility functions only called from dead stripped AFR code are then removed (e.g. RHIBroadcastTemporalEffect). On the D3D11 side, RHIBeginUpdateMultiFrameResource / RHIEndUpdateMultiFrameResource become NOPs (return at the top of the function) when GNumAlternateFrameRenderingGroups is 1, so those are removed across the board.
#jira none
#rb jason.nadro
#preflight 63cea0afb91ac945f5117110
[CL 23820752 by jason hoerner in ue5-main branch]
The RDG builder holds a scope in order to avoid deletion of any resources during the graph setup / execution. This resolves the need to hold a strong reference during the RDG graph lifetime.
If FlushPendingDeletes is called within one of these scopes, the resources are instead queued onto the immediate command list and flushed at the end of the scope.
This change removes strong references held of uniform buffers held by RHI commands and certain platform contexts. When a flush is encountered, the backends will clear all caches removing any potential dangling references.
Resources can opt out of lifetime extension by calling ->DisableLifetimeExtension(). Subsequent calls to FlushPendingDeletes will release the resource immediately. This is used by a couple edge cases where resources must be deleted mid-frame (namely, DumpGPU and BVH building).
#rb christopher.waters
#preflight 63c5e5722e714f64ad017cfd
[CL 23734079 by zach bethel in ue5-main branch]
[REVIEW] [at]mickael.gilabert, [at]john.huelin
#ushell-cherrypick of 23297379 by zach.bethel
#localization none
#tests manual
[CL 23300654 by zach bethel in ue5-main branch]
[REVIEW] [at]mihnea.balta, [at]luke.thatcher, [at]ben.woodhouse
#rb Luke.Thatcher
#localization none
#tests reprod and fixes issue and ran a few replays without any others issues. perf improvements are still there when nanite parallal translate is enabled again via cvar
[CL 23227981 by kenzo terelst in ue5-main branch]
Move nanite programmable raster command list build to parallel translate tasks to offload RHI thread on PC
[REVIEW] [at]yuriy.odonnell, [at]zach.bethel, [at]luke.thatcher, [at]mihnea.balta
#localization none
#tests ran local replay and compared perf with it on and off and saves around 3 to 4 msec on rhi thread
[CL 23227976 by luke thatcher in ue5-main branch]