- The TStatId for "stat gpu" stats.
- The FName required by the CSV profiler for GPU stats.
- The source file and line number to allow breadcrumbs shown in Insights to link back to their original source location.
Additional changes:
- Added temporary support for the Insights GPU track. This is guarded by RHI_TEMP_USE_GPU_TRACE until we have a newer, more capable API.
- Simplified FMeshDrawEvent into a standard RHI breadcrumb in FMeshDrawCommand::SubmitDraw().
- Moved "r.GPUCsvStatsEnabled" cvar into GPUProfiler.cpp, so it is accessible to both old and new profilers.
#jira UE-177299
#rb mihnea.balta
[CL 35973862 by luke thatcher in ue5-main branch]
The behavior is opt-in by adding a FRDGAsyncTask member to the lambda args like so:
[...] (FRDGAsyncTask, FRHICommandList& RHICmdList) {}
This API design choice is for two reasons.
1. Visually localize the tag near the lambda capture args. Capture arg lifetime must be valid for async access; e.g. by value or referencing memory that is tied to RDG lifetime or the scene renderer lifetime (async tasks are synced by the scene renderer just like mesh pass tasks are). This responsibility is up to the user, so it should be obvious at first glance when a pass can run on an async task.
2. Enforce a compile-time trait on the lambda without requiring multiple AddPass function variants. This allows for utility functions to continue to work as is.
#rb Luke.Thatcher
[CL 35969245 by zach bethel in ue5-main branch]
- Added a TStatId field on breadcrumbs. When set, this indicates the breadcrumb should write its computed GPU duration to the given stat.
- Implemented "stat gpu" alongside the "stat unit" GPU event stream sink. Times in "stat gpu" are now taken as the union of busy time across all GPU queues, in the same way we compute the "stat unit" GPU time.
- RDG support is handled via the new RHI_EVENT_SCOPE_STAT macro, allowing us to tag RDG scopes with GPU stats. When the new GPU profiler is enabled, RDG_GPU_STAT_SCOPE and SCOPED_GPU_STAT are empty and will eventually be deprecated and removed.
Cleanup of breadcrumb macros
- Remove unnecessary "Name" arguments.
- Require the user to wrap the format string in quotes, rather than stringizing the format arg with the preprocessor.
- Removed "F" version of macros. Both string literal and formatted strings can be handled with the varargs macro, since the varargs are simply empty when using only a string literal.
Added GetTypeHash function for TStatId
- This didn't exist before, and allows use of TStatId in TMaps etc.
#jira UE-177299
#rb zach.bethel
[CL 35953056 by luke thatcher in ue5-main branch]
[FYI] zach.bethel
Original CL Desc
-----------------------------------------------------------------
Removed legacy dispatch after execute calls before / after async compute.
[CL 35803085 by bob tellez in ue5-main branch]
- Fixed shader validation to handle buffer types for texture metadata.
- RDG external access mode will now leave the resource in its external state / pipelines.
- Modified the RHI validation layer to allow transitioning All -> One pipe without a fence if the NoFence flag is specified. Added fence validation to emit an error of an external fence is not used that might introduce a race condition.
- Fixed VT / GPU Skin cache buffers to use read states on all pipes.
- Batched multiple VT external access calls together to reduce passes in RDG.
- Fixed IES texture manager validation error.
- Fixed sky cube texture map validation error.
- Fixed virtual shadow debug shader that was attempting to read and writing using the same texture.
- Fixed render thread CVar access on game thread.
- Fixed up -ForceRHIBypass to not crash.
#rb mihnea.balta
#jira UE-210930,
[CL 34043166 by zach bethel in ue5-main branch]
- SRVNonPixel is needed by mobile to insert a barrier between fragment -> vertex texture fetch, but since this is a heavyweight barrier, it is opt-in with SHADER_PARAMETER_RDG_NON_PIXEL_SRV.
- Small refactor to FRDGTextureAccess to allow for arbitrary subresources, as the current model only allows full resource transitions.
#rb mihnea.balta, luke.thatcher, serge.bernier
#jira UE-211883
[CL 33179861 by zach bethel in ue5-main branch]
- RDG uses its own breadcrumb allocator for RDG scopes. When the graph executes, this allocator is attached to the parent one owned by the immediate command list.
- Later, when a pass makes use of the immediate command list, a new breadcrumb is inserted on that command list's allocator, which gets reattached to the RDG allocator, forming a circular reference.
- The circular reference causes a memory leak, and a crash on shutdown when the underlying mempage allocators are destroyed.
- The reference only occurs in bypass mode, and when platform RHIs skip the DispatchToRHIThread inside FRDGBuilder::BeginFlushResourcesRHI().
- Fix is to swap out the immediate command list's allocator for the RDG one when the graph executes. This is now done in FRHICommandListBase::AttachBreadcrumbSubTree(...).
Also added RHI validation to check for circular references in the RHI breadcrumb allocators.
#jira UE-212035
#rb mihnea.balta
[CL 33101846 by luke thatcher in ue5-main branch]
- Extended RHI[Begin|End]OcclusionQueryBatch to support timestamp queries, and renamed to RHI[Begin|End]RenderQueryBatch
- The GPU profiler starts a query batch in BeginFrame, which persists on the immediate command list until EndFrame.
- RDG forwards the current batch to the parallel command lists it creates.
- Platform RHIs use the active batch to group completion events / sync points to reduce overhead. If no batch is active, they fall back to how they worked before, creating individual sync points per query.
#rb christopher.waters
[CL 32763061 by luke thatcher in ue5-main branch]