- GetMoviePlayer may return nullptr if the player was never initialized. Callers should first check IsMoviePlayerEnabled()
#rb jeannoe.morissette
[CL 36758491 by luke thatcher in 5.5 branch]
- Various legacy game thread code assumes the render thread will never be more than one frame behind. The original change in 36468180 switched from syncing the GT with the RT, to syncing the GT with the RHIT. That left the render thread "floating" in the center of the pipeline, and led to cases where resources are deleted too soon.
- New approach is to always sync with the GT with the N-1 RT frame, so the GT is never too far ahead of the RT. This maintains compatibility with the legacy GT code paths. In addition to the GT->RT sync, we also sync with the RHIT to prevent the engine running ahead, which was the original bug that 36468180 was fixing.
- "r.GTSyncType" mode 0 now allows for 1 frame of GT->RT overlap, and 2 frames of GT->RHIT overlap.
- For debugging purposes, "r.GTSyncType" can also be made negative, which increases the number of GT->RHIT overlap frames, e.g. "r.GTSyncType -3" gives 5 frames of GT->RHIT overlap. While this is not overly useful in a shipped title, it can be used to prove the correctness of the rendering pipeline.
- Merged the FDeferredCleanupInterface / FPendingCleanupObjects processing into the FFrameEndSync code path, plus made FFrameEndSync a static singleton. This allows us to manage the N frames of overlap between the GT and RHIT in a central place, and correctly cleanup deferred resources when the RT fences have passed.
In future, we will need to revisit this as part of the frame pacing initiative. Game thread code should be written to not require render thread fences for correctness / threadsafety.
#jira UE-223692
#rb zach.bethel
[CL 36758377 by luke thatcher in 5.5 branch]
- The engine still uses legacy syncing behaviour where the game thread syncs with the render thread. Renderer refactors have removed most of the RHI thread flushes that happened per frame, which were the only thing synchronizing the game/render threads with the RHI thread. Without these flushes, and when occlusion queries are disabled, the game thread can run ahead of the RHI thread by several hundred frames, since it is now entirely unsynchronized.
- This fix changes mode 0 of "r.GTSyncType" to sync with the N-2 RHI thread frame when the RHI thread is active, rather than the N-1 render thread frame. The game thread is now always synchronized with the RHI thread to prevent it running ahead.
- Mode 1 of "r.GTSyncType" now works even when vsync is disabled, and syncs the game thread with the N-1 RHI thread frame (same behaviour as before).
#jira UE-223692
#rb dave.barrett
[CL 36746684 by luke thatcher in 5.5 branch]
Previously, the first (arbitrary) editor target receipt with a matching configuration was chosen.
This is necessary to disambiguate a second receipt for an editor target in the engine binaries directory, and matches existing logic used to determine the correct receipt in FEngineLoop::AppInit.
#jira UE-196216
#rb jeremie.roy, will.brown
[CL 36017174 by zach brockway in ue5-main branch]
- The TStatId for "stat gpu" stats.
- The FName required by the CSV profiler for GPU stats.
- The source file and line number to allow breadcrumbs shown in Insights to link back to their original source location.
Additional changes:
- Added temporary support for the Insights GPU track. This is guarded by RHI_TEMP_USE_GPU_TRACE until we have a newer, more capable API.
- Simplified FMeshDrawEvent into a standard RHI breadcrumb in FMeshDrawCommand::SubmitDraw().
- Moved "r.GPUCsvStatsEnabled" cvar into GPUProfiler.cpp, so it is accessible to both old and new profilers.
#jira UE-177299
#rb mihnea.balta
[CL 35973862 by luke thatcher in ue5-main branch]
- Added a TStatId field on breadcrumbs. When set, this indicates the breadcrumb should write its computed GPU duration to the given stat.
- Implemented "stat gpu" alongside the "stat unit" GPU event stream sink. Times in "stat gpu" are now taken as the union of busy time across all GPU queues, in the same way we compute the "stat unit" GPU time.
- RDG support is handled via the new RHI_EVENT_SCOPE_STAT macro, allowing us to tag RDG scopes with GPU stats. When the new GPU profiler is enabled, RDG_GPU_STAT_SCOPE and SCOPED_GPU_STAT are empty and will eventually be deprecated and removed.
Cleanup of breadcrumb macros
- Remove unnecessary "Name" arguments.
- Require the user to wrap the format string in quotes, rather than stringizing the format arg with the preprocessor.
- Removed "F" version of macros. Both string literal and formatted strings can be handled with the varargs macro, since the varargs are simply empty when using only a string literal.
Added GetTypeHash function for TStatId
- This didn't exist before, and allows use of TStatId in TMaps etc.
#jira UE-177299
#rb zach.bethel
[CL 35953056 by luke thatcher in ue5-main branch]
This change:
* Introduces the concept of "vendor-aware" precaching with the ability to have the precaching system only precache PSOs that are considered "different" by the graphics driver. Every GPU vendor has a different subset of PSO state that can cause a driver cache hit or miss. This change implements this mechanism for NVIDIA and Intel Arc where testing was performed and assumptions verified. While avoiding the compilation of similar PSOs might seem unimportant (since by definition similar PSOs will cause a cache hit and will be fast), there is significant scheduling overhead and contention that can be avoided. According to my tests, this reduces the number of precached PSOs on NVIDIA by ~40% and on Intel by ~25%.
* Adds the ability to keep the precached PSOs in memory instead of deleting them immediately after creation. This can significantly help PSO re-creation performance on NVIDIA, and avoids otherwise-unavoidable small hitches on the critical path when the PSO is used for rendering. Precached PSOs are deleted once they are actually used for rendering, and CVars are provided to control how many are kept in memory at a time (to take into account all the precached PSOs that don't end up being used and avoid unbounded memory usage). According to my tests on a replay, this reduces the number of >4ms PSO creation hitches from over a thousand to less than 50.
* Moves initialization of the precaching structures to be explicitly controlled instead of relying on static initialization since that's too early for CVars to be set.
#rb elizabeth.bunner, Kenzo.Terelst, mihnea.balta
[CL 35904165 by daniele vettorel in ue5-main branch]
- We now only have RHIEndFrame to mark the boundary between engine frames for the purpose of stat gathering and RHI cleanup tasks. The next frame begins immediately after the EndFrame command.
- RHIEndFrame is called on the RHI thread after all prior command list submissions. Platform RHIs that need to enqueue RHICmdList work can override RHIEndFrame_RenderThread to add this work either side of the actual EndFrame command.
- The work that platform RHIs performed in RHIBeginFrame has been merged into their implementations of RHIEndFrame.
#rb mihnea.balta
#jira UE-177299
[CL 35106431 by luke thatcher in ue5-main branch]
[FYI] Bryan.Johnson
Original CL Desc
-----------------------------------------------------------------
[Backout] - CL35079176
[FYI] Mieszko.Zielinski
Original CL Desc
-----------------------------------------------------------------
Moved the rest of MassEntity modules over to the Engine's Source/ code.
#jira UE-216267
[CL 35087666 by bryan johnson in ue5-main branch]
[FYI] Mieszko.Zielinski
Original CL Desc
-----------------------------------------------------------------
Moved the rest of MassEntity modules over to the Engine's Source/ code.
#jira UE-216267
[CL 35087231 by bryan johnson in ue5-main branch]
The game thread context uses manual stack scanning mode, and FrankenGC sets the stack as empty during marking.
This enables module startup to use the Verse heap (previously the context was created too late), and ensures that there is always a Verse context available while FrankenGC is enabled.
#rb Tim.Smith
[CL 34469229 by russell johnston in ue5-main branch]