Commit Graph

198 Commits

Author SHA1 Message Date
Luke Thatcher
10cdd4a111 Merging //UE5/Dev-ParallelRendering/... (up to CL 30965645) to //UE5/Main/... (base CL 30962637)
Significant refactor of RHI command list management and submission, and RHI breadcrumbs / RenderGraph (RDG) scopes, to allow for parallel translation of most RHI command lists.
See individual changelists in //UE5/Dev-ParallelRendering for details. A summary of the changes is as follows:

This work's primary goal was to allow as many RHI command lists as possible to be parallel translated, to make more efficient use of many-core systems. To achieve this:
 - The submission code paths for the immediate and parallel RHI command lists have been merged into a single function: FRHICommandListExecutor::Submit().
 - A "dispatch thread" (which is simply a series of chained task graph tasks) is used to decide which command lists are batched together in a single parallel translate job.
 - Individual command lists can disable parallel translate, which forces them to be executed on the RHI thread. This happens automatically if an RHI command list performs an operation that is not thread safe (e.g. buffer lock, or low-level resource transition).

One of the primary blockers for parallel translation was the RHI breadcrumb system, and the way RDG builds scopes. This was also refactored to remove these limitations:
 - RDG could only push/pop events on the immediate command list, which resulted in parallel and immediate work being interleaved, breaking any opportunity for parallelism.
 - Platform RHI implementations of breadcrumbs (e.g. in D3D12 RHI) was not correct across multiple RHI contexts. Push/pop operations aren't necessarily balanced within any one RHI context given that RDG builds "parallel pass sets" containing arbitrary ranges of renderer passes.

A summary of the new RHI breadcrumb system is as follows:
 - A tree of breadcrumb nodes is built by the render thread and RDG. Each node contains the node name, and pointers to the parent and next nodes. When fully built, the nodes form a depth-first linked list which is used for traversing the tree for GPU crash debugging.
 - The memory for breadcrumb nodes is provided by ref-counted allocator objects. These allocators are pipelined through the RHI, allowing the platform RHI implementation to extend their lifetime for GPU crash debugging purposes.
 - RHIPushEvent / RHIPopEvent have been removed, replaced with RHIBeginBreadcrumbGPU / RHIEndBreadcrumbGPU. Platform RHIs implement these functions to perform GPU immediate writes using the unique ID of each node, for tracking GPU progress.
 - Format string arguments are captured by-value to remove the cost of string formatting while building the breadcrumb tree. String formatting only occurs when the actual formatted string is required (e.g. during GPU crash breadcrumb stack traversal, or when calling platform GPU profiling APIs).

RenderGraph scopes have been simplified:
 - The separate scope trees / arrays of ops have been combined. There is now a single tree of RDG scopes containing all types.
 - Each RDG pass holds a pointer to the scope it was created under.
 - BeginCPU / EndCPU is called on each RDG scope as the various RDG threads enter / exit them. This allows us to mark-up each worker thread with the relevant Unreal Insights scopes.

Other changes include:
 - Fixes for bugs uncovered when parallel translate was enabled.
 - Adjusted platform affinities necessary due to the new layout of thread tasks in the renderer.
 - Refactored RHI draw call stats to better fit the new pipeline design.

#rb jeannoe.morissette, zach.bethel
#jira UE-139543

[CL 30973133 by Luke Thatcher in ue5-main branch]
2024-01-29 12:47:28 -05:00
zach bethel
bffaf2b599 Implemented recursive render command pipe command support. Render commands that are enqueued while replaying another command execute immediately. This mirrors the behavior of render thread commands. Also added an ensure to catch when a command is enqueued to pipe B while replaying pipe A, as this is not a supported scenario.
#jira UE-202416

[CL 30774800 by zach bethel in ue5-main branch]
2024-01-22 11:01:32 -05:00
wojciech krywult
97075cca15 FrameSync: Implemented a workaround for FTask::Wait creating multiple wait task and event object instances when called repeatedly while waiting for a long task to finish.
In particular, it would happen when capturing GPU profiles on some platforms as it leads to the sync between the game and render threads to be very long.

Resolved by creating a custom wait task manually which ensures that we only have one instance.

#rb danny.couture, graham.wihlidal
#rnx

[CL 30318412 by wojciech krywult in ue5-main branch]
2023-12-14 08:50:48 -05:00
christopher waters
e9661bc768 Preparing for dependency cleanup.
[CL 30244022 by christopher waters in ue5-main branch]
2023-12-11 13:55:22 -05:00
marc audy
399bcf9971 Disable PVS warning V758
Silence V570 false positives for bit field assignments
Silence various other PVS warnings
#rnx

[CL 29706746 by marc audy in ue5-main branch]
2023-11-14 00:29:43 -05:00
zach bethel
c089d9abdf Re-enabled async render command pipes after removal of the scene pipe.
[CL 29092585 by zach bethel in ue5-main branch]
2023-10-25 12:34:01 -04:00
zach bethel
d66ba4c427 Fixed race condition with the StopRecording() sync command and commands inserted into a pipe.
[CL 29047923 by zach bethel in ue5-main branch]
2023-10-24 12:14:56 -04:00
alexis matte
6f863a1e3c Fix issue with Flush Rendering command call twice
#jira UE-196811
#rb jeanluc.corenthin , zach.bethel
#rnx

[CL 28930401 by alexis matte in ue5-main branch]
2023-10-19 15:38:10 -04:00
graham wihlidal
9b72a85c3f Temporarily disable render command pipes until a race condition can be resolved with ISMs
[FYI] zach.bethel, mihnea.balta, luke.thatcher

[CL 28717524 by graham wihlidal in ue5-main branch]
2023-10-12 13:34:05 -04:00
zach bethel
df8c177da3 Removed PSO cache flush from command list flush method and manually placed it to happen after each scene render.
#rb graham.wihlidal
#jira UE-196266

[CL 28238003 by zach bethel in ue5-main branch]
2023-09-26 13:56:11 -04:00
graham wihlidal
7d5c1a475f Fixes to skin cache Set PSO matching up with Dispatches, and temporarily set r.RenderCommandPipeMode to 1 (instead of 2), until the PSO cache cleanup can be corrected (the refactored thread local compute PSO optimization exposes a race condition in certain cases like GPUSkinCache + render commands).
#jira UE-196266
#rb christopher.waters, zach.bethel
#fyi arciel.rekman

[CL 28215060 by graham wihlidal in ue5-main branch]
2023-09-25 22:31:06 -04:00
zach bethel
c92cf39716 Refactored render command fence implementation to fence render command pipes.
- Reimplemented render command fence bundler flushing to keep the bundler active during a GC.
 - Added insights regions to track render command pipe recording and fence bundler activity when the render commands channel is active.
 - Cleaned up render command pipe recording so that it works properly with the fence bundler.

#rb chrisopher.waters
[FYI] Dominic.Couture

[CL 28206922 by zach bethel in ue5-main branch]
2023-09-25 17:29:10 -04:00
zach bethel
14d84c3bb0 Re-enabled render command pipes by default.
[CL 27834049 by zach bethel in ue5-main branch]
2023-09-13 11:51:59 -04:00
zach bethel
b036d46bef Fixed server build break.
[CL 27768599 by zach bethel in ue5-main branch]
2023-09-11 16:10:46 -04:00
zach bethel
63a64c0791 Reworked render command pipes sync scopes to only start recording when a pipe was previously recording.
- Fixes issue where pipes are stopped, then a sync occurs and pipes are erroneously started up again.
 - Removed start / stop from end of frame updates.

#jira UE-194553

[CL 27767719 by zach bethel in ue5-main branch]
2023-09-11 15:52:47 -04:00
zach bethel
fd5f008daf Disabled render command pipes to investigate crash.
#jira none

[CL 27716051 by zach bethel in ue5-main branch]
2023-09-08 12:47:52 -04:00
zach bethel
715c4c3851 Fixed build break
[CL 27693415 by zach bethel in ue5-main branch]
2023-09-07 18:37:59 -04:00
zach bethel
6473b91df9 Hardened thread safety of render command pipe system and added ability to sync specific pipes.
- Added additional sync scopes to handle VT standalone path.

#jira UE-194136, FORT-648678, UE-194553, PLAY-12828

[CL 27691097 by zach bethel in ue5-main branch]
2023-09-07 17:51:23 -04:00
zach bethel
1ee43df79f Added validation to issue an ensure when a sync scope is used during end of frame updates.
[CL 27530932 by zach bethel in ue5-main branch]
2023-08-31 14:26:55 -04:00
zach bethel
869dbfa725 Removed ensure to check against render commands being enqueued from the rendering thread timeline, as this there are some valid cases in (shader compilation).
#jira UE-193448

[CL 27488816 by zach bethel in ue5-main branch]
2023-08-30 13:02:13 -04:00
zach bethel
fb4fed1103 Re-enabled render command pipes and fixed Skinning / Niagara race conditions.
- Added flags to render command pipe definitions to allow for disabling in code.
 - Release the TFunction after each command to match behavior of the render command tasks.

[CL 27412478 by zach bethel in ue5-main branch]
2023-08-28 11:31:06 -04:00
christopher waters
5ae57ccf85 The majority of FRenderCommandPipeRegistry is not needed in server builds.
#rb marc.audy

[CL 27224087 by christopher waters in ue5-main branch]
2023-08-18 19:12:59 -04:00
wojciech krywult
572fd58df2 RenderCore: Fixed crashes that could occur while using DumpGPU console command if writes to the drive are slow.
#rb David.Harvey
#jira UE-160213
#rnx

[CL 27107689 by wojciech krywult in ue5-main branch]
2023-08-15 12:55:53 -04:00
zach bethel
cfd832379c Temporarily disabled async render command pipes to investigate crash.
[CL 27102857 by zach bethel in ue5-main branch]
2023-08-15 10:50:52 -04:00
zach bethel
b5b17e2ae7 Render Command Pipe Implementation and API
Render Command Pipes dedicated asynchronous task pipes for render commands. Users can easily define new pipes and enqueue commands into them. Pipes can be synchronized using a scope to run serial render commands on the render thread, but initially pipes cannot be synchronized individually with each other. Render command overhead is reduced by recording command lambdas into MPSC queues which are serviced by the task graph; both for pipes and for the render thread. This reduces the task overhead as commands are no longer 1-to-1 with tasks.

Pipe behavior is controlled with new CVars. `r.RenderCommandPipeMode` controls overall behavior:
 0 - Legacy render thread tasks,
 1 - Render thread MPSC queue,
 2 - Render thread and async pipe MPSC queues.

To define a Render Command Pipe, use DEFINE_RENDER_COMMAND_PIPE(MyPipe), or DECLARE_RENDER_COMMAND_PIPE(MyPipe, MODULE_API) to declare an extern reference.

Enqueue a command into the pipe like so:

ENQUEUE_RENDER_COMMAND(MyCommand)(UE::RenderCommandPipe::MyPipe, [] (FRHICommandList&) {}).

Omitting a pipe will fallback to the 'general' pipe which is the render thread.

Eventually pipes need to be synced back to the general pipe for scene renders and other GPU work. On the game thread timeline, use UE::RenderCommandPipe::FSyncScope to synchronize the pipes. This waits for pipes and disables recording of new pipe commands until the scope completes, at which point pipe recording is restarted. This creates a 'sync point', so render commands issued prior to a sync scope will be waited on at the start of the scope, and render commands issued after the scope ends will not be able to start until the render thread finishes processing prior commands.

#rb christopher.waters, luke.thatcher

[CL 27074956 by zach bethel in ue5-main branch]
2023-08-14 12:52:45 -04:00