- Fixes issue where pipes are stopped, then a sync occurs and pipes are erroneously started up again.
- Removed start / stop from end of frame updates.
#jira UE-194553
[CL 27767719 by zach bethel in ue5-main branch]
- Added flags to render command pipe definitions to allow for disabling in code.
- Release the TFunction after each command to match behavior of the render command tasks.
[CL 27412478 by zach bethel in ue5-main branch]
Render Command Pipes dedicated asynchronous task pipes for render commands. Users can easily define new pipes and enqueue commands into them. Pipes can be synchronized using a scope to run serial render commands on the render thread, but initially pipes cannot be synchronized individually with each other. Render command overhead is reduced by recording command lambdas into MPSC queues which are serviced by the task graph; both for pipes and for the render thread. This reduces the task overhead as commands are no longer 1-to-1 with tasks.
Pipe behavior is controlled with new CVars. `r.RenderCommandPipeMode` controls overall behavior:
0 - Legacy render thread tasks,
1 - Render thread MPSC queue,
2 - Render thread and async pipe MPSC queues.
To define a Render Command Pipe, use DEFINE_RENDER_COMMAND_PIPE(MyPipe), or DECLARE_RENDER_COMMAND_PIPE(MyPipe, MODULE_API) to declare an extern reference.
Enqueue a command into the pipe like so:
ENQUEUE_RENDER_COMMAND(MyCommand)(UE::RenderCommandPipe::MyPipe, [] (FRHICommandList&) {}).
Omitting a pipe will fallback to the 'general' pipe which is the render thread.
Eventually pipes need to be synced back to the general pipe for scene renders and other GPU work. On the game thread timeline, use UE::RenderCommandPipe::FSyncScope to synchronize the pipes. This waits for pipes and disables recording of new pipe commands until the scope completes, at which point pipe recording is restarted. This creates a 'sync point', so render commands issued prior to a sync scope will be waited on at the start of the scope, and render commands issued after the scope ends will not be able to start until the render thread finishes processing prior commands.
#rb christopher.waters, luke.thatcher
[CL 27074956 by zach bethel in ue5-main branch]
[FYI] zach.bethel
Original CL Desc
-----------------------------------------------------------------
Render Command Pipe Implementation and API
Render Command Pipes dedicated asynchronous task pipes for render commands. Users can easily define new pipes and enqueue commands into them. Pipes can be synchronized using a scope to run serial render commands on the render thread, but initially pipes cannot be synchronized individually with each other. Render command overhead is reduced by recording command lambdas into MPSC queues which are serviced by the task graph; both for pipes and for the render thread. This reduces the task overhead as commands are no longer 1-to-1 with tasks.
Pipe behavior is controlled with new CVars. `r.RenderCommandPipeMode` controls overall behavior:
0 - Legacy render thread tasks,
1 - Render thread MPSC queue,
2 - Render thread and async pipe MPSC queues.
To define a Render Command Pipe, use DEFINE_RENDER_COMMAND_PIPE(MyPipe), or DECLARE_RENDER_COMMAND_PIPE(MyPipe, MODULE_API) to declare an extern reference.
Enqueue a command into the pipe like so:
ENQUEUE_RENDER_COMMAND(MyCommand)(UE::RenderCommandPipe::MyPipe, [] (FRHICommandList&) {}).
Omitting a pipe will fallback to the 'general' pipe which is the render thread.
Eventually pipes need to be synced back to the general pipe for scene renders and other GPU work. On the game thread timeline, use UE::RenderCommandPipe::FSyncScope to synchronize the pipes. This waits for pipes and disables recording of new pipe commands until the scope completes, at which point pipe recording is restarted. This creates a 'sync point', so render commands issued prior to a sync scope will be waited on at the start of the scope, and render commands issued after the scope ends will not be able to start until the render thread finishes processing prior commands.
#rb christopher.waters, luke.thatcher
[CL 27054009 by bob tellez in ue5-main branch]
Render Command Pipes dedicated asynchronous task pipes for render commands. Users can easily define new pipes and enqueue commands into them. Pipes can be synchronized using a scope to run serial render commands on the render thread, but initially pipes cannot be synchronized individually with each other. Render command overhead is reduced by recording command lambdas into MPSC queues which are serviced by the task graph; both for pipes and for the render thread. This reduces the task overhead as commands are no longer 1-to-1 with tasks.
Pipe behavior is controlled with new CVars. `r.RenderCommandPipeMode` controls overall behavior:
0 - Legacy render thread tasks,
1 - Render thread MPSC queue,
2 - Render thread and async pipe MPSC queues.
To define a Render Command Pipe, use DEFINE_RENDER_COMMAND_PIPE(MyPipe), or DECLARE_RENDER_COMMAND_PIPE(MyPipe, MODULE_API) to declare an extern reference.
Enqueue a command into the pipe like so:
ENQUEUE_RENDER_COMMAND(MyCommand)(UE::RenderCommandPipe::MyPipe, [] (FRHICommandList&) {}).
Omitting a pipe will fallback to the 'general' pipe which is the render thread.
Eventually pipes need to be synced back to the general pipe for scene renders and other GPU work. On the game thread timeline, use UE::RenderCommandPipe::FSyncScope to synchronize the pipes. This waits for pipes and disables recording of new pipe commands until the scope completes, at which point pipe recording is restarted. This creates a 'sync point', so render commands issued prior to a sync scope will be waited on at the start of the scope, and render commands issued after the scope ends will not be able to start until the render thread finishes processing prior commands.
#rb christopher.waters, luke.thatcher
[CL 27042459 by zach bethel in ue5-main branch]
on shutdown, AppPreExit() is covered by StartRenderCommandFenceBundler()/StoptRenderCommandFenceBundler() to minimise the number of RT tasks as there're lots of render fences in this region. waiting on any of fences stops bundling, e.g. by FlushRenderingCommands(). As this happens early in AppPreExit(), this effectively disables bundling and we get a ton of RT tasks that drives mem usage up.
This CL adds FlushRenderCommandFenceBundler() and uses it in FRenderCommandFence::Wait() instead of StopRenderCommandFenceBundler() to keep bundling enabled
#rb danny.couture
#preflight 647a11ea8417d79259a9944b
[CL 25764394 by Andriy Tylychko in ue5-main branch]
Since FRHICommandListImmediate::ImmediateFlush() was calling FlushPendingDeletes() after executing the current list of commands, the lambda enqueued by that function was actually running as part of the next command list. Besides unnecessarily extending resource lifetimes, this also meant that the new command list started by flushing the deferred deletion queue, so anything added directly in there (instead of via another enqueued command) would be deleted before the commands were executed on the GPU. This was very easy to reproduce with -norhithread on DX12, because texture unlock operations add the staging buffer to the deferred deletion queue immediately, instead of enqueuing a command to do it, so the staging buffers were gone by the time the GPU tried to copy from them.
This changelist adds a boolean to RHISubmitCommandLists() which is true when we're flushing the immediate command list with the FlushRHIThreadFlushResources mode, so that the RHI can process the deletion queue internally after submission, instead of doing it in RHIPerFrameRHIFlushComplete(). I didn't want to move RHIPerFrameRHIFlushComplete() itself to another point in the timeline, because old RHIs (D3D11 and OpenGL) use that for other purposes, and it seems unwise to alter their behavior (e.g. D3D11 wants to resolve timing queries in there with a blocking wait, and running that at the end of the current command list would introduce a CPU/GPU sync point).
Also:
* deprecated FRHIResource::FlushPendingDeletes(), since all it does is call FlushPendingDeletes() on the command list being passed in, and code outside of the RHI really shouldn't be doing that.
* made FlushPendingDeleteRHIResources_RenderThread() flush the immediate command list instead of calling FlushPendingDeletes() directly
#jira UE-184426
#rnx
#preflight https://horde.devtools.epicgames.com/job/6455194d023fe5d3ad8faa64
#rb Luke.Thatcher
[CL 25423894 by mihnea balta in ue5-main branch]
an excessive FGraphEvent instance is still used in `bSyncToRHIAndGPU` branch, and can be similarly improved. though this would require massive changes and won't bring any measurable perf improvements as this happens once a frame.
#preflight 6453d0234d593c0b428d7888
#rb danny.couture, luke.thatcher
[CL 25336380 by Andriy Tylychko in ue5-main branch]