Commit Graph

67 Commits

Author SHA1 Message Date
zach bethel
6473b91df9 Hardened thread safety of render command pipe system and added ability to sync specific pipes.
- Added additional sync scopes to handle VT standalone path.

#jira UE-194136, FORT-648678, UE-194553, PLAY-12828

[CL 27691097 by zach bethel in ue5-main branch]
2023-09-07 17:51:23 -04:00
zach bethel
38b28c32cb Remove ensures to null check compute graph worker.
#jira UE-194002

[CL 27434195 by zach bethel in ue5-main branch]
2023-08-28 22:05:37 -04:00
zach bethel
2b7dcff3c6 Added support for aborting compute graph work when releasing resources.
#rb jeremy.moore
#jira UE-193341

[CL 27389375 by zach bethel in ue5-main branch]
2023-08-25 15:47:15 -04:00
zach bethel
b5b17e2ae7 Render Command Pipe Implementation and API
Render Command Pipes dedicated asynchronous task pipes for render commands. Users can easily define new pipes and enqueue commands into them. Pipes can be synchronized using a scope to run serial render commands on the render thread, but initially pipes cannot be synchronized individually with each other. Render command overhead is reduced by recording command lambdas into MPSC queues which are serviced by the task graph; both for pipes and for the render thread. This reduces the task overhead as commands are no longer 1-to-1 with tasks.

Pipe behavior is controlled with new CVars. `r.RenderCommandPipeMode` controls overall behavior:
 0 - Legacy render thread tasks,
 1 - Render thread MPSC queue,
 2 - Render thread and async pipe MPSC queues.

To define a Render Command Pipe, use DEFINE_RENDER_COMMAND_PIPE(MyPipe), or DECLARE_RENDER_COMMAND_PIPE(MyPipe, MODULE_API) to declare an extern reference.

Enqueue a command into the pipe like so:

ENQUEUE_RENDER_COMMAND(MyCommand)(UE::RenderCommandPipe::MyPipe, [] (FRHICommandList&) {}).

Omitting a pipe will fallback to the 'general' pipe which is the render thread.

Eventually pipes need to be synced back to the general pipe for scene renders and other GPU work. On the game thread timeline, use UE::RenderCommandPipe::FSyncScope to synchronize the pipes. This waits for pipes and disables recording of new pipe commands until the scope completes, at which point pipe recording is restarted. This creates a 'sync point', so render commands issued prior to a sync scope will be waited on at the start of the scope, and render commands issued after the scope ends will not be able to start until the render thread finishes processing prior commands.

#rb christopher.waters, luke.thatcher

[CL 27074956 by zach bethel in ue5-main branch]
2023-08-14 12:52:45 -04:00
bob tellez
afd943db61 [Backout] - CL27042396 and 27048615
[FYI] zach.bethel
Original CL Desc
-----------------------------------------------------------------
Render Command Pipe Implementation and API

Render Command Pipes dedicated asynchronous task pipes for render commands. Users can easily define new pipes and enqueue commands into them. Pipes can be synchronized using a scope to run serial render commands on the render thread, but initially pipes cannot be synchronized individually with each other. Render command overhead is reduced by recording command lambdas into MPSC queues which are serviced by the task graph; both for pipes and for the render thread. This reduces the task overhead as commands are no longer 1-to-1 with tasks.

Pipe behavior is controlled with new CVars. `r.RenderCommandPipeMode` controls overall behavior:
 0 - Legacy render thread tasks,
 1 - Render thread MPSC queue,
 2 - Render thread and async pipe MPSC queues.

To define a Render Command Pipe, use DEFINE_RENDER_COMMAND_PIPE(MyPipe), or DECLARE_RENDER_COMMAND_PIPE(MyPipe, MODULE_API) to declare an extern reference.

Enqueue a command into the pipe like so:

ENQUEUE_RENDER_COMMAND(MyCommand)(UE::RenderCommandPipe::MyPipe, [] (FRHICommandList&) {}).

Omitting a pipe will fallback to the 'general' pipe which is the render thread.

Eventually pipes need to be synced back to the general pipe for scene renders and other GPU work. On the game thread timeline, use UE::RenderCommandPipe::FSyncScope to synchronize the pipes. This waits for pipes and disables recording of new pipe commands until the scope completes, at which point pipe recording is restarted. This creates a 'sync point', so render commands issued prior to a sync scope will be waited on at the start of the scope, and render commands issued after the scope ends will not be able to start until the render thread finishes processing prior commands.

#rb christopher.waters, luke.thatcher

[CL 27054009 by bob tellez in ue5-main branch]
2023-08-11 20:05:11 -04:00
zach bethel
2d143afc83 Render Command Pipe Implementation and API
Render Command Pipes dedicated asynchronous task pipes for render commands. Users can easily define new pipes and enqueue commands into them. Pipes can be synchronized using a scope to run serial render commands on the render thread, but initially pipes cannot be synchronized individually with each other. Render command overhead is reduced by recording command lambdas into MPSC queues which are serviced by the task graph; both for pipes and for the render thread. This reduces the task overhead as commands are no longer 1-to-1 with tasks.

Pipe behavior is controlled with new CVars. `r.RenderCommandPipeMode` controls overall behavior:
 0 - Legacy render thread tasks,
 1 - Render thread MPSC queue,
 2 - Render thread and async pipe MPSC queues.

To define a Render Command Pipe, use DEFINE_RENDER_COMMAND_PIPE(MyPipe), or DECLARE_RENDER_COMMAND_PIPE(MyPipe, MODULE_API) to declare an extern reference.

Enqueue a command into the pipe like so:

ENQUEUE_RENDER_COMMAND(MyCommand)(UE::RenderCommandPipe::MyPipe, [] (FRHICommandList&) {}).

Omitting a pipe will fallback to the 'general' pipe which is the render thread.

Eventually pipes need to be synced back to the general pipe for scene renders and other GPU work. On the game thread timeline, use UE::RenderCommandPipe::FSyncScope to synchronize the pipes. This waits for pipes and disables recording of new pipe commands until the scope completes, at which point pipe recording is restarted. This creates a 'sync point', so render commands issued prior to a sync scope will be waited on at the start of the scope, and render commands issued after the scope ends will not be able to start until the render thread finishes processing prior commands.

#rb christopher.waters, luke.thatcher

[CL 27042459 by zach bethel in ue5-main branch]
2023-08-11 15:51:26 -04:00
jack cai
4eb1d9b326 ComputeFramework: corrected a typo when determining dispatch type for data providers bound to a kernel.
#jira UE-192354
#rb Jeremy.Moore

[CL 26921186 by jack cai in ue5-main branch]
2023-08-08 13:02:15 -04:00
jeremy moore
62a2303cdb DeformerGraph: Persistent buffers use a UAV per kernel in the graph. This allows us to use SkipBarrier on the UAV to allow concurrent subinvocations, and still use correct barriers between kernels.
This fixes some flickering seen in graphs that recompute normals using the scatter technique.

[CL 25851483 by jeremy moore in ue5-main branch]
2023-06-07 13:58:30 -04:00
bryan sefcik
da92084a12 Optimized out more private modules includes and dependencies.
#preflight 64627c382965f6ea8ea83bd6

[CL 25479683 by bryan sefcik in ue5-main branch]
2023-05-15 16:26:12 -04:00
jack cai
fe330282cc Optimus: better validation of the compute graph to deal with ambiguity caused by non-unified vs unified dispatch
+ pins in the primary group of a kernel determines the dispatch type of the kernel

+ Added additional validation code for resource expressions. Expressions that evaluates to different results for unified/non-unified now invalidate the deformer.
   TODO: perform this validation during compile instead of runtime and throw a compile error

+ compile error if a data interface with no unified dispatch support is connected to secondary group

+ compile error if a kernel does not have execution data interface

+ support higher number dispatch group count, implying that in the long run we probably only support 1D dispatch of thread groups

#jira none
#rb halfdan.ingvarsson, Jeremy.Moore
#preflight https://horde.devtools.epicgames.com/job/6439dd22ec219759f540f526

[CL 25055283 by jack cai in ue5-main branch]
2023-04-14 20:03:04 -04:00
arciel rekman
80c58ff63e Track (first) owners of FShaderMapResource_SharedCode to improve leak investigation.
#rb Jason.Nadro, Rob.Krajcarski, Jeremy.Moore, Eric.Renaud-Houde
#preflight 6402182a59017a559b7df6bd
[REVIEW] [at]Jason.Nadro, [at]Rob.Krajcarski, [at]Jeremy.Moore, [at]Eric.Renaudhoude
#rnx

[CL 24507827 by arciel rekman in ue5-main branch]
2023-03-03 17:02:58 -05:00
jeremy moore
ca0bbb5de5 #jira MH-8602
CPU performance improvements for using half edge buffers in deformer graph data interface.
The half edge buffer was being uploaded per frame. Now it is stored in a resource owned by the data provider and uploaded once.
This is still less than ideal. We want the resource to be owned by the skel mesh, so that it is cooked instead of created at runtime. But that is work for another day.
Also test if we have any compute worker jobs before kicking off RDG graph. This allows for a fairer CPU comparison between skin cache and deformer graph. (We don't want to add RDG overhead to skin cache only work).
Also added support for reordering the compute work so for optimal GPU execution. Ordering by kernel index instead of graph index allows greater overlap of work on GPU when multiple compute graphs are running.
Added a per graph sort priority so that work sorting doesn't cause any setup graphs to run later than execution graphs.
#preflight 63f40694977ceed915769bfe

[CL 24343458 by jeremy moore in ue5-main branch]
2023-02-21 13:06:29 -05:00
jeremy moore
f00268c32b #jira MH-8602
Fix bug where dispatch sizes were being added in all dimensions, causing way too many threads to launch.
Fix is fine for all current execution data interfaces which launch 1d thread groups. Will need refining in the future for 2d or 3d thread groups.
#preflight 63ed944b0a06073fef033645
#rbx

[CL 24253413 by jeremy moore in ue5-main branch]
2023-02-16 04:17:15 -05:00
jeremy moore
350e28810b #jira MH-8602
ComputeFramework: Support for unified dispatches.
When a kernel supports unified dispatch it combines any subinvocations into a single dispatch invocation.
This can only happen if all data interfaces support unified dispatch, and all subinvocation shaders are the same.
Some data interfaces, such as skeleton, don't support this because each section keeps its own bone buffer to keep bone count to a minimum.
In future we should look to change the underlying structures to support unified dispatch as much as possible, since it will be more performant.
#preflight 63ed39c7514832b242bce7a7
#rb jack.cai

[CL 24251753 by jeremy moore in ue5-main branch]
2023-02-16 01:59:01 -05:00
Jeremy Moore
80ab78c906 Optimus: Fix UObject generated shader source paths to fix some path validation warnings.
Change OptimusSource files to set virtual path through virtual file name instead of through (less robust) line directives.
#preflight 63cc5a8d574ab9cae49dc676

[CL 23803742 by Jeremy Moore in ue5-main branch]
2023-01-21 16:53:46 -05:00
Jeremy Moore
f6acd25a55 ComputeFramework: Put DataInterface code inside virtual generated files for shader compilation.
This allows us to jump to compile errors in source for data interfaces that have simple source templates.
#preflight 63cc5a0bd83c1837b1b02af6

[CL 23803741 by Jeremy Moore in ue5-main branch]
2023-01-21 16:51:58 -05:00
Jeremy Moore
90186ec0d0 ComputeFramework: Fix compile error links for UObject shader paths.
Tidied logging during shader compilation.
Optimus: Add support for opening file in visual studio from shader compile errors in Editor or VS log.
#preflight 63c8c39f8168e8b2526e1c66

[CL 23772051 by Jeremy Moore in ue5-main branch]
2023-01-19 08:13:15 -05:00
Jeremy Moore
f39adc465f Remove one unnecessary layer of file indirection in the compute framework shader compilation.
#preflight 63c6c65602024f93d85fc7fb

[CL 23739003 by Jeremy Moore in ue5-main branch]
2023-01-17 11:18:42 -05:00
Jeremy Moore
5fc0b7cc85 ComputeFramework: Don't compile compute graphs on PostLoad.
Defer compilation to first use.
#preflight 63c03df12e714f64ad968ad2

[CL 23665189 by Jeremy Moore in ue5-main branch]
2023-01-12 12:23:03 -05:00
Jeremy Moore
3e9126d0cb ComputeFramework: Move management of shader compilation to the core shader manager.
Existing system was single threaded. With this change we can make used of shader workers.
#preflight 63c01769d862fdd347e46cc0

[CL 23662505 by Jeremy Moore in ue5-main branch]
2023-01-12 09:45:31 -05:00
Jeremy Moore
8bc0c96b2b ComputeFramework: Virtual path changes for compute kernel and source library includes.
Move them to ComputeFramework virtual path by default and use valid paths with .ush suffix.
This is in preparation for migrating to the core shared compliler manager which validates these things.
Compile error path->object resolution for source libraries will be broken, but fixed in a second pass.
#preflight 63c0123e577437afe639b2f5

[CL 23662136 by Jeremy Moore in ue5-main branch]
2023-01-12 09:16:25 -05:00
Jeremy Moore
8eae23ec33 ComputeFramework: Use generic on-disk shader file as source for compute framework kernels.
Source is included from there in generically named generated file.
#preflight 63bf6109de27f9bc450b69b1

[CL 23658267 by Jeremy Moore in ue5-main branch]
2023-01-11 20:38:05 -05:00
Jeremy Moore
e9fe39d75a ComputeFramework: We no longer need to check full shadermap for each graph kernel before enqueuing work.
#preflight 63be0492c45a2c81e033db2d

[CL 23636436 by Jeremy Moore in ue5-main branch]
2023-01-10 19:46:46 -05:00
Jeremy Moore
95d088af44 Optimus: Deprecate UComputeDataProvider::IsValid() and remove concrete implementations that are no longer called.
#preflight 63bddfe1bf54fa7b369fff00

[CL 23634453 by Jeremy Moore in ue5-main branch]
2023-01-10 17:13:11 -05:00
Jeremy Moore
63fa7c1393 ComputeFramework: Add support for fallback delegate to be called whenever EnqueueWork fails.
Failure can happen immediately or later in the pipe (for example when we try to fetch shaders).
Optimus: Use fallback for mesh deformers. Fallback is to simply reset the gpu passthrough vertex factory so that we see the bind pose. This is the same behavior as before, but now can happen at top or bottom of the pipe.
#preflight 63bda31aaf3ebedd992348eb

[CL 23629300 by Jeremy Moore in ue5-main branch]
2023-01-10 12:57:49 -05:00