Commit Graph

180 Commits

Author SHA1 Message Date
charles derousiers
810cb7dac2 Add RDG Upload variant which take a lambda function for freeing the CPU memory once the data are uploaded.
#rb zach.bethel
#preflight 612f0d3779d62b0001b43362


#ROBOMERGE-SOURCE: CL 17383153
#ROBOMERGE-BOT: (v865-17346139)

[CL 17383415 by charles derousiers in ue5-main branch]
2021-09-01 03:08:48 -04:00
zach bethel
ee081a525d Reworked acquire / discard transitions in RDG to use split barriers to improve overlap.
#jira none

[CL 17182512 by zach bethel in ue5-main branch]
2021-08-16 17:30:45 -04:00
zach bethel
ad2a2a8cdb Fixed bug with RDG drain and async compute. Fixed transition bug with UAV workaround in upload buffers. Added some command line arguments. Optimiized checks to disable RDG validation if parallel.
#rb none

[CL 16997303 by zach bethel in ue5-main branch]
2021-07-29 13:00:55 -04:00
zach bethel
9570248047 Fixed state merging when r.RDG.Drain is enabled.
#rb trivial

[CL 16962979 by zach bethel in ue5-main branch]
2021-07-26 19:49:07 -04:00
zach bethel
1a09a4ae77 Fixed GPU scene and Nanite resources to support cross-pipeline SRV access.
Re-enabled async compute for Nanite. Fixed submission bug with async compute and parallel RDG where the async compute command list wasn't being submitted correctly in order.

#rb jamie.hayes, luke.thatcher
#fyi graham.wihlidal
#jira UE-114775

[CL 16937065 by zach bethel in ue5-main branch]
2021-07-23 10:16:59 -04:00
Jian Ru
d372d878e2 Conditional GPU copies for RDG buffer uploads. This fixes broken GPU captures on some platforms when CPU-initialized GPU-modified buffers are used
#jira UE-118792

[CL 16933927 by Jian Ru in ue5-main branch]
2021-07-22 22:30:37 -04:00
zach bethel
de0fac09c7 Parallel RDG execution improvements.
- Added ERDGBuilderFlags::AllowParallelExecute to tag specific builders to attempt parallel execution. This avoids cases where small graphs fork tasks and end up causing contention. Only the main scene render graphs are tagged.
 - Moved RHI transition creation to an async task.
 - Moved parallel execute setup and dispatch to an async task.
 - Fixed RDG draining asserts using a short-term workaround by tagging relevant scene textures as non-transient.
 - Deprecated RDG AddPass utilities without names and fixed up last remnants.
 - Enabled parallel RDG execution by default.

#fyi christopher.waters

[CL 16925941 by zach bethel in ue5-main branch]
2021-07-22 12:42:14 -04:00
zach bethel
cfdf2b7700 Remove split transition for swap chain textures. This avoids cases where certain platforms will stall at the beginning of graph execution.
#rb christopher.waters
#jira UE-119087

[CL 16850825 by zach bethel in ue5-main branch]
2021-07-14 11:50:54 -04:00
zach bethel
f311bbc7a1 RDG Parallel Execution (disabled by default)
- Refactored RDG to support free-threaded execution of passes.
 - Refactored renderer to use specific RHI command list variants in pass lambda. Immediate command list passes are forced to stay on the render thread, while other variants can be parallelized.

#rb christopher.waters

[CL 16838717 by zach bethel in ue5-main branch]
2021-07-13 12:38:27 -04:00
christopher waters
d5f57e697f RDGBuffer objects actually need destructing with their new TFunction members. This fixes a slow leak when using Buffer creation callbacks.
#jira none
#rb jian.ru, zach.bethel
#preflight 60d617e1caf05900010655a8

[CL 16787641 by christopher waters in ue5-main branch]
2021-06-25 15:34:06 -04:00
Ola Olsson
e9276fc6db Combing GPU-Scene instance culling and the id-list generation into one step, and the same for VSM
- Uses the InstanceCullingLoadBalancer to pre-distribute the work on the CPU to ensure even load.
 - Make instance culling use the instance data offset in MDC instead of translating primitive IDs.
 - Track single-instance draws separately from instanced to optimize handling (disable culling for single-instance primitives).

#rb Graham.wihlidal,andrew.lauritzen
#fyi dmitriy.dyomin
#preflight 60d0eafa2ab2180001269160

[CL 16733827 by Ola Olsson in ue5-main branch]
2021-06-21 16:51:39 -04:00
Jian Ru
84573e5672 XB1 transient allocator implementation that works with RDG. Courtesy to Kenzo for most of the implementation
#jira UE-117189
#rb ben.woodhouse, kenzo.terelst
#fyi ben.woodhouse, kenzo.terelst, zach.bethel

[CL 16682921 by Jian Ru in ue5-main branch]
2021-06-15 20:55:44 -04:00
Jian Ru
13d2d6e79e Batch BuildRenderingCommands from major mesh passes inside the main render function
#jira UE-117281
#rb ola.olsson, zach.bethel

[CL 16572282 by Jian Ru in ue5-main branch]
2021-06-07 12:19:06 -04:00
zach bethel
f30bf425f3 Fix for assert when readback resource is being used in a copy destination state.
#rb none

[CL 16523954 by zach bethel in ue5-main branch]
2021-06-01 17:52:32 -04:00
zach bethel
db4d19f641 Fix for culled RDG uniform buffer being created.
#rb andrew.lauritzen

[CL 16523561 by zach bethel in ue5-main branch]
2021-06-01 17:25:18 -04:00
zach bethel
2ab8c34577 Replaced FRDGBufferUploader with internal RDG upload. Removes several transitions and passes and removes immediate command list passes from the graph.
#rb jian.ru

[CL 16459303 by zach bethel in ue5-main branch]
2021-05-25 20:46:17 -04:00
zach bethel
b843dabf06 Minor RDG fixes and support for read-only system textures.
[CL 16456650 by zach bethel in ue5-main branch]
2021-05-25 17:11:34 -04:00
kenzo terelst
17719fe54c Force IndirectArg buffers to not use transient allocator and make them committed resources in d3d12
#jira UE-115982
#rb Yuriy.ODonnell, Graham.Wihlidal

#ROBOMERGE-OWNER: kenzo.terelst
#ROBOMERGE-AUTHOR: kenzo.terelst
#ROBOMERGE-SOURCE: CL 16425509 in //UE5/Release-5.0-EarlyAccess/...
#ROBOMERGE-BOT: STARSHIP (Release-5.0-EarlyAccess -> Main) (v811-16416015)
#ROBOMERGE-CONFLICT from-shelf

[CL 16425600 by kenzo terelst in ue5-main branch]
2021-05-21 17:42:34 -04:00
zach bethel
90bc2efada RDG refactor to enable draining of work after issuing occlusion queries.
- New Drain() method on FRDGBuilder; will flush all pending work.
 - Drained passes are not culled; resource lifetimes are extended; async compute fences are optimized as best as possible but fence joining may occur after the drain.
 - Batch up and pre-build all resource transitions. This is a prerequisite for parallel command lists.
 - Removed ServiceLocalQueue passes with built-in RDG AddDispatchHint().

#jira UE-114622

[CL 16393495 by zach bethel in ue5-main branch]
2021-05-19 17:54:58 -04:00
zach bethel
8d8b93086a Reworked RDG pooled buffer size alignment so that it no longer affects the external-facing descriptor.
#rb none
#jira FROST-2514
#fyi charles.derousiers

[CL 16222555 by zach bethel in ue5-main branch]
2021-05-06 12:36:40 -04:00
zach bethel
804d775535 Minor RDG optimizations.
- Simplified texture subresource tracking.
 - Removed map lookup for each resource in SetupPass.
 - Improved Compile / CollectPassResources to reduce cache misses.
 - Added some container reservations to reduce reallocation costs.
 - Added snapping of buffers to page boundaries to improve re-use.

#rb none

[CL 16208311 by zach bethel in ue5-main branch]
2021-05-05 11:58:15 -04:00
Jian Ru
e4b1ea48eb Join the last async compute pass to its first consumer pass instead of the graph epilogue
#rb zach.bethel

[CL 16118371 by Jian Ru in ue5-main branch]
2021-04-26 16:29:05 -04:00
zach bethel
b67b0d2dda Added FRHI{Texture, Buffer}ViewCache to clean up FRDGPooledX / FRHITransientX duplicate code. Added support for UAVs with a format.
#rb christopher.waters
#preflight 6086ee7c1046fb00018cec87

[CL 16116698 by zach bethel in ue5-main branch]
2021-04-26 14:12:08 -04:00
zach bethel
47cf1f4458 Rewrite of RHI transient resource system.
- Views are cached on RHI transient resources; view renames are no longer necessary.
 - RHI Transient resources utilize a single cache per heap keyed off of the descriptor + offset. Resource caches and heaps are garbage collected.
 - CPU performance is effectively equivalent to the existing pooled resource method.
 - Added common RHI transient resource allocator implementation in RHI core; significantly reduces the amount of platform code.
 - Resource aliasing overlaps are tracked by the RHI and submitted through an acquire operation.
 - Fixed D3D12 implementation to support multi-GPU.
 - Removed condition that excluded small (<64k) buffers in the transient allocator.
 - RHI validation now checks that resource overlaps are valid; i.e. if an overlap occurs between resource A and B during an acquire of B, validation checks that A has been discarded.

#rb graham.wihlidal, luke.thatcher, kenzo.terelst

[CL 16076280 by zach bethel in ue5-main branch]
2021-04-21 13:03:28 -04:00
wei liu
fd8ea2fa94 Support a pass with the ability to explicitly skip pass merging.
#jira 113084

#rb Zach.Bethel, Dmitriy.Dyomin, Jack.Porter, Mi.Wang

#lockdown ben.marsh

#ROBOMERGE-SOURCE: CL 15966669 in //UE5/Release-5.0-EarlyAccess/...
#ROBOMERGE-BOT: STARSHIP (Release-5.0-EarlyAccess -> Main) (v787-15839533)

[CL 15981946 by wei liu in ue5-main branch]
2021-04-12 15:36:14 -04:00