Commit Graph

239 Commits

Author SHA1 Message Date
zach bethel
3b9b0f2d52 Resubmit of 22872901.
Added RDG_EVENT_SCOPE_FINAL variant that silences child scopes / events.
 - Added r.RDG.Events CVar to control GPU event behavior.
      - 0 disables GPU events; 1 enables GPU events and FINAL scopes suppress child scopes; 2 enables all GPU events.

#preflight 63614e16397c7af896701cae

[CL 22917968 by zach bethel in ue5-main branch]
2022-11-02 11:46:28 -04:00
nat parkinson
442a36cfd4 [Backout] - CL22872901 as it seems to have caused compile errors
[FYI] zach.bethel
Original CL Desc
-----------------------------------------------------------------
Added RDG_EVENT_SCOPE_FINAL variant that silences child scopes / events.
 - Added r.RDG.Events CVar to control GPU event behavior.
      - 0 disables GPU events; 1 enables GPU events and FINAL scopes suppress child scopes; 2 enables all GPU events.

#preflight 6360117d117bb4ce9da40ef0
#rb krzysztof.narkowicz, yuriy.odonnell, daniel.wright

[CL 22879513 by nat parkinson in ue5-main branch]
2022-11-01 07:02:14 -04:00
zach bethel
738376cef5 Added RDG_EVENT_SCOPE_FINAL variant that silences child scopes / events.
- Added r.RDG.Events CVar to control GPU event behavior.
      - 0 disables GPU events; 1 enables GPU events and FINAL scopes suppress child scopes; 2 enables all GPU events.

#preflight 6360117d117bb4ce9da40ef0
#rb krzysztof.narkowicz, yuriy.odonnell, daniel.wright

[CL 22876153 by zach bethel in ue5-main branch]
2022-10-31 20:56:49 -04:00
jason hoerner
ee5373b706 Virtual Production: Optimizations to help with D3D12 CPU and GPU performance regressions following Parallel Rendering integration:
* Feature that merges command lists from QueueAsyncCommandListSubmit into a single Payload where possible (r.D3D12.AllowPayloadMerge, default enabled).  Saves around 0.5 ms per async command list batch, with total savings from 1 to 4 ms on VP test scenes.
* Remove unnecessary command flush before Present on D3D12.  Added a virtual function "NeedFlushBeforeEndDrawing" that can return false if the platform function RHIEndDrawingViewport function flushes commands, and the calling function doesn't need to.
* Wrapped three scene rendering flushes, plus BeginFlushResourcesRHI with "RHIIncludeOptionalFlushes()", which returns false on D3D12.  These were a net perf loss on D3D12, as they wrap hardly any rendering.  I didn't want to change behavior on other platforms, hence the conditional.
* Overall the above changes remove 8 command list flushes in a two view scene, which typically cost in the range of 0.06 ms each, saving 0.48 ms CPU total.  Perf win increases with more views, and removing the flushes also helps with GPU bubbles.
* Added a flush in a strategic location at the end of Shadows and Lumen.  In a sample scene, saved 1.5 ms GPU eliminating a bubble.

#jira UE-167553
#rb luke.thatcher
#rnx
#preflight 635547519e14ee3c790cc14e
#lockdown mihnea.balta

[CL 22728231 by jason hoerner in ue5-main branch]
2022-10-24 11:23:51 -04:00
tiantian xie
0a5d859dad Rendering Engine change to support the following two kind of path tracing denoiser plugins at the same time:
1. Spatial denoising only plugin.
  2. Spatial and temporal denoising plugin.

When the user enables both plugins:
  1. The spatial denoising only plugin + builtin temporal denoiser can be selected as (default):

        `r.PathTracing.SpatialDenoiser.Type 0`
  2. The spatial and temporal denoser plugin can be selected as:

        `r.PathTracing.SpatialDenoiser.Type 1`

Note:
  * There is a modification in FRDGBuilder::IsTransient() to make TexCreate_Shared  return false to make it committed resource. Otherwise, the interop will fail when trying to get the device E_Invalidarg (0x80070057).

#jira UE-158838
#preflight 633f20f72a0a2c1ead3a6184
#rb Juan.Canada, Chris.Kulla
#ushell-cherrypick of 21960393 by Tiantian.Xie

[CL 22405452 by tiantian xie in ue5-main branch]
2022-10-07 14:17:00 -04:00
zach bethel
16e557e814 Fixed flickering shadow artifact on some platforms. Fixed crash when r.RDG.CullPasses is 0.
#preflight 633f50d82a0a2c1ead464181
#jira UE-165698, UE-166172
#lockdown Mihnea.Balta
#rb Luke.Thatcher
#rnx

[CL 22398000 by zach bethel in ue5-main branch]
2022-10-07 05:51:53 -04:00
luke thatcher
0c5c2b3e27 Merging //UE5/Dev-ParallelRendering (up to CL 22203289) to //UE5/Release-5.1
This change includes significant refactor work performed in //UE5/Dev-ParallelRendering. A brief summary of the work is as follows:

Refactored RHI command lists
 - Removal of the "immediate" async compute command list
 - Introduced an "active pipe" on each command list, allowing RHICmdLists to record work for either graphics or async compute. Pipes can be selected using the SwitchPipeline() function, or the FRHICommandListScopedPipeline helper.
 - New explicit command list submission RHI API (RHIFinalizeContext, RHISubmitCommandLists). The IRHICommandContextContainer type has been removed.
 - Explicit GPU submission is automatically appended to the immediate command list when it is dispatched to the RHI thread.

Platform RHI implementations
 - The new submission API has been implemented across all platforms. Some platforms required a significant refactor.

#rb Mihnea.Balta,Kenzo.Terelst
#jira UE-139550
#preflight 6332e3641003050806d802ef

[CL 22239063 by luke thatcher in ue5-main branch]
2022-09-28 21:40:05 -04:00
zach bethel
b1b031829f Fix for ensure when using RDG clobber resources.
#jira UE-156315
#preflight trivial

[CL 21924352 by zach bethel in ue5-main branch]
2022-09-09 13:51:29 -04:00
zach bethel
ebbe30f548 Removed RDG pass CPU trace to reduce memory pressure on insights.
#fyi Ionut.Matasaru
#preflight trivial

[CL 21854680 by zach bethel in ue5-main branch]
2022-09-07 13:05:54 -04:00
zach bethel
487a34d598 Added parallel setup for RDG passes
- Passes are added to a queue that is consumed by a task to perform pass setup actions.
 - Moved CompilePassBarriers to overlap with resource collection.
 - Converted most tasks to new task system.

#preflight 630e5e2e660db81edb9a0562

[CL 21710710 by zach bethel in ue5-main branch]
2022-08-30 17:48:59 -04:00
zach bethel
f4bc813400 Disable full string events in RDG in test builds, even when profile gpu is enabled, and disable per-pass tracing as well.
[CL 21696542 by zach bethel in ue5-main branch]
2022-08-30 02:23:50 -04:00
zach bethel
8eac902180 Fixed parallel translate latency. With lock / unlock supported in queued async command lists, an RHI thread fence was being added conservatively with every async command list submission. When RDG parallel execution is enabled, the parallel translate jobs cannot launch until they are supposed to be submitted, which introduces massive stalls on the RHI thread and extends parallel translation into the next frame, clashing with InitViews. This change fixes the issue by introducing a tiny job that launches when all dependent command lists are finished recording (which happens quickly in RDG) and then scans for the latest RHI thread fence on any of the command lists.
#rb ben.woodhouse, christopher.waters

[CL 21499911 by zach bethel in ue5-main branch]
2022-08-22 21:21:08 -04:00
serge bernier
882beb5553 [Backout] - CL20968686
[FYI] zach.bethel
Original CL Desc
-----------------------------------------------------------------
Move RDG buffer uploads off the render thread.

Maybe be related to gpu crash in EMT (UE-157850)

#preflight 62b9f1ec4209c7c579df8fc2

#ushell-cherrypick of 20839110 by zach.bethel

#ROBOMERGE-OWNER: ben.woodhouse
#ROBOMERGE-AUTHOR: serge.bernier
#ROBOMERGE-COMMAND: _robomerge ue5-main
#ROBOMERGE-SOURCE: CL 21010071 via CL 21010165 via CL 21010189
#ROBOMERGE-BOT: UE5 (Release-Engine-Staging -> Main) (v972-20964824)

[CL 21020108 by serge bernier in ue5-main branch]
2022-07-09 07:04:47 -04:00
serge bernier
63ff734598 Move the execute of the RDG GPU scopes before the Prologue to include the BeginRenderPass. This will make clears done in the BeginRenderPass appear in there corresponding render pass and not in the previous one. Also changed the order of the markers execute for the Epilogue for consistency.
#rb [at]zachary.bethel

#ROBOMERGE-OWNER: serge.bernier
#ROBOMERGE-AUTHOR: serge.bernier
#ROBOMERGE-SOURCE: CL 20987740 via CL 20988066 via CL 20988088
#ROBOMERGE-BOT: UE5 (Release-Engine-Staging -> Main) (v972-20964824)

[CL 20991417 by serge bernier in ue5-main branch]
2022-07-07 16:03:43 -04:00
zach bethel
a604eb2cf9 Fixed async RDG clear to delete allocator after the FRDGBuilder destructor executes.
#preflight 62c362e57358826af82295d0

[CL 20937688 by zach bethel in ue5-main branch]
2022-07-04 18:20:07 -04:00
ben woodhouse
2d39d994c3 Put RDG::ExecutePass into its own RDG_Execute CSV stat to make it easier to measure (instead of using RenderOther)
[FYI] zachary.bethel

#ROBOMERGE-AUTHOR: ben.woodhouse
#ROBOMERGE-SOURCE: CL 20930230 via CL 20930231 via CL 20930232
#ROBOMERGE-BOT: UE5 (Release-Engine-Staging -> Main) (v971-20777995)

[CL 20931176 by ben woodhouse in ue5-main branch]
2022-07-03 22:00:23 -04:00
zach bethel
06fbb3a3e1 Moved RDG builder destruction off the render thread.
#preflight 62bf7a3cc438da7f09eb21d5

[CL 20926623 by zach bethel in ue5-main branch]
2022-07-02 12:20:33 -04:00
zach bethel
34623f0321 Replaced thread-local MemStack with ConcurrentLinearAllocator across the renderer.
- Removed scene render mem-mark among others. MemStack usage is now restricted to local scopes with known marks.
 - Render resources with destructors are allocated using the FSceneRenderingBulkObjectAllocator on FSceneRenderer, which is deleted when the scene render is.

#preflight 62b266e20d4d6228de97babe
#rb mihnea.balta, yuriy.odonnell

[CL 20907647 by zach bethel in ue5-main branch]
2022-06-30 19:55:24 -04:00
zach bethel
ddea412b12 Move RDG buffer uploads off the render thread.
#preflight 62b9f1ec4209c7c579df8fc2

[CL 20839110 by zach bethel in ue5-main branch]
2022-06-27 16:45:50 -04:00
zach bethel
f04a19a658 Fixed RHI validation error in cloud rendering.
#preflight 62b37fcac603be614811adcd
#jira UE-156470

[CL 20784050 by zach bethel in ue5-main branch]
2022-06-22 18:47:11 -04:00
zach bethel
8a9fd982e6 Sync RDG setup events prior to the PSO cache coalescing sync point.
#fyi dave.barrett
#preflight 62a95ee1293ff41d4957783a

[CL 20665487 by zach bethel in ue5-main branch]
2022-06-15 00:28:49 -04:00
zach bethel
eee5dbab99 Fix for engine test failure in DumpGPU on Vulkan
#preflight 62a60279b94c57409e81884f

[CL 20619061 by zach bethel in ue5-main branch]
2022-06-12 11:28:44 -04:00
mihnea balta
b6bd0ea0ac Add missing task tag when FRDGBuilder::CreateUniformBuffers runs async.
#fyi Zach.Bethel
#preflight skip
#jira none
#rnx

[CL 20594531 by mihnea balta in ue5-main branch]
2022-06-10 09:39:51 -04:00
zach bethel
67d0028aab Moved RDG uniform buffer creation off the render thread. Added AddSetupTask helper function for async tasks that are automatically waited on prior to execution.
#preflight 62a0d919fc5ffe569a6774c0

[CL 20560693 by zach bethel in ue5-main branch]
2022-06-08 14:00:10 -04:00
Guillaume Abadie
0734868f53 Implements DynamicRenderScaling API with GPU timing measurement integrated within RDG
This allows to define a new dynamic scaling in the renderer with low amount of boiler plate:

DynamicRenderScaling::FHeuristicSettings GetDynamicTranslucencyResolutionSettings()
{
	RenderingDynamicScaling::FHeuristicSettings BucketSetting;
	BucketSetting.Model = RenderingDynamicScaling::EHeuristicModel::Quadratic;
	BucketSetting.bModelScalesWithPrimaryScreenPercentage = true;
	BucketSetting.MinResolutionFraction = ...
	...
	return BucketSetting;
}

DynamicRenderScaling::FBudget GDynamicTranslucencyResolution(TEXT("DynamicTranslucencyResolution"), &GetDynamicTranslucencyResolutionSettings);


And then simply define a scope to measure the GPU timing as such:

{
	DynamicRenderScaling::FRDGScope DynamicTranslucencyResolutionScope(GraphBuilder, GDynamicTranslucencyResolution);

	// add passes to GraphBuilder
}

#rb zach.bethel
#jira UE-152561
#preflight 628f1219bb14235aa38c904c

[CL 20376428 by Guillaume Abadie in ue5-main branch]
2022-05-26 01:58:36 -04:00