Commit Graph

123 Commits

Author SHA1 Message Date
Guillaume Abadie
0734868f53 Implements DynamicRenderScaling API with GPU timing measurement integrated within RDG
This allows to define a new dynamic scaling in the renderer with low amount of boiler plate:

DynamicRenderScaling::FHeuristicSettings GetDynamicTranslucencyResolutionSettings()
{
	RenderingDynamicScaling::FHeuristicSettings BucketSetting;
	BucketSetting.Model = RenderingDynamicScaling::EHeuristicModel::Quadratic;
	BucketSetting.bModelScalesWithPrimaryScreenPercentage = true;
	BucketSetting.MinResolutionFraction = ...
	...
	return BucketSetting;
}

DynamicRenderScaling::FBudget GDynamicTranslucencyResolution(TEXT("DynamicTranslucencyResolution"), &GetDynamicTranslucencyResolutionSettings);


And then simply define a scope to measure the GPU timing as such:

{
	DynamicRenderScaling::FRDGScope DynamicTranslucencyResolutionScope(GraphBuilder, GDynamicTranslucencyResolution);

	// add passes to GraphBuilder
}

#rb zach.bethel
#jira UE-152561
#preflight 628f1219bb14235aa38c904c

[CL 20376428 by Guillaume Abadie in ue5-main branch]
2022-05-26 01:58:36 -04:00
zach bethel
fdc795f5b9 Implemented support for fused CopySrc | CopyDest transitions for Vulkan in RDG.
- Simplified RDG transition logic to only use subresource transitions.
 - Don't allow unmergeable states to be merged during pass setup.

#preflight 627ebfadca3b90fc14fb9b85
#fyi jeannoe.morissette

[CL 20224647 by zach bethel in ue5-main branch]
2022-05-16 11:02:52 -04:00
zach bethel
e37a08177c Submit async compute work prior to a fence from graphics -> async compute so that prior work isn't grouped with the submission to work around an issue in D3D12, which currently brute forces a fence from graphics to async compute any time graphics work is submitted.
#preflight 627c359a9f7ad2a14b8388d0
#fyi sebastien.hillaire

[CL 20152556 by zach bethel in ue5-main branch]
2022-05-11 18:40:35 -04:00
zach bethel
ac11396729 Fixed test and validation regressions due to RDG changes.
#preflight 6272a1f12f6d177be3c60a53
#jira UE-150908

[CL 20043535 by zach bethel in ue5-main branch]
2022-05-04 12:41:19 -04:00
zach bethel
4eedc02f37 Fixes to external access RDG feature to handle async compute without validation failures.
#preflight 6270551191629533ec2b4bc9

[CL 20017395 by zach bethel in ue5-main branch]
2022-05-02 18:31:37 -04:00
zach bethel
10131e1285 Refactored RDG in preparation for UnifiedBuffer conversions.
- Refactord 'Finalized Access' feature into a more flexible 'External' vs. 'Internal' access mode per resource toggle.
      - Resources can transition between modes multiple times within the graph.
      - Supports async compute pipeline.
      - Supports queueing of requests to avoid back-to-back helper passes.
      - This feature is needed to support conversion of GPU scene buffers.

 - Deprecated the ReadOnly and ForceTracking resource flags and added a 'SkipTracking' flag instead.
      - Previous semantics were confusing and error prone.
      - New model requires a manual flag to tell RDG never to transition a resource.
      - This flag is used for read-only dummy resources as an optimization.

 - Renamed some of the auxiliary 'FinalizedResource' utilities since the name no longer matches the semantics.

#preflight 6266cc6d0634d0904ce4ba46

[CL 19904734 by zach bethel in ue5-main branch]
2022-04-25 13:00:12 -04:00
zach bethel
3864629f00 Minor RDG improvements in preparation for UnifiedBuffer conversion.
- Added resource pool counters and events.
 - Added AllocatePooledBuffer method and refactored pool to no longer take a command list.
 - Refactored swap chain barrier logic to be a bit cleaner.
 - Added helper methods to cast between views.
 - Added power of two alignment option to buffer pool.
 - Added GetTypeHash implementations for RDG SRV | UAV descriptors.

#preflight 62631046006fa20b683d130f

[CL 19873407 by zach bethel in ue5-main branch]
2022-04-22 17:11:57 -04:00
zach bethel
f457a69101 Added RHI tracked access API to remove Unknown transitions.
- New RHI command list SetTrackedAccess method for the user to supply a current whole-resource state.
 - New RHI command context GetTrackedAccess method for querying the tracked access in RHIBeginTransitions / RHIEndTransitions on the RHI thread.
 - Hooked RHICmdList.Transition and FRHICommandListExecutor::Transition to assign tracked state automatically.
 - Refactored RDG and resource pools to use new RHI tracking.
      - FRDGPooledBuffer / FRDGPooledTexture no longer contain tracked state. RDG temp-allocates state through the graph allocator instead.
      - All prologue transitions are 'Unknown', and all epilogue transitions coalesce into a whole resource state.
 - Implemented platform support for patching the 'before' state with the tracked state.
 - Implemented various RHI validation checks:
      - Asserts that the user assigned tracked state matches RHI validation tracked state, for all subresources.
      - Asserts that tracked state is not assigned or queried from a parallel translation context.
 - Added FRHIViewableResource and FRHIView base classes to RHI. FRHIView contains a pointer to an FRHIViewableResource. This is currently a raw pointer, but should be extended to a full reference in a later CL.

NOTE on RHI thread constraint:

Transition evaluation is now restricted to the RHI thread (i.e. no parallel translation contexts). Transitions aren't performed in parallel translate contexts anyway, so this is not a problem. If, however, we decide to refactor parallel translation to be more general, this implementation could be extended to track the state per context and update from the 'dispatch' thread.

#preflight 6233b4396666d7e753a16aaf
#rb kenzo.terelst

[CL 19513316 by zach bethel in ue5-main branch]
2022-03-25 11:19:10 -04:00
guillaume abadie
7b9b4fd2a9 Implements DumpGPUServices plugin to have DumpGPU -upload
#rb juan.canada
#preflight 623ca9fed078aec3e42ee738

#ROBOMERGE-AUTHOR: guillaume.abadie
#ROBOMERGE-SOURCE: CL 19499603 via CL 19500414 via CL 19500427
#ROBOMERGE-BOT: UE5 (Release-Engine-Staging -> Main) (v936-19480137)

[CL 19502702 by guillaume abadie in ue5-main branch]
2022-03-24 17:10:00 -04:00
graham wihlidal
32ff4eefc9 Merged Nanite HW rasterizer passes into a single RDG pass, and also SW rasterizer passes into a single RDG pass. All passes update various buffers with atomics, so they are now always marked with SkipBarrier to allow disable synchronization between passes and allow for overlap. Based on work by Zach Bethel
VSM in AncientGame campfire went from 4.64ms -> 3.41ms, Primary raster went from 1.34ms -> 1.07ms. Lumen raster 0.20ms -> 0.18ms. Much higher gains expected in content with high numbers of rasterizer bins (more overhead to remove using this optimization)

MedievalGame is even better: Primary raster 1.82ms -> 0.92ms, VSM 2.99ms -> 2.07ms, Lumen 0.43ms -> 0.19ms

#rb zach.bethel
#fyi ola.olsson, brian.karis, rune.stubbe
#preflight skip
#robomerge FNNC

[CL 19432065 by graham wihlidal in ue5-main branch]
2022-03-18 01:01:33 -04:00
richard wallis
c13b410b24 Fix DumpGPU on macOS. Begin Resource Dump occurs in: FSlateApplication::Get().Tick(ESlateTickType::PlatformAndInput); which is after GEngine->Tick(FApp::GetDeltaTime(), bIdleMode);. Make the defered dump command request a GPU dump for the next frame by adding an InitDump function so it's clear for any platform when to begin. Could add 1 when setting the DumpingFrameCounter_GameThread but that seems brittle across platforms.
Allow more flexibility of the Metal RHICopyToResolveTarget to include compatable texture view pixelformats.  Fixes validation error when resolving between sRGB and RGB formats.

FMetalContext: don't assert then go into the weeds on macOS when there are no render targets.

#jira UE-140658,  UE-120222
#preflight  61fd124b2839dd07cb98d771
[REVIEW] [at]will.damon,  [at]Guillaume.Abadie
#rb will.damon,  Guillaume.Abadie
#lockdown cristina.riveron
#rnx

#ROBOMERGE-AUTHOR: richard.wallis
#ROBOMERGE-SOURCE: CL 18864871 in //UE5/Release-5.0/... via CL 18864881 via CL 18865081
#ROBOMERGE-BOT: UE5 (Release-Engine-Test -> Main) (v910-18824042)

[CL 18865102 by richard wallis in ue5-main branch]
2022-02-04 12:01:46 -05:00
zach bethel
9996233f7a Removed unused legacy MSAA multi-texture support from IPooledRenderTarget.
- Deprecated legacy members from FPooledRenderTargetDesc.
 - Deprecated ETextureRenderTarget and removed from RDG.
 - TargetableTexture always equals ShaderResourceTexture.
 - Simplified render target pool FindFreeElement.
 - Create pooled buffers and textures with a known state.

#rb graham.wihlidal
#preflight 61f8488568795b2f45852274

#ROBOMERGE-AUTHOR: zach.bethel
#ROBOMERGE-SOURCE: CL 18796880 in //UE5/Release-5.0/... via CL 18797840 via CL 18799070
#ROBOMERGE-BOT: UE5 (Release-Engine-Test -> Main) (v908-18788545)

[CL 18799188 by zach bethel in ue5-main branch]
2022-01-31 17:22:31 -05:00
zach bethel
2230e96167 Removal of RDG drain experiment for UE5 release. It is currently unused and hasn't proven itself necessary yet. It would be relatively straightforward to re-add if the need arises in the future.
#rb mihnea.balta
#preflight 61f82d9b3e13556eb9c3eb34

#ROBOMERGE-AUTHOR: zach.bethel
#ROBOMERGE-SOURCE: CL 18794948 in //UE5/Release-5.0/... via CL 18795422 via CL 18796381
#ROBOMERGE-BOT: UE5 (Release-Engine-Test -> Main) (v908-18788545)

[CL 18796735 by zach bethel in ue5-main branch]
2022-01-31 15:55:31 -05:00
zach bethel
25f20493c4 Backing out changes to remove unknown states in pooled resources. Unported code is still transitioning pooled resources while not updating the tracked state. This will result in incorrect before states.
#rb none
#preflight 61f1920ef8088a3d298fb3a9

#ROBOMERGE-AUTHOR: zach.bethel
#ROBOMERGE-SOURCE: CL 18740193 in //UE5/Release-5.0/... via CL 18740756 via CL 18741541
#ROBOMERGE-BOT: UE5 (Release-Engine-Test -> Main) (v903-18687472)

[CL 18742250 by zach bethel in ue5-main branch]
2022-01-26 15:04:14 -05:00
zach bethel
14085dec09 RDG now reuses tracked state for external pooled textures and buffers. Added a variant of RegisterExternal{Texture, Buffer} that allows the user to specify the actual state if transitioned outside of RDG.
#preflight 61e84c67276892ce107685a0
#rb kenzo.terelst

#ROBOMERGE-AUTHOR: zach.bethel
#ROBOMERGE-SOURCE: CL 18662047 in //UE5/Release-5.0/... via CL 18662058 via CL 18662082
#ROBOMERGE-BOT: UE5 (Release-Engine-Test -> Main) (v900-18638592)

[CL 18662103 by zach bethel in ue5-main branch]
2022-01-19 14:08:44 -05:00
zach bethel
3b9e24d8b2 Replaced legacy XB1 transient page allocator with RDG page table mapped transient allocator. Added support for ESRAM page pool.
#rb ben.woodhouse
#preflight 61df2577ff67b6fe7ac00eae

#ROBOMERGE-AUTHOR: zach.bethel
#ROBOMERGE-SOURCE: CL 18587728 in //UE5/Release-5.0/... via CL 18587798 via CL 18587831
#ROBOMERGE-BOT: STARSHIP (Release-Engine-Test -> Main) (v899-18417669)

[CL 18587909 by zach bethel in ue5-main branch]
2022-01-12 14:29:07 -05:00
dmitriy dyomin
667034ee1e Fixed: DumpGPU command on mobile platforms
#jira UE-135663
#rb Guillaume.Abadie, Jack.Porter
#preflight 61dd77e18d72a407aabd881b

#ROBOMERGE-AUTHOR: dmitriy.dyomin
#ROBOMERGE-SOURCE: CL 18571322 in //UE5/Release-5.0/... via CL 18571337
#ROBOMERGE-BOT: STARSHIP (Release-Engine-Staging -> Release-Engine-Test) (v899-18417669)

[CL 18571345 by dmitriy dyomin in ue5-release-engine-test branch]
2022-01-11 09:24:45 -05:00
zach bethel
d5b21eab6b Major refactor of transient allocator to support page table mapping implementation and extracted transient resources.
- Implemented common transient page allocator in RHICore.
 - Implemented XBox specific GPU page table mapping allocator.
 - Extended RDG insights to support viewing heap visualization or page pool visualization.

#preflight 61d356682e0e436c725818bf

#ushell-cherrypick of 18504626 by zach.bethel

#ROBOMERGE-AUTHOR: zach.bethel
#ROBOMERGE-SOURCE: CL 18565702 in //UE5/Release-5.0/... via CL 18565762
#ROBOMERGE-BOT: STARSHIP (Release-Engine-Staging -> Release-Engine-Test) (v899-18417669)

[CL 18565807 by zach bethel in ue5-release-engine-test branch]
2022-01-10 17:00:33 -05:00
andrew davidson
0715ebc996 Type truncation fixes - Renderer
#rb arne.schober
#preflight 61d85ab0932a02483ce13e7d

#ROBOMERGE-AUTHOR: andrew.davidson
#ROBOMERGE-SOURCE: CL 18544411 in //UE5/Release-5.0/... via CL 18544434
#ROBOMERGE-BOT: STARSHIP (Release-Engine-Staging -> Release-Engine-Test) (v899-18417669)

[CL 18544466 by andrew davidson in ue5-release-engine-test branch]
2022-01-07 10:39:08 -05:00
jason hoerner
5600dd7c35 UE5_RELEASE: MGPU, numerous fixes to get EngineTest AFR, and Virtual Production City map to run with multiple GPUs. Ultimately there were 3 crash sources (RayTracing, Nanite, Distance Field streaming), each of which required a couple fixes, plus infrastructure to support those fixes...
There remain significant visual artifacts in the Virtual Production City map.  Lumen has serious issues with multiple views in both single and multi-GPU modes -- I think Lumen data needs to be split per view family to solve this.  There is some corrupt geometry in the second view, which may be Nanite or instance rendering related (or something else entirely).  To narrow down these issues, I think I'm going to need to extend the DumpGPU feature to be able to do more effective MGPU graphical debugging, since none of PIX, RenderDoc, or NSight work.  But at least it doesn't crash now...

Full list of changes:
* CVAR (DC.MultiGPUMode) to override multi-GPU mode for Display Cluster, debug feature copied over from 4.27.
* Barrier and synchronization fixes for RHITransferTextures copied over 4.27.  Future work will make RDG handle multi-GPU transitions more seamlessly...
* CVAR (DC.ForceCrossGPUCopy) to force expensive full synchronization and copy of resources cross GPU at the end of each view family render (for debugging).  RHITransferTextures upgraded to support copying things besides 2D textures, including other texture resources and buffers.
* AFR temporal fixes from a previous CL (which I moved from my single GPU to multi GPU PC), now improved to avoid some validation asserts in Debug builds (pass inputs not declared, GetParent()->GetRHI() not working because parent not declared to pass).
* Ray tracing (hang):  acceleration buffers are branched per GPU, as GPU virtual addresses for resources internally referenced by these buffers may vary per GPU.  Needed to add infrastructure to support buffers that duplicate memory per GPU, rather than using driver aliasing of the underlying resource.
* Ray tracing (hang):  some buffer bindings weren't using a proper GPU index.
* Nanite (hang):  Force initial clear of Nanite.MainAndPostNodesAndClusterBatchesBuffer to run on all GPUs.  Solves GPU hang in shadow rendering the first frame (due to shadow rendering running across all GPUs), and later random hangs in view rendering.
* Distance field streaming (assert):  GPU readback staging buffers need to be branched per GPU, as the underlying class is single device.  GPU readback buffers and textures properly take into account the GPU they were last written on when locking and unlocking.  Includes handling an edge case where a write can be queued when a lock is active, due to the deferred way commands are played back in the render graph.
* Distance field streaming (assert):  UAV clear wasn't taking into account GPU index.
* GPU scene update needs to run across all GPUs.
* Fix for "DumpGPU" command to avoid assert with MGPU -- arbitrarily pick a GPU (last index) when the GPU mask contains multiple bits.  Hope to improve this in the future, but it works.

#rnx
#rb mihnea.balta juan.canada tiago.costa kenzo.terelst
#jira none
#preflight 61ba7edbdc58e54b3318fdf5

#ROBOMERGE-AUTHOR: jason.hoerner
#ROBOMERGE-SOURCE: CL 18472819 in //UE5/Release-5.0/... via CL 18473380
#ROBOMERGE-BOT: STARSHIP (Release-Engine-Staging -> Release-Engine-Test) (v899-18417669)

[CL 18473412 by jason hoerner in ue5-release-engine-test branch]
2021-12-15 23:12:04 -05:00
guillaume abadie
dd1c4774d5 Improves DumpGPU command
Dumps improvements:
1) Bring up for consoles
2) Better out of memory resiliency during the dumping process
3) Dumps console variables in CSV
4) Dumps process' log after completion
5) Dumps mip chains through FDumpTextureCS compute shader
6) Dumps depth & stencil texture formats through the FDumpTextureCS compute shader
7) Dumps at draw granularity with FRDGBuilder::DumpDraw(); (experimental)
8) Dumps final png screenshot to the dump directory
9) Adds & Dumps the FRDGBufferDesc::Metadata for viewer to decode buffer binary automatically
10) Dumps the PassParameters with structure metadata to decode shader parameters automatically
11) Adds CTRL+SHIFT+/ shortcut

Viewer improvements:
1) Tips display onload to spread some knowledge to the user
2) Supports for opening any pass/resources in new web browser tab
3) Emulates 16 and 32 bits UINT texture visualization with multiple webgl 8bit UINT textures
4) Fixes the webpage's tab going out of memory after visualizing many large resources.
5) Fixes the webpage's tab going out of memory after loading large buffer.
6) Adds support for more texture format with RGB channel reswizzling
7) Implements UI color-sheme based on UE5's editor theme
8) Implements texel color picker capabable of decoding every pixel format.
9) Implements texture viewer zooming with the mouse wheel
10) Implements a r.DumpGPU.Viewer.Visualize to open a specific RDG output resource when opening the viewer

#rb juan.canada
#preflight 619bb638fa0b360c406c42c5
[FYI] juan.canada, zach.bethel

#ROBOMERGE-AUTHOR: guillaume.abadie
#ROBOMERGE-SOURCE: CL 18260079 via CL 18372399 via CL 18372914
#ROBOMERGE-BOT: STARSHIP (Release-Engine-Staging -> Release-Engine-Test) (v895-18170469)

[CL 18373039 by guillaume abadie in ue5-release-engine-test branch]
2021-12-03 16:04:00 -05:00
guillaume abadie
e0202048e8 Implements r.DumpGPU command
#rb yuriy.odonnell
#lockdown michal.valient
#preflight 615ace99e69d8c00011a309f

#ROBOMERGE-AUTHOR: guillaume.abadie
#ROBOMERGE-SOURCE: CL 17706889 via CL 17969938 via CL 18366598 via CL 18366692
#ROBOMERGE-BOT: STARSHIP (Release-Engine-Staging -> Release-Engine-Test) (v895-18170469)

[CL 18366749 by guillaume abadie in ue5-release-engine-test branch]
2021-12-03 02:41:52 -05:00
aurel cordonnier
a12d56ff31 Merge from Release-Engine-Staging @ 17791557 to Release-Engine-Test
This represents UE4/Main @17774255, Release-5.0 @17791557 and Dev-PerfTest @17789485

[CL 17794212 by aurel cordonnier in ue5-release-engine-test branch]
2021-10-12 21:21:22 -04:00
charles derousiers
e17a01e51a Add RDG Upload variant which take a lambda function for freeing the CPU memory once the data are uploaded.
#rb zach.bethel
#preflight 612f0d3779d62b0001b43362

#ROBOMERGE-SOURCE: CL 17383153 via CL 17383415
#ROBOMERGE-BOT: STARSHIP (Main -> Release-Engine-Test) (v865-17346139)

[CL 17383421 by charles derousiers in ue5-release-engine-test branch]
2021-09-01 03:09:25 -04:00
zach bethel
aed2812399 Reworked acquire / discard transitions in RDG to use split barriers to improve overlap.
#jira none

#ROBOMERGE-SOURCE: CL 17182512 in //UE5/Main/...
#ROBOMERGE-BOT: STARSHIP (Main -> Release-Engine-Test) (v855-17104924)

[CL 17182558 by zach bethel in ue5-release-engine-test branch]
2021-08-16 17:32:38 -04:00