- Moved all FD3D12Viewport members to be together.
- Merged the three FD3D12Viewport viewport texture arrays and the mGPU index array into a single array.
- CalculateSwapChainDepth always took in the same count that was given on viewport construction so it was renamed to InitializeBackBufferArrays and NumBackBuffers is now constant per platform.
- Merged D3D11 and D3D12 GetRenderTargetFormat implementations, which were the exact same, into UE::DXGIUtilities::GetSwapChainFormat.
- Renaming D3D12_USE_DUMMY_BACKBUFFER to D3D12RHI_USE_DUMMY_BACKBUFFER since it's not a d3d12.h define.
- FD3D12BackBufferReferenceTexture2D is now only used when D3D12RHI_USE_DUMMY_BACKBUFFER is defined.
- Adding D3D12RHI_USE_SDR_BACKBUFFER to prevent uses of SDR texture pointers when we don't need them.
- Adding D3D12RHI_SUPPORTS_UAV_BACKBUFFER to replace uses of RHISupportsSwapchainUAVs(GMaxRHIShaderPlatform)
#rb Luke.Thatcher, zach.bethel
[CL 32810092 by christopher waters in ue5-main branch]
- Generating mips is now always done via the FGenerateMips helper class in RenderCore, which uses either a pixel or compute shader.
- OpenGL cannot use these shaders due to lack of support for SRVs that target single mips of a texture resource. As such, the RHIGenerateMips implementation for OpenGL has been kept, but moved to the IOpenGLDynamicRHI interface so that it can be removed from the base RHI contexts, which FGenerateMips makes use of.
Deprecate TexCreate_GenerateMipCapable
- This flag was only to make D3D11 RHI set the D3D11_RESOURCE_MISC_GENERATE_MIPS flag on texture creation. Since RHIGenerateMips is no longer implemented in D3D11 RHI, the flag is not required.
- Textures should be created with TexCreate_UAV or TexCreate_RenderTargetable to make them compatible with FGenerateMips.
- Added checks to FGenerateMips to catch incompatible textures.
#rb christopher.waters
[CL 32792595 by luke thatcher in ue5-main branch]
* In case of multiple GPUs, use the memory information from the one with the largest memory budget.
* Do not crash if the driver does not support querying memory information.
* Only write CSV stats for Windows, where there's a high chance of this information being useful (with a discrete card having dedicated VRAM).
#rb christopher.waters
[CL 32763588 by daniele vettorel in ue5-main branch]
[FYI] William.Belcher
Original CL Desc
-----------------------------------------------------------------
Fix: Move creation of FWindowsVideoRecordingSystem away from FWindowsPlatformFeaturesModule constructor
#rb Aidan.Possemiers
[FYI]
[CL 32237646 by bob tellez in ue5-main branch]
[FYI] William.Belcher
Original CL Desc
-----------------------------------------------------------------
QOL: Deprecate AVEncoder (for removal) and its dependencies (to be moved to plugins)
#rb Luke.Bermingham
#jira UE-174651
[FYI]
[CL 32237625 by will brown in ue5-main branch]
- This was mainly used for bindless descriptor updates, where updates need to be applied to all GPU pipelines.
- Switching pipeline can change the current active breadcrumb on the new pipeline. Doing this at the bottom-of-pipe is not possible as the start/end breadcrumbs for each command list must be known at dispatch time (before execution / translation).
- Added EnqueueLambdaMultiPipe which passes an array of RHI contexts to the lambda. This generally replaces the FRHICommandListBase& which is handed down through the platform RHI.
- EnqueueLambdaMultiPipe may only be called at the top-of-pipe.
Replace RHITransfer[...]UnderlyingResource with RHIReplaceResources on FDynamicRHI / platform implementations
- Old function was always bottom-of-pipe, so couldn't call EnqueueLambdaMultiPipe. New function takes the RHICmdList and is called at top-of-pipe.
- All resource types are merged into the same function (currently buffers and raytracing geometry).
Remove use of RHILockBuffer and RHIUnlockBuffer at the bottom-of-pipe
- Since platform RHIs need to use EnqueueLambdaMultiPipe for buffer locks/unlocks, it is no longer possible to call RHILock/UnlockBuffer at the bottom-of-pipe.
- Also, buffers locked on parallel translating command lists are broken. Lock/unlock calls RHIThreadFence(true), which flags the command list for single-thread translate, however calling this at the bottom-of-pipe is too late, as the decision to dispatch the command list in parallel has already been made.
- Added checks in these functions to catch future use.
#rb zach.bethel
#jira UE-208823
[FYI] christopher.waters
[CL 32220227 by luke thatcher in ue5-main branch]
- This data is required by uniform binding code, and was copy/paste across platform RHIs. Moving it to the base RHI type will allow the RHI validation layer to enumerate resources in uniform buffers, which have been opaque up until now.
- FShaderResourceTable has moved from RenderCore to RHI.
- UE::RHICore::ApplyStaticUniformBuffers and UE::RHICore::SetResourcesFromTables now only need the shader, since the binding info is stored within it.
- The serializer functions in shader format modules and platform RHIs have been fixed up to handle the SRT being in the base type. The data format has not changed, so no shader versions need bumping.
- Removed unnecessary operator == and GetTypeHash functions in some places.
Includes all platforms / RHIs except OpenGL, which will follow in another CL.
#rb Kenzo.Terelst
[CL 31871815 by luke thatcher in ue5-main branch]
- FRHITexture2D
- FRHITexture2DArray
- FRHITexture3D
- FRHITextureCube
- FTexture2DRHIRef
- FTexture2DArrayRHIRef
- FTexture3DRHIRef
- FTextureCubeRHIRef
Replaced with FRHITexture and FTextureRHIRef
These types were unified in UE 5.1 and have been defined via "using" statements to the same underlying texture type for several engine releases.
#rb christopher.waters
[CL 31724002 by luke thatcher in ue5-main branch]
- Only in places where it is trivially proven the call is only made on the render thread, due to an existing check(IsInRenderingThread()) assert somewhere in the function.
- FRHICommandListImmediate::Get() itself contains a check(IsInRenderingThread()), so this enforces correct threading, and removes the need for extra checks at the call sites.
- Remaining uses of FRHICommandListExecutor::GetImmediateCommandList() need investigation. Some may be bugs.
- Also some changes to make use of the passed-in RHICmdList where possible (e.g. render commands that are given the immediate command list, but call the global getter rather than using the argument they were given).
#rb zach.bethel
[CL 31699633 by luke thatcher in ue5-main branch]
Slate switches to Windowed Fullscreen when we lose focus, but doesn't explicitly resize the window, so we never call SetWindowPos or ResizeTarget. If the fullscreen swapchain had a lower resolution than the monitor's native resolution, or if the swapchain had been windowed before, we were stretching the backbuffer to that smaller resolution inside the borderless window, instead of covering all of it. ResizeBuffers is not sufficient because it only affects the buffers, not the window. We need to also call ResizeTarget explicitly when dropping out of exclusive fullscreen, to make sure the window is correctly set up.
Vulkan doesn't have this problem because it doesn't support exclusive fullscreen.
#jira UE-203916
#rnx
#rb benjamin.rouveyrol
[CL 31290487 by mihnea balta in ue5-main branch]
Significant refactor of RHI command list management and submission, and RHI breadcrumbs / RenderGraph (RDG) scopes, to allow for parallel translation of most RHI command lists.
See individual changelists in //UE5/Dev-ParallelRendering for details. A summary of the changes is as follows:
This work's primary goal was to allow as many RHI command lists as possible to be parallel translated, to make more efficient use of many-core systems. To achieve this:
- The submission code paths for the immediate and parallel RHI command lists have been merged into a single function: FRHICommandListExecutor::Submit().
- A "dispatch thread" (which is simply a series of chained task graph tasks) is used to decide which command lists are batched together in a single parallel translate job.
- Individual command lists can disable parallel translate, which forces them to be executed on the RHI thread. This happens automatically if an RHI command list performs an operation that is not thread safe (e.g. buffer lock, or low-level resource transition).
One of the primary blockers for parallel translation was the RHI breadcrumb system, and the way RDG builds scopes. This was also refactored to remove these limitations:
- RDG could only push/pop events on the immediate command list, which resulted in parallel and immediate work being interleaved, breaking any opportunity for parallelism.
- Platform RHI implementations of breadcrumbs (e.g. in D3D12 RHI) was not correct across multiple RHI contexts. Push/pop operations aren't necessarily balanced within any one RHI context given that RDG builds "parallel pass sets" containing arbitrary ranges of renderer passes.
A summary of the new RHI breadcrumb system is as follows:
- A tree of breadcrumb nodes is built by the render thread and RDG. Each node contains the node name, and pointers to the parent and next nodes. When fully built, the nodes form a depth-first linked list which is used for traversing the tree for GPU crash debugging.
- The memory for breadcrumb nodes is provided by ref-counted allocator objects. These allocators are pipelined through the RHI, allowing the platform RHI implementation to extend their lifetime for GPU crash debugging purposes.
- RHIPushEvent / RHIPopEvent have been removed, replaced with RHIBeginBreadcrumbGPU / RHIEndBreadcrumbGPU. Platform RHIs implement these functions to perform GPU immediate writes using the unique ID of each node, for tracking GPU progress.
- Format string arguments are captured by-value to remove the cost of string formatting while building the breadcrumb tree. String formatting only occurs when the actual formatted string is required (e.g. during GPU crash breadcrumb stack traversal, or when calling platform GPU profiling APIs).
RenderGraph scopes have been simplified:
- The separate scope trees / arrays of ops have been combined. There is now a single tree of RDG scopes containing all types.
- Each RDG pass holds a pointer to the scope it was created under.
- BeginCPU / EndCPU is called on each RDG scope as the various RDG threads enter / exit them. This allows us to mark-up each worker thread with the relevant Unreal Insights scopes.
Other changes include:
- Fixes for bugs uncovered when parallel translate was enabled.
- Adjusted platform affinities necessary due to the new layout of thread tasks in the renderer.
- Refactored RHI draw call stats to better fit the new pipeline design.
#rb jeannoe.morissette, zach.bethel
#jira UE-139543
[CL 30973133 by Luke Thatcher in ue5-main branch]