This represents UE4/Main @18073326, Release-5.0 @18081140 and Dev-PerfTest @18045971
[CL 18081471 by aurel cordonnier in ue5-release-engine-test branch]
1. Reduce unnecessary UAV Clear with property velocity projection. Only subpass one UAV is cleared conditionally when separable tiles are detected.
2. Remove subsurface view copy and use tile based recombine.
3. Mipmaps are generated only when there are many Burley tiles ( default r.SSS.Burley.MinGenerateMipsTileCount to 4000)
[Minor] Rename pass and texture names, and use proper descriptors.
#jira UE-131573
#rb jian.ru
#ushell-cherrypick of 17979880 by Tiantian.Xie
#preflight 617c0c6b5dbdbc00016ea885
#ROBOMERGE-AUTHOR: tiantian.xie
#ROBOMERGE-SOURCE: CL 17983475 in //UE5/Main/...
#ROBOMERGE-BOT: STARSHIP (Main -> Release-Engine-Test) (v885-17909292)
[CL 17983481 by tiantian xie in ue5-release-engine-test branch]
This represents UE4/Main @17911760, Release-5.0 @17915875 and Dev-PerfTest @17914035
[CL 17918595 by aurel cordonnier in ue5-release-engine-test branch]
- FRayTracingScene::Create() kicks off instance data upload to parallel thread along with RDG pass to build platform specific instance buffer.
- Implemented helper functions FillInstanceUploadBuffer and BuildRayTracingInstanceBuffer
- AutoInstanceBuffer is deprecated but still supported (for one release) when FRayTracingSceneInitializer is used.
- Temporarily remove support for Niagara meshes in TLAS.
#jira UE-129652
#preflight 617177237a83f30001e1407e
#rb Yuriy.ODonnell
#ROBOMERGE-AUTHOR: tiago.costa
#ROBOMERGE-SOURCE: CL 17885359 in //UE5/Main/...
#ROBOMERGE-BOT: STARSHIP (Main -> Release-Engine-Test) (v883-17842818)
[CL 17885365 by tiago costa in ue5-release-engine-test branch]
3 fixes:
* Fixes the GRHISupportsVariableRateShading logic to ensure that bool doesn't get nuked if the device supports FragmentDensityMap but not ShadingRateAttachment.
* Ensures Scene Textures are created as an array when multiview enabled.
* Ensures the RenderPass2 impl supports the Multiview Mask.
#jira UE-131846
#rb steve.smith jeannoe.morissette
#ROBOMERGE-AUTHOR: steve.smith
#ROBOMERGE-SOURCE: CL 17864739 in //UE5/Main/...
#ROBOMERGE-BOT: STARSHIP (Main -> Release-Engine-Test) (v883-17842818)
[CL 17864759 by tuxerr in ue5-release-engine-test branch]
- Bindless rendering is not enabled anywhere yet.
- Adding bSupportsBindless to DDSPI.
- Moving ED3D12DescriptorHeapType into RHI and renaming to ERHIDescriptorHeapType.
- Adding FRHIDescriptorHandle for public usage of bindless indices.
- Adding GetBindlessHandle() to RHI SamplerState, SRV, UAV.
- Adding GetDefaultShaderResourceView() to RHI Textures.
- Adding bindless Sampler heap support to D3D12.
- Cleaned up some common shader code in prep for future bindless work.
#jira none
#rb mihnea.balta
#preflight 6169f8fafeab330001591bd0
#ROBOMERGE-AUTHOR: christopher.waters
#ROBOMERGE-SOURCE: CL 17846249 in //UE5/Main/...
#ROBOMERGE-BOT: STARSHIP (Main -> Release-Engine-Test) (v883-17842818)
[CL 17846289 by christopher waters in ue5-release-engine-test branch]
This represents UE4/Main @17774255, Release-5.0 @17791557 and Dev-PerfTest @17789485
[CL 17794212 by aurel cordonnier in ue5-release-engine-test branch]
Basic approach is to add HLSL types FLWCScalar, FLWCMatrix, FLWCVector, etc. Inside shaders, absolute world space position values should be represented as FLWCVector3. Matrices that transform *into* absolute world space become FLWCMatrix. Matrices that transform *from* world space become FLWCInverseMatrix. Generally LWC values work by extending the regular 'float' value with an additional tile coordinate. Final tile size will be a trade-off between scale/accuracy; I'm using 256k for now, but may need to be adjusted. Value represented by a FLWCVector thus becomes V.Tile * TileSize + V.Offset. Most operations can be performed directly on LWC values. There are HLSL functions like LWCAdd, LWCSub, LWCMultiply, LWCDivide (operator overloading would be really nice here). The goal is to stay with LWC values for as long as needed, then convert to regular float values when possible. One thing that comes up a lot is working in translated (rather than absolute) world space. WorldSpace + View.PrevPreViewTranslation = TranslatedWorldspace. Except 'View.PrevPreViewTranslation' is now a FLWCVector3, and WorldSpace quantities should be as well. So that becomes LWCAdd(WorldSpace, View.PrevPreViewTranslation) = TranslatedWorldspace. Assuming that we're talking about a position that's "reasonably close" to the camera, it should be safe to convert the translated WS value to float. The 'tile' coordinate of the 2 LWC values should cancel out when added together in this case. I've done some work throughout the shader code to do this. Materials are fully supporting LWC-values as well. Projective texturing and vertex animation materials that I've tested work correctly even when positioned "far away" from the origin.
Lots of work remains to fully convert all of our shader code. There's a function LWCHackToFloat(), which is a simple wrapper for LWCToFloat(). The idea of HackToFloat is to mark places that need further attention, where I'm simply converting absolute WS positions to float, to get shaders to compile. Shaders converted in this way should continue to work for all existing content (without LWC-scale values), but they will break if positions get too large.
General overview of changed files:
LargeWorldCoordinates.ush - This defines the FLWC types and operations
GPUScene.cpp, SceneData.ush - Primitives add an extra 'float3' tile coordinate. Instance data is unchanged, so instances need to stay within single-precision range of the primitive origin. Could potentially split instances behind the scenes (I think) if we don't want this limitation
HLSLMaterialDerivativeAutogen.cpp, HLSLMaterialTranslator.cpp, Preshader.cpp - Translated materials to use LWC values
SceneView.cpp, SceneRelativeViewMatrices.cpp, ShaderCompiler.cpp, InstancedStereo.ush - View uniform buffer includes LWC values where appropriate
#jira UE-117101
#rb arne.schober, Michael.Galetzka
#ROBOMERGE-AUTHOR: ben.ingram
#ROBOMERGE-SOURCE: CL 17787435 in //UE5/Main/...
#ROBOMERGE-BOT: STARSHIP (Main -> Release-Engine-Test) (v881-17767770)
[CL 17787478 by ben ingram in ue5-release-engine-test branch]
- The benefit isn't worth having to deal with multiple fallouts from this.
#rb none
#jira UE-127489
#ushell-cherrypick of 17776551 by Arciel.Rekman
#ROBOMERGE-AUTHOR: arciel.rekman
#ROBOMERGE-SOURCE: CL 17779200 in //UE5/Main/...
#ROBOMERGE-BOT: STARSHIP (Main -> Release-Engine-Test) (v881-17767770)
[CL 17779205 by arciel rekman in ue5-release-engine-test branch]
(cherry pick of 17507631 including later followup fixes in 17526511 and 17542272)
- Disable RDG parallel execution on D3D11.
- Precreate ClearReplacement shaders
- Also add a check to catch other possible issues before it's too late.
#rb Chris.Waters (in Dev-EMT)
#jira UE-125050
#ushell-cherrypick of 17507631 by Arciel.Rekman
#ROBOMERGE-AUTHOR: arciel.rekman
#ROBOMERGE-SOURCE: CL 17586312 in //UE5/Main/...
#ROBOMERGE-BOT: STARSHIP (Main -> Release-Engine-Test) (v871-17566257)
[CL 17586354 by arciel rekman in ue5-release-engine-test branch]
Power of two rounding to mitigate fragmentation now happens in bytes, instead of elements, so we get fewer unique buffer sizes.
#rb ola.olsson
#preflight 613f61e53bbb4800011187f1
#ROBOMERGE-AUTHOR: rune.stubbe
#ROBOMERGE-SOURCE: CL 17493448 in //UE5/Main/...
#ROBOMERGE-BOT: STARSHIP (Main -> Release-Engine-Test) (v870-17433530)
[CL 17493511 by rune stubbe in ue5-release-engine-test branch]
This should be useful if we want to vary the feedback trade offs for different modes.
Did some comparisons of shader ISA and ALU cost is unchanged.
#jira none
#ushell-cherrypick of 17483512 by Jeremy.Moore
#ROBOMERGE-AUTHOR: jeremy.moore
#ROBOMERGE-SOURCE: CL 17491509 in //UE5/Main/...
#ROBOMERGE-BOT: STARSHIP (Main -> Release-Engine-Test) (v870-17433530)
[CL 17491521 by jeremy moore in ue5-release-engine-test branch]