- Focused around moving GlobalBeginCompileShader and friends.
- ModifyCompilationEnvironment and ValidateCompiledResult now only compiled in Editor builds.
- Measured 0.5MB to 1.0MB elf size reduction depending on platform.
#jira none
#rb jason.nadro, arciel.rekman, florin.pascu
#preflight 63613f992b5338aceb442902
[CL 22890964 by christopher waters in ue5-main branch]
- Old code relied on atomic increments in the RHI_DRAW_CALL_INC / RHI_DRAW_CALL_STATS macros, which is expensive, particularly on platforms with poor atomic performance.
- New system replaces the atomic writes with a context-specific stats structure, which is accumulated by the RHI thread into the global structure.
- Contexts write stat data through a "Stats" pointer on the IRHIComputeContext, which is set automatically by the command list management code. The pointer is replaced whenever a new "draw call category" is pushed, to redirect the counts.
- Also moved some macro definitions around so more of the system can be removed when HAS_GPU_STATS is 0. Removed dependencies on CSV_PROFILER.
Moved Begin/EndFrame, Begin/EndScene, Begin/EndDrawingViewport into the immediate RHICmdList
- They were already immediate-only functions due to a check() they contain.
#rb Zach.Bethel
#preflight 635fed4af97758810b50cb06
[CL 22875455 by luke thatcher in ue5-main branch]
A critical section is still used on cache miss to prevent redundant calculations, while the cache remains accessible for readers.
#preflight 635b0a399c65b7958608bb01
#rb jason.nadro
[CL 22862505 by yuriy odonnell in ue5-main branch]
*** This change will incur a full shader invalidation across all platforms ***
Issues:
- Some platforms require async compute dispatch indirect arguments to not cross specific memory boundaries
- This places restrictions on the valid sizes for a dispatch indirect argument set. We were not conforming to these restrictions which could result in GPU crashes on these async passes
Fixes:
- FRHIDispatchIndirectParameters is padded out to meet per-platform memory boundary restrictions
- This is driven via new per-platform preprocessor define PLATFORM_DISPATCH_INDIRECT_ARGUMENT_BOUNDARY_SIZE
- Some platforms require FRHIDispatchIndirectParameters to align with their internal structure hence we cannot universally size to meet all platform's requirements
- Introduce new FRHIDispatchIndirectParametersNoPadding for uses when we explicitly do not want the padding and otherwise avoid the memory boundary restrictions
- Revise and expand indirect argument validation code to catch further such issues in the future
- Update shaders which write to dispatch indirect argument buffers to account for optional per-platform padding
- New utility function WriteDispatchIndirectArgs introduced to faciliate this
- platforms which require other than the default nonpadded dispatch indirect arguments must define DISPATCH_INDIRECT_UINT_COUNT and their own WriteDispatchIndirectArgs in their CommonPlatform.ush
- move creation of DispatchIndirectGraphicsCommandSignature command signature to be per-platform
- DispatchIndirectGraphicsCommandSignature and DispatchIndirectComputeCommandSignature stride changed to account for additional padding on impacted platforms
Testing:
- ran Lyra with and without async compute Lumen on impacted platforms as well as Win64
- ran FN replay on impacted platforms
#rb Krzysztof.Narkowicz, Ben.Woodhouse, Benjamin.Rouveyrol
#jira UE-167950
#preflight 6359563b2e6690262a11bc06
[CL 22862498 by eric mcdaniel in ue5-main branch]
Make SceneDepthAux a platform constant configuration, it can't depend on runtime vars
#rb none
#preflight 635e4bed1b41d36d48d26061
[CL 22854614 by Dmitriy Dyomin in ue5-main branch]
This is meant to help avoid inconsistencies between GetRayTracingPayloadType() and the shader code.
As an example of the usage, convert RayTracingDebug related shaders to only enable the payload conditionally. Fix the current incorrect mixing of payload uses in the same shader by introducing a permutation for the material version of the debug shader vs. the one that uses the debug payload (this was previously a runtime shader parameter only which implied two different payloads could theoretically be compiled into one RTPSO).
Also conditionaly enable some of the simpler payload types like the niagara and decal ones that are only used in a few places. Other payloads will be handled in future refactors.
Split RayTracingDebug.usf into smaller files that only have one raytracing entry in them to reduce the amount of counditional logic needed around payloads.
#rb Yuriy.ODonnell
#jira none
#preflight 635c49e9ae6840072d4df82f
[CL 22849513 by chris kulla in ue5-main branch]
None of the shader compiling code runs in non-Editor mode, so don't compile it for those targets.
#jira none
#rb dan.elksnitis, jason.nadro
#preflight 635adca94710dd6af8673141
[CL 22839932 by christopher waters in ue5-main branch]
Moved MMV exit code to it's own function
#rb robert.srinivasiah, Arciel.Rekman #preflight 22486450 #jira UE-168613
[CL 22820432 by juliet dipietro in ue5-main branch]
This allows the raytracing payload type to vary according to shader permutation parameters which is more flexible and simplifies the code for complex cases like Lumen or deferred reflections
#rb Yuriy.ODonnell
#jira none
#preflight 635a8ba96a2a692f5d720eb3
[CL 22809506 by chris kulla in ue5-main branch]
Originally reviewed in review 22330833.
#rb andrew.lauritzen, Ola.Olsson, Sebastien.Hillaire
#jira FORT-515227
#rnx
#preflight 6356c6700313c24974e8f7dd
[CL 22749579 by tim doerries in ue5-main branch]
Every shader now has the ability to declare which payload it uses. HitGroup, Miss and Callable shaders may only specify a single type, whereas Raygen shaders may provide more than one.
The global raytracing shader libraries have been categorized according to their payload type, which helps facilitate the creation of minimal ray-tracing pipelines.
Avoid adding callable shaders to the RTPSO if they are not being used.
All RHI shader compilers and shader loaders that support ray tracing have been modified to track an extra uint representing the payload type flagged in the source (at time of compilation).
Add the RayTracingPayloadType to the shadermap DDC key so that we can properly detect when a shader's payload type is invalidated. This is required because we serialize the payload type into the shaders from the C++ side.
#rb Yuriy.ODonnell
#jira none
#preflight 635715cae6096564af4dd28e
[CL 22742893 by chris kulla in ue5-main branch]
* Feature that merges command lists from QueueAsyncCommandListSubmit into a single Payload where possible (r.D3D12.AllowPayloadMerge, default enabled). Saves around 0.5 ms per async command list batch, with total savings from 1 to 4 ms on VP test scenes.
* Remove unnecessary command flush before Present on D3D12. Added a virtual function "NeedFlushBeforeEndDrawing" that can return false if the platform function RHIEndDrawingViewport function flushes commands, and the calling function doesn't need to.
* Wrapped three scene rendering flushes, plus BeginFlushResourcesRHI with "RHIIncludeOptionalFlushes()", which returns false on D3D12. These were a net perf loss on D3D12, as they wrap hardly any rendering. I didn't want to change behavior on other platforms, hence the conditional.
* Overall the above changes remove 8 command list flushes in a two view scene, which typically cost in the range of 0.06 ms each, saving 0.48 ms CPU total. Perf win increases with more views, and removing the flushes also helps with GPU bubbles.
* Added a flush in a strategic location at the end of Shadows and Lumen. In a sample scene, saved 1.5 ms GPU eliminating a bubble.
#jira UE-167553
#rb luke.thatcher
#rnx
#preflight 635547519e14ee3c790cc14e
#lockdown mihnea.balta
[CL 22728231 by jason hoerner in ue5-main branch]