Commit Graph

1308 Commits

Author SHA1 Message Date
christopher waters
0621d20368 More shader compiling code wrapped with WITH_EDITOR checks
- Focused around moving GlobalBeginCompileShader and friends.
- ModifyCompilationEnvironment and ValidateCompiledResult now only compiled in Editor builds.
- Measured 0.5MB to 1.0MB elf size reduction depending on platform.

#jira none
#rb jason.nadro, arciel.rekman, florin.pascu
#preflight 63613f992b5338aceb442902

[CL 22890964 by christopher waters in ue5-main branch]
2022-11-01 16:31:27 -04:00
henrik karlsson
8f895fef35 Added includes needed after removing includes in headers
#preflight 6360b63e41625be270a6e464
#rb none

[CL 22888775 by henrik karlsson in ue5-main branch]
2022-11-01 15:14:15 -04:00
nat parkinson
442a36cfd4 [Backout] - CL22872901 as it seems to have caused compile errors
[FYI] zach.bethel
Original CL Desc
-----------------------------------------------------------------
Added RDG_EVENT_SCOPE_FINAL variant that silences child scopes / events.
 - Added r.RDG.Events CVar to control GPU event behavior.
      - 0 disables GPU events; 1 enables GPU events and FINAL scopes suppress child scopes; 2 enables all GPU events.

#preflight 6360117d117bb4ce9da40ef0
#rb krzysztof.narkowicz, yuriy.odonnell, daniel.wright

[CL 22879513 by nat parkinson in ue5-main branch]
2022-11-01 07:02:14 -04:00
zach bethel
738376cef5 Added RDG_EVENT_SCOPE_FINAL variant that silences child scopes / events.
- Added r.RDG.Events CVar to control GPU event behavior.
      - 0 disables GPU events; 1 enables GPU events and FINAL scopes suppress child scopes; 2 enables all GPU events.

#preflight 6360117d117bb4ce9da40ef0
#rb krzysztof.narkowicz, yuriy.odonnell, daniel.wright

[CL 22876153 by zach bethel in ue5-main branch]
2022-10-31 20:56:49 -04:00
luke thatcher
0a443d68c5 Improve performance of RHI GPU draw call / num primitive stats
- Old code relied on atomic increments in the RHI_DRAW_CALL_INC /  RHI_DRAW_CALL_STATS macros, which is expensive, particularly on platforms with poor atomic performance.
 - New system replaces the atomic writes with a context-specific stats structure, which is accumulated by the RHI thread into the global structure.
 - Contexts write stat data through a "Stats" pointer on the IRHIComputeContext, which is set automatically by the command list management code. The pointer is replaced whenever a new "draw call category" is pushed, to redirect the counts.
 - Also moved some macro definitions around so more of the system can be removed when HAS_GPU_STATS is 0. Removed dependencies on CSV_PROFILER.

Moved Begin/EndFrame, Begin/EndScene, Begin/EndDrawingViewport into the immediate RHICmdList
 - They were already immediate-only functions due to a check() they contain.

#rb Zach.Bethel
#preflight 635fed4af97758810b50cb06

[CL 22875455 by luke thatcher in ue5-main branch]
2022-10-31 20:02:27 -04:00
Florin Pascu
b056e9a78c ShaderMap now serializes the ShaderPlatform name instead of ShaderPlatform enum
#jira UE-167922
#preflight 635c2a0d0053ddfa53401a82

[CL 22863617 by Florin Pascu in ue5-main branch]
2022-10-31 11:06:18 -04:00
yuriy odonnell
88303eba42 Use a RW lock to protect GShaderHashCache instead of a critical section to reduce lock contention
A critical section is still used on cache miss to prevent redundant calculations, while the cache remains accessible for readers.

#preflight 635b0a399c65b7958608bb01
#rb jason.nadro

[CL 22862505 by yuriy odonnell in ue5-main branch]
2022-10-31 10:15:22 -04:00
eric mcdaniel
502749c59a Fix for async compute on platforms with memory boundary restrictions on async compute dispatch indirect arguments
*** This change will incur a full shader invalidation across all platforms ***

Issues:
  - Some platforms require async compute dispatch indirect arguments to not cross specific memory boundaries
    - This places restrictions on the valid sizes for a dispatch indirect argument set.  We were not conforming to these restrictions which could result in GPU crashes on these async passes

Fixes:
  - FRHIDispatchIndirectParameters is padded out to meet per-platform memory boundary restrictions
    - This is driven via new per-platform preprocessor define PLATFORM_DISPATCH_INDIRECT_ARGUMENT_BOUNDARY_SIZE
    - Some platforms require FRHIDispatchIndirectParameters to align with their internal structure hence we cannot universally size to meet all platform's requirements

  - Introduce new FRHIDispatchIndirectParametersNoPadding for uses when we explicitly do not want the padding and otherwise avoid the memory boundary restrictions

  - Revise and expand indirect argument validation code to catch further such issues in the future

  - Update shaders which write to dispatch indirect argument buffers to account for optional per-platform padding
    - New utility function WriteDispatchIndirectArgs introduced to faciliate this
    - platforms which require other than the default nonpadded dispatch indirect arguments must define DISPATCH_INDIRECT_UINT_COUNT and their own WriteDispatchIndirectArgs in their CommonPlatform.ush

  - move creation of DispatchIndirectGraphicsCommandSignature command signature to be per-platform
    - DispatchIndirectGraphicsCommandSignature and DispatchIndirectComputeCommandSignature stride changed to account for additional padding on impacted platforms

Testing:
  - ran Lyra with and without async compute Lumen on impacted platforms as well as Win64
  - ran FN replay on impacted platforms

#rb Krzysztof.Narkowicz, Ben.Woodhouse, Benjamin.Rouveyrol
#jira UE-167950
#preflight 6359563b2e6690262a11bc06

[CL 22862498 by eric mcdaniel in ue5-main branch]
2022-10-31 10:15:11 -04:00
yuriy odonnell
bb6b77dcb3 Use a RW lock to protect GShaderFileCache instead of a critical section to reduce lock contention
#preflight 635aaa7b0b08a07d8a47009d
#rb jason.nadro

[CL 22861304 by yuriy odonnell in ue5-main branch]
2022-10-31 09:14:11 -04:00
Dmitriy Dyomin
f7dffe9f57 Fixed: A few issues with depth access in a mobile renderer
Make SceneDepthAux a platform constant configuration, it can't depend on runtime vars
#rb none
#preflight 635e4bed1b41d36d48d26061

[CL 22854614 by Dmitriy Dyomin in ue5-main branch]
2022-10-30 08:26:11 -04:00
chris kulla
60ddb1867f Hide unused payload structs from shaders which did not declare them
This is meant to help avoid inconsistencies between GetRayTracingPayloadType() and the shader code.

As an example of the usage, convert RayTracingDebug related shaders to only enable the payload conditionally. Fix the current incorrect mixing of payload uses in the same shader by introducing a permutation for the material version of the debug shader vs. the one that uses the debug payload (this was previously a runtime shader parameter only which implied two different payloads could theoretically be compiled into one RTPSO).

Also conditionaly enable some of the simpler payload types like the niagara and decal ones that are only used in a few places. Other payloads will be handled in future refactors.

Split RayTracingDebug.usf into smaller files that only have one raytracing entry in them to reduce the amount of counditional logic needed around payloads.

#rb Yuriy.ODonnell
#jira none
#preflight 635c49e9ae6840072d4df82f

[CL 22849513 by chris kulla in ue5-main branch]
2022-10-28 23:28:11 -04:00
christopher waters
ed40ee1428 Global, Material and MeshMaterial shader compiling WITH_EDITOR cleanup.
None of the shader compiling code runs in non-Editor mode, so don't compile it for those targets.

#jira none
#rb dan.elksnitis, jason.nadro
#preflight 635adca94710dd6af8673141

[CL 22839932 by christopher waters in ue5-main branch]
2022-10-28 18:04:50 -04:00
Thomas Engel
79e0b609f0 Enabling non-PQ >8 bit material for H265 in Electra
#preflight 635c3255ae6840072d43eac7
#fyi jens.petersam
#jira none

[CL 22836941 by Thomas Engel in ue5-main branch]
2022-10-28 16:09:37 -04:00
juliet dipietro
b712f039fe Adding ISR & Multiview Logging at runtime initialization::
Moved MMV exit code to it's own function
#rb robert.srinivasiah, Arciel.Rekman #preflight 22486450 #jira UE-168613

[CL 22820432 by juliet dipietro in ue5-main branch]
2022-10-27 19:22:17 -04:00
chris kulla
5da8820c19 Redesign the raytracing payload type tagging to use a function instead of a hardcoded field
This allows the raytracing payload type to vary according to shader permutation parameters which is more flexible and simplifies the code for complex cases like Lumen or deferred reflections

#rb Yuriy.ODonnell
#jira none
#preflight 635a8ba96a2a692f5d720eb3

[CL 22809506 by chris kulla in ue5-main branch]
2022-10-27 10:21:14 -04:00
graham wihlidal
a759d430ca Implemented 'stat renderscaling' to show current status of the dynamic resolution heuristics
#rb guillaume.abadie
[FYI] zach.bethel, ben.woodhouse
#preflight skip

[CL 22803487 by graham wihlidal in ue5-main branch]
2022-10-26 23:18:21 -04:00
tiago costa
b034d9e7c0 Implemented DrawClearQuadAlpha to clear only the alpha channel of a render target.
#rb Sebastien.Hillaire, wouter.dek
#preflight skip

[CL 22803446 by tiago costa in ue5-main branch]
2022-10-26 23:15:54 -04:00
arciel rekman
4dca43d6f1 Exclude more non-rendering apps from RHI runtime caps check (UE-168344).
#rb Jules.Blok
[REVIEW] [at]Jules.Blok
#jira UE-168344
#preflight 63593ba1764df4711e6ecbe4

[CL 22802884 by arciel rekman in ue5-main branch]
2022-10-26 22:20:37 -04:00
guillaume abadie
efedd65122 Implements r.DumpGPU.Delay
#rb none
#preflight 635859ac2e6690262ac2a16e

[CL 22798566 by guillaume abadie in ue5-main branch]
2022-10-26 19:11:42 -04:00
arciel rekman
befab85d25 Fix Hololens rendering (UE-167020).
- Fixed Mobile MultiView (fallback to ISR) stereo rendering on D3D12 (UE-167020).

#rb Rob.Srinivasiah
[REVIEW] [at]Robert.Srinivasiah
#jira UE-167020
#preflight 6357e8230313c249743758ab

[CL 22788314 by arciel rekman in ue5-main branch]
2022-10-26 16:21:13 -04:00
Yuriy ODonnell
1467e1ef82 Remove deprecated RHIRayTraceOcclusion and RHIRayTraceIntersection methods. They are replaced with equivalent new high-level functions in RayTracingBasicShaders.cpp.
#jira UE-167534
#preflight 63584e6f765b435dddecbb0c
#rb Chris.Kulla

[CL 22764325 by Yuriy ODonnell in ue5-main branch]
2022-10-25 17:55:27 -04:00
tim doerries
1efb722a91 Reworked CL for adding support for filtered virtual shadow maps on SingleLayerWater. Implemented by applying the VSM shadow mask to the SeparateMainDirLight texture that is written out by the main water pass in the presence of distance field shadows (or VSM filtering) and then later composing it back into the scene. Also removed EncodeLightAttenuationFromMask() function from VirtualShadowMapMaskBitsCommon.ush.
Originally reviewed in review 22330833.

#rb andrew.lauritzen, Ola.Olsson, Sebastien.Hillaire
#jira FORT-515227
#rnx
#preflight 6356c6700313c24974e8f7dd

[CL 22749579 by tim doerries in ue5-main branch]
2022-10-25 09:19:25 -04:00
chris kulla
b00b7d70d0 Organize Ray Tracing shaders by their payload
Every shader now has the ability to declare which payload it uses. HitGroup, Miss and Callable shaders may only specify a single type, whereas Raygen shaders may provide more than one.

The global raytracing shader libraries have been categorized according to their payload type, which helps facilitate the creation of minimal ray-tracing pipelines.

Avoid adding callable shaders to the RTPSO if they are not being used.

All RHI shader compilers and shader loaders that support ray tracing have been modified to track an extra uint representing the payload type flagged in the source (at time of compilation).

Add the RayTracingPayloadType to the shadermap DDC key so that we can properly detect when a shader's payload type is invalidated. This is required because we serialize the payload type into the shaders from the C++ side.

#rb Yuriy.ODonnell
#jira none
#preflight 635715cae6096564af4dd28e

[CL 22742893 by chris kulla in ue5-main branch]
2022-10-24 20:11:02 -04:00
jason hoerner
ee5373b706 Virtual Production: Optimizations to help with D3D12 CPU and GPU performance regressions following Parallel Rendering integration:
* Feature that merges command lists from QueueAsyncCommandListSubmit into a single Payload where possible (r.D3D12.AllowPayloadMerge, default enabled).  Saves around 0.5 ms per async command list batch, with total savings from 1 to 4 ms on VP test scenes.
* Remove unnecessary command flush before Present on D3D12.  Added a virtual function "NeedFlushBeforeEndDrawing" that can return false if the platform function RHIEndDrawingViewport function flushes commands, and the calling function doesn't need to.
* Wrapped three scene rendering flushes, plus BeginFlushResourcesRHI with "RHIIncludeOptionalFlushes()", which returns false on D3D12.  These were a net perf loss on D3D12, as they wrap hardly any rendering.  I didn't want to change behavior on other platforms, hence the conditional.
* Overall the above changes remove 8 command list flushes in a two view scene, which typically cost in the range of 0.06 ms each, saving 0.48 ms CPU total.  Perf win increases with more views, and removing the flushes also helps with GPU bubbles.
* Added a flush in a strategic location at the end of Shadows and Lumen.  In a sample scene, saved 1.5 ms GPU eliminating a bubble.

#jira UE-167553
#rb luke.thatcher
#rnx
#preflight 635547519e14ee3c790cc14e
#lockdown mihnea.balta

[CL 22728231 by jason hoerner in ue5-main branch]
2022-10-24 11:23:51 -04:00
christopher waters
6aa817d9ab Removing PLATFORM_SUPPORTS_SRV_UB since it should be set to the same value for all shader platforms.
#jira none
#rb stu.mckenna, jason.nadro
#preflight 6351afa9ae33b04ec1b98459

[CL 22690373 by christopher waters in ue5-main branch]
2022-10-21 11:20:44 -04:00