363 Commits

Author SHA1 Message Date
dan elksnitis
55df76e600 [shaders] fix metal shader archive population, was still implementing a deprecated API
#rb Jason.Nadro

[CL 36757780 by dan elksnitis in 5.5 branch]
2024-10-01 19:32:51 -04:00
dan elksnitis
5bcff15345 [shaders] modify FShaderCode finalize to create a FSharedBuffer object, and modify all downstream uses of shader code to re-use this buffer (job cache, pushes to DDC, shader maps, and shader library). This reduces total amount of LLM tracked memory allocated at the end of a cold Lyra PS4 cook by about ~350MB; impact likely much larger for cooks of larger projects.
resubmit with following fixes:
- static analysis error which caught an >=0 check on a uint64 which should have been >0
- fix for an inverted guard on multiprocess cook sending bytecode to director (was only sending code across if empty instead of non-empty)
- fix for uninitialized padding in the FShaderCodeResource::FHeader struct causing nondeterministic puts
- fix for incorrect size passed to job cache hashing on receiving buffers from DDC

#rb Devin.Doucette, Laura.Hermanns, Zousar.Shaker
#lockdown Marc.Audy

[CL 36754792 by dan elksnitis in 5.5 branch]
2024-10-01 19:02:22 -04:00
carl lloyd
ed0247785b Metal RHI
- Update to Metal Shader Converter 2.0 Beta 4 which added support for quad derivatives in Compute
- Reduced max number of shaders per library due to increased RAM usage in cooks

#rb zack.neyland
#jira UE-223492

[CL 36754144 by carl lloyd in 5.5 branch]
2024-10-01 18:56:54 -04:00
dan elksnitis
fd01802612 [Backout] - CL36470025
[FYI] dan.elksnitis
Original CL Desc
-----------------------------------------------------------------
[shaders] modify FShaderCode finalize to create a FSharedBuffer object, and modify all downstream uses of shader code to re-use this buffer (job cache, pushes to DDC, shader maps, and shader library). This reduces total amount of LLM tracked memory allocated at the end of a cold Lyra PS4 cook by about ~350MB; impact likely much larger for cooks of larger projects.

#rb Devin.Doucette, Zousar.Shaker
#lockdown Marc.Audy
resubmit with SA+MP cook fix

[CL 36747522 by dan elksnitis in 5.5 branch]
2024-10-01 17:45:02 -04:00
dan elksnitis
8c666d2108 [shaders] modify FShaderCode finalize to create a FSharedBuffer object, and modify all downstream uses of shader code to re-use this buffer (job cache, pushes to DDC, shader maps, and shader library). This reduces total amount of LLM tracked memory allocated at the end of a cold Lyra PS4 cook by about ~350MB; impact likely much larger for cooks of larger projects.
#rb Devin.Doucette, Zousar.Shaker
#lockdown Marc.Audy

resubmit with SA+MP cook fix

[CL 36746984 by dan elksnitis in 5.5 branch]
2024-10-01 17:40:02 -04:00
dan elksnitis
c02d3f0517 [Backout] - CL36437712
[FYI] dan.elksnitis
Original CL Desc
-----------------------------------------------------------------
[shaders] modify FShaderCode finalize to create a FSharedBuffer object, and modify all downstream uses of shader code to re-use this buffer (job cache, pushes to DDC, shader maps, and shader library). This reduces total amount of LLM tracked memory allocated at the end of a cold Lyra PS4 cook by about ~350MB; impact likely much larger for cooks of larger projects.

#rb Zousar.Shaker
#lockdown marc.audy

[CL 36440265 by dan elksnitis in 5.5 branch]
2024-09-19 13:16:02 -04:00
dan elksnitis
c7dfb5d9b6 [shaders] modify FShaderCode finalize to create a FSharedBuffer object, and modify all downstream uses of shader code to re-use this buffer (job cache, pushes to DDC, shader maps, and shader library). This reduces total amount of LLM tracked memory allocated at the end of a cold Lyra PS4 cook by about ~350MB; impact likely much larger for cooks of larger projects.
#rb Zousar.Shaker
#lockdown marc.audy

[CL 36437741 by dan elksnitis in 5.5 branch]
2024-09-19 12:21:59 -04:00
carl lloyd
0f5c2bec9a Metal RHI - Bindless/SM6 Update
- Updated MSC to latest version 2.0 beta 3
- Removed MTLBufferPtr to make deallocations more explicit
- Re-wrote MetalTempAllocator to be a simple buffer allocator as the heap allocator had a huge perf overhead when used with Bindless
- Fixed use after free in deferred delete
- Limited SM6 to MacOS 15

Changes in collaboration with Apple:

- Reworked residency management
- Replace manual resource binding/pre-draw steps with IRRuntime helpers
- Added vertex layout hashing support for MSC vertex descriptors
- Replaced VertexBuffers cache struct with MSC IRRuntimeVertexBuffer
- Fixed texture reference update by adding an texture override in SRVs (this way the texture reference SRV don???t revert to the default resource when the view is invalidated).
- Fixed some page faults by removing the side table allocation with temporary allocations

#jira UE-223489
#rb Luke.Thatcher

[CL 36227379 by carl lloyd in 5.5 branch]
2024-09-12 10:32:09 -04:00
laura hermanns
5f560f423b [Shaders] Set USING_VERTEX_SHADER_LAYER depending on DDPI values, not hard coded in shader backend. Reintroduced after backout 35892687.
- This also reverts the DDPI information from CL 35396864, since D3D does in fact support vertex shader layers (except for D3D11.2 or older).
- Don't set USING_VERTEX_SHADER_LAYER when geometry shaders are available; This resulted in ShaderMinifier not able to find the entry point.

#jira UE-221358
#rnx
#rb Arciel.Rekman, Sebastien.Hillaire
[FYI] Christopher.Waters, Graham.Wihlidal, Erica.Stella

[CL 35971246 by laura hermanns in ue5-main branch]
2024-09-03 12:41:06 -04:00
tiago costa
b2fac9e0f5 [Backout] - CL35880869
[FYI] Laura.Hermanns
Original CL Desc
-----------------------------------------------------------------
[Shaders] Set USING_VERTEX_SHADER_LAYER depending on DDPI values not hard coded in shader backend.

This also reverts the DDPI information from CL 35396864, since D3D does in fact support vertex shader layers (except for D3D11.2 or older).

#jira UE-221358
#rnx
#rb Arciel.Rekman
[FYI] Christopher.Waters, Erica.Stella

[CL 35892689 by tiago costa in ue5-main branch]
2024-08-29 06:14:57 -04:00
laura hermanns
f2146abf9d [Shaders] Set USING_VERTEX_SHADER_LAYER depending on DDPI values not hard coded in shader backend.
This also reverts the DDPI information from CL 35396864, since D3D does in fact support vertex shader layers (except for D3D11.2 or older).

#jira UE-221358
#rnx
#rb Arciel.Rekman
[FYI] Christopher.Waters, Erica.Stella

[CL 35880876 by laura hermanns in ue5-main branch]
2024-08-28 17:09:14 -04:00
florian penzkofer
8edbf2e4f6 Fix half precision support for iOS
#rb carl.lloyd
#jira UE-216022

[CL 35834122 by florian penzkofer in ue5-main branch]
2024-08-27 15:17:04 -04:00
carl lloyd
056d57f3e7 Metal RHI Context Refactor
- Removed MetalContext and MetalRenderPass.
- Removed all code to restart renderpasses
- Added support in the RHI for a new UploadContext which allows uploads to execute before the submission of contexts using them.
- Most functionality is now within MetalRHIContext and removed dependancies so that multiple MetalRHIContext's can execute in parallel.
- MetalDeviceContext has been removed and replaced with MetalDevice.
- Removed the previous FrameAllocator and replaced with a temporary heap based allocator.
- Metal no longer uses SubmitCommandsHint and now builds/submits command buffers through RHIFinalizeContext/RHISubmitCommandLists
- Added initial support for MetalRHI parallel encoding, can be tested with -rhiparallel.
- Removed addition temporary allocations when uploading to buffers

#rb Luke.Thatcher
#jira UE-212349

[CL 35450226 by carl lloyd in ue5-main branch]
2024-08-12 08:41:13 -04:00
laura hermanns
99ffa68e64 [Shaders] Only store optional shader source when CFLAG_ExtraShaderData is specified, not for CFLAG_Archive.
Storing the whole metal shader source text in the optional data field was introduced with CL 3341849 for debugging purposes.
This should only be required when CFLAG_ExtraShaderData is specified (enabled via CVar r.Shaders.ExtraData).
The only place this optional data is needed outside the shader compiler, is TMetalBaseShader::Init() to initialize the field "GlslCodeNSString" which is documented as debuggable text.

Also invalidate all Metal shaders to trigger a recompilation with a much smaller memory footprint of the output shader binaries.

#rnx
#rb Arciel.Rekman, Florin.Pascu, Jason.Nadro
#lockdown Michal.Valient

[CL 35284217 by laura hermanns in ue5-main branch]
2024-08-02 15:43:31 -04:00
dan elksnitis
0320312a2b [Backout] - CL35053495
[FYI] dan.elksnitis
Original CL Desc
-----------------------------------------------------------------
[shaders]
- move population of output.target field into core code so each backend doesn't need to do it manually (do so before calling the compile function so any existing code expecting the output field to be set at any point during the compile process is unaffected)
- add a check in FShaderCompileJob::SerializeOutput and FShaderCompileJob::SerializeWorkerOutput that the FShaderTarget for a job output matches that of its input. This should catch cases of shaders with the wrong frequency being associated with jobs; further the callstack should indicate where this incorrect association is coming from (since SerializeOutput is called in different places for each cache path: duplicate in-flight jobs, jobs which hit in the in-memory job cache, and jobs which hit in the DDC cache, and SerializeWorkerOutput is only called when the job is read back from SCW output).

#rb Laura.Hermanns

[CL 35080902 by dan elksnitis in ue5-main branch]
2024-07-25 10:12:01 -04:00
dan elksnitis
ddbc683598 [shaders]
- move population of output.target field into core code so each backend doesn't need to do it manually (do so before calling the compile function so any existing code expecting the output field to be set at any point during the compile process is unaffected)
- add a check in FShaderCompileJob::SerializeOutput and FShaderCompileJob::SerializeWorkerOutput that the FShaderTarget for a job output matches that of its input. This should catch cases of shaders with the wrong frequency being associated with jobs; further the callstack should indicate where this incorrect association is coming from (since SerializeOutput is called in different places for each cache path: duplicate in-flight jobs, jobs which hit in the in-memory job cache, and jobs which hit in the DDC cache, and SerializeWorkerOutput is only called when the job is read back from SCW output).

#rb Laura.Hermanns

[CL 35053511 by dan elksnitis in ue5-main branch]
2024-07-24 10:09:55 -04:00
christopher waters
cfc6f343df Uniform Buffer improvements
- Moving various UB booleans into a flags enum.
- UB booleans could not be reasonably deprecated without incurring memory overhead, so this will break custom code that uses them.
- Adding UB flag to force the shader compilers to generate reflection for the UB members which are normally excluded from reflection.
- Adding UB flag that tells MeshCommands that a UB will be bound during pass drawing and that it doesn't need to be set via MDCs.
- New flags are not used in this CL, they are prerequisites for subsequent, larger changes.

#rb jeannoe.morissette

[CL 34356503 by christopher waters in ue5-main branch]
2024-06-13 17:59:13 -04:00
florin pascu
6bca8ef20e Change the CHECK_METAL_COMPILER_TOOLCHAIN_SETUP define to a cvar Metal.CheckCompilerToolChainSetup
#rb carl.lloyd

[CL 34100249 by florin pascu in ue5-main branch]
2024-06-04 13:57:09 -04:00
florin pascu
549e2d4a2e Add MetalOptimizeBySize setting for MetalShaderFormat
#rb carl.lloyd

[CL 34096192 by florin pascu in ue5-main branch]
2024-06-04 12:10:32 -04:00
laura hermanns
7a0927103c [Shaders] Remove last remaining use case of DXC rewriter in MetalCompileShaderMSC and deprecate RewriteHlsl.
#rnx
#rb carl.lloyd, dan.elksnitis

[CL 33279539 by laura hermanns in ue5-main branch]
2024-04-26 17:10:10 -04:00
dmitriy dyomin
1afc70a95a Metal: reduce shader metadata size by ~20% for most cases
#rb carl.lloyd

[CL 33259903 by dmitriy dyomin in ue5-main branch]
2024-04-26 06:17:17 -04:00
zach bethel
0674d30d69 Added SRVNonPixel, SHADER_PARAMETER_RDG_NON_PIXEL_SRV, and modified RDG_TEXTURE_ACCESS to support texture subresources.
- SRVNonPixel is needed by mobile to insert a barrier between fragment -> vertex texture fetch, but since this is a heavyweight barrier, it is opt-in with SHADER_PARAMETER_RDG_NON_PIXEL_SRV.
 - Small refactor to FRDGTextureAccess to allow for arbitrary subresources, as the current model only allows full resource transitions.

#rb mihnea.balta, luke.thatcher, serge.bernier
#jira UE-211883

[CL 33179861 by zach bethel in ue5-main branch]
2024-04-23 17:02:48 -04:00
dan elksnitis
d64ae2b124 [shaders] add error code to spirv reflect failure assert
#rb Jason.Nadro

[CL 33000052 by dan elksnitis in ue5-main branch]
2024-04-16 10:08:24 -04:00
dan elksnitis
05dbf5d214 [shaders] strip unnecessary source file hashing from metal compiler. this was dead code and the result wasn't being used anywhere.
#rb Laura.Hermanns
#jira UE-211354

[CL 32732231 by dan elksnitis in ue5-main branch]
2024-04-04 12:20:22 -04:00
laura hermanns
67217ad5ad [Shaders] Replace DXC rewriter with new SPIRV-Tools pass to pack "$Globals" cbuffer.
- Adds StructPackingPass to SPIRV-Tools which re-assigns all struct member offsets of the global cbuffer ("type.$Globals" when translated in DXC/SPIR-V) according to the std140 memory layout rules.
- Remove DXC rewriter from shader backends as the shader minifier can already handle the majority of dead code removal.
- Rebuilt DXC for Win64, Mac, Linux.

#jira UE-207703
#rnx
#rb Yuriy.ODonnell
[FYI] Dan.Elksnitis, JeanNoe.Morissette, Serge.Bernier, Florin.Pascu

[CL 32646612 by laura hermanns in ue5-main branch]
2024-04-01 14:46:17 -04:00