resubmit with following fixes:
- static analysis error which caught an >=0 check on a uint64 which should have been >0
- fix for an inverted guard on multiprocess cook sending bytecode to director (was only sending code across if empty instead of non-empty)
- fix for uninitialized padding in the FShaderCodeResource::FHeader struct causing nondeterministic puts
- fix for incorrect size passed to job cache hashing on receiving buffers from DDC
#rb Devin.Doucette, Laura.Hermanns, Zousar.Shaker
#lockdown Marc.Audy
[CL 36754792 by dan elksnitis in 5.5 branch]
[FYI] dan.elksnitis
Original CL Desc
-----------------------------------------------------------------
[shaders] modify FShaderCode finalize to create a FSharedBuffer object, and modify all downstream uses of shader code to re-use this buffer (job cache, pushes to DDC, shader maps, and shader library). This reduces total amount of LLM tracked memory allocated at the end of a cold Lyra PS4 cook by about ~350MB; impact likely much larger for cooks of larger projects.
#rb Devin.Doucette, Zousar.Shaker
#lockdown Marc.Audy
resubmit with SA+MP cook fix
[CL 36747522 by dan elksnitis in 5.5 branch]
[FYI] dan.elksnitis
Original CL Desc
-----------------------------------------------------------------
[shaders] modify FShaderCode finalize to create a FSharedBuffer object, and modify all downstream uses of shader code to re-use this buffer (job cache, pushes to DDC, shader maps, and shader library). This reduces total amount of LLM tracked memory allocated at the end of a cold Lyra PS4 cook by about ~350MB; impact likely much larger for cooks of larger projects.
#rb Zousar.Shaker
#lockdown marc.audy
[CL 36440265 by dan elksnitis in 5.5 branch]
- add strings as an option for shader statistics, and a flags member/option to hide specific shader statistics from UI (so we can use statistics for other internal usage)
- add debug output for shader platform hashes (use to track down cases of shader job output which has identical bytecode but differing metadata), using the above to record a stat containing the shader hash computed by the compiler. currently only implemented for a subset of platforms
- fix a bug in shader diagnostic remapping for cache hits - the value stored in the cache/DDC for a job should contain non-remapped diagnostics (so cache hits can properly remap them to their version of the unstripped shader code). this also slightly improves deduplication within the job cache/DDC. fixed by storing the job output in the cache/DDC before running the job oncomplete callback (which performs the remapping)
- cleanup dxc hash retrieval code to use actual type instead of hacking around with an offset into the opaque blob
#rb Jason.Nadro
[CL 34864220 by dan elksnitis in ue5-main branch]
- Moving various UB booleans into a flags enum.
- UB booleans could not be reasonably deprecated without incurring memory overhead, so this will break custom code that uses them.
- Adding UB flag to force the shader compilers to generate reflection for the UB members which are normally excluded from reflection.
- Adding UB flag that tells MeshCommands that a UB will be bound during pass drawing and that it doesn't need to be set via MDCs.
- New flags are not used in this CL, they are prerequisites for subsequent, larger changes.
#rb jeannoe.morissette
[CL 34356503 by christopher waters in ue5-main branch]
- TCHAR* arguments were being used to find elements in a TMap<FString,..> which creates temporary memory allocations.
- We can use FStringView with FindByHash/RemoveByHash to prevent temporary allocations.
- Adding FShaderParameterMap::FindAndRemoveParameterAllocation as a shortcut for Find+Remove (which Uniform Buffer member handling does).
- Fixing a few locations that were going FString->TCHAR*->FString.
#rb Laura.Hermanns
[CL 34065428 by christopher waters in ue5-main branch]
- SRVNonPixel is needed by mobile to insert a barrier between fragment -> vertex texture fetch, but since this is a heavyweight barrier, it is opt-in with SHADER_PARAMETER_RDG_NON_PIXEL_SRV.
- Small refactor to FRDGTextureAccess to allow for arbitrary subresources, as the current model only allows full resource transitions.
#rb mihnea.balta, luke.thatcher, serge.bernier
#jira UE-211883
[CL 33179861 by zach bethel in ue5-main branch]
Split Work Graph shaders into multiple frequencies.
This is somewhat in anticipation of graphic nodes.
But also it is a replacement for using CFLAG_WorkgraphLocalNodes to differentiate nodes with local or global root signature.
#rb Yuriy.ODonnell
[CL 33146442 by jeremy moore in ue5-main branch]
- strip old job cache path (constructing an input hash based on all inputs); the cache key based on preprocessed source is now the One True Job Cache
- strip parallelfor-based job submission path (gamethread-blocking); the task path has been enabled for a long while now without issues
- remove the "compile job inputs" debug dump; this is no longer relevant since it's based on inputs to the now-stripped compile job path (similar functionality will be provided by the new form of debug usf once completed)
#rb Laura.Hermanns
[CL 32825320 by dan elksnitis in ue5-main branch]
- modify the input hash debug dump mechanism to output an empty "debughash_<hash>" file instead of a txt file with the hash in contents, and always dump these files for the instance of the job that actually compiled
- the existing cvar will now just make it so these files are also dumped for jobs which hit in DDC or the job cache; we don't do this by default so there's only a single match for the debug hash for any given shader normally and it is inside the folder containing the full debug info, including those artifacts which are only output as a side effect of the compile step
- add the same hash as the first line in the stripped source code, so "debughash_<hash>" can be used as a search term in Everything to quickly find debug info associated with a shader (i.e. when looking at a capture in renderdoc or similar)
note: this is a resubmit with fixes for mac/linux issues (avoiding printfs and using string builders instead; since wchar_t and TCHAR are not the same size on these platforms using %ls does not work), and a further fix for problems encountered with source compression when using wide chars in preprocessing instead of ansi chars.
#rb Laura.Hermanns
#jira UE-209753
[CL 32542773 by dan elksnitis in ue5-main branch]
[FYI] dan.elksnitis
Original CL Desc
-----------------------------------------------------------------
[shaders]
- modify the input hash debug dump mechanism to output an empty "debughash_<hash>" file instead of a txt file with the hash in contents, and always dump these files for the instance of the job that actually compiled
- the existing cvar will now just make it so these files are also dumped for jobs which hit in DDC or the job cache; we don't do this by default so there's only a single match for the debug hash for any given shader normally and it is inside the folder containing the full debug info, including those artifacts which are only output as a side effect of the compile step
- add the same hash as the first line in the stripped source code, so "debughash_<hash>" can be used as a search term in Everything to quickly find debug info associated with a shader (i.e. when looking at a capture in renderdoc or similar)
#rb Laura.Hermanns
#jira UE-209753
[CL 32448284 by dan elksnitis in ue5-main branch]
- modify the input hash debug dump mechanism to output an empty "debughash_<hash>" file instead of a txt file with the hash in contents, and always dump these files for the instance of the job that actually compiled
- the existing cvar will now just make it so these files are also dumped for jobs which hit in DDC or the job cache; we don't do this by default so there's only a single match for the debug hash for any given shader normally and it is inside the folder containing the full debug info, including those artifacts which are only output as a side effect of the compile step
- add the same hash as the first line in the stripped source code, so "debughash_<hash>" can be used as a search term in Everything to quickly find debug info associated with a shader (i.e. when looking at a capture in renderdoc or similar)
#rb Laura.Hermanns
#jira UE-209753
[CL 32436259 by dan elksnitis in ue5-main branch]
Add basic DX12 Work Graph support.
For this first pass there is no exposed RHI functionality for directly dispatching a work graph. Instead shader bundles have been extended to support a work graph based implementation.
Nanite compute materials now can use work graph shader bundles on D3D12 when r.Nanite.AllowWorkGraphMaterials and r.Nanite.Bundle.Shading are both set. Both of these default to off at the moment.
Also DataDrivenPlatformInfo now expose bSupportsWorkGraphs. This is false everywhere, but will be enabled for D3D12_SM6 as soon as we have the latest DXC shader compiler with lib_6_8 support submitted.
#rb Kenzo.Terelst, Yuriy.ODonnell
[CL 32196717 by jeremy moore in ue5-main branch]
- modify which version of source is dumped as a debug artifact by default to be the final preprocessed source instead of the stripped version; this is more useful when debugging in directcompile mode
- re-add the dump of defines as commented code to this artifact, which was inadvertantly removed in a previous CL (bad merge)
- make the "detailed" source dump add the stripped version, and the version modified by the compile step (if it exists)
- only add the additional debug data to the stripped version (which is still something that SCW can process successfully) but remove it from the "modified" version (which typically will cause errors when running in directcompile mode)
#rb Laura.Hermanns
[CL 31354417 by dan elksnitis in ue5-main branch]