- Focused around moving GlobalBeginCompileShader and friends.
- ModifyCompilationEnvironment and ValidateCompiledResult now only compiled in Editor builds.
- Measured 0.5MB to 1.0MB elf size reduction depending on platform.
#jira none
#rb jason.nadro, arciel.rekman, florin.pascu
#preflight 63613f992b5338aceb442902
[CL 22890964 by christopher waters in ue5-main branch]
- each platform compilation request is now wrapped in __try/__except rather than a single exception handling block at the top level of the worker; this allows us to log an exception (with callstack) as a compilation error and continue the batch
- remove SEH code from ShaderConductorContext, the above makes this redundant (and it didn't provide any actionable information)
- strip down SEH code in D3DShaderCompiler; now only used for the purposes of pre-compiling with DXC in the case of an FXC crash. dumping preprocessed source will be handled in a different manner in a forthcoming CL
- minor change in the DXC precompile path to not log an unnecessary warning when performing an explicitly-requested DXC precompile
#rb Jason.Nadro
#rb Laura.Hermanns
#rb Yuriy.ODonnell
#preflight 63514c798176062ea73acb41
#jira FORT-524383
[CL 22654436 by dan elksnitis in ue5-main branch]
This can be enabled by modifying `UnrealEngine\Engine\Saved\UnrealBuildTool\BuildConfiguration.xml` like so:
<?xml version="1.0" encoding="utf-8" ?>
<Configuration xmlns="https://www.unrealengine.com/BuildConfiguration">
<BuildConfiguration>
<bShaderCompilerWorkerTrace>true</bShaderCompilerWorkerTrace>
</BuildConfiguration>
</Configuration>
- Added a build configuration xml value, `bShaderCompilerWorkerTrace`.
- Turning this on will set USE_SHADER_COMPILER_WORKER_TRACE=1
- Move the parameter -nothreading to be set when we launch the process instead of internally as an extra cmd line arg.
- Unreal Insights uses a separate thread to send events so threading support is needed for the program. When we have USE_SHADER_COMPILER_WORKER_TRACE enabled we need to turn off `-nothreading`.
- When USE_SHADER_COMPILER_WORKER_TRACE is enabled we pass in `-trace=default` to get CPU event markers.
- The SCW program needs to turn on the following defines to be able to perform CPU and memory traces:
ENABLE_LOW_LEVEL_MEM_TRACKER=1
UE_MEMORY_TAGS_TRACE_ENABLED=1
UE_TRACE_ENABLED=1
- Instrument Shader Compiler Worker with TRACE_CPUPROFILER_EVENT_SCOPE. This are no-ops when this is turned off.
todo: Make the shader compiler worker inherit the trace args from the main process it was launched from.
#rb Yuriy.ODonnell
#jira none
#preflight 634ef80269246074db9637c2
[CL 22625183 by Jason Nadro in ue5-main branch]
Changed ShaderArchive, GlobalShaderCache, ShaderDebugInfo and Autogen to use ShaderPlatformName and not ShaderFormat when naming their output files.
#rb Jack.Porter, Chris.Waters, Mihnea.Balta, Jason.Nadro
#jira UE-120561
#preflight 62c31f6fc9410537282296c6
[CL 20937870 by Florin Pascu in ue5-main branch]
- Non-engine modules/targets will have to specify the "version" of includes via IncludeOrderVersion in TargetRules or ModuleRules.
- This setting will control the value of UE_ENABLE_INCLUDE_ORDER_DEPRECATED_IN_XXX where XXX is the version of the engine.
- When moving types out of a header, users will need to include the new location of the type in the header it was removed from but only if UE_ENABLE_INCLUDE_ORDER_DEPRECATED_IN_XXX is set.
- If a target does not change its IncludeOrderVersion to the latest version, UBT will print out a message telling users how to upgrade.
- This change introduces a new set of SharedPCH permutations to make sure modules with older versions get the PCH with UE_ENABLE_INCLUDE_ORDER_DEPRECATED_IN_XXX set correctly.
#jira none
#rb jonathan.adamczewski, joe.kirchoff
#preflight 623e1d3d196f3ae80b4c37ee
[CL 19518359 by christopher waters in ue5-main branch]
- Provides about 8% runtime memory savings (in local tests).
- Also, adds more compression types for shaders.
- Impact on the shader compilation (in SCWs) seems to be negligible in local tests.
#rb Devin.Doucette, Charles.Bloom
[REVIEW] [at]Devin.Doucette, [at]Charles.Bloom, [at]Jason.Nadro
#jira UE-136845
#ROBOMERGE-AUTHOR: arciel.rekman
#ROBOMERGE-SOURCE: CL 18502862 via CL 18503105 via CL 18503112 via CL 18505939 via CL 18505950
#ROBOMERGE-BOT: STARSHIP (Release-Engine-Staging -> Release-Engine-Test) (v899-18417669)
[CL 18505961 by arciel rekman in ue5-release-engine-test branch]
This represents UE4/Main @18073326, Release-5.0 @18081140 and Dev-PerfTest @18045971
[CL 18081471 by aurel cordonnier in ue5-release-engine-test branch]
Local shader compiler:
- Not making 1 job batches (for High prio jobs) on startup
- Randomizing pending job selection to reduce chance that multiple 60+ sec jobs get into a single batch
(the above have the most effect on speed up, below is optional/misc)
- Pushing the completed jobs back to manager before, and not after, pulling new ones to reduce idle time
- Processing jobs in FIFO rather than LIFO order (change to LIFO seems like an ODSC regression? Hard to say definitively if it's a regression from the numbers, but seems odd to have the earliest jobs processed last)
- Parallel processing of input and output files (starts sequential by default to reduce the CPU overhead, but is enabled if we ever see too long write/read)
- More insights scopes
Distributed:
- Not avoiding local machine for XGE on startup
Both:
- Input file compression (disabled by default, need to better figure out when it's beneficial in a general case. Mostly for people with really slow I/O and XGE over VPN).
- More stats, also more dense stat output. Removed an unimportant one, added stats about the batches.
#rb Jason.Nadro, Ben.Ingram
#[review] [at]Jason.Nadro, [at]Ben.Ingram
#preflight 6132ec79bf137d0001ae91ee
#jira UE-125101
#ushell-cherrypick of 17448576 by Arciel.Rekman
#ROBOMERGE-AUTHOR: arciel.rekman
#ROBOMERGE-SOURCE: CL 17448989 in //UE5/Main/...
#ROBOMERGE-BOT: STARSHIP (Main -> Release-Engine-Test) (v870-17433530)
[CL 17448996 by arciel rekman in ue5-release-engine-test branch]
- Increased the number of jobs per XGE worker to reduce the number of times we start them. Since XGE SCWs aren't used for latency-critical jobs, this is expected to not cause user-visible delays.
- Added an ability to detect a hung distributed controller (but the build would be currently broken anyway as the jobs aren't reissued).
- Added a check for XGE system service running to avoid attempting to launch XGE on machines without it.
- Removed the code to even attempt to launch XGE XML interface.
- Also added some more logs about what's happening, and reduced the job cache verbosity.
#rb Jason.Nadro, Ben.Ingram, Danny.Couture
#jira none
#ROBOMERGE-OWNER: Arciel.Rekman
#ROBOMERGE-AUTHOR: arciel.rekman
#ROBOMERGE-SOURCE: CL 15878601 in //UE5/Release-5.0-EarlyAccess/...
#ROBOMERGE-BOT: STARSHIP (Release-5.0-EarlyAccess -> Main) (v786-15839533)
#ROBOMERGE-CONFLICT from-shelf
[CL 15878696 by Arciel Rekman in ue5-main branch]
- This helps with overall system responsiveness when many shader workers are spawned at the same time
#rb Arciel.Rekman
[CL 15580762 by danny couture in ue5-main branch]