- Several GetTypeHash would produce only very narrow hash distributions, so these have been modified.
- FShaderDrawKey::GetTypeHash was particularly slow (around 4% of sampled time in Mac UT in Instruments) almost all in MemCrc32, so this has been replaced with a much simpler hash routine to bring it down to ~1% sampled time.
- FShaderDrawKey::operator== comparison order changed to try and fail earlier by testing from most-to-least varying attributes.
- Changed the way sampler-state & texture look-up works to avoid unnecessary map lookup.
#codereview michael.trepka
[CL 2493570 by Mark Satterthwaite in Main branch]
- This is especially important on OS X OpenGL where these hitches can be equivalent to several frames, so it is enabled by default for this platform.
- The logging of draw calls is turned on/off with r.UseShaderDrawLog and incurs a fixed overhead as all RHI state is tracked & recorded into the shader cache.
- The actual predrawing is turned on/off with r.UseShaderPredraw and requires an existing cache to be useful, the predrawing takes place at frame-end and can be controlled with r.PredrawBatchTime which the time in ms to spend predrawing each frame until complete, or -1 to do it in one go (currently the default).
- The shader cache is now also versioned with the game able to supply a version distinct from the engine, so that old caches may be invalidated if/when required.
- The shader cache is now compatible with OpenGL SM4 & SM5 on Windows & probably Linux too.
- It may not work correctly with desktop ES2 emulation as that is still untested.
- To make it easier to add to other RHIs the API has been made simper to avoid lots of exposed branches.
#codereview michael.trepka, dmitry.rekman
[CL 2460109 by Mark Satterthwaite in Main branch]
- The consolve variable "r.UseShaderCaching" controls whether the cache should be used.
- The cache has to be initialised and shutdown by the RHI.
- All RHI shader types now contain the SHA hash of the compiled source when the cache is in use so that they can be identified efficiently.
- All shaders are fetched from the cache rather than being created on-demand.
- On shader deserialisation the new shader is sent to the render thread for creation & caching.
- After each shader is cached all complete bound-shader-states that include that shader are constructed and cached.
- Wrapped OpenGL glCreateShader & glCreateProgram to use name-caching as these operations can synchronise the entire OpenGL pipeline on OS X.
[CL 2423996 by Mark Satterthwaite in Main branch]