Page allocation part is a fairly minor benefit in most scenes but can occasionally make a big difference.
Projection early-out is a pretty uniform benefit all the time, especially with SMRT.
#preflight 612d1b836a14cc000118f03a
#rb ola.olsson
#ROBOMERGE-SOURCE: CL 17388425 via CL 17389280
#ROBOMERGE-BOT: STARSHIP (Main -> Release-Engine-Test) (v865-17346139)
[CL 17389330 by andrew lauritzen in ue5-release-engine-test branch]
Add cvar for including non-nanite geometry in coarse pages (so that one can disable for performance reasons). Default on.
Some general unused flags cleanup.
#rb rune.stubbe
#preflight 610ccc0e6c6eb0000196b59e
#ROBOMERGE-SOURCE: CL 17085350 via CL 17095081
#ROBOMERGE-BOT: STARSHIP (Main -> Release-Engine-Test) (v853-17066230)
[CL 17095354 by andrew lauritzen in ue5-release-engine-test branch]
Run a compute job that packs most commonly used instance data (LocalToWorld matrix and some other bits - 80 bytes) into per-instance vertex buffer. Vertex shader does not have access to GPUScene and instead loads instance data from a per-instance vertex buffer. If it needs more primitive/instance data than available then it will load it from Primitive UB, binding unique uniform buffer and breaking auto-instancing. Pixel shader has a full access to a GPUScene
There are 3 ways how FSceneDataIntermediates gets populated
1. PrimitiveId + GPUScene (Desktop)
2. Per-Instance data + Primitive UB (Mobile)
3. Primitive UB (auto-instancing disabled)
Details for GPUScene specific vertex inputs and access to FSceneDataIntermediates are hidden behind a macro:
VF_GPUSCENE_DECLARE_INPUT_BLOCK
VF_GPUSCENE_GET_INTERMEDIATES
FSceneDataIntermediates is now stored in FVertexFactoryIntermediates, FMaterialVertexParameters. Added a few GetPrimitiveData() overloads that allows you to access PrimitiveData depending on current context. Removed most of the cases where GetPrimitiveData() gets used with PrimitiveId.
#rb Ola.Ollson
#ROBOMERGE-SOURCE: CL 17093848 in //UE5/Main/...
#ROBOMERGE-BOT: STARSHIP (Main -> Release-Engine-Test) (v853-17066230)
[CL 17093856 by dmitriy dyomin in ue5-release-engine-test branch]
- Eliminates the previous frame texture (~128MB in default config) and copy
- Cached pages are kept in place, with a quick pass to find and allocate free pages following
- Experimented with a persistent free list but for the small numbers of pages we are talking about (2k-4k ish standard), recomputing the free list each frame was cheap and robust
- Main cost is synchronization/barriers between the new passes. Fairly low measured overhead compared to the much larger wins, but can revisit later if need be.
- Removed a few things that added complexity but were not making a big difference to performance, especially with the changes:
- No longer support "panning" the directional light cascade cached pages. With the changes to the snapping and depth range invalidations this was no longer making much of a difference anyways.
- No longer do the hierarchical "eliminate parent if all 4 mip children are marked" for local lights. Tested a bit and it was not making a measurable difference in most cases with caching. Can be revived if useful later.
- Moved ownership of the now persistent physical pool to CacheManager
- A bunch of related cleanup and fixes
#rb ola.olsson
#preflight 61008a655938f90001f04260
#ROBOMERGE-OWNER: andrew.lauritzen
#ROBOMERGE-AUTHOR: andrew.lauritzen
#ROBOMERGE-SOURCE: CL 16985509 via CL 16987414
#ROBOMERGE-BOT: STARSHIP (Main -> Release-Engine-Test) (v838-16927207)
[CL 16987415 by andrew lauritzen in ue5-release-engine-test branch]
#ROBOMERGE-SOURCE: CL 16917622 in //UE5/Main/...
#ROBOMERGE-BOT: STARSHIP (Main -> Release-Engine-Test) (v836-16769935)
[CL 16917631 by andrew lauritzen in ue5-release-engine-test branch]
- Add VirtualShadowMapId to forward light data, removing the per-view remapping table. Should fixe a few multi-view/split-screen bugs.
- Minor cleanup of PCF subsurface path; remove dead/broken code.
- Fix up blending of light attenuation into screen shadow mask; disable CSM fading when VSMs are enabled.
- Fix OnePassProjection flag when VSMs are disabled
#rb none
[FYI] ola.olsson, brian.karis
#ROBOMERGE-SOURCE: CL 16852407 via CL 16852415
#ROBOMERGE-BOT: STARSHIP (Main -> Release-Engine-Test) (v836-16769935)
[CL 16852422 by andrew lauritzen in ue5-release-engine-test branch]
- Uses the InstanceCullingLoadBalancer to pre-distribute the work on the CPU to ensure even load.
- Make instance culling use the instance data offset in MDC instead of translating primitive IDs.
- Track single-instance draws separately from instanced to optimize handling (disable culling for single-instance primitives).
#rb Graham.wihlidal,andrew.lauritzen
[FYI] dmitriy.dyomin
#preflight 60d0eafa2ab2180001269160
#ROBOMERGE-SOURCE: CL 16733827 in //UE5/Main/...
#ROBOMERGE-BOT: STARSHIP (Main -> Release-Engine-Test) (v835-16672529)
[CL 16733837 by ola olsson in ue5-release-engine-test branch]
- Enables overlapping batched and non-batched instance culling (needed for batching work).
- Removes some explicit transitions & minor cleanup.
- Added tracking the required number of instances (fixes non-nanite VSM for large ISMs)
#rb graham.wihlidal,jian.ru,yujiang.wang,zach.bethel
#preflight 60b73f38107dc600017d931b
[CL 16544217 by Ola Olsson in ue5-main branch]
- Instead, create a reference depth range with a guard band that we reuse frame to frame until we get too close to the edge, at which point we invalidate that clipmap level's cached pages (if any)
- Does not significantly affect invalidations, even when moving quickly near surfaces, since those cases would already get invalidated when those pixels moved to a different clipmap level
- Sets up for not re-copying cached VSM pages every frame
#rb ola.olsson
[FYI] brian.karis
#ROBOMERGE-SOURCE: CL 16492432 in //UE5/Private-Frosty/...
#ROBOMERGE-BOT: STARSHIP (Private-Frosty -> Main) (v823-16466674)
[CL 16492445 by andrew lauritzen in ue5-main branch]
Use actor label for light identification/selection in editor; fall back on component name (UUID) temporarily until this data is made available outside the editor as well
Collect debug data in RayState for SMRT and return the last valid sample data
- Alternative is to always generate debug data based on just a single VSM sample even when SMRT is enabled, but it's useful to have a debug output path for the true SMRT data, even if a bit more complex
Move some common visualization utilities to /Visualization.ush
Implement but disable backface culling for shadow evaluation pending change that handles various SHADINGMODEL's properly
#rb graham.wihlidal, ola.olsson
#ROBOMERGE-SOURCE: CL 16407102 in //UE5/Private-Frosty/...
#ROBOMERGE-BOT: STARSHIP (Private-Frosty -> Main) (v804-16311228)
[CL 16412421 by andrew lauritzen in ue5-main branch]
- Require 64bit atomic support for Nanite by default now (r.Nanite.RequireAtomic64Support=1)
- Cleaned up a bunch of cvar component reattach callbacks to use lambdas instead of redundant global functions
- Added UStaticMesh::HasValidNaniteData() as a helper to check if an asset has valid Nanite data
- Improved Nanite unsupported on screen error message so that we don't asks the user to update GPU drivers on a console :)
#rb nick.whiting
#jira UETOP-1646
[FYI] brian.karis, michal.valient
#ROBOMERGE-OWNER: graham.wihlidal
#ROBOMERGE-AUTHOR: graham.wihlidal
#ROBOMERGE-SOURCE: CL 16363554 in //UE5/Release-5.0-EarlyAccess/...
#ROBOMERGE-BOT: STARSHIP (Release-5.0-EarlyAccess -> Main) (v804-16311228)
#ROBOMERGE-CONFLICT from-shelf
[CL 16363648 by graham wihlidal in ue5-main branch]
PS already has a permutation (bAtomicWrites) so this primarily affects offline compilation
Leaving the function of NonNaniteVSM the same for now to minimize the surface area of the change, but the shader and other logic can be simplified in the future
NonNaniteVSM can become a dynamic debug cvar in the future once we are happy this is working
#jira UETOP-1088
#lockdown graham.wihlidal
#rb ola.olsson, krzysztof.narkowicz
[FYI] nick.whiting
#ROBOMERGE-SOURCE: CL 16057407 in //UE5/Release-5.0-EarlyAccess/...
#ROBOMERGE-BOT: STARSHIP (Release-5.0-EarlyAccess -> Main) (v789-15992632)
[CL 16061153 by andrew lauritzen in ue5-main branch]
Add virtual clipmap levels near the camera to cover extreme close-ups
Add Nanite view compation debug output
Make VSM resolution lod bias consistent for directional and local lights
Remove some unnecessary code from ProjectionCommon.ush which gets included in material shaders
#jira UETOP-1088
#rb graham.wihlidal, ola.olsson
[FYI] brian.karis
#lockdown graham.wihlidal
#ROBOMERGE-SOURCE: CL 15994600 in //UE5/Release-5.0-EarlyAccess/...
#ROBOMERGE-BOT: STARSHIP (Release-5.0-EarlyAccess -> Main) (v789-15992632)
[CL 15994621 by andrew lauritzen in ue5-main branch]