- Added RangeBasedCullingDistance to Nanite view & NANITE_VIEW_FLAG_DISTANCE_CULL to indicate if used
- Added light radius to VSM projection data and added culling invalidations based on range also.
- Added unrelated flag & flag field to VSM to reduce shader header change churn (VSM_PROJ_FLAG_CURRENT_DISTANT_LIGHT).
- Added light range test in page marking shader.
Simplify Nanite LOD calculations by performing them in world space instead of local space
- This also allows us no longer precalculate ViewPosScaledLocal and ViewForwardScaledLocal in DynamicData
#rb rune.stubbe
#preflight 62beb9303f0d6beee2da98db
#ROBOMERGE-AUTHOR: ola.olsson
#ROBOMERGE-SOURCE: CL 20913913 via CL 20913992 via CL 20913998
#ROBOMERGE-BOT: UE5 (Release-Engine-Staging -> Main) (v971-20777995)
[CL 20916199 by ola olsson in ue5-main branch]
- Removed scene render mem-mark among others. MemStack usage is now restricted to local scopes with known marks.
- Render resources with destructors are allocated using the FSceneRenderingBulkObjectAllocator on FSceneRenderer, which is deleted when the scene render is.
#preflight 62b266e20d4d6228de97babe
#rb mihnea.balta, yuriy.odonnell
[CL 20907647 by zach bethel in ue5-main branch]
- Improvements to cube map filtering to eliminate various leaking artifacts at cube edges/corners
- Implement texel dither and optimal bias for local lights
- Remove some half stuff that was causing some sub-optimal compilation
- Fix VSM visualization alignment
- Add a general purpose permutation parameter that can be set for A/B testing projection performance that affects register allocation
- General cleanup
#rb ola.olsson
#preflight 62b350da767c9aaf349e3243
#ROBOMERGE-AUTHOR: andrew.lauritzen
#ROBOMERGE-SOURCE: CL 20779131 via CL 20780276 via CL 20780946
#ROBOMERGE-BOT: UE5 (Release-Engine-Staging -> Main) (v970-20704180)
[CL 20782054 by andrew lauritzen in ue5-main branch]
The refactor to get rid of the FSceneTextures and FSceneTexturesConfig global singletons has also been preserved. The FViewFamilyInfo is used to hold those structures, so client facing API functions that lack a scene renderer reference can still pull the FSceneTextures[Config] from the view family pointer. All other scene renderer state has been moved from FViewFamilyInfo back into FSceneRenderer.
#jira none
#rnx
#rb ola.olsson krzysztof.narkowicz zach.bethel
#preflight 629f770e233ae0a8f8fb7f2e
[CL 20540730 by jason hoerner in ue5-main branch]
Clients outside Renderer should use the default view, or the self managed FShaderPrintData (improvements to that incoming).
#preflight 629a384c6879a2ac678a889e
[CL 20488523 by Jeremy Moore in ue5-main branch]
* Existing calls to CreateSceneTextureShaderParameters and similar functions use "GetSceneTexturesChecked", which allows for the possibility that they are reached in a code path where scene textures haven't been initialized, and nullptr is returned instead of asserting. The shader parameter setup functions then fill in dummy defaults for that case. The goal was to precisely match the original behavior, which queried the RDG blackboard, and gracefully handled null if scene textures weren't there. This definitely appears to occur in FNiagaraGpuComputeDispatch::ProcessPendingTicksFlush, which can be called with a dummy scene with no scene textures. In the future, I may change this so dummy defaults are filled in for FSceneTextures at construction time, so the structure is never in an uninitialized state, but I would like to set up a test case for the Niagara code path before doing that, and the checks aren't harmful in the meantime.
* I marked as deprecated global functions which query values from FSceneTexturesConfig, but they'll still work with the caveat that if you use multi-view-family rendering, the results will be indeterminate (whatever view family rendered last). There was only one case outside the scene renderer that accessed the globals (depth clear value), which I removed, noting that there is nowhere in the code where we modify the depth clear value from its global default. I would like to permanently deprecate or remove these at some point. Display Cluster is the only code that's currently using the multi-view-family code path, and as a new (still incomplete) feature, third party code can't be using it, and won't be affected.
#jira NONE
#rb chris.kulla zach.bethel mihnea.balta
#preflight 6261aca76119a1a496bd2644
[CL 19873983 by jason hoerner in ue5-main branch]
* Added "BeginRenderingViewFamilies" render interface call that accepts multiple view families. Original "BeginRenderingViewFamily" falls through to this.
* FSceneRenderer modified to include an array of view families, plus an active view family and the Views for that family.
* Swap ViewFamily to ActiveViewFamily.
* Swap Views array from TArray<FViewInfo> to TArrayView<FViewInfo>, including where the Views array is passed to functions.
* FSceneRenderer iterates over the view families, rendering each one at a time, as separate render graph executions.
* Some frame setup and cleanup logic outside the render graph runs once.
* Moved stateful FSceneRenderer members to FViewFamilyInfo, to preserve existing one-at-a-time view family rendering behavior.
* Display Cluster (Virtual Production) uses new API.
Next step will push everything into one render graph, which requires handling per-family external resources and cleaning up singletons (like FSceneTextures and FSceneTexturesConfig). Once that's done, we'll be in a position to further interleave rendering, properly handle once per frame work, and solve artifacts in various systems.
#jira none
#rnx
#rb zach.bethel
#preflight 625df821b21bb49791d377c9
[CL 19813996 by jason hoerner in ue5-main branch]
Add cvar to force full HZB update for debugging
Force invalidation of static pages for any non-movable objects, even if they currently "draw velocity" due to editor movement
- Fixes shadows staying in the original location when dragging static objects in editor
- This is also step 1 to having a more heuristic static/dynamic per-prim split where things can go from one bucket to the other
#preflight 62585cb86e2c50550f07a394
#rb jamie.hayes
[CL 19766198 by andrew lauritzen in ue5-main branch]
- Avoid gotchas with max texture size when static separate enabled
- Simplify addressing logic in a number of places
- Avoid allocating extra HZB that we never use
Details:
- Support rendering/sampling to 2D depth texture array in Nanite and virtual shadow map pass
- Remove some unnecessary HZB-related cvars
- Remove unused permutations from VSM HW raster
#preflight 624f4e5611261bc7b2171208
#rb jamie.hayes
#ROBOMERGE-AUTHOR: andrew.lauritzen
#ROBOMERGE-SOURCE: CL 19679616 via CL 19679656 via CL 19679706
#ROBOMERGE-BOT: UE5 (Release-Engine-Staging -> Main) (v938-19570697)
[CL 19680680 by andrew lauritzen in ue5-main branch]
- Refactor Nanite instance / cluster culling to accommodate VSM HZB tests without blowing out code size.
- Consolidate VSM HZB testing code to one flexible path.
- Always tests against static cached (when static separate is enabled) as this provides the best coverage.
- Currently performs a full HZB rebuild, which is expensive.
#rb rune.stubbe
#fyi andrew.lauritzen
#preflight 62309454e65a7e65d6855209
#robomerge fnnc
[CL 19384824 by Ola Olsson in ue5-main branch]
Reductions *per material*:
SM5
--
FHWRasterizeVS: 832 -> 21
FHWRasterizePS: 104 -> 39
SM6
--
FHWRasterizeVS: 320 -> 9
FHWRasterizeMS: 640 -> 9
FHWRasterizePS: 120 -> 30
Vulkan
--
FHWRasterizeVS: 320 -> 9
FHWRasterizePS: 40 -> 15
Other platforms redacted =)
-- Details
* CLUSTER_PER_PAGE has been fully removed (since we no longer ever run CLUSTER_PER_PAGE=0), which now makes it mutually inclusive with VIRTUAL_TEXTURE_TARGET
* HAS_RASTER_BIN has been replaced with a dynamic branch, since this is just a per cluster index offset based on a simple uniform buffer load
* ADD_CLUSTER_OFFSET has been replaced with a dynamic branch, since this is just a per cluster index offset based on a simple uniform buffer load
* HAS_PREV_DRAW_DATA has been replaced with a dynamic branch, since this is just a per cluster index offset based on a simple uniform buffer load
* NEAR_CLIP (only change to significantly affect codegen) has been turned into a dynamic branch based on FNaniteView - this lets us merge depth clip/clamp rasterizer calls in VSM together instead of relying on HAS_PREV_DRAW_DATA, and a future optimization can now be done to merge local and directional light full Nanite pipeline calls together.
* VISUALIZE permutation removed from VS/MS since it only loaded unform values that passed down per-vertex into fragment stage as nointerpolation parameters. Pixel shader now constructs this uint2 directly under the VISUALIZE permutation
* NANITE_MESH_SHADER_INTERP removed by default but still left in the code, since it is a work in progress potential optimization for DX12 mesh shaders
* Removed explicit Lumen and VSM usage of NANITE_RENDER_FLAG_HAVE_PREV_DRAW_DATA (now the dynamic branch path is only taken if CullRasterizeMultiPass implicitly breaks the rasterization into multiple calls due to NANITE_MAX_VIEWS_PER_CULL_RASTERIZE_PASS overflow)
Performance was tested on a 2080Ti in AncientGame, and the delta is effectively noise (tested cached and uncached VSM). Further testing on other platforms will occur, but important to get this change in for all the benefits and easy to tweak things later if needed.
#rb rune.stubbe
#fyi brian.karis, ola.olsson, andrew.lauritzen, jamie.hayes, daniel.wright, krzysztof.narkowicz
#preflight 622e684c7e2e35638c96a16a
#robomerge FNNC
[CL 19370372 by graham wihlidal in ue5-main branch]
- also fix construction of top part of VSM HZB.
#rb rune.stubbe
#robomerge FNNC
#fyi andrew.lauritzen
#preflight 6220a20ad059c6be6c8ca719
[CL 19241653 by Ola Olsson in ue5-main branch]
This avoids to define STRATA_ENABLED as 'env. define' for all global shaders needing a Strata specific path.
#rb none
#jira none
#fyi sebastien.hillaire
#preflight 62151b25797dbbeb472ae2eb
[CL 19074999 by Charles deRousiers in ue5-main branch]