Commit Graph

14 Commits

Author SHA1 Message Date
ola olsson
4adb667865 Moved VSM stats to shared header and implement piping all of the them through to csv (if the stat category is enabled).
#rb andrew.lauritzen

[CL 33823518 by ola olsson in ue5-main branch]
2024-05-22 06:06:43 -04:00
andrew lauritzen
ce2d5721b7 Move some of the invalidation tracking back to per-primitive and on the CPU since we ended up doing WPO stuff slightly differently.
This lets us get rid of the GPU static->dynamic pass that was non-trivially expensive on scenes with many instances, and is currently unnecessary.
Behavior should be the same as before, just with the logic moved back to the CPU and per-primitive (which that part of it functionally still was).

Swap invaliation dispatch back to single per group and go back to expanding on the CPU to avoid redundantly hitting cases that were already CPU culled to specific lights. More of these will be present with Ola's recent changes.

#rb jamie.hayes

[CL 30676139 by andrew lauritzen in ue5-main branch]
2024-01-17 19:35:33 -05:00
andrew lauritzen
0ba82e077c Support nanite overdraw visualization in VSMs
#rb graham.wihlidal

[CL 27798209 by andrew lauritzen in ue5-main branch]
2023-09-12 13:19:05 -04:00
andrew lauritzen
2a901165d1 Make VSM cache more persistent and robust against multiple renders in a single "frame" (scene captures, etc).
Remove multiple cache managesr that were implemented to work around issues with the previous naive implementation
Fixes some invalidation issues in cases where multiple managers existed previously

NOTE: This change brings pages from scene captures and ndisplay and similar systems back into the main physical page pool, so it may be necessary in some of these cases to increase the page pool size.

#rb ola.olsson jason.hoerner

[CL 26352061 by andrew lauritzen in ue5-main branch]
2023-06-30 15:15:06 -04:00
andrew lauritzen
090b64dcc2 Significant changes and refactoring to VSM physical page allocation to support more persistent allocations:
- Support keeping pages that are not requested in a given frame, disabled to start but will be enabled soon. r.Shadow.Virtual.Cache.KeepOnlyRequestedPages
- Support LRU allocation of new pages. Only particularly meaningful with persistent allocations. Also disabled to start. r.Shadow.Virtual.Cache.AllocateViaLRU
- Lots of cleanup and refactoring of page allocation passes and shaders.
- Move page marking shaders into their own file as they are relatively independent from the allocation/caching and have dependencies on other systems (hair, water, etc)
- Clean up various cases of cache enabled/available/etc.
- Some minor cleanup of invalidations in prep for future work

#rb ola.olsson
#jira UE-147454
#preflight 642cad58da7f9583709d3172

[CL 24932187 by andrew lauritzen in ue5-main branch]
2023-04-05 13:48:25 -04:00
jamie hayes
ee7a64c5b3 - Implement "WPO Disable Distance" for non-Nanite static meshes, and unify FMaterialVertexParameters.bEvaluateWorldPositionOffset for all vertex factories
- Add instance draw distance and global clip plane culling to GPU instance culling and VSM culling shaders to prevent VS invocations for invisible instances.
- Additional clean-up to GPU Scene data and culling code.

#rb ola.olsson
#preflight 640a5bf67e654e2e6543a98c

[CL 24584209 by jamie hayes in ue5-main branch]
2023-03-09 17:51:49 -05:00
jamie hayes
e54c43ad9b [Backout] - CL24579289 and CL24579327
Original CL Desc
-----------------------------------------------------------------
- Implement "WPO Disable Distance" for non-Nanite static meshes, and unify `FMaterialVertexParameters.bEvaluateWorldPositionOffset` for all vertex factories
- Add instance draw distance and global clip plane culling to GPU instance culling and VSM culling shaders to prevent VS invocations for invisible instances.
- Additional clean-up to GPU Scene data and culling code.

#rb ola.olsson
#preflight 6407c662ba12ba6416b8245b

[CL 24581209 by jamie hayes in ue5-main branch]
2023-03-09 15:30:06 -05:00
jamie hayes
ea4847a60c Fix compile errors due to missing file in previous changelist.
#preflight skip

[CL 24579327 by jamie hayes in ue5-main branch]
2023-03-09 13:38:30 -05:00
ola olsson
d077efdf50 Dirty-page marking skips dynamic pages and merge only when static is dirty, also initialize dynamic pages from cached static pages, when available.
#preflight 6321b8948838676d103d3e75
#rb andrew.lauritzen
#preflight 632ae18af87253e02153cd56

[CL 22201428 by ola olsson in ue5-main branch]
2022-09-27 02:14:18 -04:00
ola olsson
a9b46eb202 Dirty page tracking for non-nanite implementation & WPO-invalidations from Nanite
- bMaterialMayModifyPosition -> bMaterialUsesWorldPositionOffset for non-nanite, we don't want to invalidate due to PDO
- Nanite instance culling records static primitives that invalidate the VSM (so they get cached as dynamic)
- Dirty page flags now store the invalidation as well (static & dynamic) so 3x in size
- Nanite VSM instance/cluster culling uses PRIMITIVE_SCENE_DATA_FLAG_EVALUATE_WORLD_POSITION_OFFSET to drive bHasMoved and invalidaiton.
- Non-nanite instance culling outputs dirty page flags for invalidating instances, with in-group load balancing for large footprints
- Store invalidation flags in physical page metadata flags (removed the cumulative dirty flags buffer).
- Added bAnyMaterialHasWorldPositionOffset (accessor AnyMaterialHasWorldPositionOffset()) to FPrimitiveSceneProxy.
- Driving PRIMITIVE_SCENE_DATA_FLAG_EVALUATE_WORLD_POSITION_OFFSET from bAnyMaterialHasWorldPositionOffset in addition to EvaluateWorldPositionOffset.
- Removed near clip permutation for non-nanite VSM culling shader.
- Non-nanite vsm raster passes are all batched in a single RDG pass to better allow overlap between draws and lower pass overhead.
- Removed old GPU->GPU invalidation logic.
- Removed dynamic caster flags and update the physical page metadata directly
- Renamed PRIMITIVE_SCENE_DATA_FLAG_SHOULD_CACHE_SHADOW -> PRIMITIVE_SCENE_DATA_FLAG_SHOULD_CACHE_SHADOW

#rb andrew.lauritzen,jamie.hayes
#jira UE-147061
#preflight 631b0a18a60c539c98cf1308

[CL 21973831 by ola olsson in ue5-main branch]
2022-09-12 18:06:57 -04:00
ola olsson
6c563cf263 Batch of improvements to page management, that help improve passes of page allocation.
Implemented single-page page table support for distant lights
- store only a single page table entry for distant lights
- modify page lookup logic in various places to handle this

Implemented override behavior to render everything to dynamic pages for a light that always invalidates using r.Shadow.Virtual.Cache.ForceInvalidateClipmaps (behave as uncached, despite caching being enabled).
This brings performance to par with uncached rendering by removing various overheads that are not achieving anything for this case.
- Added a new flag to the nanite view to indicate if it is uncached VSM_PROJ_FLAG_UNCACHED, currently driven by the cvar r.Shadow.Virtual.Cache.ForceInvalidateClipmaps
- If this flag is set on a view ShouldCacheInstanceAsStatic, which now takes a nanite view, returns false, causing all rendering to go to the dynamic pages.
- To preserve HZB functionality, the HZB build is modified to load from the dynamic depth pages (normally it uses the static)
- The page initializer (that clears depth) skips static pages for uncached bviews as they will not be used.
- Finally the page merge pass that combines static & dynamic depth into the dynamic page also skips pages from uncached views.

Optimized page allocation pass by storing the actual pages needing allocation from the cache init pass.
Optimized hierarchical page flag generation by dispatching over physical pages instead of virtual.
Fixed dynamic primitive cache invalidation logic.

#jira UE-122102
#rb andrew.lauritzen
#preflight 63087c2592620e5ec3aa3f2f

[CL 21590421 by ola olsson in ue5-main branch]
2022-08-26 11:06:04 -04:00
ola olsson
0b7b093519 Added distance based instance/cluster culling for views in Nanite to improve local light rendering (measured ~20% perf improvement for local lights - highly content dependent).
- Added RangeBasedCullingDistance to Nanite view & NANITE_VIEW_FLAG_DISTANCE_CULL to indicate if used
- Added light radius to VSM projection data and added culling invalidations based on range also.
- Added unrelated flag & flag field to VSM to reduce shader header change churn (VSM_PROJ_FLAG_CURRENT_DISTANT_LIGHT).
- Added light range test in page marking shader.

Simplify Nanite LOD calculations by performing them in world space instead of local space
- This also allows us no longer precalculate ViewPosScaledLocal and ViewForwardScaledLocal in DynamicData

#rb rune.stubbe
#preflight 62beb9303f0d6beee2da98db

#ROBOMERGE-AUTHOR: ola.olsson
#ROBOMERGE-SOURCE: CL 20913913 via CL 20913992 via CL 20913998
#ROBOMERGE-BOT: UE5 (Release-Engine-Staging -> Main) (v971-20777995)

[CL 20916199 by ola olsson in ue5-main branch]
2022-07-01 12:47:13 -04:00
andrew lauritzen
bd803d7e43 SMRT improvements:
- Add slope-based depth extrapolation which improves the quality of penumbras on angled receivers. Costs ~10% performance in some cases so maintaining a permutation/cvar (default on) for scalability.
- Change screen ray trace to be a simple "space skipping" ray that terminates as soon as it goes behind geometry and continue VSM trace from that distance. This avoids various contact-shadow-like artifacts and undesirable/inconsistent contact shadows from things that aren't in the VSM. In certain cases if regular contact shadows are desired on top of VSM the engine contact shadows can be enabled, as it is with CSMs.
- Remove a bunch of use of "halfs" in the shaders as they cause some extra ALU on some platforms and don't appear to really be helping with occupancy anymore
- Small bump to minimum normal bias clamp (only affects things very close to the camera)

#rb brian.karis
[FYI] ola.olsson

#ROBOMERGE-AUTHOR: andrew.lauritzen
#ROBOMERGE-SOURCE: CL 19411300 via CL 19411627
#ROBOMERGE-BOT: UE5 (Release-Engine-Staging -> Main) (v928-19376421)

[CL 19413239 by andrew lauritzen in ue5-main branch]
2022-03-16 17:30:21 -04:00
andrew lauritzen
a936f96e04 Add clipmap address space visualization and hook up to (advanced) visualization stuff.
Move visualize enum/defines to shared header file.
Rename a few cvars for consistency.

#rb graham.wihlidal
[FYI] ola.olsson
#preflight 61f9d8921d7ca8ed2d628ccd

#ROBOMERGE-AUTHOR: andrew.lauritzen
#ROBOMERGE-SOURCE: CL 18819821 in //UE5/Release-5.0/... via CL 18819825 via CL 18822904
#ROBOMERGE-BOT: UE5 (Release-Engine-Test -> Main) (v910-18824042)

[CL 18825054 by andrew lauritzen in ue5-main branch]
2022-02-02 08:19:08 -05:00