Commit Graph

64 Commits

Author SHA1 Message Date
jason hoerner
b19bb6be2f UE5_MAIN: Multi-view-family scene renderer refactor, part 1. Major structural change to allow scene renderer to accept multiple view families, with otherwise negligible changes in internal behavior.
* Added "BeginRenderingViewFamilies" render interface call that accepts multiple view families.  Original "BeginRenderingViewFamily" falls through to this.
* FSceneRenderer modified to include an array of view families, plus an active view family and the Views for that family.
* Swap ViewFamily to ActiveViewFamily.
* Swap Views array from TArray<FViewInfo> to TArrayView<FViewInfo>, including where the Views array is passed to functions.
* FSceneRenderer iterates over the view families, rendering each one at a time, as separate render graph executions.
* Some frame setup and cleanup logic outside the render graph runs once.
* Moved stateful FSceneRenderer members to FViewFamilyInfo, to preserve existing one-at-a-time view family rendering behavior.
* Display Cluster (Virtual Production) uses new API.

Next step will push everything into one render graph, which requires handling per-family external resources and cleaning up singletons (like FSceneTextures and FSceneTexturesConfig).  Once that's done, we'll be in a position to further interleave rendering, properly handle once per frame work, and solve artifacts in various systems.

#jira none
#rnx
#rb zach.bethel
#preflight 625df821b21bb49791d377c9

[CL 19813996 by jason hoerner in ue5-main branch]
2022-04-19 14:45:26 -04:00
Arciel Rekman
700c39e0e7 Fix VSM debug view being the same for the right eye.
#rb Andrew.Lauritzen
#review @Andrew.Lauritzen, @Robert.Srinivasiah, @Jules.Blok
#jira none
#preflight 625642c45f20a0a34d8fbaf7

[CL 19734289 by Arciel Rekman in ue5-main branch]
2022-04-12 23:44:41 -04:00
andrew lauritzen
70a2837739 Move static separate cache to second texture array slice rather than "below" in UV space:
- Avoid gotchas with max texture size when static separate enabled
- Simplify addressing logic in a number of places
- Avoid allocating extra HZB that we never use

Details:
- Support rendering/sampling to 2D depth texture array in Nanite and virtual shadow map pass
- Remove some unnecessary HZB-related cvars
- Remove unused permutations from VSM HW raster

#preflight 624f4e5611261bc7b2171208
#rb jamie.hayes

#ROBOMERGE-AUTHOR: andrew.lauritzen
#ROBOMERGE-SOURCE: CL 19679616 via CL 19679656 via CL 19679706
#ROBOMERGE-BOT: UE5 (Release-Engine-Staging -> Main) (v938-19570697)

[CL 19680680 by andrew lauritzen in ue5-main branch]
2022-04-07 18:36:13 -04:00
ola olsson
b86b9f1d70 Add tracking dirty VSM pages during Nanite rendering to reduce HZB build cost
- Make VSM HZB persistent, reallocated on pool size change.
- Enable two-pass HZB by default for Nanite VSM (r.Shadow.Virtual.UseHZB == 2)

#rb andrew.lauritzen
#preflight 623b1c6088538cd45e0ce824

#ROBOMERGE-AUTHOR: ola.olsson
#ROBOMERGE-SOURCE: CL 19478487 via CL 19481406 via CL 19481553
#ROBOMERGE-BOT: UE5 (Release-Engine-Staging -> Main) (v936-19480137)

[CL 19484029 by ola olsson in ue5-main branch]
2022-03-23 15:54:41 -04:00
Ola Olsson
245fe2702a Implement two-pass HZB occlusion culling for VSM (Nanite path) enabled through r.Shadow.Virtual.UseHZB = 2
- Refactor Nanite instance / cluster culling to accommodate VSM HZB tests without blowing out code size.
- Consolidate VSM HZB testing code to one flexible path.
- Always tests against static cached (when static separate is enabled) as this provides the best coverage.
- Currently performs a full HZB rebuild, which is expensive.

#rb rune.stubbe
#fyi andrew.lauritzen
#preflight 62309454e65a7e65d6855209
#robomerge fnnc

[CL 19384824 by Ola Olsson in ue5-main branch]
2022-03-15 10:05:21 -04:00
graham wihlidal
25f8e29153 Removed a massive number of Nanite rasterizer shader permutations across all platforms/shaderdbs, significantly improving iteration times for the editor and cooker, especially when these numbers get multiplied by the number of materials that utilize programmable features in addition to the default material "fixed function" path.
Reductions *per material*:

SM5
--
FHWRasterizeVS: 832 -> 21
FHWRasterizePS: 104 -> 39

SM6
--
FHWRasterizeVS: 320 -> 9
FHWRasterizeMS: 640 -> 9
FHWRasterizePS: 120 -> 30

Vulkan
--
FHWRasterizeVS: 320 -> 9
FHWRasterizePS: 40 -> 15

Other platforms redacted =)

-- Details

* CLUSTER_PER_PAGE has been fully removed (since we no longer ever run CLUSTER_PER_PAGE=0), which now makes it mutually inclusive with VIRTUAL_TEXTURE_TARGET
* HAS_RASTER_BIN has been replaced with a dynamic branch, since this is just a per cluster index offset based on a simple uniform buffer load
* ADD_CLUSTER_OFFSET has been replaced with a dynamic branch, since this is just a per cluster index offset based on a simple uniform buffer load
* HAS_PREV_DRAW_DATA has been replaced with a dynamic branch, since this is just a per cluster index offset based on a simple uniform buffer load
* NEAR_CLIP (only change to significantly affect codegen) has been turned into a dynamic branch based on FNaniteView - this lets us merge depth clip/clamp rasterizer calls in VSM together instead of relying on HAS_PREV_DRAW_DATA, and a future optimization can now be done to merge local and directional light full Nanite pipeline calls together.
* VISUALIZE permutation removed from VS/MS since it only loaded unform values that passed down per-vertex into fragment stage as nointerpolation parameters. Pixel shader now constructs this uint2 directly under the VISUALIZE permutation
* NANITE_MESH_SHADER_INTERP removed by default but still left in the code, since it is a work in progress potential optimization for DX12 mesh shaders
* Removed explicit Lumen and VSM usage of NANITE_RENDER_FLAG_HAVE_PREV_DRAW_DATA (now the dynamic branch path is only taken if CullRasterizeMultiPass implicitly breaks the rasterization into multiple calls due to NANITE_MAX_VIEWS_PER_CULL_RASTERIZE_PASS overflow)

Performance was tested on a 2080Ti in AncientGame, and the delta is effectively noise (tested cached and uncached VSM). Further testing on other platforms will occur, but important to get this change in for all the benefits and easy to tweak things later if needed.

#rb rune.stubbe
#fyi brian.karis, ola.olsson, andrew.lauritzen, jamie.hayes, daniel.wright, krzysztof.narkowicz
#preflight 622e684c7e2e35638c96a16a
#robomerge FNNC

[CL 19370372 by graham wihlidal in ue5-main branch]
2022-03-13 23:18:25 -04:00
andrew lauritzen
c79193c915 Many LWC fixes for virtual shadow maaps:
- Shadow PreViewTranslation and ClipmapOrigin become full LWC tile/offset values on the GPU
- In most cases, the camera's and shadow's PreViewTranslations can be subtracted on the GPU to produce a regular-range value to transform from PrimaryView.TranslatedWorld to ShadowView.TranslatedWorld
- Miner cleanup and improvements to SMRT trace loop
- Remove special case for ortho matrices disabling PreViewTranslation in FViewMatrices
- Remove broken static function local and associated cvar r.PreViewTranslation

#preflight 6205dd571404d0fef964d721
#jira UE-139824
#rb graham.wihlidal
#lockdown juan.canada

#ROBOMERGE-AUTHOR: andrew.lauritzen
#ROBOMERGE-SOURCE: CL 18956693 in //UE5/Release-5.0/... via CL 18956877 via CL 18957087
#ROBOMERGE-BOT: UE5 (Release-Engine-Test -> Main) (v917-18934589)

[CL 18958948 by andrew lauritzen in ue5-main branch]
2022-02-11 14:57:27 -05:00
andrew lauritzen
a936f96e04 Add clipmap address space visualization and hook up to (advanced) visualization stuff.
Move visualize enum/defines to shared header file.
Rename a few cvars for consistency.

#rb graham.wihlidal
[FYI] ola.olsson
#preflight 61f9d8921d7ca8ed2d628ccd

#ROBOMERGE-AUTHOR: andrew.lauritzen
#ROBOMERGE-SOURCE: CL 18819821 in //UE5/Release-5.0/... via CL 18819825 via CL 18822904
#ROBOMERGE-BOT: UE5 (Release-Engine-Test -> Main) (v910-18824042)

[CL 18825054 by andrew lauritzen in ue5-main branch]
2022-02-02 08:19:08 -05:00
andrew lauritzen
482b3e6acf Reimplement clipmap panning to reduce full-level cache invalidations during camera movement
#rb graham.wihlidal
[FYI] ola.olsson
#jira UE-140434
#preflight 61f885a8a6632a34f3603419

#ROBOMERGE-AUTHOR: andrew.lauritzen
#ROBOMERGE-SOURCE: CL 18805354 in //UE5/Release-5.0/... via CL 18807961 via CL 18821749
#ROBOMERGE-BOT: UE5 (Release-Engine-Test -> Main) (v908-18788545)

[CL 18822110 by andrew lauritzen in ue5-main branch]
2022-02-02 02:18:54 -05:00
andrew lauritzen
e241229c79 Fix compile issue
#rb trivial
#preflight trivial

#ROBOMERGE-AUTHOR: andrew.lauritzen
#ROBOMERGE-SOURCE: CL 18764046 in //UE5/Release-5.0/... via CL 18764257 via CL 18764430
#ROBOMERGE-BOT: UE5 (Release-Engine-Test -> Main) (v903-18687472)

[CL 18764518 by andrew lauritzen in ue5-main branch]
2022-01-27 17:52:44 -05:00
andrew lauritzen
ef2fd132f7 Add VSM projection visualizations to the editor menu
#rb graham.wihlidal
[FYI] ola.olsson
#preflight 61f1a689fc74f46b5645b225

#ROBOMERGE-AUTHOR: andrew.lauritzen
#ROBOMERGE-SOURCE: CL 18757572 in //UE5/Release-5.0/... via CL 18759665 via CL 18760682
#ROBOMERGE-BOT: UE5 (Release-Engine-Test -> Main) (v903-18687472)

[CL 18760914 by andrew lauritzen in ue5-main branch]
2022-01-27 15:49:31 -05:00
ola olsson
e833c86840 Add max distance threshold cvar (r.Shadow.Virtual.MaxMaterialPositionInvalidationRange) for stopping VSM invalidations from WPO (and PDO)
#rb andrew.lauritzen
[FYI] matt.oztalay
#preflight 61e93101cc5594132e061db9

#ROBOMERGE-AUTHOR: ola.olsson
#ROBOMERGE-SOURCE: CL 18672635 in //UE5/Release-5.0/... via CL 18672638 via CL 18672641
#ROBOMERGE-BOT: UE5 (Release-Engine-Test -> Main) (v901-18665521)

[CL 18672642 by ola olsson in ue5-main branch]
2022-01-20 04:59:27 -05:00
andrew lauritzen
7960520813 Encode hierarchical page flags inside bits of page flags mip structure.
- Eliminates the additional HPageFlags buffer and associated scalar array indexing in constant buffer
- Unifies addressing logic and helpers (effectively now the addressing is just MipLevel + HMipLevel)
- Small reduction in memory
Move PageFlags and PageRectBounds into the VSM uniform buffer - similar to the page table - to avoid needing to individually funnel them through various interfaces that need to check page overlap
Rename nanitestats VSM_Perspective to VSM_Local for consistency with other cvars

#rb ola.olsson
#preflight 61e5e57ea2616066f68f3453

#ROBOMERGE-AUTHOR: andrew.lauritzen
#ROBOMERGE-SOURCE: CL 18642391 in //UE5/Release-5.0/... via CL 18642432 via CL 18642483
#ROBOMERGE-BOT: UE5 (Release-Engine-Test -> Main) (v900-18638592)

[CL 18642585 by andrew lauritzen in ue5-main branch]
2022-01-18 13:05:54 -05:00
ola olsson
e856b953f3 Implement batching of instance culling for non-nanite VSM, all local lights are processed as one dispatch, also:
- Factor out the batching setup & merging from the deferred instance culling
- Refactor the instance culling load balancer to be able to share types between different allocators
- Make batching logic use scene rendering allocator
- Rename RenderVirtualShadowMapsHw to RenderVirtualShadowMapsNonNanite (and similar names)
- Refactor view setup logic for Nanite & Non into common functions

#rb andrew.lauritzen

#ROBOMERGE-AUTHOR: ola.olsson
#ROBOMERGE-SOURCE: CL 18293379 via CL 18373760 via CL 18373855
#ROBOMERGE-BOT: STARSHIP (Release-Engine-Staging -> Release-Engine-Test) (v895-18170469)

[CL 18373879 by ola olsson in ue5-release-engine-test branch]
2021-12-03 16:38:33 -05:00
andrew lauritzen
1dfa763b1f - Generalize one pass projection shadow mask to support up to 32 lights/pixel with 4bpp quantization
- Add some dither noise to both the SMRT result and the shadow mask lookup to minimize banding
- Fall back to a single sample VSM lookup (with a generous static bias) when overflowing the number of lights in one pass projection path
- Fix clamping issue with page dilation that was setting extraneous pages with point lights
- Fix SMRT issue with local lights jammed right next to geometry viewed at a distance
- Separate settings for page dilation for local and directional lights
- Add simple debug output for # lights in one pass projection
- Remove some dead code/parameters

#ROBOMERGE-AUTHOR: andrew.lauritzen
#ROBOMERGE-SOURCE: CL 18279117 via CL 18373418 via CL 18373449
#ROBOMERGE-BOT: STARSHIP (Release-Engine-Staging -> Release-Engine-Test) (v895-18170469)

[CL 18373485 by andrew lauritzen in ue5-release-engine-test branch]
2021-12-03 16:23:04 -05:00
andrew lauritzen
2a232eb1f8 Fix indirect HZB generation for static pages
Minor cleanup to logic to try and avoid similar issues

#rb graham.wihlidal
[FYI] ola.olsson
#preflight 6156462e9dc4c50001365202
#lockdown michal.valient

#ROBOMERGE-AUTHOR: andrew.lauritzen
#ROBOMERGE-SOURCE: CL 17685329 via CL 17967455 via CL 18366106 via CL 18366202
#ROBOMERGE-BOT: STARSHIP (Release-Engine-Staging -> Release-Engine-Test) (v895-18170469)

[CL 18366310 by andrew lauritzen in ue5-release-engine-test branch]
2021-12-03 01:39:26 -05:00
ola olsson
1744646e6d Add building VSM HZB for only pages that are allocated, using indirect-dispatch, saves around 0.4ms in typical views.
#rb andrew.lauritzen
#preflight 615468dcf4d2a400010d35c1
#lockdown michal.valient

#ROBOMERGE-AUTHOR: ola.olsson
#ROBOMERGE-SOURCE: CL 17675060 via CL 17966250 via CL 18365947 via CL 18366073
#ROBOMERGE-BOT: STARSHIP (Release-Engine-Staging -> Release-Engine-Test) (v895-18170469)

[CL 18366176 by ola olsson in ue5-release-engine-test branch]
2021-12-03 01:31:24 -05:00
ola olsson
053ef3d05f Make VSM physical pageinitialization and merge use indirect dispatch to only operate on needed pages.
#rb Andrew.Lauritzen
#preflight 614392d1b5a4fa000169fd47
#lockdown michal.valient

#ROBOMERGE-AUTHOR: ola.olsson
#ROBOMERGE-SOURCE: CL 17550967 via CL 17945639 via CL 18363730 via CL 18363958
#ROBOMERGE-BOT: STARSHIP (Release-Engine-Staging -> Release-Engine-Test) (v895-18170469)

[CL 18364052 by ola olsson in ue5-release-engine-test branch]
2021-12-02 23:09:13 -05:00
jon nabozny
87abd3f7ad Initial implementation of separate static/dynamic VSM page caching to lower cost of dynamic invalidations
- Disabled by default pending some additional optimization, but showing promising initial results

#rb ola.olsson
[FYI] brian.karis, rune.stubbe
#preflight 614262b39bba9a0001a9ee58
#lockdown michal.valient

#ROBOMERGE-OWNER: jon.nabozny
#ROBOMERGE-AUTHOR: andrew.lauritzen
#ROBOMERGE-SOURCE: CL 17528599 via CL 17943875 via CL 18363706 via CL 18363947
#ROBOMERGE-BOT: STARSHIP (Release-Engine-Staging -> Release-Engine-Test) (v895-18170469)

[CL 18364037 by jon nabozny in ue5-release-engine-test branch]
2021-12-02 23:08:02 -05:00
jon nabozny
a6136b2a60 Occlusion cull instances drawing into non-nanite VSM (controlled by r.Shadow.Virtual.NonNanite.UseHZB, default mode 2)
- Added stats for non-nanite VSM instance culling (moved VSM stats functionality into own file).
- r.Shadow.Virtual.NonNanite.UseHZB == 2 (default) uses the current-frame Nanite VSM HZB as this enables correct culling for camera cuts & light movement and contains most of the occluding geometry.

#rb Andrew.Lauritzen
#preflight 6138a6582d09b90001568819

#ROBOMERGE-OWNER: jon.nabozny
#ROBOMERGE-AUTHOR: ola.olsson
#ROBOMERGE-SOURCE: CL 17470225 via CL 17923040 via CL 18360986 via CL 18361244
#ROBOMERGE-BOT: STARSHIP (Release-Engine-Staging -> Release-Engine-Test) (v895-18170469)

[CL 18361409 by jon nabozny in ue5-release-engine-test branch]
2021-12-02 18:25:13 -05:00
aurel cordonnier
fc542f6cfd Merge from Release-Engine-Staging @ 18081189 to Release-Engine-Test
This represents UE4/Main @18073326, Release-5.0 @18081140 and Dev-PerfTest @18045971

[CL 18081471 by aurel cordonnier in ue5-release-engine-test branch]
2021-11-07 23:43:01 -05:00
jeannoe morissette
17b84d83db Fix all cases of single scalar in shader parameter arrays to respect 16 byte alignment for Vulkan.
Add static_assert to prevent the creation of new ones moving forward.
Used SHADER_PARAMETER_SCALAR_ARRAY/GET_SCALAR_ARRAY_ELEMENT for single parameters, or packed them with surrounding parameters when possible.

#rb Guillaume.Abadie,Daniel.Wright,Charles.deRousiers
#preflight 61577bf15631d900011d59a1

#ROBOMERGE-AUTHOR: jeannoe.morissette
#ROBOMERGE-SOURCE: CL 17707027 in //UE5/Main/...
#ROBOMERGE-BOT: STARSHIP (Main -> Release-Engine-Test) (v879-17706426)
#ROBOMERGE[STARSHIP]: UE5-Release-Engine-Staging Release-5.0

[CL 17707037 by jeannoe morissette in ue5-release-engine-test branch]
2021-10-04 09:14:58 -04:00
andrew davidson
57beb335f2 Merging //UE5/Dev-LargeWorldCoordinates [at] 17581892 to //UE5/Main
#ROBOMERGE-AUTHOR: andrew.davidson
#ROBOMERGE-SOURCE: CL 17595295 in //UE5/Main/...
#ROBOMERGE-BOT: STARSHIP (Main -> Release-Engine-Test) (v871-17566257)

[CL 17595306 by andrew davidson in ue5-release-engine-test branch]
2021-09-22 10:01:48 -04:00
ola olsson
c78ae71b1f Replace side-effect invalidation during instance/cluster cull with a buffer appended to at instance cull.
- preparational step to enable HZB culling of invalidations in an uniform way.
- also add FComputeShaderUtils helper to set up an indirect dispatch.

#rb andrew.lauritzen
#preflight 6130818017a8610001b0cfc7

#ROBOMERGE-SOURCE: CL 17400532 via CL 17400838
#ROBOMERGE-BOT: STARSHIP (Main -> Release-Engine-Test) (v865-17346139)

[CL 17400862 by ola olsson in ue5-release-engine-test branch]
2021-09-02 07:13:18 -04:00
andrew lauritzen
ead3650569 Virtual shadow maps: early out on pixels backfacing the light in both projection and page allocation
Page allocation part is a fairly minor benefit in most scenes but can occasionally make a big difference.
Projection early-out is a pretty uniform benefit all the time, especially with SMRT.

#preflight 612d1b836a14cc000118f03a
#rb ola.olsson

#ROBOMERGE-SOURCE: CL 17388425 via CL 17389280
#ROBOMERGE-BOT: STARSHIP (Main -> Release-Engine-Test) (v865-17346139)

[CL 17389330 by andrew lauritzen in ue5-release-engine-test branch]
2021-09-01 13:13:01 -04:00