Commit Graph

138 Commits

Author SHA1 Message Date
andrew lauritzen
4bdb2cf471 Add debug visualization for VSM caching bin and some minor cleanup
#rb graham.wihlidal
#preflight 614909d4116f2a00017632c4
[FYI] ola.olsson
#lockdown michal.valient

#ROBOMERGE-AUTHOR: andrew.lauritzen
#ROBOMERGE-SOURCE: CL 17578872 via CL 17947144 via CL 18363889 via CL 18364001
#ROBOMERGE-BOT: STARSHIP (Release-Engine-Staging -> Release-Engine-Test) (v895-18170469)

[CL 18364100 by andrew lauritzen in ue5-release-engine-test branch]
2021-12-02 23:12:09 -05:00
andrew lauritzen
a9a04f3a20 Enable indirect clear/merge by default
#rb ola.olsson
#lockdown Michal.Valient
#preflight 6145187dbf494a0001a5d993

#ROBOMERGE-AUTHOR: andrew.lauritzen
#ROBOMERGE-SOURCE: CL 17559918 via CL 17946265 via CL 18363853 via CL 18363979
#ROBOMERGE-BOT: STARSHIP (Release-Engine-Staging -> Release-Engine-Test) (v895-18170469)

[CL 18364075 by andrew lauritzen in ue5-release-engine-test branch]
2021-12-02 23:10:49 -05:00
ola olsson
053ef3d05f Make VSM physical pageinitialization and merge use indirect dispatch to only operate on needed pages.
#rb Andrew.Lauritzen
#preflight 614392d1b5a4fa000169fd47
#lockdown michal.valient

#ROBOMERGE-AUTHOR: ola.olsson
#ROBOMERGE-SOURCE: CL 17550967 via CL 17945639 via CL 18363730 via CL 18363958
#ROBOMERGE-BOT: STARSHIP (Release-Engine-Staging -> Release-Engine-Test) (v895-18170469)

[CL 18364052 by ola olsson in ue5-release-engine-test branch]
2021-12-02 23:09:13 -05:00
jon nabozny
87abd3f7ad Initial implementation of separate static/dynamic VSM page caching to lower cost of dynamic invalidations
- Disabled by default pending some additional optimization, but showing promising initial results

#rb ola.olsson
[FYI] brian.karis, rune.stubbe
#preflight 614262b39bba9a0001a9ee58
#lockdown michal.valient

#ROBOMERGE-OWNER: jon.nabozny
#ROBOMERGE-AUTHOR: andrew.lauritzen
#ROBOMERGE-SOURCE: CL 17528599 via CL 17943875 via CL 18363706 via CL 18363947
#ROBOMERGE-BOT: STARSHIP (Release-Engine-Staging -> Release-Engine-Test) (v895-18170469)

[CL 18364037 by jon nabozny in ue5-release-engine-test branch]
2021-12-02 23:08:02 -05:00
jon nabozny
a6136b2a60 Occlusion cull instances drawing into non-nanite VSM (controlled by r.Shadow.Virtual.NonNanite.UseHZB, default mode 2)
- Added stats for non-nanite VSM instance culling (moved VSM stats functionality into own file).
- r.Shadow.Virtual.NonNanite.UseHZB == 2 (default) uses the current-frame Nanite VSM HZB as this enables correct culling for camera cuts & light movement and contains most of the occluding geometry.

#rb Andrew.Lauritzen
#preflight 6138a6582d09b90001568819

#ROBOMERGE-OWNER: jon.nabozny
#ROBOMERGE-AUTHOR: ola.olsson
#ROBOMERGE-SOURCE: CL 17470225 via CL 17923040 via CL 18360986 via CL 18361244
#ROBOMERGE-BOT: STARSHIP (Release-Engine-Staging -> Release-Engine-Test) (v895-18170469)

[CL 18361409 by jon nabozny in ue5-release-engine-test branch]
2021-12-02 18:25:13 -05:00
jeannoe morissette
17b84d83db Fix all cases of single scalar in shader parameter arrays to respect 16 byte alignment for Vulkan.
Add static_assert to prevent the creation of new ones moving forward.
Used SHADER_PARAMETER_SCALAR_ARRAY/GET_SCALAR_ARRAY_ELEMENT for single parameters, or packed them with surrounding parameters when possible.

#rb Guillaume.Abadie,Daniel.Wright,Charles.deRousiers
#preflight 61577bf15631d900011d59a1

#ROBOMERGE-AUTHOR: jeannoe.morissette
#ROBOMERGE-SOURCE: CL 17707027 in //UE5/Main/...
#ROBOMERGE-BOT: STARSHIP (Main -> Release-Engine-Test) (v879-17706426)
#ROBOMERGE[STARSHIP]: UE5-Release-Engine-Staging Release-5.0

[CL 17707037 by jeannoe morissette in ue5-release-engine-test branch]
2021-10-04 09:14:58 -04:00
andrew davidson
57beb335f2 Merging //UE5/Dev-LargeWorldCoordinates [at] 17581892 to //UE5/Main
#ROBOMERGE-AUTHOR: andrew.davidson
#ROBOMERGE-SOURCE: CL 17595295 in //UE5/Main/...
#ROBOMERGE-BOT: STARSHIP (Main -> Release-Engine-Test) (v871-17566257)

[CL 17595306 by andrew davidson in ue5-release-engine-test branch]
2021-09-22 10:01:48 -04:00
aurel cordonnier
7f517562d5 Merge from Release-Engine-Staging @ 17438845 to Release-Engine-Test
This represents UE4/Main @17430120 and Dev-PerfTest @17437669

[CL 17439044 by aurel cordonnier in ue5-release-engine-test branch]
2021-09-06 12:23:53 -04:00
ola olsson
c78ae71b1f Replace side-effect invalidation during instance/cluster cull with a buffer appended to at instance cull.
- preparational step to enable HZB culling of invalidations in an uniform way.
- also add FComputeShaderUtils helper to set up an indirect dispatch.

#rb andrew.lauritzen
#preflight 6130818017a8610001b0cfc7

#ROBOMERGE-SOURCE: CL 17400532 via CL 17400838
#ROBOMERGE-BOT: STARSHIP (Main -> Release-Engine-Test) (v865-17346139)

[CL 17400862 by ola olsson in ue5-release-engine-test branch]
2021-09-02 07:13:18 -04:00
charles derousiers
8baac25269 Add full/partial/all tile classification for finer dispatch.
Add generation for indirect arg buffer for raytracing indirect dispatch.

#rb none
#preflight 61307b511a52e20001966113

#ROBOMERGE-OWNER: charles.derousiers
#ROBOMERGE-AUTHOR: charles.derousiers
#ROBOMERGE-SOURCE: CL 17399381 via CL 17400082
#ROBOMERGE-BOT: STARSHIP (Main -> Release-Engine-Test) (v865-17346139)

[CL 17400113 by charles derousiers in ue5-release-engine-test branch]
2021-09-02 05:43:40 -04:00
andrew lauritzen
ead3650569 Virtual shadow maps: early out on pixels backfacing the light in both projection and page allocation
Page allocation part is a fairly minor benefit in most scenes but can occasionally make a big difference.
Projection early-out is a pretty uniform benefit all the time, especially with SMRT.

#preflight 612d1b836a14cc000118f03a
#rb ola.olsson

#ROBOMERGE-SOURCE: CL 17388425 via CL 17389280
#ROBOMERGE-BOT: STARSHIP (Main -> Release-Engine-Test) (v865-17346139)

[CL 17389330 by andrew lauritzen in ue5-release-engine-test branch]
2021-09-01 13:13:01 -04:00
charles derousiers
a91bbfa21e Change & clean hair tile resources to share buffers for adding later on full/partially covered tile support.
#rb none
#preflight 612f604d677f0e0001866d50

#ROBOMERGE-OWNER: charles.derousiers
#ROBOMERGE-AUTHOR: charles.derousiers
#ROBOMERGE-SOURCE: CL 17385390 via CL 17387630
#ROBOMERGE-BOT: STARSHIP (Main -> Release-Engine-Test) (v865-17346139)

[CL 17387665 by charles derousiers in ue5-release-engine-test branch]
2021-09-01 11:46:13 -04:00
graham wihlidal
29b4ac53b7 Implemented a new payload data allocator and buffer in GPU Scene that will be used to allow instances to dynamically allocate optional side-car data rather than always paying a fixed high watermark cost in the scene data layout (often features are never used by most instances).
#rb brian.karis
[FYI] christopher.waters, ola.olsson, krzysztof.narkowicz
#preflight 6123e56d8ff55400011df401
[FYI] Dmitriy.Dyomin

#ROBOMERGE-OWNER: graham.wihlidal
#ROBOMERGE-AUTHOR: graham.wihlidal
#ROBOMERGE-SOURCE: CL 17275101 via CL 17276332
#ROBOMERGE-BOT: STARSHIP (Main -> Release-Engine-Test) (v858-17259218)

[CL 17276338 by graham wihlidal in ue5-release-engine-test branch]
2021-08-23 16:47:16 -04:00
andrew lauritzen
eebbe18a26 Add flag to mark page requests/allocated pages as excludig non-nanite geometry.
Add cvar for including non-nanite geometry in coarse pages (so that one can disable for performance reasons). Default on.
Some general unused flags cleanup.

#rb rune.stubbe
#preflight 610ccc0e6c6eb0000196b59e

#ROBOMERGE-SOURCE: CL 17085350 via CL 17095081
#ROBOMERGE-BOT: STARSHIP (Main -> Release-Engine-Test) (v853-17066230)

[CL 17095354 by andrew lauritzen in ue5-release-engine-test branch]
2021-08-07 15:19:34 -04:00
dmitriy dyomin
b0828d6beb Mobile specific implementation for auto-instancing. (disabled by default atm)
Run a compute job that packs most commonly used instance data (LocalToWorld matrix and some other bits - 80 bytes) into per-instance vertex buffer. Vertex shader does not have access to GPUScene and instead loads instance data from a per-instance vertex buffer. If it needs more primitive/instance data than available then it will load it from Primitive UB, binding unique uniform buffer and breaking auto-instancing. Pixel shader has a full access to a GPUScene
There are 3 ways how FSceneDataIntermediates gets populated
 1. PrimitiveId + GPUScene (Desktop)
 2. Per-Instance data + Primitive UB (Mobile)
 3. Primitive UB (auto-instancing disabled)
Details for GPUScene specific vertex inputs and access to FSceneDataIntermediates are hidden behind a macro:
VF_GPUSCENE_DECLARE_INPUT_BLOCK
VF_GPUSCENE_GET_INTERMEDIATES
FSceneDataIntermediates is now stored in FVertexFactoryIntermediates, FMaterialVertexParameters. Added a few GetPrimitiveData() overloads that allows you to access PrimitiveData depending on current context. Removed most of the cases where GetPrimitiveData() gets used with PrimitiveId.
#rb Ola.Ollson

#ROBOMERGE-SOURCE: CL 17093848 in //UE5/Main/...
#ROBOMERGE-BOT: STARSHIP (Main -> Release-Engine-Test) (v853-17066230)

[CL 17093856 by dmitriy dyomin in ue5-release-engine-test branch]
2021-08-07 07:20:52 -04:00
ola olsson
18895c0987 Add clear passes to fix capture replay issues on certain consoles.
#jira UE-118792
#rb andrew.lauritzen
#preflight 6101ad5f4cd7930001962d6e

#ROBOMERGE-SOURCE: CL 16994350 via CL 16994353
#ROBOMERGE-BOT: STARSHIP (Main -> Release-Engine-Test) (v838-16927207)

[CL 16994355 by ola olsson in ue5-release-engine-test branch]
2021-07-29 07:49:11 -04:00
andrew lauritzen
33ca74fba3 Implement persistent shadow physical page pool
- Eliminates the previous frame texture (~128MB in default config) and copy
- Cached pages are kept in place, with a quick pass to find and allocate free pages following
  - Experimented with a persistent free list but for the small numbers of pages we are talking about (2k-4k ish standard), recomputing the free list each frame was cheap and robust
  - Main cost is synchronization/barriers between the new passes. Fairly low measured overhead compared to the much larger wins, but can revisit later if need be.
- Removed a few things that added complexity but were not making a big difference to performance, especially with the changes:
  - No longer support "panning" the directional light cascade cached pages. With the changes to the snapping and depth range invalidations this was no longer making much of a difference anyways.
  - No longer do the hierarchical "eliminate parent if all 4 mip children are marked" for local lights. Tested a bit and it was not making a measurable difference in most cases with caching. Can be revived if useful later.
- Moved ownership of the now persistent physical pool to CacheManager
- A bunch of related cleanup and fixes

#rb ola.olsson
#preflight 61008a655938f90001f04260

#ROBOMERGE-OWNER: andrew.lauritzen
#ROBOMERGE-AUTHOR: andrew.lauritzen
#ROBOMERGE-SOURCE: CL 16985509 via CL 16987414
#ROBOMERGE-BOT: STARSHIP (Main -> Release-Engine-Test) (v838-16927207)

[CL 16987415 by andrew lauritzen in ue5-release-engine-test branch]
2021-07-28 17:00:37 -04:00
andrew lauritzen
9e9c6bfa2c Remove unused/no longer working VSM paths
#ROBOMERGE-SOURCE: CL 16917622 in //UE5/Main/...
#ROBOMERGE-BOT: STARSHIP (Main -> Release-Engine-Test) (v836-16769935)

[CL 16917631 by andrew lauritzen in ue5-release-engine-test branch]
2021-07-21 19:04:20 -04:00
andrew lauritzen
37ebf8ceb4 Implement simple subsurface shading model for virtual shadow maps and related fixes and improvements:
- Add VirtualShadowMapId to forward light data, removing the per-view remapping table. Should fixe a few multi-view/split-screen bugs.
- Minor cleanup of PCF subsurface path; remove dead/broken code.
- Fix up blending of light attenuation into screen shadow mask; disable CSM fading when VSMs are enabled.
- Fix OnePassProjection flag when VSMs are disabled

#rb none
[FYI] ola.olsson, brian.karis

#ROBOMERGE-SOURCE: CL 16852407 via CL 16852415
#ROBOMERGE-BOT: STARSHIP (Main -> Release-Engine-Test) (v836-16769935)

[CL 16852422 by andrew lauritzen in ue5-release-engine-test branch]
2021-07-14 13:45:31 -04:00
ola olsson
30b717b963 Combing GPU-Scene instance culling and the id-list generation into one step, and the same for VSM
- Uses the InstanceCullingLoadBalancer to pre-distribute the work on the CPU to ensure even load.
 - Make instance culling use the instance data offset in MDC instead of translating primitive IDs.
 - Track single-instance draws separately from instanced to optimize handling (disable culling for single-instance primitives).

#rb Graham.wihlidal,andrew.lauritzen
[FYI] dmitriy.dyomin
#preflight 60d0eafa2ab2180001269160

#ROBOMERGE-SOURCE: CL 16733827 in //UE5/Main/...
#ROBOMERGE-BOT: STARSHIP (Main -> Release-Engine-Test) (v835-16672529)

[CL 16733837 by ola olsson in ue5-release-engine-test branch]
2021-06-21 16:52:03 -04:00
andrew lauritzen
35f696abf5 Use conservative CPU cull volume for VSM clipmaps to avoid missing non-nanite geometry in cached pages.
Better labeling for some GPU profiling regions.
Fix to frustum vertex buffer usage flag.

#rb ola.olsson

#ROBOMERGE-SOURCE: CL 16675237 via CL 16675244
#ROBOMERGE-BOT: STARSHIP (Main -> Release-Engine-Test) (v835-16672529)

[CL 16675251 by andrew lauritzen in ue5-release-engine-test branch]
2021-06-15 12:47:24 -04:00
graham wihlidal
f602dcf242 Big cleanup/refactor of InstanceData* (renamed to InstanceSceneData* to match many other places already calling it that, and to disambiguate upcoming changes that add another instance data buffer to GPU Scene for arbitrary data payloads). This change also removes the virtuals on FPrimitiveSceneProxy for the instance list along with lots of copy paste code for all the derived types, and instead makes it a built-in feature of the base proxy (since nearly everything supports GPU Scene instancing now).
#rb ola.olsson
[FYI] brian.karis
#preflight 60c4d5c586ce760001377f2a

#ROBOMERGE-OWNER: graham.wihlidal
#ROBOMERGE-AUTHOR: graham.wihlidal
#ROBOMERGE-SOURCE: CL 16660135 via CL 16660883
#ROBOMERGE-BOT: STARSHIP (Main -> Release-Engine-Test) (v834-16658389)

[CL 16660909 by graham wihlidal in ue5-release-engine-test branch]
2021-06-14 13:43:26 -04:00
Ola Olsson
91ed2ab3ba Move instance ID buffer to RDG and out of the View uniform buffer and into own UB.
- Enables overlapping batched and non-batched instance culling (needed for batching work).
 - Removes some explicit transitions & minor cleanup.
 - Added tracking the required number of instances (fixes non-nanite VSM for large ISMs)

#rb graham.wihlidal,jian.ru,yujiang.wang,zach.bethel
#preflight 60b73f38107dc600017d931b

[CL 16544217 by Ola Olsson in ue5-main branch]
2021-06-03 02:19:28 -04:00
andrew lauritzen
e9c2267b53 Fix CPU-side structure to align with padding added to GPU structure.
#rb Charles.deRousiers

#ROBOMERGE-SOURCE: CL 16510863 in //UE5/Private-Frosty/...
#ROBOMERGE-BOT: STARSHIP (Private-Frosty -> Main) (v826-16501804)

[CL 16510864 by andrew lauritzen in ue5-main branch]
2021-05-31 16:52:31 -04:00
andrew lauritzen
c373a4445f Fix depth range precision and offset/clamping issues with VSM cached pages
- Instead, create a reference depth range with a guard band that we reuse frame to frame until we get too close to the edge, at which point we invalidate that clipmap level's cached pages (if any)
- Does not significantly affect invalidations, even when moving quickly near surfaces, since those cases would already get invalidated when those pixels moved to a different clipmap level
- Sets up for not re-copying cached VSM pages every frame

#rb ola.olsson
[FYI] brian.karis

#ROBOMERGE-SOURCE: CL 16492432 in //UE5/Private-Frosty/...
#ROBOMERGE-BOT: STARSHIP (Private-Frosty -> Main) (v823-16466674)

[CL 16492445 by andrew lauritzen in ue5-main branch]
2021-05-27 17:36:27 -04:00