Commit Graph

81 Commits

Author SHA1 Message Date
Sebastien Hillaire
1ffe7c958d Strata - Refactored indirect dispatches to source from the same indierect draw buffer, thus avoiding hitting out of uav slots on dx11.
#rb none
#preflight https://horde.devtools.epicgames.com/job/6230946ce65a7e65d6855346
#fyi charles.derousiers

[CL 19384753 by Sebastien Hillaire in ue5-main branch]
2022-03-15 09:55:16 -04:00
Charles deRousiers
9090280376 Change thin layer to act as an Strata operator.
This simplify Strata Slab UI, and make it easier to thin-film coat many BSDFs at the same time.

#rb none
#jira none
#preflight 622fa337c51b66df4c248174
#fyi sebastien.hillaire

[CL 19378172 by Charles deRousiers in ue5-main branch]
2022-03-14 16:46:59 -04:00
graham wihlidal
25f8e29153 Removed a massive number of Nanite rasterizer shader permutations across all platforms/shaderdbs, significantly improving iteration times for the editor and cooker, especially when these numbers get multiplied by the number of materials that utilize programmable features in addition to the default material "fixed function" path.
Reductions *per material*:

SM5
--
FHWRasterizeVS: 832 -> 21
FHWRasterizePS: 104 -> 39

SM6
--
FHWRasterizeVS: 320 -> 9
FHWRasterizeMS: 640 -> 9
FHWRasterizePS: 120 -> 30

Vulkan
--
FHWRasterizeVS: 320 -> 9
FHWRasterizePS: 40 -> 15

Other platforms redacted =)

-- Details

* CLUSTER_PER_PAGE has been fully removed (since we no longer ever run CLUSTER_PER_PAGE=0), which now makes it mutually inclusive with VIRTUAL_TEXTURE_TARGET
* HAS_RASTER_BIN has been replaced with a dynamic branch, since this is just a per cluster index offset based on a simple uniform buffer load
* ADD_CLUSTER_OFFSET has been replaced with a dynamic branch, since this is just a per cluster index offset based on a simple uniform buffer load
* HAS_PREV_DRAW_DATA has been replaced with a dynamic branch, since this is just a per cluster index offset based on a simple uniform buffer load
* NEAR_CLIP (only change to significantly affect codegen) has been turned into a dynamic branch based on FNaniteView - this lets us merge depth clip/clamp rasterizer calls in VSM together instead of relying on HAS_PREV_DRAW_DATA, and a future optimization can now be done to merge local and directional light full Nanite pipeline calls together.
* VISUALIZE permutation removed from VS/MS since it only loaded unform values that passed down per-vertex into fragment stage as nointerpolation parameters. Pixel shader now constructs this uint2 directly under the VISUALIZE permutation
* NANITE_MESH_SHADER_INTERP removed by default but still left in the code, since it is a work in progress potential optimization for DX12 mesh shaders
* Removed explicit Lumen and VSM usage of NANITE_RENDER_FLAG_HAVE_PREV_DRAW_DATA (now the dynamic branch path is only taken if CullRasterizeMultiPass implicitly breaks the rasterization into multiple calls due to NANITE_MAX_VIEWS_PER_CULL_RASTERIZE_PASS overflow)

Performance was tested on a 2080Ti in AncientGame, and the delta is effectively noise (tested cached and uncached VSM). Further testing on other platforms will occur, but important to get this change in for all the benefits and easy to tweak things later if needed.

#rb rune.stubbe
#fyi brian.karis, ola.olsson, andrew.lauritzen, jamie.hayes, daniel.wright, krzysztof.narkowicz
#preflight 622e684c7e2e35638c96a16a
#robomerge FNNC

[CL 19370372 by graham wihlidal in ue5-main branch]
2022-03-13 23:18:25 -04:00
krzysztof narkowicz
e067e0fe2a Lumen - fixed HLOD1 surface cache. HLOD1 meshes, hidden in main view, weren't rasterized by Nanite into Lumen's surface cache.
#jira UE-144367
#preflight 621f0a9fb20446f11c8295a4
#lockdown Juan.Canada
#rb Daniel.Wright


#ROBOMERGE-OWNER: krzysztof.narkowicz
#ROBOMERGE-AUTHOR: krzysztof.narkowicz
#ROBOMERGE-SOURCE: CL 19222428 via CL 19222851 via CL 19222926 via CL 19222934 via CL 19225794
#ROBOMERGE-BOT: UE5 (Release-Engine-Staging -> Main) (v921-19075845)

[CL 19229576 by krzysztof narkowicz in ue5-main branch]
2022-03-02 16:22:29 -05:00
aleksander netzel
d513d2de4e Adding NaniteRayTrace.ush which implements functions needed for Nanite geometry intersection for ray tracing:
* Moller-Trumbore and Watertight triangle intersections in RayTriangleIntersect.h
* AABB intersection
* Nanite cluster intersection
* Nanite hierarchy intersection using a stack.

#rb Tiago.Costa
#preflight none

[CL 19196324 by aleksander netzel in ue5-main branch]
2022-03-01 05:14:21 -05:00
sebastien lussier
e119711dad #jira UE-138073
Scene proxies having the bRayTracingFarField flag set would never be visible in the editor

Removed the VisibleInRaster logic which caused the HLOD1 meshes (having bRayTracingFarField=true) from being hidden.

#rb krzysztof.narkowicz
#preflight 6215a7db0f71e491cce866fd


#ROBOMERGE-OWNER: sebastien.lussier
#ROBOMERGE-AUTHOR: sebastien.lussier
#ROBOMERGE-SOURCE: CL 19091265 via CL 19092640 via CL 19094403 via CL 19095886 via CL 19105082
#ROBOMERGE-BOT: UE5 (Release-Engine-Staging -> Main) (v921-19075845)
#ROBOMERGE-CONFLICT from-shelf

[CL 19146451 by sebastien lussier in ue5-main branch]
2022-02-25 09:35:14 -05:00
rune stubbe
7c08b952b6 Fix for Nanite geometry streaming never settling with nDisplay
#jira UE-142432
#preflight 62154355c06cac272dddcc04
#rb graham.wihlidal


#ROBOMERGE-AUTHOR: rune.stubbe
#ROBOMERGE-SOURCE: CL 19085087 via CL 19092521 via CL 19092615 via CL 19093316 via CL 19101776
#ROBOMERGE-BOT: UE5 (Release-Engine-Staging -> Main) (v921-19075845)

[CL 19141615 by rune stubbe in ue5-main branch]
2022-02-24 23:58:13 -05:00
graham wihlidal
4171f6f73f Big rewrite/optimization of Nanite programmable raster pipeline registration with GPUScene on the CPU, added raster bin visualization plus duplicate define code cleanup, added initial WIP two sided material support (will optimize the math shortly), and fixed a number of programmable raster VSM related bugs.
#rb brian.karis
#fyi rune.stubbe, jamie.hayes
#preflight 6216c0b0f8704e8ca91c7dbd
#robomerge FNNC

[CL 19104646 by graham wihlidal in ue5-main branch]
2022-02-23 18:42:05 -05:00
chris kulla
b46d29e962 Implement Sky Atmosphere and Exponential Height Fog support in the Path Tracer
This is the start of volumetric support in the path tracer, so a lot of basic infrastructure had to be put into place, making this changelist fairly large. Some shuffling of light parameters had to be done to make room for the volumetric scattering multiplier.

The integration strategy is to use null tracking (with Spectral MIS) for choosing a random scatter point along the ray. This point is chosen similarly to transparent hits, so surface and volume shading are unified. However, these volume hits are chosen proportionally to transmittance times scattering which is not optimal for lights embedded in the volume. To handle the latter, we also allow the main trace call to return a volume segment over which we can compute the direct lighting from local light sources. To simplify the handling of overlapping media and inter-mixed transparent hits, we stochastically select a ray segment. From this point on, we can evaluate direct lighting using equi-angular sampling. The MIS combination of equi-angular sampling and null tracked spectral density sampling was prototyped but found not to bring any improvement for the currently implemented volume types. This will likely have to be revisited in the future.

To improve quality and reduce the amount of ray-marching required, the volume API allows for analytic implementations of transmittance for cases where this is possible to do more efficiently than by ray-marching.

Implementation details for Atmosphere: This volume type is a planet sized, spherically symetric model. Because the default units in UE are centimeters, objects like the planet that are kilometer sized will run into all sorts of numerical precision artifacts. To solve this, an implementation of double-word arithmetic was added which allows enough decimal digits to robustly intersect the planet, as well as cary out the lookups required. The Transmittance is cached in a lookup table indexed by height above the ground and viewing angle cosine. This is similar to the LUT used by the realtime version but with a different parameterization which covers the full range of angles/heights with high precision. This lookup table is automatically baked on demand when atmosphere parameters change. The volumetric sky model is only used when "reference atmosphere" is enabled in the post process volume. This is because the existing approached (cached into a skylight) is generally a bit faster and supports clouds. This toggle may be removed as the support for volumes matures.

Implementation details for ExponentialHeightFog: This volume is represented as a finite slab centered around the camera. Transmittance is easily computed analytically for this volume. We only add this volume when the volumetric fog checkbox is enabled, as the default parameters are not fully physically based. We limit the fog to be present only within a certain radius of the camera, to prevent rays from scattering forever.

#preflight 620ad32d583261b0a66af216
#rb Sebastien.Hillaire,Patrick.Kelly
#preflight 620dd2270931bfd925e5936b

[CL 19031069 by chris kulla in ue5-main branch]
2022-02-17 00:21:09 -05:00
Sebastien Hillaire
b8821a1964 Big Strata core update!
- Completely changed the way we evaluate and represent the material topology.
- A topology is now represented as a tree to allow any horizontal/vertical operations (almost like a kdtree).
- This allows for correct translucent material coverage/transmittance evaluation, as well as correct BSDFs luminance weight (finally).
- All the operations to be able to do that are expended in HLSLTranslator instead of letting the compiler handle it (otherwise was too slow, leading to bugs and could not even debug shaders). This means no conditional BSDF packing.
- There is no more Material/layer/BSDFs array: only a Tree or operators, some of them being BSDF at leaves.
- Parameter blending has also been updated: when enabled on a node, it will be forced on all the oprator child / in the sub tree.
- Parameter blending is applied on an inlinedbsdf on the StrataData. When it is the root of the parameter blending subtree, it is promoted to a fully fledged BSDF operator node to allow other regular operators.
- The inlinedbsdf is not longer in premultiplied mode. Horizontal/Vertical/Add parameter blending operators have been updated to reflect that. This allows for simpler/unified/clearer code.
- Vertical blending now uses the new simple "layering of two slabs with uncorelated coverage" math I have RnD'd. See GetVerticalLayeringInfo in ShadingCommon.ush.

Follow up to this CL:
- CompileStrataBlendFunction should use parameter blending.
- Add more compilation debug output
- Update material vizualisation
- Fix Rough refraction
- Fix decals
- Fix debug probes
- Pack FStrataBSDF
- Move all StrataCompilationInfoCreateSingleBSDFMaterial into into the strata tree and rework cost evaluation and material LOD
- Expand graph visit such as UpdateBSDFWeightAfterOperatorVisit to be specific to operators for the compiler to have to do less search to inline what we know already
- Fix STRATA_TODO: operation using parameter blending are not actually discarded so they still occupy a pot in the opeartion arrays in the compiler and in the shader also.

#rb charles.derousiers
#preflight https://horde.devtools.epicgames.com/job/620a5f5d803d9066e67de938
#fyi charles.derousiers

[CL 18993400 by Sebastien Hillaire in ue5-main branch]
2022-02-15 03:09:30 -05:00
graham wihlidal
3d89d9a647 Initial submit of Nanite programmable raster binning/framework (disabled by default, and missing some WIP pieces for it to function correctly). Later submits will enable it, and also include numerous optimizations.
#rb brian.karis
#fyi rune.stubbe, ola.olsson
#preflight 6206fdc9c663666c89ba1b9f

[CL 18971638 by graham wihlidal in ue5-main branch]
2022-02-11 22:50:10 -05:00
Charles deRousiers
86ece99784 Add rect light texture atlas support.
This changes allow to bind a larger amount of rect. light texture, and allows future support for rect light in forward & cluster passes.

#jira none
#rb sebastien.hillaire
#preflight 61fcebd0b5092d45ad110db4

[CL 18861192 by Charles deRousiers in ue5-main branch]
2022-02-04 04:18:10 -05:00
andrew lauritzen
a936f96e04 Add clipmap address space visualization and hook up to (advanced) visualization stuff.
Move visualize enum/defines to shared header file.
Rename a few cvars for consistency.

#rb graham.wihlidal
[FYI] ola.olsson
#preflight 61f9d8921d7ca8ed2d628ccd

#ROBOMERGE-AUTHOR: andrew.lauritzen
#ROBOMERGE-SOURCE: CL 18819821 in //UE5/Release-5.0/... via CL 18819825 via CL 18822904
#ROBOMERGE-BOT: UE5 (Release-Engine-Test -> Main) (v910-18824042)

[CL 18825054 by andrew lauritzen in ue5-main branch]
2022-02-02 08:19:08 -05:00
tiago costa
0d12b0df17 Updated FPathTracingLight to use translated world space
- Store Position and Bounds in translated world space
- Not using TilePosition+RelativeWorldPosition to avoid increasing struct size.
- FPathTracingLight arrays seem to be calculated every frame so shouldn't be a problem.
- LightGrid fully built in translated world space.
- TraceLight/SampleLight/EstimateLight/InitLightPickingCdf now expect rays and positions in translated world space.

#preflight 61f98c744b0bc1c4176461df
#rb chris.kulla

#ROBOMERGE-AUTHOR: tiago.costa
#ROBOMERGE-SOURCE: CL 18813277 in //UE5/Release-5.0/... via CL 18813296 via CL 18822754
#ROBOMERGE-BOT: UE5 (Release-Engine-Test -> Main) (v910-18824042)

[CL 18824289 by tiago costa in ue5-main branch]
2022-02-02 07:35:04 -05:00
graham wihlidal
bac2a2b090 Moved all Nanite defines shared between C++ and shaders into a common header file, removing all the "keep this define in sync with this file" cases all over the code, and make the code a lot more maintainable. Common definitions now have a NANITE_ prefix to disambiguate global symbols
#rb rune.stubbe
#preflight 61f94f9ea6632a34f372dc39
[FYI] brian.karis, ola.olsson, jamie.hayes, andrew.lauritzen

#ROBOMERGE-AUTHOR: graham.wihlidal
#ROBOMERGE-SOURCE: CL 18808945 in //UE5/Release-5.0/... via CL 18809413 via CL 18822535
#ROBOMERGE-BOT: UE5 (Release-Engine-Test -> Main) (v908-18788545)

[CL 18823295 by graham wihlidal in ue5-main branch]
2022-02-02 05:33:52 -05:00
christopher waters
c08bc8a9cd Intel extensions for 64bit atomics
#jira none
#rb mihnea.balta, graham.wihlidal
#preflight 61eeeb29ba69a4fdb219e68f

#ROBOMERGE-AUTHOR: christopher.waters
#ROBOMERGE-SOURCE: CL 18712534 in //UE5/Release-5.0/... via CL 18712568 via CL 18712835
#ROBOMERGE-BOT: UE5 (Release-Engine-Test -> Main) (v903-18687472)

[CL 18712869 by christopher waters in ue5-main branch]
2022-01-24 14:48:23 -05:00
sebastien hillaire
9d09d19e34 Strata leverage UAV slice start/count to remove the post base pass copy step.
Ran Strata and non Strata code path.
QAGame cooked on Swi.
ShoorterGame cooked and ran on PC.
#rb charles.derousiers
#preflight https://horde.devtools.epicgames.com/job/61d47489db0309127dfae2a2

#ROBOMERGE-AUTHOR: sebastien.hillaire
#ROBOMERGE-SOURCE: CL 18509133 in //UE5/Release-5.0/... via CL 18509155
#ROBOMERGE-BOT: STARSHIP (Release-Engine-Staging -> Release-Engine-Test) (v899-18417669)

[CL 18509183 by sebastien hillaire in ue5-release-engine-test branch]
2022-01-04 12:18:49 -05:00
sebastien hillaire
956da1713d Fix for a pltform to compile when strata is off.
We always compile the tile vertex shader and this includes  strata.ush. We will fix compilation issue there when we get to it.

#jira https://jira.it.epicgames.com/browse/UE-137866
#rb none
#preflight https://horde.devtools.epicgames.com/job/61d42e9f932a02483cb8d000
[FYI] charles.derousiers

#ROBOMERGE-AUTHOR: sebastien.hillaire
#ROBOMERGE-SOURCE: CL 18506891 in //UE5/Release-5.0/... via CL 18506899
#ROBOMERGE-BOT: STARSHIP (Release-Engine-Staging -> Release-Engine-Test) (v899-18417669)

[CL 18506901 by sebastien hillaire in ue5-release-engine-test branch]
2022-01-04 06:43:29 -05:00
chris kulla
0281af418c Avoid wave operations in path compaction shader which appears to give a slight speedup despite additional contention on the atomic and does not require running with SM6.
Implement tiled dispatch in the path tracer to reduce the likelyhood of GPU timeouts when rendering at high resolution. This also reduces the memory requirements for path state when running with path compaction enabled.

Change from a uint buffer to a structured buffer for storing path states which gives a small speedup.

Add indirect dispatch support to launch less work for compacted bounces (off by default as it does not seem to provide a speedup so far)

#jira TM-6595
#rb Juan.Canada
#preflight 61b27c6a2b48d03df526ce85
#preflight 61b28773ee0de9822e0f02de

#ROBOMERGE-AUTHOR: chris.kulla
#ROBOMERGE-SOURCE: CL 18426885 in //UE5/Main/...
#ROBOMERGE-BOT: STARSHIP (Main -> Release-Engine-Test) (v897-18405271)
#ROBOMERGE[STARSHIP]: UE5-Release-Engine-Staging Release-5.0

[CL 18426911 by chris kulla in ue5-release-engine-test branch]
2021-12-09 18:36:38 -05:00
aleksander netzel
87350eedcd New ray tracing debug visualization modes:
Traversal Node - shows bounding box nodes intersections
Traversal Triangle - shows triangle/leaf nodes intersections
Traversal All - shows total node intersection count

#rb Tiago.Costa
#preflight 61ad4fc4245256036a307c3a

#ROBOMERGE-AUTHOR: aleksander.netzel
#ROBOMERGE-SOURCE: CL 18393728 in //UE5/Release-5.0/... via CL 18393733
#ROBOMERGE-BOT: STARSHIP (Release-Engine-Staging -> Release-Engine-Test) (v896-18170469)

[CL 18393742 by aleksander netzel in ue5-release-engine-test branch]
2021-12-07 06:55:02 -05:00
ryan vance
64db820c54 Add a far field visualization mode for ray tracing debugging.
#rb patrick.kelly

#ROBOMERGE-AUTHOR: ryan.vance
#ROBOMERGE-SOURCE: CL 18176711 via CL 18371985 via CL 18372022
#ROBOMERGE-BOT: STARSHIP (Release-Engine-Staging -> Release-Engine-Test) (v895-18170469)

[CL 18372050 by ryan vance in ue5-release-engine-test branch]
2021-12-03 15:21:50 -05:00
charles derousiers
b44703f4d2 Remove unecessary Gbuffer output during base pass when strata is not enabled.
#rb sebastien.hillaire
#jira none

#ROBOMERGE-AUTHOR: charles.derousiers
#ROBOMERGE-SOURCE: CL 18280952 in //UE5/Release-5.0/... via CL 18280965
#ROBOMERGE-BOT: STARSHIP (Release-Engine-Staging -> Release-Engine-Test) (v895-18170469)

[CL 18280989 by charles derousiers in ue5-release-engine-test branch]
2021-11-24 04:59:01 -05:00
charles derousiers
f73f7ea590 * Change classification texture to be encoded on 16bits rather 32bits
* Remove unused Strata VGPRs array

#rb none
#jira none
[FYI] sebastien.hillaire

#ROBOMERGE-AUTHOR: charles.derousiers
#ROBOMERGE-SOURCE: CL 18258719 in //UE5/Release-5.0/... via CL 18258732
#ROBOMERGE-BOT: STARSHIP (Release-Engine-Staging -> Release-Engine-Test) (v895-18170469)

[CL 18258742 by charles derousiers in ue5-release-engine-test branch]
2021-11-22 10:26:42 -05:00
Marc Audy
0c3be2b6ad Merge Release-Engine-Staging to Test @ CL# 18240298
[CL 18241953 by Marc Audy in ue5-release-engine-test branch]
2021-11-18 14:37:34 -05:00
chris kulla
64cda7600c Implement alpha channel support for the path tracer
To maintain consistency with the rasterizer implementation, keep track of background visibility (1.0 where there is background, 0.0 where there are solid objects). MRQ will flip this quantity before outputing to disk.

#rb Juan.Canada
#preflight 6149fccbe594c90001d9c558

#ROBOMERGE-AUTHOR: chris.kulla
#ROBOMERGE-SOURCE: CL 17584574 in //UE5/Main/...
#ROBOMERGE-BOT: STARSHIP (Main -> Release-Engine-Test) (v871-17566257)

[CL 17584582 by chris kulla in ue5-release-engine-test branch]
2021-09-21 13:09:10 -04:00