Commit Graph

101 Commits

Author SHA1 Message Date
tiago costa
fc62d1fc0e Initial support for deferred decals in path tracer.
- Decals materials are evaluated using callable shaders in PathTracingKernel.
- Decals are culled using a 2D grid similar to the existing light grid.
- In order to correctly handle decal blending order, decals are sorted using the same logic as the rasterizer on CPU. The compute shader that builds the decal grid maintains the correct order.
- Decal materials are wrapped in FRayTracingDecalMaterialShader. The instance parameters of each decal are bound using uniform buffers.

#preflight 628f3fed2f2409bc1e7a6414
#rb Yuriy.ODonnell, chris.kulla, Jeremy.Moore

[CL 20377336 by tiago costa in ue5-main branch]
2022-05-26 05:59:55 -04:00
Rune Stubbe
190f65419c Added Nanite MaterialCount debug visualization mode
#rb none
#fyi graham.wihlidal
#preflight 628d5fa69263ba4bef03341f

[CL 20362519 by Rune Stubbe in ue5-main branch]
2022-05-25 07:49:44 -04:00
jamie hayes
857e02b9a0 Add ability to specify a camera distance cull range for instances of a primitive and add camera distance culling logic to Nanite.
Hooked the value used by instanced static meshes into it to get nanite ISMs to respect the cull distance.

#rb brian.karis
#preflight 6287d8a21e478b95c7345866

#ROBOMERGE-AUTHOR: jamie.hayes
#ROBOMERGE-SOURCE: CL 20301693 via CL 20301776 via CL 20301792
#ROBOMERGE-BOT: UE5 (Release-Engine-Staging -> Main) (v948-20297126)

[CL 20305738 by jamie hayes in ue5-main branch]
2022-05-20 19:15:15 -04:00
aleksander netzel
ea75809cb8 Add per-pass traversal statistics for inline ray tracing for supporting platforms:
* Added RaytracingTraversalStatistics to create, capture and print TraceRayInline traversal statistics.
* In inline raytracing shaders you only need to call TraceRayInlineAccumulateStatistics() which will gather the results of the most recent TraceRayInline call.
* New debug visualization mode 'Traversal Statistics' that will print to the screen the traversal statistics for primary rays.

#rb Yuriy.Odonnell
#preflight 62860caa2b53e2be4c8ceee2

[CL 20277869 by aleksander netzel in ue5-main branch]
2022-05-19 06:05:59 -04:00
rune stubbe
623a28ee1e Nanite SW raster optimizations
Bake viewport and subpixel transforms into matrix to save ALU and VGPRs
Changed group size back to 64 for non-masked
#rb graham.wihlidal
#preflight 627a9b2fe713fc6e2c52c9bf

#ROBOMERGE-AUTHOR: rune.stubbe
#ROBOMERGE-SOURCE: CL 20125493 via CL 20125504 via CL 20125508
#ROBOMERGE-BOT: UE5 (Release-Engine-Staging -> Main) (v943-19904690)

[CL 20128642 by rune stubbe in ue5-main branch]
2022-05-10 16:03:57 -04:00
chris kulla
f8f584945a Path Tracer: Remove RayCone from path tracer payload as it was not actually used
If we want to re-introduce this notion, we should be able to handle this directly in the material given the world position and camera matrix.

This frees up 4 bytes (in preparation for Strata support) and also reduces the path state structure size by 4 bytes.

I've measured around 3% speedup from this removal.

#rb Yuriy.ODonnell,Charles.deRousiers
#preflight 626ae4926461dd769ffe4394

[CL 19966893 by chris kulla in ue5-main branch]
2022-04-28 15:20:39 -04:00
jamie hayes
292c19fe82 Add functionality for Nanite instances of a certain type to properly be filtered out if disabled by their respective ShowFlags on the view family.
#rb rune.stubbe
#preflight 62680e711638ac249e7b8a5a

#ROBOMERGE-AUTHOR: jamie.hayes
#ROBOMERGE-SOURCE: CL 19921850 via CL 19922083 via CL 19922618
#ROBOMERGE-BOT: UE5 (Release-Engine-Staging -> Main) (v943-19904690)

[CL 19925604 by jamie hayes in ue5-main branch]
2022-04-26 14:37:33 -04:00
Sebastien Hillaire
255a2b8a59 Strata - now using force cast of data when sent to backend fucntions to have a behavior close to the root node. (2d TexCoord can be used as Emissive color for instance)
#rb none
#preflight none
#fyi charles.derousiers

[CL 19806375 by Sebastien Hillaire in ue5-main branch]
2022-04-19 07:13:36 -04:00
chris kulla
703b25cc16 Path Tracer: re-enable mGPU support
Remove coherent sampler since it would be complicated to support in the mGPU case and isn't really necessary for performance anymore

#rb Jason.Hoerner
#preflight 624b79f09f404234149aec8e

[CL 19617353 by chris kulla in ue5-main branch]
2022-04-04 19:23:28 -04:00
Charles deRousiers
9d440f6a81 Remove STRATA_DATA_TILE_XXX to avoid confusion with STRATA_TILE.
#rb none
#jira none
#preflight 6245dbf39f404234145fc039

[CL 19574692 by Charles deRousiers in ue5-main branch]
2022-03-31 13:05:44 -04:00
Rune Stubbe
a86a7ba254 Added Non-persistent culling path. When r.Nanite.PersistentThreadsCulling is disabled, the node and cluster culling is unrolled into a chain of dependent dispatches that are easier to debug.
#rb graham.wihlidal
#preflight 6244f400981a2c8eb4791451

[CL 19567899 by Rune Stubbe in ue5-main branch]
2022-03-30 21:21:16 -04:00
graham wihlidal
9c8fa1c395 Implemented a GPU Scene API for primitives explicitly enabling/disabling WPO support driven by events. This will be important for disabling WPO overhead in Nanite and other systems when unnecessary. The material system MayModifyMeshPosition hints are insufficient when using an MICD with static params that ultimately disable WPO, but the material system still reports WPO usage. This hint can also be used in new LOD systems to disable expensive features like WPO in the distance, but without doing a full shader switch. Nanite now supports a debug view that shows WPO off (red) and on (green) for meshes in the scene (r.Nanite.Visualize EvaluateWPO).
This change also remaps the original bEvaluateWorldPositionOffset on SMC into bEvaluateWorldPositionOffsetInRayTracing, because this var was only ever driven by ray tracing specific methods. The original bEvaluateWorldPositionOffset is now used by this more generic API.

Lastly, a new cvar (r.OptimizedWPO) has been added that indicates if the hint should be respected or not (default is false, which means WPO is always active, regardless of hint)

#rb rune.stubbe, marc.audy, derek.ehrman
[FYI] brian.karis, jamie.hayes, ola.olsson, andrew.lauritzen, jian.ru
#preflight 6244a8dcdc6183e3f5f8de98

#ROBOMERGE-AUTHOR: graham.wihlidal
#ROBOMERGE-SOURCE: CL 19564957 via CL 19564973 via CL 19564978
#ROBOMERGE-BOT: UE5 (Release-Engine-Staging -> Main) (v937-19513599)

[CL 19566743 by graham wihlidal in ue5-main branch]
2022-03-30 19:41:19 -04:00
Sebastien Hillaire
4100e391ab Strata more defines clean up
#rb none
#preflight none
#fyi charles.derousiers

[CL 19539341 by Sebastien Hillaire in ue5-main branch]
2022-03-29 02:29:00 -04:00
Sebastien Hillaire
811ac749a9 Strata - refactor and clean up of FStrataMaterialCompilationInfo into the strata tree.
#rb none
#preflight https://horde.devtools.epicgames.com/job/62421016b6084b98321ad00e
#fyi charles.derousiers

[CL 19538779 by Sebastien Hillaire in ue5-main branch]
2022-03-29 01:49:06 -04:00
graham wihlidal
150c8f6cf7 Optimized raster binning by merging separate HW and SW dispatches together, and also fixed the cluster thread group count so we don't spawn 64x useless thread groups. Tested on a 2080Ti with AncientGame campfire.
* Separate HW/SW dispatch vs. Merged HW/SW dispatch
Primary 1.28ms -> 1.02ms
Directional VSM 1.66ms -> 1.52ms
Local VSM 2.21ms -> 1.96ms

* Merged HW/SW dispatch -> Fixed Cluster Counts
Primary 1.02ms -> 1.01ms
Directional VSM 1.52ms -> 1.43ms
Local VSM 1.96ms -> 1.92ms

#preflight 623912161302f69e9a7744b8
#rb andrew.lauritzen
#fyi brian.karis, rune.stubbe, ola.olsson

[CL 19462469 by graham wihlidal in ue5-main branch]
2022-03-21 21:29:54 -04:00
Charles deRousiers
b993ce8650 Add BSDF tile and BSDF offset computation.
For pixel with complex/multi-BSDF computes BSDF material byte/index offsets and 'overflowing' tile.

This wil be used by Lumen for handling/parallelizing lighting computation.

#rb none
#jira none
#preflight 6234c38848746817f13c87bb
#fyi sebastien.hillaire

[CL 19438082 by Charles deRousiers in ue5-main branch]
2022-03-18 13:43:33 -04:00
tiago costa
3c7d7a5b4a Support nanite stats in RayTracingDebugTraversal.
#preflight 62348cb3da56b5683ac6fdf9
#rb Aleksander.Netzel

[CL 19434963 by tiago costa in ue5-main branch]
2022-03-18 10:26:29 -04:00
Sebastien Hillaire
781eddedd8 Strata - fixed legacy material issues with opaque rough refractions.
#rb none
#preflight none

[CL 19419681 by Sebastien Hillaire in ue5-main branch]
2022-03-17 08:56:51 -04:00
andrew lauritzen
bd803d7e43 SMRT improvements:
- Add slope-based depth extrapolation which improves the quality of penumbras on angled receivers. Costs ~10% performance in some cases so maintaining a permutation/cvar (default on) for scalability.
- Change screen ray trace to be a simple "space skipping" ray that terminates as soon as it goes behind geometry and continue VSM trace from that distance. This avoids various contact-shadow-like artifacts and undesirable/inconsistent contact shadows from things that aren't in the VSM. In certain cases if regular contact shadows are desired on top of VSM the engine contact shadows can be enabled, as it is with CSMs.
- Remove a bunch of use of "halfs" in the shaders as they cause some extra ALU on some platforms and don't appear to really be helping with occupancy anymore
- Small bump to minimum normal bias clamp (only affects things very close to the camera)

#rb brian.karis
[FYI] ola.olsson

#ROBOMERGE-AUTHOR: andrew.lauritzen
#ROBOMERGE-SOURCE: CL 19411300 via CL 19411627
#ROBOMERGE-BOT: UE5 (Release-Engine-Staging -> Main) (v928-19376421)

[CL 19413239 by andrew lauritzen in ue5-main branch]
2022-03-16 17:30:21 -04:00
Sebastien Hillaire
13414c64e6 Strata opaque rough refraction with place holder blur process.
Tiles with SSS but no rough refraction are also added onto the scene color buffer separately from the blur process.

#rb none
#preflight
#fyi charles.derousiers

[CL 19404162 by Sebastien Hillaire in ue5-main branch]
2022-03-16 05:45:43 -04:00
Sebastien Hillaire
1ffe7c958d Strata - Refactored indirect dispatches to source from the same indierect draw buffer, thus avoiding hitting out of uav slots on dx11.
#rb none
#preflight https://horde.devtools.epicgames.com/job/6230946ce65a7e65d6855346
#fyi charles.derousiers

[CL 19384753 by Sebastien Hillaire in ue5-main branch]
2022-03-15 09:55:16 -04:00
Charles deRousiers
9090280376 Change thin layer to act as an Strata operator.
This simplify Strata Slab UI, and make it easier to thin-film coat many BSDFs at the same time.

#rb none
#jira none
#preflight 622fa337c51b66df4c248174
#fyi sebastien.hillaire

[CL 19378172 by Charles deRousiers in ue5-main branch]
2022-03-14 16:46:59 -04:00
graham wihlidal
25f8e29153 Removed a massive number of Nanite rasterizer shader permutations across all platforms/shaderdbs, significantly improving iteration times for the editor and cooker, especially when these numbers get multiplied by the number of materials that utilize programmable features in addition to the default material "fixed function" path.
Reductions *per material*:

SM5
--
FHWRasterizeVS: 832 -> 21
FHWRasterizePS: 104 -> 39

SM6
--
FHWRasterizeVS: 320 -> 9
FHWRasterizeMS: 640 -> 9
FHWRasterizePS: 120 -> 30

Vulkan
--
FHWRasterizeVS: 320 -> 9
FHWRasterizePS: 40 -> 15

Other platforms redacted =)

-- Details

* CLUSTER_PER_PAGE has been fully removed (since we no longer ever run CLUSTER_PER_PAGE=0), which now makes it mutually inclusive with VIRTUAL_TEXTURE_TARGET
* HAS_RASTER_BIN has been replaced with a dynamic branch, since this is just a per cluster index offset based on a simple uniform buffer load
* ADD_CLUSTER_OFFSET has been replaced with a dynamic branch, since this is just a per cluster index offset based on a simple uniform buffer load
* HAS_PREV_DRAW_DATA has been replaced with a dynamic branch, since this is just a per cluster index offset based on a simple uniform buffer load
* NEAR_CLIP (only change to significantly affect codegen) has been turned into a dynamic branch based on FNaniteView - this lets us merge depth clip/clamp rasterizer calls in VSM together instead of relying on HAS_PREV_DRAW_DATA, and a future optimization can now be done to merge local and directional light full Nanite pipeline calls together.
* VISUALIZE permutation removed from VS/MS since it only loaded unform values that passed down per-vertex into fragment stage as nointerpolation parameters. Pixel shader now constructs this uint2 directly under the VISUALIZE permutation
* NANITE_MESH_SHADER_INTERP removed by default but still left in the code, since it is a work in progress potential optimization for DX12 mesh shaders
* Removed explicit Lumen and VSM usage of NANITE_RENDER_FLAG_HAVE_PREV_DRAW_DATA (now the dynamic branch path is only taken if CullRasterizeMultiPass implicitly breaks the rasterization into multiple calls due to NANITE_MAX_VIEWS_PER_CULL_RASTERIZE_PASS overflow)

Performance was tested on a 2080Ti in AncientGame, and the delta is effectively noise (tested cached and uncached VSM). Further testing on other platforms will occur, but important to get this change in for all the benefits and easy to tweak things later if needed.

#rb rune.stubbe
#fyi brian.karis, ola.olsson, andrew.lauritzen, jamie.hayes, daniel.wright, krzysztof.narkowicz
#preflight 622e684c7e2e35638c96a16a
#robomerge FNNC

[CL 19370372 by graham wihlidal in ue5-main branch]
2022-03-13 23:18:25 -04:00
krzysztof narkowicz
e067e0fe2a Lumen - fixed HLOD1 surface cache. HLOD1 meshes, hidden in main view, weren't rasterized by Nanite into Lumen's surface cache.
#jira UE-144367
#preflight 621f0a9fb20446f11c8295a4
#lockdown Juan.Canada
#rb Daniel.Wright


#ROBOMERGE-OWNER: krzysztof.narkowicz
#ROBOMERGE-AUTHOR: krzysztof.narkowicz
#ROBOMERGE-SOURCE: CL 19222428 via CL 19222851 via CL 19222926 via CL 19222934 via CL 19225794
#ROBOMERGE-BOT: UE5 (Release-Engine-Staging -> Main) (v921-19075845)

[CL 19229576 by krzysztof narkowicz in ue5-main branch]
2022-03-02 16:22:29 -05:00
aleksander netzel
d513d2de4e Adding NaniteRayTrace.ush which implements functions needed for Nanite geometry intersection for ray tracing:
* Moller-Trumbore and Watertight triangle intersections in RayTriangleIntersect.h
* AABB intersection
* Nanite cluster intersection
* Nanite hierarchy intersection using a stack.

#rb Tiago.Costa
#preflight none

[CL 19196324 by aleksander netzel in ue5-main branch]
2022-03-01 05:14:21 -05:00