Files
UnrealEngineUWP/Engine/Shaders/ReflectionEnvironmentComputeShaders.usf
Marcus Wassmer edea678466 Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3072736)
#lockdown Nick.Penwarden
#rb none

==========================
MAJOR FEATURES + CHANGES
==========================

Change 3055495 on 2016/07/19 by Marc.Olano

	Allow Noise material node on mobile

	No reason to exclude mobile, except for Fast Gradient Noise, which uses 3D textures. Allow this node on ES2 for all of the other noise functions.

	#jira UE-33345

Change 3055602 on 2016/07/19 by Luke.Thatcher

	Fix crash bug in D3D11 RHI when selecting adapters.
	 - Array of adapter descriptors will get out of sync with the adapter index if any adapter is skipped (e.g. the Microsoft Basic Render Device).
	#jira UE-33236

Change 3055890 on 2016/07/19 by Daniel.Wright

	Improved the assert in LoadModuleChecked so we won't have to check the log to see which module it was

Change 3055891 on 2016/07/19 by Daniel.Wright

	Fixed Global Distance Field not dirtying previous object position on UpdateTransform - left behind a phantom shadow on teleports
	* This will effectively double partial distiance field update costs until clipping of the update regions is implemented

Change 3055892 on 2016/07/19 by Daniel.Wright

	Higher poly light source shapes drawn into reflection captures

Change 3055893 on 2016/07/19 by Daniel.Wright

	More info to 'Incompatible surface format' GNM assert

Change 3055904 on 2016/07/19 by Daniel.Wright

	Reflection environment normalization improvements
	* Indirect specular from reflection captures is now mixed with indirect diffuse from lightmaps based on roughness, such that a mirror surface will have no mixing.  Reflection captures now match other reflection methods like SSR and planar reflections much more closely.
	* When a stationary skylight is present, Reflection captures are now normalized as if the initial skylight will always be present, giving consistent results with static skylight reflections.  The skylight and reflection captures with sky removed used to be normalized separately, compacting the relative brightness between the sky and scene.
	* Added r.ReflectionEnvironmentLightmapMixing for debugging lightmap mixing issues.  This toggle was previously not possible due to prenormalizing the capture data.
	* The standard deferred reflection path (r.DoTiledReflections 0) can no longer match the results of the compute path or base pass reflections, as it would require MRT to accumulate the average brightness
	* Removed unused r.DiffuseFromCaptures
	* Cost of reflection environment on PS4 increased from 1.52ms -> 1.75ms with this change, but decreased back to 1.58ms by reducing tile size to 8x8

Change 3055905 on 2016/07/19 by Daniel.Wright

	Workaround for RTDF shadows not working on PS4 - manual clear of ObjectIndirectArguments instead of RHICmdList.ClearUAV

Change 3059486 on 2016/07/21 by Nick.Penwarden

	Testing #uecritical

Change 3060558 on 2016/07/21 by Daniel.Wright

	Fixed skylight with specified cubemap being black

Change 3061999 on 2016/07/22 by Marcus.Wassmer

	Disable old AMD driver hacks for DX11.  QA has already tested with them off and given thumbs up.

Change 3062241 on 2016/07/22 by Daniel.Wright

	Fixed bug in RHISupportsSeparateMSAAAndResolveTextures that was preventing MSAA for any non-Vulkan platforms

Change 3062244 on 2016/07/22 by Daniel.Wright

	Discard old prenormalized reflection environment data on load

Change 3062283 on 2016/07/22 by Daniel.Wright

	MSAA support for the forward renderer
	* AntiAliasing method is chosen in Rendering project settings, DefaultSettings category
	* Deferred passes like shadow projection, fogging and decals are only computed per-pixel and can introduce aliasing
	* Added Rendering project setting VertexFoggingForOpaque, which makes height fog cheaper and work properly with MSAA
	* The AntiAliasing method in PostProcessSettings has been removed, this may affect existing content
	* Added r.MSAACount which defaults to 4
	* Integrated wide custom resolve filter from Oculus renderer, controlled by r.WideCustomResolve
	* GBuffer targets are no longer allocated when using the forward renderer
	* Decal blend modes that write to the GBuffer fall back to SceneColor emissive only

Change 3062666 on 2016/07/23 by Uriel.Doyon

	Added legend to streaming accuracy viewmodes
	Added a new helper class FRenderTargetTemp to be reused in different canvas rendering.
	Exposed the pass through pixel shader so that it can be reused.
	#review-3058986 @marcus.wassmer

Change 3063023 on 2016/07/25 by Luke.Thatcher

	Fix "RecompileShaders Changed" when using Cook On The Fly.
	#jira UE-33573

Change 3063078 on 2016/07/25 by Ben.Woodhouse

	Add -emitdrawevents command line option to emit draw events by default. This is useful when capturing with Renderdoc

Change 3063315 on 2016/07/25 by Ben.Woodhouse

	Fix div 0 in motion blur. This caused artifacts in some fairly common cases
	#jira UE-32331

Change 3063897 on 2016/07/25 by Uriel.Doyon

	Fixed missing qualifier on interpolants

Change 3064559 on 2016/07/26 by Ben.Woodhouse

	Fix for cooker crash with BC6H textures (XB1, but may affect other platforms). Also fixes corruption issue with texture slices not being a multiple of 4 pixels (expanding as necessary), courtesy of Stu McKenna at the Coalition
	Tested fix on xbox, PC and PS4, using QAGame
	#jira UE-28592

Change 3064896 on 2016/07/26 by Ben.Woodhouse

	Fix compile errors on PS4 (the variable "sample" was conflicting with a keyword, causing compile errors). Also making encoding consistent on new shaders (ansi rather than UTF16)

Change 3064913 on 2016/07/26 by Ben.Marsh

	Fix spelling of "Editor, Tools, Monolithics & DDC" node in Dev-Rendering build settings.

Change 3065326 on 2016/07/26 by Uriel.Doyon

	Fixed UnbuiltInstanceBoundsList not being reset correctly, creating broken rendered primitives.
	#jira UE-32585

Change 3065541 on 2016/07/26 by Daniel.Wright

	Materials with a GBuffer SceneTexture lookup will fail to compile with forward shading

Change 3065543 on 2016/07/26 by Daniel.Wright

	Restored DetailMode changes causing a FGlobalComponentRecreateRenderStateContext - accidental removal from cl 2969413

Change 3065545 on 2016/07/26 by Daniel.Wright

	Added material property bNormalCurvatureToRoughness, which can slightly reduce aliasing.  Tweakable impact with r.NormalCurvatureToRoughnessScale.
	Fixed reflection capture feedback with base pass reflections

Change 3066783 on 2016/07/27 by Daniel.Wright

	Moved PreShadowCacheDepthZ out of FSceneRenderTargets and into FScene, which fixes issues with cached preshadows and multiple scenes, including HighResScreenShot
	Disabled GMinScreenRadiusForShadowCaster on per-object shadows, which fixes popping when trying to increase shadow resolution from the defaults (r.Shadow.TexelsPerPixel 3)

Change 3066794 on 2016/07/27 by Daniel.Wright

	Fixed crash rendering planar reflections due to NULL PostProcessSettings

Change 3067412 on 2016/07/27 by Daniel.Wright

	Fix for OpenGL4 with uint interpolator

Change 3068470 on 2016/07/28 by Daniel.Wright

	Fixed crash rendering translucency with translucent shadows which were determined to be invisible

Change 3069046 on 2016/07/28 by Daniel.Wright

	Handle null Family in SetupAntiAliasingMethod

Change 3069059 on 2016/07/28 by Daniel.Wright

	Added r.ReflectionEnvironmentBeginMixingRoughness (.1) and r.ReflectionEnvironmentEndMixingRoughness (.3), which can be used to tweak the lightmap mixing heuristc, or revert to previous behavior (mixing even on a mirror surface)

Change 3069391 on 2016/07/28 by Daniel.Wright

	Fixed AverageBrightness being applied to reflections in gamma space in the mobile base pass, causing ES2 reflections to be overbright

Change 3070369 on 2016/07/29 by Daniel.Wright

	r.ReflectionEnvironmentBeginMixingRoughness and r.ReflectionEnvironmentEndMixingRoughness set to 0 can be used to achieve old non-roughness based lightmap mixing

Change 3070370 on 2016/07/29 by Daniel.Wright

	Bumped reflection capture DDC version to get rid of legacy prenormalized data

Change 3070680 on 2016/07/29 by Marcus.Wassmer

	Fix slate ensure that is most likely a timing issue exposed by rendering.
	#ue-33902

Change 3070811 on 2016/07/29 by Marcus.Wassmer

	Fix ProjectLauncher errors when loading old versions
	#ue-33939

Change 3070971 on 2016/07/29 by Uriel.Doyon

	Updated ListTextures outputs to fix cooked VS non cooked differences and also to put enphasis on disk VS memory

Change 3071452 on 2016/07/31 by Uriel.Doyon

	Updated the legend description for the (texture streaming) primitive distance accuracy view mode

[CL 3072803 by Marcus Wassmer in Main branch]
2016-08-01 18:56:49 -04:00

700 lines
23 KiB
Plaintext

// Copyright 1998-2016 Epic Games, Inc. All Rights Reserved.
/*=============================================================================
ReflectionEnvironmentComputeShaders - functionality to apply local cubemaps.
=============================================================================*/
#include "Common.usf"
#include "DeferredShadingCommon.usf"
#include "BRDF.usf"
#include "ReflectionEnvironmentShared.usf"
#include "SkyLightingShared.usf"
#include "ShadingModels.usf"
#if TILED_DEFERRED_CULL_SHADER
/** Cube map array of reflection captures. */
TextureCubeArray ReflectionEnvironmentColorTexture;
SamplerState ReflectionEnvironmentColorSampler;
#define THREADGROUP_TOTALSIZE (THREADGROUP_SIZEX * THREADGROUP_SIZEY)
// Workaround performance issue with shared memory bank collisions in GLSL
#if GL4_PROFILE
#define ATOMIC_REDUCTION 0
#else
#define ATOMIC_REDUCTION 0
#endif
#define AABB_INTERSECT 1
#define VISUALIZE_OVERLAP 0
uint NumCaptures;
/** View rect min in xy, max in zw. */
uint4 ViewDimensions;
/** Min and Max depth for this tile. */
groupshared uint IntegerTileMinZ;
groupshared uint IntegerTileMaxZ;
/** Inner Min and Max depth for this tile. */
groupshared uint IntegerTileMinZ2;
groupshared uint IntegerTileMaxZ2;
/** Number of reflection captures affecting this tile, after culling. */
groupshared uint TileNumReflectionCaptures;
/** Indices into the capture data buffer of captures that affect this tile, computed by culling. */
groupshared uint TileReflectionCaptureIndices[MAX_CAPTURES];
/** Capture indices after sorting. */
groupshared uint SortedTileReflectionCaptureIndices[MAX_CAPTURES];
#if !ATOMIC_REDUCTION
#if THREADGROUP_TOTALSIZE < 107
#define TILE_Z_SIZE 107
#else
#define TILE_Z_SIZE THREADGROUP_TOTALSIZE
#endif
groupshared float TileZ[TILE_Z_SIZE];
#endif
void ComputeTileMinMax(uint ThreadIndex, float SceneDepth, out float MinTileZ, out float MaxTileZ, out float MinTileZ2, out float MaxTileZ2)
{
#if ATOMIC_REDUCTION
// Initialize per-tile variables
if (ThreadIndex == 0)
{
IntegerTileMinZ = 0x7F7FFFFF;
IntegerTileMaxZ = 0;
IntegerTileMinZ2 = 0x7F7FFFFF;
IntegerTileMaxZ2 = 0;
}
GroupMemoryBarrierWithGroupSync();
// Use shared memory atomics to build the depth bounds for this tile
// Each thread is assigned to a pixel at this point
InterlockedMin(IntegerTileMinZ, asuint(SceneDepth));
InterlockedMax(IntegerTileMaxZ, asuint(SceneDepth));
GroupMemoryBarrierWithGroupSync();
MinTileZ = asfloat(IntegerTileMinZ);
MaxTileZ = asfloat(IntegerTileMaxZ);
float HalfZ = .5f * (MinTileZ + MaxTileZ);
// Compute a second min and max Z, clipped by HalfZ, so that we get two depth bounds per tile
// This results in more conservative tile depth bounds and fewer intersections
if (SceneDepth >= HalfZ)
{
InterlockedMin(IntegerTileMinZ2, asuint(SceneDepth));
}
if (SceneDepth <= HalfZ)
{
InterlockedMax(IntegerTileMaxZ2, asuint(SceneDepth));
}
GroupMemoryBarrierWithGroupSync();
MinTileZ2 = asfloat(IntegerTileMinZ2);
MaxTileZ2 = asfloat(IntegerTileMaxZ2);
#else
TileZ[ThreadIndex] = SceneDepth;
GroupMemoryBarrierWithGroupSync();
THREADGROUP_TOTALSIZE;
if (ThreadIndex < 32)
{
float Min = SceneDepth;
float Max = SceneDepth;
for ( int i = ThreadIndex+32; i< THREADGROUP_TOTALSIZE; i+=32)
{
Min = min( Min, TileZ[i]);
Max = max( Max, TileZ[i]);
}
TileZ[ThreadIndex] = Min;
TileZ[ThreadIndex + 32] = Max;
}
GroupMemoryBarrierWithGroupSync();
if (ThreadIndex < 8)
{
float Min = TileZ[ThreadIndex];
float Max = TileZ[ThreadIndex + 32];
Min = min( Min, TileZ[ThreadIndex + 8]);
Max = max( Max, TileZ[ThreadIndex + 40]);
Min = min( Min, TileZ[ThreadIndex + 16]);
Max = max( Max, TileZ[ThreadIndex + 48]);
Min = min( Min, TileZ[ThreadIndex + 24]);
Max = max( Max, TileZ[ThreadIndex + 56]);
TileZ[ThreadIndex + 64] = Min;
TileZ[ThreadIndex + 96] = Max;
}
GroupMemoryBarrierWithGroupSync();
if (ThreadIndex == 0)
{
float Min = TileZ[64];
float Max = TileZ[96];
for ( int i = 1; i< 8; i++)
{
Min = min( Min, TileZ[i+64]);
Max = max( Max, TileZ[i+96]);
}
IntegerTileMinZ = asuint(Min);
IntegerTileMaxZ = asuint(Max);
}
GroupMemoryBarrierWithGroupSync();
MinTileZ = asfloat(IntegerTileMinZ);
MaxTileZ = asfloat(IntegerTileMaxZ);
float HalfZ = .5f * (MinTileZ + MaxTileZ);
MinTileZ2 = HalfZ;
MaxTileZ2 = HalfZ;
#endif
}
bool SphereVsBox( float3 SphereCenter, float SphereRadius, float3 BoxCenter, float3 BoxExtent )
{
float3 ClosestOnBox = max( 0, abs( BoxCenter - SphereCenter ) - BoxExtent );
return dot( ClosestOnBox, ClosestOnBox ) < SphereRadius * SphereRadius;
}
// Culls reflection captures in the scene with the current tile
// Outputs are stored in shared memory
void DoTileCulling(uint3 GroupId, uint ThreadIndex, float MinTileZ, float MaxTileZ, float MinTileZ2, float MaxTileZ2)
{
#if AABB_INTERSECT
float3 TileBoxCenter;
float3 TileBoxExtent;
// can be optmized
// left top front
float2 ScreenUV0 = float2((GroupId.xy + int2(0, 0))* float2(THREADGROUP_SIZEX, THREADGROUP_SIZEY) + .5f) / (ViewDimensions.zw - ViewDimensions.xy);
float3 ScreenPos0 = float3(float2(2.0f, -2.0f) * ScreenUV0 + float2(-1.0f, 1.0f), ConvertToDeviceZ(MinTileZ));
// right bottom back
float2 ScreenUV1 = float2((GroupId.xy + int2(1, 1)) * float2(THREADGROUP_SIZEX, THREADGROUP_SIZEY) - .5f) / (ViewDimensions.zw - ViewDimensions.xy);
float3 ScreenPos1 = float3(float2(2.0f, -2.0f) * ScreenUV1 + float2(-1.0f, 1.0f), ConvertToDeviceZ(MaxTileZ));
// back rect
float4 ViewPos0 = mul(float4(ScreenPos0.x, ScreenPos0.y, ScreenPos1.z, 1), View.ClipToView); ViewPos0.xyz /= ViewPos0.w;
float4 ViewPos1 = mul(float4(ScreenPos0.x, ScreenPos1.y, ScreenPos1.z, 1), View.ClipToView); ViewPos1.xyz /= ViewPos1.w;
float4 ViewPos2 = mul(float4(ScreenPos1.x, ScreenPos0.y, ScreenPos1.z, 1), View.ClipToView); ViewPos2.xyz /= ViewPos2.w;
float4 ViewPos3 = mul(float4(ScreenPos1.x, ScreenPos1.y, ScreenPos1.z, 1), View.ClipToView); ViewPos3.xyz /= ViewPos3.w;
// front point
// Warning: this assumes a point at the near depth, which is not a valid assumption, will cause culling artifacts
float4 ViewPos4 = mul(float4(ScreenPos0.xy, ScreenPos0.z, 1), View.ClipToView); ViewPos4.xyz /= ViewPos4.w;
float3 TileBoxMin = min(ViewPos4.xyz, min(ViewPos0.xyz, ViewPos3.xyz));
float3 TileBoxMax = max(ViewPos4.xyz, max(ViewPos0.xyz, ViewPos3.xyz));
TileBoxCenter = (TileBoxMax + TileBoxMin) * 0.5f;
TileBoxExtent = (TileBoxMax - TileBoxMin) * 0.5f;
#else
// Setup tile frustum planes
float2 TileScale = float2(ViewDimensions.zw - ViewDimensions.xy) * rcp(2 * float2(THREADGROUP_SIZEX, THREADGROUP_SIZEY));
float2 TileBias = TileScale - GroupId.xy;
float4 C1 = float4(View.ViewToClip._11 * TileScale.x, 0.0f, View.ViewToClip._31 * TileScale.x + TileBias.x, 0.0f);
float4 C2 = float4(0.0f, -View.ViewToClip._22 * TileScale.y, View.ViewToClip._32 * TileScale.y + TileBias.y, 0.0f);
float4 C4 = float4(0.0f, 0.0f, 1.0f, 0.0f);
// TODO transform to world space
#if ATOMIC_REDUCTION
float4 frustumPlanes[8];
frustumPlanes[0] = C4 - C1;
frustumPlanes[1] = C4 + C1;
frustumPlanes[2] = C4 - C2;
frustumPlanes[3] = C4 + C2;
frustumPlanes[4] = float4(0.0f, 0.0f, 1.0f, -MinTileZ);
frustumPlanes[5] = float4(0.0f, 0.0f, -1.0f, MaxTileZ2);
frustumPlanes[6] = float4(0.0f, 0.0f, 1.0f, -MinTileZ2);
frustumPlanes[7] = float4(0.0f, 0.0f, -1.0f, MaxTileZ);
#else
float4 frustumPlanes[6];
frustumPlanes[0] = C4 - C1;
frustumPlanes[1] = C4 + C1;
frustumPlanes[2] = C4 - C2;
frustumPlanes[3] = C4 + C2;
frustumPlanes[4] = float4(0.0f, 0.0f, 1.0f, -MinTileZ);
frustumPlanes[5] = float4(0.0f, 0.0f, -1.0f, MaxTileZ);
#endif
// Normalize tile frustum planes
UNROLL
for (uint i = 0; i < 4; ++i)
{
frustumPlanes[i] *= rcp(length(frustumPlanes[i].xyz));
}
#endif
if (ThreadIndex == 0)
{
TileNumReflectionCaptures = 0;
}
GroupMemoryBarrierWithGroupSync();
// Compute per-tile lists of affecting captures through bounds culling
// Each thread now operates on a sample instead of a pixel
LOOP
for (uint CaptureIndex = ThreadIndex; CaptureIndex < NumCaptures && CaptureIndex < MAX_CAPTURES; CaptureIndex += THREADGROUP_TOTALSIZE)
{
float4 CapturePositionAndRadius = ReflectionCapture.PositionAndRadius[CaptureIndex];
float3 BoundsViewPosition = mul(float4(CapturePositionAndRadius.xyz + View.PreViewTranslation.xyz, 1), View.TranslatedWorldToView).xyz;
#if AABB_INTERSECT
// Add this capture to the list of indices if it intersects
BRANCH
if( SphereVsBox( BoundsViewPosition, CapturePositionAndRadius.w, TileBoxCenter, TileBoxExtent ) )
{
uint ListIndex;
InterlockedAdd(TileNumReflectionCaptures, 1U, ListIndex);
TileReflectionCaptureIndices[ListIndex] = CaptureIndex;
}
#else
// Cull the light against the tile's frustum planes
// Note: this has some false positives, a light that is intersecting three different axis frustum planes yet not intersecting the volume of the tile will be treated as intersecting
bool bInTile = true;
// Test against the screen x and y oriented planes first
UNROLL
for (uint i = 0; i < 4; ++i)
{
float PlaneDistance = dot(frustumPlanes[i], float4(BoundsViewPosition, 1.0f));
bInTile = bInTile && (PlaneDistance >= -CapturePositionAndRadius.w);
}
BRANCH
if (bInTile)
{
#if ATOMIC_REDUCTION
bool bInNearDepthRange = true;
// Test against the near depth range
UNROLL
for (uint i = 4; i < 6; ++i)
{
float PlaneDistance = dot(frustumPlanes[i], float4(BoundsViewPosition, 1.0f));
bInNearDepthRange = bInNearDepthRange && (PlaneDistance >= -CapturePositionAndRadius.w);
}
bool bInFarDepthRange = true;
// Test against the far depth range
UNROLL
for (uint j = 6; j < 8; ++j)
{
float PlaneDistance = dot(frustumPlanes[j], float4(BoundsViewPosition, 1.0f));
bInFarDepthRange = bInFarDepthRange && (PlaneDistance >= -CapturePositionAndRadius.w);
}
bool bInDepthRange = bInNearDepthRange || bInFarDepthRange;
#else
bool bInDepthRange = true;
// Test against the depth range
UNROLL
for (uint i = 4; i < 6; ++i)
{
float PlaneDistance = dot(frustumPlanes[i], float4(BoundsViewPosition, 1.0f));
bInDepthRange = bInDepthRange && (PlaneDistance >= -CapturePositionAndRadius.w);
}
#endif
// Add this capture to the list of indices if it intersects
BRANCH
if (bInDepthRange)
{
uint ListIndex;
InterlockedAdd(TileNumReflectionCaptures, 1U, ListIndex);
TileReflectionCaptureIndices[ListIndex] = CaptureIndex;
}
}
#endif
}
GroupMemoryBarrierWithGroupSync();
uint NumCapturesAffectingTile = TileNumReflectionCaptures;
// Sort captures by their original capture index
// This is necessary because the culling used InterlockedAdd to generate compacted array indices,
// Which rearranged the original capture order, in which the captures were sorted smallest to largest on the CPU.
//@todo - parallel stream compaction could be faster than this
#define SORT_CAPTURES 1
#if SORT_CAPTURES
// O(N^2) simple parallel sort
LOOP
for (uint CaptureIndex2 = ThreadIndex; CaptureIndex2 < NumCapturesAffectingTile; CaptureIndex2 += THREADGROUP_TOTALSIZE)
{
// Sort by original capture index
int SortKey = TileReflectionCaptureIndices[CaptureIndex2];
uint NumSmaller = 0;
// Count how many items have a smaller key, so we can insert ourselves into the correct position, without requiring interaction between threads
for (uint OtherSampleIndex = 0; OtherSampleIndex < NumCapturesAffectingTile; OtherSampleIndex++)
{
int OtherSortKey = TileReflectionCaptureIndices[OtherSampleIndex];
if (OtherSortKey < SortKey)
{
NumSmaller++;
}
}
// Move this entry into its sorted position
SortedTileReflectionCaptureIndices[NumSmaller] = TileReflectionCaptureIndices[CaptureIndex2];
}
#endif
GroupMemoryBarrierWithGroupSync();
}
float CountOverlap( float3 WorldPosition )
{
float Overlap = 0;
float Opacity = 1;
uint NumCapturesAffectingTile = TileNumReflectionCaptures;
// Accumulate reflections from captures affecting this tile, applying largest captures first so that the smallest ones display on top
LOOP
for (uint TileCaptureIndex = 0; TileCaptureIndex < NumCapturesAffectingTile; TileCaptureIndex++)
{
BRANCH
if( Opacity < 0.001 )
{
break;
}
#if SORT_CAPTURES
uint CaptureIndex = SortedTileReflectionCaptureIndices[TileCaptureIndex];
#else
uint CaptureIndex = TileReflectionCaptureIndices[TileCaptureIndex];
#endif
float4 CapturePositionAndRadius = ReflectionCapture.PositionAndRadius[CaptureIndex];
float3 CaptureVector = WorldPosition - CapturePositionAndRadius.xyz;
float CaptureVectorLength = length(CaptureVector);
BRANCH
if (CaptureVectorLength < CapturePositionAndRadius.w)
{
float NormalizedDistanceToCapture = saturate(CaptureVectorLength / CapturePositionAndRadius.w);
// Fade out based on distance to capture
float x = saturate( 2.5 * NormalizedDistanceToCapture - 1.5 );
float DistanceAlpha = 1 - x*x*(3 - 2*x);
Overlap += 1;
Opacity *= 1 - DistanceAlpha;
}
}
return Overlap;
}
float3 GatherRadiance(float CompositeAlpha, float3 WorldPosition, float3 RayDirection, float Roughness, float2 ScreenPosition, float IndirectIrradiance, float NoV, uint ShadingModelID)
{
// Indirect occlusion from DFAO, which should be applied to reflection captures and skylight specular, but not SSR
float IndirectSpecularOcclusion = 1.0f;
float3 ExtraIndirectSpecular = 0;
#if SUPPORT_DFAO_INDIRECT_OCCLUSION
float2 ScreenUV = ScreenPosition * View.ScreenPositionScaleBias.xy + View.ScreenPositionScaleBias.wz;
float IndirectDiffuseOcclusion;
GetDistanceFieldAOSpecularOcclusion(ScreenUV, RayDirection, Roughness, ShadingModelID == SHADINGMODELID_TWOSIDED_FOLIAGE, IndirectSpecularOcclusion, IndirectDiffuseOcclusion, ExtraIndirectSpecular);
// Apply DFAO to IndirectIrradiance before mixing with indirect specular
IndirectIrradiance *= IndirectDiffuseOcclusion;
#endif
float Mip = ComputeReflectionCaptureMipFromRoughness(Roughness, View.ReflectionCubemapMaxMip);
uint NumCapturesAffectingTile = TileNumReflectionCaptures;
float4 ImageBasedReflections = float4(0, 0, 0, CompositeAlpha);
float2 CompositedAverageBrightness = float2(0.0f, 1.0f);
// Accumulate reflections from captures affecting this tile, applying largest captures first so that the smallest ones display on top
LOOP
for (uint TileCaptureIndex = 0; TileCaptureIndex < NumCapturesAffectingTile; TileCaptureIndex++)
{
BRANCH
if (ImageBasedReflections.a < 0.001)
{
break;
}
#if SORT_CAPTURES
uint CaptureIndex = SortedTileReflectionCaptureIndices[TileCaptureIndex];
#else
uint CaptureIndex = TileReflectionCaptureIndices[TileCaptureIndex];
#endif
float4 CapturePositionAndRadius = ReflectionCapture.PositionAndRadius[CaptureIndex];
float4 CaptureProperties = ReflectionCapture.CaptureProperties[CaptureIndex];
float3 CaptureVector = WorldPosition - CapturePositionAndRadius.xyz;
float CaptureVectorLength = sqrt(dot(CaptureVector, CaptureVector));
float NormalizedDistanceToCapture = saturate(CaptureVectorLength / CapturePositionAndRadius.w);
BRANCH
if (CaptureVectorLength < CapturePositionAndRadius.w)
{
float3 ProjectedCaptureVector = RayDirection;
float4 CaptureOffsetAndAverageBrightness = ReflectionCapture.CaptureOffsetAndAverageBrightness[CaptureIndex];
// Fade out based on distance to capture
float DistanceAlpha = 0;
#define PROJECT_ONTO_SHAPE 1
#if PROJECT_ONTO_SHAPE
#if HAS_BOX_CAPTURES
#if HAS_SPHERE_CAPTURES
// Box
BRANCH if (CaptureProperties.b > 0)
#endif
{
ProjectedCaptureVector = GetLookupVectorForBoxCapture(RayDirection, WorldPosition, CapturePositionAndRadius, ReflectionCapture.BoxTransform[CaptureIndex], ReflectionCapture.BoxScales[CaptureIndex], CaptureOffsetAndAverageBrightness.xyz, DistanceAlpha);
}
#endif
#if HAS_SPHERE_CAPTURES
// Sphere
#if HAS_BOX_CAPTURES
else
#endif
{
ProjectedCaptureVector = GetLookupVectorForSphereCapture(RayDirection, WorldPosition, CapturePositionAndRadius, NormalizedDistanceToCapture, CaptureOffsetAndAverageBrightness.xyz, DistanceAlpha);
}
#endif
#else
DistanceAlpha = 1.0;
#endif //PROJECT_ONTO_SHAPE
float CaptureArrayIndex = CaptureProperties.g;
{
float4 Sample = ReflectionEnvironmentColorTexture.SampleLevel(ReflectionEnvironmentColorSampler, float4(ProjectedCaptureVector, CaptureArrayIndex), Mip);
Sample.rgb *= CaptureProperties.r;
Sample *= DistanceAlpha;
// Under operator (back to front)
ImageBasedReflections.rgb += Sample.rgb * ImageBasedReflections.a * IndirectSpecularOcclusion;
ImageBasedReflections.a *= 1 - Sample.a;
float AverageBrightness = CaptureOffsetAndAverageBrightness.w;
CompositedAverageBrightness.x += AverageBrightness * DistanceAlpha * CompositedAverageBrightness.y;
CompositedAverageBrightness.y *= 1 - DistanceAlpha;
}
}
}
#if HAS_SKYLIGHT
BRANCH
if (SkyLightParameters.y > 0)
{
float SkyAverageBrightness = 1.0f;
float3 SkyLighting = GetSkyLightReflectionSupportingBlend(RayDirection, Roughness, SkyAverageBrightness);
// Normalize for static skylight types which mix with lightmaps
bool bNormalize = SkyLightParameters.z < 1 && USE_LIGHTMAPS;
FLATTEN
if (bNormalize)
{
ImageBasedReflections.rgb += ImageBasedReflections.a * SkyLighting * IndirectSpecularOcclusion;
CompositedAverageBrightness.x += SkyAverageBrightness * CompositedAverageBrightness.y;
}
else
{
ExtraIndirectSpecular += SkyLighting * IndirectSpecularOcclusion;
}
}
#endif
#if USE_LIGHTMAPS
// Note: make sure this matches the lightmap mixing done for translucency (BasePassPixelShader.usf)
ImageBasedReflections.rgb *= ComputeMixingWeight(IndirectIrradiance, CompositedAverageBrightness.x, Roughness);
#endif
ImageBasedReflections.rgb += ImageBasedReflections.a * ExtraIndirectSpecular;
return ImageBasedReflections.rgb;
}
Texture2D ScreenSpaceReflections;
Texture2D InSceneColor;
/** Output HDR target. */
RWTexture2D<float4> RWOutSceneColor;
[numthreads(THREADGROUP_SIZEX, THREADGROUP_SIZEY, 1)]
void ReflectionEnvironmentTiledDeferredMain(
uint3 GroupId : SV_GroupID,
uint3 DispatchThreadId : SV_DispatchThreadID,
uint3 GroupThreadId : SV_GroupThreadID)
{
uint ThreadIndex = GroupThreadId.y * THREADGROUP_SIZEX + GroupThreadId.x;
uint2 PixelPos = DispatchThreadId.xy + ViewDimensions.xy;
float2 ViewportUV = (float2(DispatchThreadId.xy) + .5f) / (ViewDimensions.zw - ViewDimensions.xy);
float2 ScreenPosition = float2(2.0f, -2.0f) * ViewportUV + float2(-1.0f, 1.0f);
float SceneDepth = CalcSceneDepth(PixelPos);
float MinTileZ;
float MaxTileZ;
float MinTileZ2;
float MaxTileZ2;
ComputeTileMinMax(ThreadIndex, SceneDepth, MinTileZ, MaxTileZ, MinTileZ2, MaxTileZ2);
DoTileCulling(GroupId, ThreadIndex, MinTileZ, MaxTileZ, MinTileZ2, MaxTileZ2);
// Lookup GBuffer properties once per pixel
FScreenSpaceData ScreenSpaceData = GetScreenSpaceDataUint(PixelPos);
FGBufferData GBuffer = ScreenSpaceData.GBuffer;
float4 Color = float4(0, 0, 0, 1);
float4 HomogeneousWorldPosition = mul(float4(ScreenPosition * SceneDepth, SceneDepth, 1), View.ScreenToWorld);
float3 WorldPosition = HomogeneousWorldPosition.xyz / HomogeneousWorldPosition.w;
float3 CameraToPixel = normalize(WorldPosition - View.WorldCameraOrigin);
float3 ReflectionVector = reflect(CameraToPixel, GBuffer.WorldNormal);
float IndirectIrradiance = GBuffer.IndirectIrradiance;
#if HAS_SKYLIGHT && USE_LIGHTMAPS
BRANCH
// Add in diffuse contribution from dynamic skylights so reflection captures will have something to mix with
if (SkyLightParameters.y > 0 && SkyLightParameters.z > 0)
{
float2 ScreenUV = ScreenPosition * View.ScreenPositionScaleBias.xy + View.ScreenPositionScaleBias.wz;
IndirectIrradiance += GetDynamicSkyIndirectIrradiance(ScreenUV, GBuffer.WorldNormal);
}
#endif
#if VISUALIZE_OVERLAP
float Overlap = CountOverlap( WorldPosition );
#endif
BRANCH
if( GBuffer.ShadingModelID != SHADINGMODELID_UNLIT && GBuffer.ShadingModelID != SHADINGMODELID_HAIR )
{
float3 N = GBuffer.WorldNormal;
float3 V = -CameraToPixel;
float3 R = 2 * dot( V, N ) * N - V;
float NoV = saturate( dot( N, V ) );
// Point lobe in off-specular peak direction
R = GetOffSpecularPeakReflectionDir(N, R, GBuffer.Roughness);
#if 1
// Note: this texture may also contain planar reflections
float4 SSR = ScreenSpaceReflections.Load( int3(PixelPos, 0) );
Color.rgb = SSR.rgb;
Color.a = 1 - SSR.a;
#endif
if( GBuffer.ShadingModelID == SHADINGMODELID_CLEAR_COAT )
{
const float ClearCoat = GBuffer.CustomData.x;
Color = lerp( Color, float4(0,0,0,1), ClearCoat );
#if CLEAR_COAT_BOTTOM_NORMAL
const float2 oct1 = ((float2(GBuffer.CustomData.a, GBuffer.CustomData.z) * 2) - (256.0/255.0)) + UnitVectorToOctahedron(GBuffer.WorldNormal);
const float3 ClearCoatUnderNormal = OctahedronToUnitVector(oct1);
const float3 BottomEffectiveNormal = ClearCoatUnderNormal;
R = 2 * dot( V, ClearCoatUnderNormal ) * ClearCoatUnderNormal - V;
#endif
}
float AO = ScreenSpaceData.AmbientOcclusion;
float RoughnessSq = GBuffer.Roughness * GBuffer.Roughness;
float SpecularOcclusion = GetSpecularOcclusion(NoV, RoughnessSq, AO);
Color.a *= SpecularOcclusion;
//bottom for clearcoat or the only reflection.
Color.rgb += GatherRadiance(Color.a, WorldPosition, R, GBuffer.Roughness, ScreenPosition, IndirectIrradiance, NoV, GBuffer.ShadingModelID);
BRANCH
if( GBuffer.ShadingModelID == SHADINGMODELID_CLEAR_COAT )
{
const float ClearCoat = GBuffer.CustomData.x;
const float ClearCoatRoughness = GBuffer.CustomData.y;
// TODO EnvBRDF should have a mask param
float2 AB = PreIntegratedGF.SampleLevel( PreIntegratedGFSampler, float2( NoV, GBuffer.Roughness ), 0 ).rg;
Color.rgb *= GBuffer.SpecularColor * AB.x + AB.y * saturate( 50 * GBuffer.SpecularColor.g ) * (1 - ClearCoat);
// F_Schlick
float F0 = 0.04;
float Fc = Pow5( 1 - NoV );
float F = Fc + (1 - Fc) * F0;
F *= ClearCoat;
float LayerAttenuation = (1 - F);
Color.rgb *= LayerAttenuation;
Color.a = F;
Color.rgb += SSR.rgb * F;
Color.a *= 1 - SSR.a;
Color.a *= SpecularOcclusion;
float3 TopLayerR = 2 * dot( V, N ) * N - V;
Color.rgb += GatherRadiance(Color.a, WorldPosition, TopLayerR, ClearCoatRoughness, ScreenPosition, IndirectIrradiance, NoV, GBuffer.ShadingModelID);
}
else
{
Color.rgb *= EnvBRDF( GBuffer.SpecularColor, GBuffer.Roughness, NoV );
}
}
// Only write to the buffer for threads inside the view
BRANCH
if (all(DispatchThreadId.xy < ViewDimensions.zw))
{
float4 OutColor = 0;
#if VISUALIZE_OVERLAP
//OutColor.rgb = 0.1 * TileNumReflectionCaptures;
OutColor.rgb = 0.1 * Overlap;
#else
OutColor.rgb = Color.rgb;
#endif
// Transform NaNs to black, transform negative colors to black.
OutColor.rgb = -min(-OutColor.rgb, 0.0);
// alpha channel is also added to keep the alpha channel for screen space subsurface scattering
OutColor += InSceneColor.Load( int3(PixelPos, 0) );
RWOutSceneColor[PixelPos.xy] = OutColor;
}
}
#endif