You've already forked UnrealEngineUWP
mirror of
https://github.com/izzy2lost/UnrealEngineUWP.git
synced 2026-03-26 18:15:20 -07:00
#lockdown Nick.Penwarden #rb none ========================== MAJOR FEATURES + CHANGES ========================== Change 3028958 on 2016/06/27 by Ben.Woodhouse Fix for perf issue with GetSingleFinalDataConst This was caused by the LPV integration/switch to blendables. Now we cache the flag for the directionalocclusion in the LPV class. This reduces calls to GetSingleFinalDataConst on the blendable data (potentially slow), and makes things a bit cleaner and consistent. Tested in QAGame editor (with LPV enabled in ConsoleSettings.ini) #jira UE-26179 Change 3029401 on 2016/06/27 by Rolando.Caloca DR - More vk logging Change 3029549 on 2016/06/27 by Uriel.Doyon Refactored "r.OnlyStreamInTextures" into "r.Streaming.FullyLoadUsedTextures", making it fully load every used textures, as an alternative to disabling texture streaming. New options "r.Streaming.UsePerTextureBias" that assign a bias between 0 and MipBias to each texture in order to fit in budget. Fixed crash when disabling texture streaming. Fixed issue when disabling texture streaming that would make current loaded texture low res. New logic to prevent retrying to cancel a streaming request more than once. Pending load request of one extra mip will not be cancelled anymore. Changed UTexture2D from float to double. Also using FApp::GetCurrentTime() instead of FPlatformTime::Seconds(). #jira UE-32197 #jira UE-31102 Change 3029837 on 2016/06/27 by David.Hill Fixed Shutter SM4 not working when using compute shader eye-adaptation #jira UE-32443 The default eye adaptation value was missing. Change 3030039 on 2016/06/27 by Uriel.Doyon Fix for crash when landscape materials are used in the Texture Streaming Build. #jira UE-32196 Change 3030081 on 2016/06/27 by Uriel.Doyon Updated MaterialTexCoordScalesPixelShader to use PackedEyeIndex, preventing crash when building the map with stereo rendering enabled. Change 3030401 on 2016/06/28 by Ben.Woodhouse Perf Monitor: Fix for perf warning due to cvar FindConsoleVariable being called too frequently. Tested in QAGame editor (DX11) #jira UE-31238 Change 3030607 on 2016/06/28 by Marc.Olano Random Number generators: fixed bug in TEA, added integer and float Blum-Blum-Shub. BBS is way cheaper for similar quality, suggest it for future use. Change 3030627 on 2016/06/28 by Ben.Woodhouse Fix for warning. CVar naming scope clash (doesn't appear to happen in vs2015). Change 3030809 on 2016/06/28 by Marc.Olano Noise shader function rename & perf improvement. Due to incorrect terminology in internet soruces, previous "Perlin" noise was not, in fact, Perlin noise. Now more accurately called "Value" noise. 6x perf improvement for value noise by changing random number function to BBS. Also updated instruction counts in UI tooltips. Change 3030850 on 2016/06/28 by Marc.Olano Rename & redirect noise material enums. At some point these got switched around and no longer accurately described the noise options the selected. Redirect, so all existing content will continue to work as-is. Updated UDN docs to match. Change 3030981 on 2016/06/28 by Rolando.Caloca DR - vk - More logging Change 3031056 on 2016/06/28 by Marc.Olano Introduce new pure-ALU gradient shader noise. Add noise samples to RenderTest map Change 3031398 on 2016/06/28 by Benjamin.Hyder updating TM-Shadermodels (correcting Mt Rushmore) Change 3031441 on 2016/06/28 by Marc.Olano Use only float version of BBS shader rand function for ES2 Change 3031463 on 2016/06/28 by John.Billon Fixed F4 changing the viewmode in Fortnite editor. The detailed lighting viewmode (detaillighting) named in DefaultInput.ini differed from the one in BaseInput.ini(lit_detaillighting). #Jira UE-32020 Change 3031512 on 2016/06/28 by Zabir.Hoque Relax clear flags for DX12 RHIs. Properly flush pending commands before residency is updated. Change 3031517 on 2016/06/28 by Rolando.Caloca DR - vk logging using r.Vulkan.DumpLayer Change 3032359 on 2016/06/29 by Allan.Bentham Fix mobile shadows crash. Change 3032431 on 2016/06/29 by Gil.Gribb Merging //UE4/Dev-Main@3032394 to Dev-Rendering (//UE4/Dev-Rendering) Change 3032757 on 2016/06/29 by Uriel.Doyon Fixed global mip bias being applied twice following integration with main. Change 3033121 on 2016/06/29 by Rolando.Caloca DR - vk - Logging Change 3033529 on 2016/06/29 by Daniel.Wright Null world guard on UReflectionCaptureComponent::ReadbackFromGPU Change 3033668 on 2016/06/29 by Uriel.Doyon Grouped texture streaming settings to simplify logic. New options "r.Streaming.UseAllMips" to ignores the different lod and cinematic bias #jira UE-32118 Change 3034403 on 2016/06/30 by Rolando.Caloca DR - Shorten dumped shader debug strings Change 3034475 on 2016/06/30 by Rolando.Caloca DR - Missing logging Change 3034722 on 2016/06/30 by Uriel.Doyon Improved StreamingAccuracy viewmodes with alpha test and translucent materials #jira UE-32656 Change 3034797 on 2016/06/30 by Rolando.Caloca DR - vk - 'fix' RHIClear but causes a CPU hang on AMD, so disabled again Change 3034799 on 2016/06/30 by Rolando.Caloca DR - vk - missed file Change 3034905 on 2016/06/30 by Rolando.Caloca DR - vk - Fix for render passes being reused with wrong dimensions Change 3035503 on 2016/07/01 by Simon.Tovey Async compute version of translucency lighting volume clear. Change 3035577 on 2016/07/01 by Marc.Olano Tiling noise. Adds tiling option for gradient, gradient texture, and value noise in the noise material node. Tiling is more expensive, but allows noise functions to be baked into a seamless repeating texture. Change 3035587 on 2016/07/01 by Ben.Woodhouse Fix for async SSAO bug (SSAO Async Compute results are used before the async job wait) #jira UE-32709 Change 3035618 on 2016/07/01 by Olaf.Piesche Asset fixes Change 3035692 on 2016/07/01 by Rolando.Caloca DR - vk - Deferred deletion queue Change 3035808 on 2016/07/01 by Rolando.Caloca DR - vk - Stat for deletion time, fixed some logging Change 3036012 on 2016/07/01 by John.Billon Alpha Coverage Preservation -Textures have a Alpha Preservation Vec4 property which dictates about much of that channel to preserve down the mip chain during mip generation. #Jira UE-31986 Change 3036041 on 2016/07/01 by Rolando.Caloca DR - vk - Fix for 32bit Change 3036433 on 2016/07/01 by Rolando.Caloca DR - More vk logging Change 3036935 on 2016/07/04 by Simon.Tovey Removing Data Objects Change 3036942 on 2016/07/04 by Ben.Woodhouse Fix for decal rendering resource leak The cause was that FD3D11BoundRenderTargets doesn't support setting RTs sparsely. So if one element is NULL, it won't release the ones after it. The sparse RT layout happened as a result of a change back in October, which meant that GBuffers for decals could be set sparsely, dependent on whether the decal wrote to the normalbuffer This change adds support for sparsely bound rendertargets in FD3D11BoundRenderTargets. #jira UE-32602 Change 3037563 on 2016/07/05 by Chris.Bunner HLOD self-shadowing in baked lighting fix. Change 3037640 on 2016/07/05 by Marcus.Wassmer Fix bug in USE_GPU_OVERWRITE_CHECKING Change 3037927 on 2016/07/05 by Rolando.Caloca DR - Fix touch pads not showing on Vulkan #jira UE-32062 Change 3038085 on 2016/07/05 by Chris.Bunner HLOD dynamic shadowing support. #jira UE-22627 Change 3038209 on 2016/07/05 by Rolando.Caloca DR - vk - Android compile fix Change 3038644 on 2016/07/05 by Uriel.Doyon Added LerpRange that allows to lerp between two rotators without taking the sortest path. Change 3038820 on 2016/07/05 by Uriel.Doyon Selecting streaming accuracy view modes will not automatically generate missing visualization data. Change 3039332 on 2016/07/06 by John.Billon -Made MaxGPUSkinBonesCvar a FAutoConsoleVariableRef and moved it to mesh utilitles from console manager to fix a thread initialization problem. #Jira UE-31710 Change 3039454 on 2016/07/06 by Simon.Tovey Moved all Niagara files from Engine and UnrealEd to remove dependancies and increase compile times. Niagara is now 99.999% decoupled from engine and editor so development should be much streamlined. Plus a few other edits to remove Curves/DataObjects that I missed in last CL. Change 3039517 on 2016/07/06 by Gil.Gribb Merging //UE4/Dev-Main@3039013 to Dev-Rendering (//UE4/Dev-Rendering) Change 3039587 on 2016/07/06 by Rolando.Caloca DR - vk logging, submit counter Change 3039603 on 2016/07/06 by Rolando.Caloca DR - Allow more samplers on GL4 #jira UE-32628 #jira UE-32744 Change 3039661 on 2016/07/06 by Daniel.Wright Fixed non-directional DFAO occlusion on specular 'r.AOSpecularOcclusionMode 0' Skylight occlusion tint now applies to specular Skylight occlusion tint on diffuse is now correctly affected by DiffuseColor Change 3039960 on 2016/07/06 by Daniel.Wright Forward renderer initial implementation * Point and spot lights are culled to a frustum space grid, base pass loops over culled lights. * Light culling uses a reverse linked list to avoid a per-cell limit, and the linked list is compacted to an array before the base pass. * New cvars to control light culling: r.Forward.MaxCulledLightsPerCell, r.Forward.LightGridSizeZ, r.Forward.LightGridPixelSize * A full Z Prepass is forced with forward shading. This allows deferred rendering before the base pass of shadow projection methods that only rely on depth. * Dynamic shadows are packed based on the assigned stationary light ShadowMapChannel, since stationary lights are already restricted to 4 overlapping. * GBuffer render targets are still allocated * Fixed several issues in parallax corrected base pass reflections - not blending out box shape, discontinuity in reflection vector, not blending with stationary skylight properly * Forward shading is now used for TLM_SurfacePerPixelLighting translucency in the deferred path * Notable missing features: shadowing of translucency, support for various translucency lighting modes, multiple blended reflection captures Change 3040050 on 2016/07/06 by Daniel.Wright Added r.Shadow.WholeSceneShadowCacheMb, which defaults to 150, to limit how much memory can be spent caching whole scene shadowmaps Change 3040160 on 2016/07/06 by Daniel.Wright Fixed tile artifacts in indirect capsule shadows from doing the scaled sphere vs tile bounding sphere intersection in the wrong space Change 3040163 on 2016/07/06 by Rolando.Caloca DR - vk - More logging Change 3040257 on 2016/07/06 by Daniel.Wright Skylights aren't captured until their level is made visible- fixes the case where skylights capture too early Change 3040316 on 2016/07/06 by Daniel.Wright PerObject shadows from point / spot lights do the light source pull back based on subject box size, not subject radius, since the box is used to find a valid < 90 degree projection. Fix from licensee Change 3040361 on 2016/07/06 by Daniel.Wright Fixed TexCreate_UAV being used on translucency volume textures in SM4 Change 3040402 on 2016/07/06 by Rolando.Caloca DR - vk - Make host mem accesses coherent Change 3040486 on 2016/07/06 by Daniel.Wright CIS fixes Change 3041028 on 2016/07/07 by Gil.Gribb Merging //UE4/Dev-Main@3040917 to Dev-Rendering (//UE4/Dev-Rendering) Change 3041235 on 2016/07/07 by Simon.Tovey Compile fix for FName conflict on UProperty (hopefully). Change 3041666 on 2016/07/07 by Daniel.Wright Fixed TLM_SurfacePerPixelLighting in SM4, falls back to lighting volume Change 3041731 on 2016/07/07 by Olaf.Piesche Adding Niagara to dynamically loaded module list; should fix UE-32915 Change 3042181 on 2016/07/07 by Daniel.Wright CIS fix [CL 3045471 by Gil Gribb in Main branch]
812 lines
31 KiB
Plaintext
812 lines
31 KiB
Plaintext
// Copyright 1998-2016 Epic Games, Inc. All Rights Reserved.
|
|
|
|
/*=============================================================================
|
|
CapsuleShadowShaders.usf: Tiled deferred culling and shadowing from capsule shapes
|
|
=============================================================================*/
|
|
|
|
#include "Common.usf"
|
|
#include "DeferredShadingCommon.usf"
|
|
#include "FastMath.usf"
|
|
|
|
#ifndef THREADGROUP_SIZEX
|
|
#define THREADGROUP_SIZEX 1
|
|
#endif
|
|
|
|
#ifndef THREADGROUP_SIZEY
|
|
#define THREADGROUP_SIZEY 1
|
|
#endif
|
|
|
|
#ifndef LIGHT_SOURCE_MODE
|
|
#define LIGHT_SOURCE_MODE 0
|
|
#endif
|
|
|
|
#define LIGHT_SOURCE_PUNCTUAL 0
|
|
#define LIGHT_SOURCE_FROM_CAPSULE 1
|
|
#define LIGHT_SOURCE_FROM_RECEIVER 2
|
|
|
|
#define MAX_INTERSECTING_SHAPES 512
|
|
groupshared uint IntersectingShapeIndices[MAX_INTERSECTING_SHAPES * 2];
|
|
|
|
uint NumShadowCapsules;
|
|
Buffer<float4> ShadowCapsuleShapes;
|
|
|
|
#if LIGHT_SOURCE_MODE == LIGHT_SOURCE_FROM_CAPSULE
|
|
Buffer<float4> LightDirectionData;
|
|
#endif
|
|
|
|
bool SphereIntersectCone(float4 SphereCenterAndRadius, float3 ConeVertex, float3 ConeAxis, float ConeAngleCos, float ConeAngleSin)
|
|
{
|
|
float3 U = ConeVertex - (SphereCenterAndRadius.w / ConeAngleSin) * ConeAxis;
|
|
float3 D = SphereCenterAndRadius.xyz - U;
|
|
float DSizeSq = dot(D, D);
|
|
float E = dot(ConeAxis, D);
|
|
|
|
if (E > 0 && E * E >= DSizeSq * ConeAngleCos * ConeAngleCos)
|
|
{
|
|
D = SphereCenterAndRadius.xyz - ConeVertex;
|
|
DSizeSq = dot(D, D);
|
|
E = -dot(ConeAxis, D);
|
|
|
|
if (E > 0 && E * E >= DSizeSq * ConeAngleSin * ConeAngleSin)
|
|
{
|
|
return DSizeSq <= SphereCenterAndRadius.w * SphereCenterAndRadius.w;
|
|
}
|
|
else
|
|
{
|
|
return true;
|
|
}
|
|
}
|
|
|
|
return false;
|
|
}
|
|
|
|
bool SphereIntersectConeWithMaxDistance(float4 SphereCenterAndRadius, float3 ConeVertex, float3 ConeAxis, float ConeAngleCos, float ConeAngleSin, float MaxDistanceAlongAxis)
|
|
{
|
|
if (SphereIntersectCone(SphereCenterAndRadius, ConeVertex, ConeAxis, ConeAngleCos, ConeAngleSin))
|
|
{
|
|
float ConeAxisDistance = dot(SphereCenterAndRadius.xyz - ConeVertex, ConeAxis);
|
|
float ConeAxisDistanceMax = ConeAxisDistance - SphereCenterAndRadius.w;
|
|
|
|
return ConeAxisDistanceMax < MaxDistanceAlongAxis;
|
|
}
|
|
|
|
return false;
|
|
}
|
|
|
|
bool SphereIntersectSphere(float4 SphereCenterAndRadius, float4 OtherSphereCenterAndRadius)
|
|
{
|
|
float CombinedRadii = SphereCenterAndRadius.w + OtherSphereCenterAndRadius.w;
|
|
float3 VectorBetweenCenters = SphereCenterAndRadius.xyz - OtherSphereCenterAndRadius.xyz;
|
|
return dot(VectorBetweenCenters, VectorBetweenCenters) < CombinedRadii * CombinedRadii;
|
|
}
|
|
|
|
#if LIGHT_SOURCE_MODE == LIGHT_SOURCE_PUNCTUAL
|
|
/** From point being shaded toward light, for directional lights. */
|
|
float3 LightDirection;
|
|
float4 LightPositionAndInvRadius;
|
|
float LightSourceRadius;
|
|
float RayStartOffsetDepthScale;
|
|
float3 LightAngleAndNormalThreshold;
|
|
#endif
|
|
|
|
uint4 ScissorRectMinAndSize;
|
|
float2 NumGroups;
|
|
|
|
/** Min and Max depth for this tile. */
|
|
groupshared uint IntegerTileMinZ;
|
|
groupshared uint IntegerTileMaxZ;
|
|
|
|
/** Inner Min and Max depth for this tile. */
|
|
groupshared uint IntegerTileMinZ2;
|
|
groupshared uint IntegerTileMaxZ2;
|
|
|
|
/** Number of capsules affecting the tile, after culling. */
|
|
groupshared uint TileNumCapsules0;
|
|
groupshared uint TileNumCapsules1;
|
|
|
|
struct FTileCullingData
|
|
{
|
|
float4 BoundingSphere;
|
|
#if LIGHT_SOURCE_MODE == LIGHT_SOURCE_PUNCTUAL
|
|
float3 ConeAxis;
|
|
float ConeAngleCos;
|
|
float ConeAngleSin;
|
|
#endif
|
|
};
|
|
|
|
void SetupTileCullingData(
|
|
float SceneDepth,
|
|
float MaxDepth,
|
|
uint ThreadIndex,
|
|
uint2 GroupId,
|
|
out FTileCullingData TileCullingData0,
|
|
out FTileCullingData TileCullingData1,
|
|
out bool bTileShouldComputeShadowing,
|
|
out uint GroupIndex)
|
|
{
|
|
// Initialize per-tile variables
|
|
if (ThreadIndex == 0)
|
|
{
|
|
IntegerTileMinZ = 0x7F7FFFFF;
|
|
IntegerTileMaxZ = 0;
|
|
IntegerTileMinZ2 = 0x7F7FFFFF;
|
|
IntegerTileMaxZ2 = 0;
|
|
TileNumCapsules0 = 0;
|
|
TileNumCapsules1 = 0;
|
|
}
|
|
|
|
GroupMemoryBarrierWithGroupSync();
|
|
|
|
// Use shared memory atomics to build the depth bounds for this tile
|
|
// Each thread is assigned to a pixel at this point
|
|
//@todo - move depth range computation to a central point where it can be reused by all the frame's tiled deferred passes!
|
|
|
|
if (SceneDepth < MaxDepth)
|
|
{
|
|
InterlockedMin(IntegerTileMinZ, asuint(SceneDepth));
|
|
InterlockedMax(IntegerTileMaxZ, asuint(SceneDepth));
|
|
}
|
|
|
|
GroupMemoryBarrierWithGroupSync();
|
|
|
|
float MinTileZ = asfloat(IntegerTileMinZ);
|
|
float MaxTileZ = asfloat(IntegerTileMaxZ);
|
|
|
|
float HalfZ = .5f * (MinTileZ + MaxTileZ);
|
|
|
|
if (SceneDepth < MaxDepth)
|
|
{
|
|
// Compute a second min and max Z, clipped by HalfZ, so that we get two depth bounds per tile
|
|
// This results in more conservative tile depth bounds and fewer intersections
|
|
if (SceneDepth >= HalfZ)
|
|
{
|
|
InterlockedMin(IntegerTileMinZ2, asuint(SceneDepth));
|
|
}
|
|
|
|
if (SceneDepth <= HalfZ)
|
|
{
|
|
InterlockedMax(IntegerTileMaxZ2, asuint(SceneDepth));
|
|
}
|
|
}
|
|
|
|
GroupMemoryBarrierWithGroupSync();
|
|
|
|
float MinTileZ2 = asfloat(IntegerTileMinZ2);
|
|
float MaxTileZ2 = asfloat(IntegerTileMaxZ2);
|
|
|
|
bTileShouldComputeShadowing = true;
|
|
|
|
if (IntegerTileMinZ == 0x7F7FFFFF && IntegerTileMaxZ == 0)
|
|
{
|
|
bTileShouldComputeShadowing = false;
|
|
}
|
|
|
|
float3 ViewTileMin;
|
|
float3 ViewTileMax;
|
|
|
|
float3 ViewTileMin2;
|
|
float3 ViewTileMax2;
|
|
|
|
bool bCenteredProjection = abs(View.ViewToClip[1][0]) < .00001f && abs(View.ViewToClip[2][0]) < .00001f;
|
|
|
|
BRANCH
|
|
// Off center projection path uses 37 more asm instructions
|
|
if (bCenteredProjection)
|
|
{
|
|
float2 TanViewFOV = GetTanHalfFieldOfView();
|
|
// tan(FOV) = HalfUnitPlaneWidth / 1, so TanViewFOV * 2 is the size of the whole unit view plane
|
|
// We are operating on a subset of that defined by ScissorRectMinAndSize
|
|
float2 TileSize = TanViewFOV * 2 * ScissorRectMinAndSize.zw / ((float2)View.ViewSizeAndInvSize.xy * NumGroups);
|
|
float2 UnitPlaneMin = -TanViewFOV + TanViewFOV * 2 * (ScissorRectMinAndSize.xy - View.ViewRectMin.xy) * View.ViewSizeAndInvSize.zw;
|
|
|
|
float2 UnitPlaneTileMin = (GroupId.xy * TileSize + UnitPlaneMin) * float2(1, -1);
|
|
float2 UnitPlaneTileMax = ((GroupId.xy + 1) * TileSize + UnitPlaneMin) * float2(1, -1);
|
|
|
|
ViewTileMin.xy = min(MinTileZ * UnitPlaneTileMin, MaxTileZ2 * UnitPlaneTileMin);
|
|
ViewTileMax.xy = max(MinTileZ * UnitPlaneTileMax, MaxTileZ2 * UnitPlaneTileMax);
|
|
ViewTileMin.z = MinTileZ;
|
|
ViewTileMax.z = MaxTileZ2;
|
|
ViewTileMin2.xy = min(MinTileZ2 * UnitPlaneTileMin, MaxTileZ * UnitPlaneTileMin);
|
|
ViewTileMax2.xy = max(MinTileZ2 * UnitPlaneTileMax, MaxTileZ * UnitPlaneTileMax);
|
|
ViewTileMin2.z = MinTileZ2;
|
|
ViewTileMax2.z = MaxTileZ;
|
|
}
|
|
else
|
|
{
|
|
float2 TileSize = 2 * ScissorRectMinAndSize.zw / ((float2)View.ViewSizeAndInvSize.xy * NumGroups);
|
|
float2 UnitPlaneMin = -1 + 2 * (ScissorRectMinAndSize.xy - View.ViewRectMin.xy) * View.ViewSizeAndInvSize.zw;
|
|
|
|
float2 UnitPlaneTileMin = (GroupId.xy * TileSize + UnitPlaneMin) * float2(1, -1);
|
|
float2 UnitPlaneTileMax = ((GroupId.xy + 1) * TileSize + UnitPlaneMin) * float2(1, -1);
|
|
|
|
{
|
|
float MinTileDeviceZ = ConvertToDeviceZ(MinTileZ);
|
|
float4 MinDepthMinCorner = mul(float4(UnitPlaneTileMin.x, UnitPlaneTileMin.y, MinTileDeviceZ, 1), View.ClipToView);
|
|
float4 MinDepthMaxCorner = mul(float4(UnitPlaneTileMax.x, UnitPlaneTileMax.y, MinTileDeviceZ, 1), View.ClipToView);
|
|
|
|
float MaxTileDeviceZ = ConvertToDeviceZ(MaxTileZ2);
|
|
float4 MaxDepthMinCorner = mul(float4(UnitPlaneTileMin.x, UnitPlaneTileMin.y, MaxTileDeviceZ, 1), View.ClipToView);
|
|
float4 MaxDepthMaxCorner = mul(float4(UnitPlaneTileMax.x, UnitPlaneTileMax.y, MaxTileDeviceZ, 1), View.ClipToView);
|
|
|
|
ViewTileMin.xy = min(MinDepthMinCorner.xy / MinDepthMinCorner.w, MaxDepthMinCorner.xy / MaxDepthMinCorner.w);
|
|
ViewTileMax.xy = max(MinDepthMaxCorner.xy / MinDepthMaxCorner.w, MaxDepthMaxCorner.xy / MaxDepthMaxCorner.w);
|
|
ViewTileMin.z = MinTileZ;
|
|
ViewTileMax.z = MaxTileZ2;
|
|
}
|
|
|
|
{
|
|
float MinTileDeviceZ = ConvertToDeviceZ(MinTileZ2);
|
|
float4 MinDepthMinCorner = mul(float4(UnitPlaneTileMin.x, UnitPlaneTileMin.y, MinTileDeviceZ, 1), View.ClipToView);
|
|
float4 MinDepthMaxCorner = mul(float4(UnitPlaneTileMax.x, UnitPlaneTileMax.y, MinTileDeviceZ, 1), View.ClipToView);
|
|
|
|
float MaxTileDeviceZ = ConvertToDeviceZ(MaxTileZ);
|
|
float4 MaxDepthMinCorner = mul(float4(UnitPlaneTileMin.x, UnitPlaneTileMin.y, MaxTileDeviceZ, 1), View.ClipToView);
|
|
float4 MaxDepthMaxCorner = mul(float4(UnitPlaneTileMax.x, UnitPlaneTileMax.y, MaxTileDeviceZ, 1), View.ClipToView);
|
|
|
|
ViewTileMin2.xy = min(MinDepthMinCorner.xy / MinDepthMinCorner.w, MaxDepthMinCorner.xy / MaxDepthMinCorner.w);
|
|
ViewTileMax2.xy = max(MinDepthMaxCorner.xy / MinDepthMaxCorner.w, MaxDepthMaxCorner.xy / MaxDepthMaxCorner.w);
|
|
ViewTileMin2.z = MinTileZ2;
|
|
ViewTileMax2.z = MaxTileZ;
|
|
}
|
|
}
|
|
|
|
float3 ViewGroup0Center = (ViewTileMax + ViewTileMin) / 2;
|
|
TileCullingData0.BoundingSphere.xyz = mul(float4(ViewGroup0Center, 1), View.ViewToTranslatedWorld).xyz - View.PreViewTranslation;
|
|
TileCullingData0.BoundingSphere.w = length(ViewGroup0Center - ViewTileMax);
|
|
|
|
float3 ViewGroup1Center = (ViewTileMax2 + ViewTileMin2) / 2;
|
|
TileCullingData1.BoundingSphere.xyz = mul(float4(ViewGroup1Center, 1), View.ViewToTranslatedWorld).xyz - View.PreViewTranslation;
|
|
TileCullingData1.BoundingSphere.w = length(ViewGroup1Center - ViewTileMax2);
|
|
|
|
#if LIGHT_SOURCE_MODE == LIGHT_SOURCE_PUNCTUAL
|
|
#if POINT_LIGHT
|
|
float3 LightVector0 = LightPositionAndInvRadius.xyz - TileCullingData0.BoundingSphere.xyz;
|
|
float LightVector0Length = length(LightVector0);
|
|
float3 LightVector1 = LightPositionAndInvRadius.xyz - TileCullingData1.BoundingSphere.xyz;
|
|
float LightVector1Length = length(LightVector1);
|
|
TileCullingData0.ConeAxis = LightVector0 / LightVector0Length;
|
|
TileCullingData1.ConeAxis = LightVector1 / LightVector1Length;;
|
|
float TanLightAngle0 = LightSourceRadius / LightVector0Length;
|
|
float TanLightAngle1 = LightSourceRadius / LightVector1Length;
|
|
|
|
TileCullingData0.ConeAngleCos = 1.0f / sqrt(1 + TanLightAngle0 * TanLightAngle0);
|
|
TileCullingData0.ConeAngleSin = TileCullingData0.ConeAngleCos * TanLightAngle0;
|
|
|
|
TileCullingData1.ConeAngleCos = 1.0f / sqrt(1 + TanLightAngle1 * TanLightAngle1);
|
|
TileCullingData1.ConeAngleSin = TileCullingData1.ConeAngleCos * TanLightAngle1;
|
|
|
|
// Don't operate on tiles completely outside of the light's influence
|
|
bool bTileInLightInfluenceBounds = LightVector0Length < 1.0f / LightPositionAndInvRadius.w + TileCullingData0.BoundingSphere.w
|
|
|| LightVector1Length < 1.0f / LightPositionAndInvRadius.w + TileCullingData1.BoundingSphere.w;
|
|
|
|
bTileShouldComputeShadowing = bTileShouldComputeShadowing && bTileInLightInfluenceBounds;
|
|
|
|
#else
|
|
TileCullingData0.ConeAxis = TileCullingData1.ConeAxis = LightDirection;
|
|
TileCullingData0.ConeAngleCos = TileCullingData1.ConeAngleCos = cos(LightAngleAndNormalThreshold.x);
|
|
TileCullingData0.ConeAngleSin = TileCullingData1.ConeAngleSin = sin(LightAngleAndNormalThreshold.x);
|
|
#endif
|
|
#endif
|
|
|
|
GroupIndex = SceneDepth > MaxTileZ2 ? 1 : 0;
|
|
}
|
|
|
|
// Scaled sphere intersection allows capsule shadows to blend together better when penumbras are large, so use for indirect.
|
|
// Otherwise an occluder sphere will be extracted from the capsule and used for shadowing.
|
|
// This maintains shadow silhouette shapes better but has a discontinuity when the capsule direction is nearly parallel to the light direction.
|
|
#define USE_SCALED_SPHERE_INTERSECTION (LIGHT_SOURCE_MODE != LIGHT_SOURCE_PUNCTUAL)
|
|
|
|
uint CullCapsuleShapesToTile(
|
|
uint ThreadIndex,
|
|
uint GroupIndex,
|
|
float MaxOcclusionDistance,
|
|
FTileCullingData TileCullingData0,
|
|
FTileCullingData TileCullingData1)
|
|
{
|
|
#if LIGHT_SOURCE_MODE == LIGHT_SOURCE_PUNCTUAL
|
|
|
|
float3 ConeAxis0 = TileCullingData0.ConeAxis;
|
|
float ConeAngleCos0 = TileCullingData0.ConeAngleCos;
|
|
float ConeAngleSin0 = TileCullingData0.ConeAngleSin;
|
|
float3 ConeAxis1 = TileCullingData1.ConeAxis;
|
|
float ConeAngleCos1 = TileCullingData1.ConeAngleCos;
|
|
float ConeAngleSin1 = TileCullingData1.ConeAngleSin;
|
|
|
|
#endif
|
|
|
|
LOOP
|
|
for (uint ShapeIndex = ThreadIndex; ShapeIndex < NumShadowCapsules; ShapeIndex += THREADGROUP_SIZEX * THREADGROUP_SIZEY)
|
|
{
|
|
#if LIGHT_SOURCE_MODE == LIGHT_SOURCE_FROM_CAPSULE
|
|
|
|
float4 LightData = LightDirectionData[ShapeIndex];
|
|
float3 ConeAxis0 = LightData.xyz;
|
|
float LightAngle = LightData.w;
|
|
float ConeAngleCos0 = cos(LightAngle);
|
|
float ConeAngleSin0 = sin(LightAngle);
|
|
|
|
float3 ConeAxis1 = ConeAxis0;
|
|
float ConeAngleCos1 = ConeAngleCos0;
|
|
float ConeAngleSin1 = ConeAngleSin0;
|
|
|
|
#endif
|
|
|
|
float4 SphereCenterAndRadius = ShadowCapsuleShapes[ShapeIndex * 2];
|
|
float3 TransformedSphereCenter = SphereCenterAndRadius.xyz;
|
|
float TransformedSphereRadius = SphereCenterAndRadius.w;
|
|
float3 TransformedTileBoundingSphereCenter0 = TileCullingData0.BoundingSphere.xyz;
|
|
float3 TransformedTileBoundingSphereCenter1 = TileCullingData1.BoundingSphere.xyz;
|
|
|
|
#if LIGHT_SOURCE_MODE != LIGHT_SOURCE_FROM_RECEIVER
|
|
float3 TransformedConeAxis0 = ConeAxis0;
|
|
float3 TransformedConeAxis1 = ConeAxis1;
|
|
#endif
|
|
|
|
#if USE_SCALED_SPHERE_INTERSECTION
|
|
|
|
float4 CapsuleCenterAndRadius = SphereCenterAndRadius;
|
|
float4 CapsuleOrientationAndLength = ShadowCapsuleShapes[ShapeIndex * 2 + 1];
|
|
|
|
float3 CapsuleSpaceX;
|
|
float3 CapsuleSpaceY;
|
|
float3 CapsuleSpaceZ = CapsuleOrientationAndLength.xyz;
|
|
GenerateCoordinateSystem(CapsuleSpaceZ, CapsuleSpaceX, CapsuleSpaceY);
|
|
|
|
// Scale required along the capsule's axis to turn it into a sphere (assuming it was originally a scaled sphere instead of a capsule)
|
|
float CapsuleZScale = CapsuleCenterAndRadius.w / (.5f * CapsuleOrientationAndLength.w + CapsuleCenterAndRadius.w);
|
|
CapsuleSpaceZ *= CapsuleZScale;
|
|
|
|
// The capsule is centered at 0 in the scaled sphere space
|
|
TransformedSphereCenter = 0;
|
|
|
|
// After scaling along the capsule axis it will become a sphere with the original radius
|
|
TransformedSphereRadius = SphereCenterAndRadius.w;
|
|
|
|
// Transform the sphere center and cone axis into the scaled sphere space
|
|
float3 CapsuleCenterToTileCenter0 = TileCullingData0.BoundingSphere.xyz - CapsuleCenterAndRadius.xyz;
|
|
TransformedTileBoundingSphereCenter0 = float3(dot(CapsuleCenterToTileCenter0, CapsuleSpaceX), dot(CapsuleCenterToTileCenter0, CapsuleSpaceY), dot(CapsuleCenterToTileCenter0, CapsuleSpaceZ));
|
|
|
|
float3 CapsuleCenterToTileCenter1 = TileCullingData1.BoundingSphere.xyz - CapsuleCenterAndRadius.xyz;
|
|
TransformedTileBoundingSphereCenter1 = float3(dot(CapsuleCenterToTileCenter1, CapsuleSpaceX), dot(CapsuleCenterToTileCenter1, CapsuleSpaceY), dot(CapsuleCenterToTileCenter1, CapsuleSpaceZ));
|
|
|
|
#if LIGHT_SOURCE_MODE != LIGHT_SOURCE_FROM_RECEIVER
|
|
// Renormalize the cone axis as it went through a non-uniformly scaled transform
|
|
TransformedConeAxis0 = normalize(float3(dot(ConeAxis0, CapsuleSpaceX), dot(ConeAxis0, CapsuleSpaceY), dot(ConeAxis0, CapsuleSpaceZ)));
|
|
TransformedConeAxis1 = normalize(float3(dot(ConeAxis1, CapsuleSpaceX), dot(ConeAxis1, CapsuleSpaceY), dot(ConeAxis1, CapsuleSpaceZ)));
|
|
#endif
|
|
|
|
#else
|
|
|
|
float CapsuleLength = ShadowCapsuleShapes[ShapeIndex * 2 + 1].w;
|
|
|
|
// Add half capsule length to bounding sphere
|
|
TransformedSphereRadius = SphereCenterAndRadius.w + .5f * CapsuleLength;
|
|
#endif
|
|
|
|
BRANCH
|
|
if (SphereIntersectSphere(float4(TransformedSphereCenter, TransformedSphereRadius + MaxOcclusionDistance), float4(TransformedTileBoundingSphereCenter0, TileCullingData0.BoundingSphere.w))
|
|
#if LIGHT_SOURCE_MODE != LIGHT_SOURCE_FROM_RECEIVER
|
|
&& SphereIntersectConeWithMaxDistance(float4(TransformedSphereCenter, TransformedSphereRadius + TileCullingData0.BoundingSphere.w), TransformedTileBoundingSphereCenter0, TransformedConeAxis0, ConeAngleCos0, ConeAngleSin0, MaxOcclusionDistance)
|
|
#endif
|
|
)
|
|
{
|
|
uint ListIndex;
|
|
InterlockedAdd(TileNumCapsules0, 1U, ListIndex);
|
|
// Don't overwrite on overflow
|
|
ListIndex = min(ListIndex, (uint)(MAX_INTERSECTING_SHAPES - 1));
|
|
IntersectingShapeIndices[MAX_INTERSECTING_SHAPES * 0 + ListIndex] = ShapeIndex;
|
|
}
|
|
|
|
BRANCH
|
|
if (SphereIntersectSphere(float4(TransformedSphereCenter, TransformedSphereRadius + MaxOcclusionDistance), float4(TransformedTileBoundingSphereCenter1, TileCullingData1.BoundingSphere.w))
|
|
#if LIGHT_SOURCE_MODE != LIGHT_SOURCE_FROM_RECEIVER
|
|
&& SphereIntersectConeWithMaxDistance(float4(TransformedSphereCenter, TransformedSphereRadius + TileCullingData1.BoundingSphere.w), TransformedTileBoundingSphereCenter1, TransformedConeAxis1, ConeAngleCos1, ConeAngleSin1, MaxOcclusionDistance)
|
|
#endif
|
|
)
|
|
{
|
|
uint ListIndex;
|
|
InterlockedAdd(TileNumCapsules1, 1U, ListIndex);
|
|
// Don't write out of bounds on overflow
|
|
ListIndex = min(ListIndex, (uint)(MAX_INTERSECTING_SHAPES - 1));
|
|
IntersectingShapeIndices[MAX_INTERSECTING_SHAPES * 1 + ListIndex] = ShapeIndex;
|
|
}
|
|
}
|
|
|
|
GroupMemoryBarrierWithGroupSync();
|
|
|
|
return min(GroupIndex == 0 ? TileNumCapsules0 : TileNumCapsules1, (uint)MAX_INTERSECTING_SHAPES);
|
|
}
|
|
|
|
// Approximate the area of intersection of two spherical caps, from 'Ambient Aperture Lighting'
|
|
// fRadius0 : First caps radius (arc length in radians)
|
|
// fRadius1 : Second caps radius (in radians)
|
|
// fDist : Distance between caps (radians between centers of caps)
|
|
float SphericalCapIntersectionAreaFast(float fRadius0, float fRadius1, float fDist)
|
|
{
|
|
float fArea;
|
|
|
|
if ( fDist <= max(fRadius0, fRadius1) - min(fRadius0, fRadius1) )
|
|
{
|
|
// One cap is completely inside the other
|
|
fArea = 6.283185308f - 6.283185308f * cos( min(fRadius0,fRadius1) );
|
|
}
|
|
else if ( fDist >= fRadius0 + fRadius1 )
|
|
{
|
|
// No intersection exists
|
|
fArea = 0;
|
|
}
|
|
else
|
|
{
|
|
float fDiff = abs(fRadius0 - fRadius1);
|
|
fArea = smoothstep(0.0f,
|
|
1.0f,
|
|
1.0f - saturate((fDist-fDiff)/(fRadius0+fRadius1-fDiff)));
|
|
fArea *= 6.283185308f - 6.283185308f * cos( min(fRadius0,fRadius1) );
|
|
}
|
|
return fArea;
|
|
}
|
|
|
|
float ShadowConeTraceAgainstCulledCapsuleShapes(
|
|
float3 WorldRayStart,
|
|
float3 UnitRayDirection,
|
|
float LightAngle,
|
|
float InvMaxOcclusionDistance,
|
|
uint CulledDataParameter,
|
|
uint NumIntersectingCapsules,
|
|
uniform bool bUseCulling)
|
|
{
|
|
float ConeVisibility = 1;
|
|
float AreaOfLight = 6.283185308f - 6.283185308f * cos(LightAngle);
|
|
|
|
LOOP
|
|
for (uint ListObjectIndex = 0; ListObjectIndex < NumIntersectingCapsules; ListObjectIndex++)
|
|
{
|
|
uint ObjectIndex;
|
|
|
|
if (bUseCulling)
|
|
{
|
|
uint GroupIndex = CulledDataParameter;
|
|
ObjectIndex = IntersectingShapeIndices[MAX_INTERSECTING_SHAPES * GroupIndex + ListObjectIndex];
|
|
}
|
|
else
|
|
{
|
|
ObjectIndex = ListObjectIndex;
|
|
}
|
|
|
|
#if LIGHT_SOURCE_MODE == LIGHT_SOURCE_FROM_CAPSULE
|
|
float4 LightData = LightDirectionData[ObjectIndex];
|
|
UnitRayDirection = LightData.xyz;
|
|
LightAngle = LightData.w;
|
|
AreaOfLight = 6.283185308f - 6.283185308f * cos(LightAngle);
|
|
#endif
|
|
|
|
#define OVERRIDE_LIGHT_DEBUG 0
|
|
#if OVERRIDE_LIGHT_DEBUG
|
|
//UnitRayDirection = normalize(float3(.2f, .2f, .8f));
|
|
UnitRayDirection = float3(0, 0, 1);
|
|
LightAngle = .3f;
|
|
#endif
|
|
|
|
float4 CapsuleCenterAndRadius = ShadowCapsuleShapes[ObjectIndex * 2];
|
|
float4 CapsuleOrientationAndLength = ShadowCapsuleShapes[ObjectIndex * 2 + 1];
|
|
|
|
float DistanceToShadowSphere;
|
|
float3 UnitVectorToShadowSphere;
|
|
float3 UnitRayDirectionInCorrectSpace = UnitRayDirection;
|
|
|
|
BRANCH
|
|
if (CapsuleOrientationAndLength.w > 0)
|
|
{
|
|
#if USE_SCALED_SPHERE_INTERSECTION
|
|
|
|
float3 CapsuleSpaceX;
|
|
float3 CapsuleSpaceY;
|
|
float3 CapsuleSpaceZ = CapsuleOrientationAndLength.xyz;
|
|
GenerateCoordinateSystem(CapsuleSpaceZ, CapsuleSpaceX, CapsuleSpaceY);
|
|
|
|
float CapsuleZScale = CapsuleCenterAndRadius.w / (.5f * CapsuleOrientationAndLength.w + CapsuleCenterAndRadius.w);
|
|
CapsuleSpaceZ *= CapsuleZScale;
|
|
|
|
float3 CapsuleCenterToRayStart = WorldRayStart - CapsuleCenterAndRadius.xyz;
|
|
float3 CapsuleSpaceRayStart = float3(dot(CapsuleCenterToRayStart, CapsuleSpaceX), dot(CapsuleCenterToRayStart, CapsuleSpaceY), dot(CapsuleCenterToRayStart, CapsuleSpaceZ));
|
|
|
|
float3 CapsuleSpaceRayDirection = float3(dot(UnitRayDirection, CapsuleSpaceX), dot(UnitRayDirection, CapsuleSpaceY), dot(UnitRayDirection, CapsuleSpaceZ));
|
|
|
|
DistanceToShadowSphere = length(CapsuleSpaceRayStart);
|
|
UnitVectorToShadowSphere = -CapsuleSpaceRayStart / DistanceToShadowSphere;
|
|
UnitRayDirectionInCorrectSpace = normalize(CapsuleSpaceRayDirection);
|
|
#else
|
|
float3 VectorToCapsuleCenter = CapsuleCenterAndRadius.xyz - WorldRayStart;
|
|
|
|
// Closest point on line segment to ray
|
|
float3 L01 = CapsuleOrientationAndLength.xyz * CapsuleOrientationAndLength.w;
|
|
float3 L0 = VectorToCapsuleCenter - 0.5 * L01;
|
|
float3 L1 = VectorToCapsuleCenter + 0.5 * L01;
|
|
|
|
// The below is computing the shortest distance between capsule line segment and ray
|
|
float CapsuleOrientationProjectedOntoRay = dot(UnitRayDirection, L01);
|
|
// Vector that spans L01 perpendicular to the ray
|
|
float3 PerpendicularSpanningVector = CapsuleOrientationProjectedOntoRay * UnitRayDirection - L01;
|
|
// Length of PerpendicularSpanningVector using the right triangle formed by L01 and UnitRayDirection * CapsuleOrientationProjectedOntoRay
|
|
float PerpendicularDistance = Square(CapsuleOrientationAndLength.w) - CapsuleOrientationProjectedOntoRay * CapsuleOrientationProjectedOntoRay;
|
|
// Project the vector to a capsule endpoint onto the perpendicular spanning vector, normalized
|
|
float t = saturate(dot(L0, PerpendicularSpanningVector) / PerpendicularDistance);
|
|
// Compute the vector to the shadow sphere which best approximates the capsule's shadowing
|
|
float3 VectorToShadowSphere = L0 + t * L01;
|
|
|
|
DistanceToShadowSphere = length(VectorToShadowSphere);
|
|
UnitVectorToShadowSphere = VectorToShadowSphere / DistanceToShadowSphere;
|
|
|
|
// The above 'best shadow sphere' calculation doesn't take into account the projected solid angle of the potential shadow spheres
|
|
// As a result, there's a discontinuity when the capsule and the ray point in nearly the same direction, where the far end of the capsule gets chosen
|
|
// Here we mitigate the effect by overriding the distance to shadow sphere if one of the capsule end points was closer
|
|
DistanceToShadowSphere = min(DistanceToShadowSphere, length(L0));
|
|
DistanceToShadowSphere = min(DistanceToShadowSphere, length(L1));
|
|
#endif
|
|
}
|
|
else
|
|
{
|
|
DistanceToShadowSphere = length(CapsuleCenterAndRadius.xyz - WorldRayStart);
|
|
UnitVectorToShadowSphere = (CapsuleCenterAndRadius.xyz - WorldRayStart) / DistanceToShadowSphere;
|
|
}
|
|
|
|
float AngleBetween = acosFast(dot(UnitVectorToShadowSphere, UnitRayDirectionInCorrectSpace));
|
|
float ConeConeIntersection = 1 - saturate(SphericalCapIntersectionAreaFast(LightAngle, atanFastPos(CapsuleCenterAndRadius.w / DistanceToShadowSphere), AngleBetween) / AreaOfLight);
|
|
float DistanceFadeAlpha = saturate(DistanceToShadowSphere * InvMaxOcclusionDistance * 3 - 2);
|
|
ConeConeIntersection = lerp(ConeConeIntersection, 1, DistanceFadeAlpha);
|
|
|
|
ConeVisibility *= ConeConeIntersection;
|
|
}
|
|
|
|
return ConeVisibility;
|
|
}
|
|
|
|
#if APPLY_TO_BENT_NORMAL
|
|
Texture2D ReceiverBentNormalTexture;
|
|
RWTexture2D<float4> RWBentNormalTexture;
|
|
#endif
|
|
|
|
float ReduceSelfShadowingIntensity;
|
|
uint DownsampleFactor;
|
|
RWTexture2D<float2> RWShadowFactors;
|
|
float MaxOcclusionDistance;
|
|
float MinVisibility;
|
|
|
|
uint2 TileDimensions;
|
|
RWBuffer<uint> RWTileIntersectionCounts;
|
|
|
|
[numthreads(THREADGROUP_SIZEX, THREADGROUP_SIZEY, 1)]
|
|
void CapsuleShadowingCS(
|
|
uint3 GroupId : SV_GroupID,
|
|
uint3 DispatchThreadId : SV_DispatchThreadID,
|
|
uint3 GroupThreadId : SV_GroupThreadID)
|
|
{
|
|
uint ThreadIndex = GroupThreadId.y * THREADGROUP_SIZEX + GroupThreadId.x;
|
|
|
|
float2 ScreenUV = float2((DispatchThreadId.xy * DownsampleFactor + ScissorRectMinAndSize.xy + .5f) * View.BufferSizeAndInvSize.zw);
|
|
float2 ScreenPosition = (ScreenUV.xy - View.ScreenPositionScaleBias.wz) / View.ScreenPositionScaleBias.xy;
|
|
|
|
#if LIGHT_SOURCE_MODE == LIGHT_SOURCE_FROM_RECEIVER
|
|
float4 ReceiverTextureValue = ReceiverBentNormalTexture.Load(DispatchThreadId.xyz);
|
|
float3 ReceiverBentNormal = ReceiverTextureValue.xyz;
|
|
float SceneDepth = ReceiverTextureValue.w;
|
|
#else
|
|
float SceneDepth = CalcSceneDepth(ScreenUV);
|
|
#endif
|
|
|
|
float4 HomogeneousWorldPosition = mul(float4(ScreenPosition * SceneDepth, SceneDepth, 1), View.ScreenToWorld);
|
|
float3 OpaqueWorldPosition = HomogeneousWorldPosition.xyz / HomogeneousWorldPosition.w;
|
|
|
|
uint CulledDataParameter = 0;
|
|
bool bTileShouldComputeShadowing = true;
|
|
FTileCullingData TileCullingData0;
|
|
FTileCullingData TileCullingData1;
|
|
uint NumPixelIntersectingShapes = 0;
|
|
uint NumTileIntersectingShapes = 0;
|
|
|
|
// So we can skip skybox pixels / tiles without having to check the GBuffer for shading model
|
|
float MaxDepth = 20000;
|
|
|
|
#define USE_CULLING 1
|
|
#if USE_CULLING
|
|
|
|
SetupTileCullingData(SceneDepth, MaxDepth, ThreadIndex, GroupId.xy, TileCullingData0, TileCullingData1, bTileShouldComputeShadowing, CulledDataParameter);
|
|
|
|
#endif // USE_CULLING
|
|
|
|
float Visibility = 1;
|
|
|
|
BRANCH
|
|
if (bTileShouldComputeShadowing)
|
|
{
|
|
// World space offset along the start of the ray to avoid incorrect self-shadowing
|
|
float RayStartOffset = 0;
|
|
|
|
#if LIGHT_SOURCE_MODE == LIGHT_SOURCE_PUNCTUAL
|
|
#if POINT_LIGHT
|
|
|
|
float3 LightVector = LightPositionAndInvRadius.xyz - OpaqueWorldPosition;
|
|
float LightVectorLength = length(LightVector);
|
|
float3 WorldRayStart = OpaqueWorldPosition + LightVector / LightVectorLength * RayStartOffset;
|
|
float3 UnitRayDirection = (LightPositionAndInvRadius.xyz - OpaqueWorldPosition) / LightVectorLength;
|
|
float LightAngle = atanFastPos(LightSourceRadius / LightVectorLength);
|
|
|
|
#else
|
|
|
|
float3 WorldRayStart = OpaqueWorldPosition + LightDirection * RayStartOffset;
|
|
float3 UnitRayDirection = LightDirection;
|
|
float LightAngle = LightAngleAndNormalThreshold.x;
|
|
|
|
#endif
|
|
#elif LIGHT_SOURCE_MODE == LIGHT_SOURCE_FROM_RECEIVER
|
|
float3 WorldRayStart = OpaqueWorldPosition;
|
|
float BentNormalLength = length(ReceiverBentNormal);
|
|
float3 UnitRayDirection = ReceiverBentNormal / max(BentNormalLength, .00001f);
|
|
float LightAngle = max(BentNormalLength * .5f * PI, PI / 8);
|
|
#else
|
|
float3 WorldRayStart = OpaqueWorldPosition;
|
|
float3 UnitRayDirection = 0;
|
|
float LightAngle = 0;
|
|
#endif
|
|
|
|
uint NumIntersectingCapsules = NumShadowCapsules;
|
|
|
|
#if USE_CULLING
|
|
|
|
NumIntersectingCapsules = CullCapsuleShapesToTile(
|
|
ThreadIndex,
|
|
CulledDataParameter,
|
|
MaxOcclusionDistance,
|
|
TileCullingData0,
|
|
TileCullingData1);
|
|
|
|
NumTileIntersectingShapes = TileNumCapsules0 + TileNumCapsules1;
|
|
#else
|
|
NumTileIntersectingShapes = NumShadowCapsules;
|
|
#endif
|
|
|
|
NumPixelIntersectingShapes += NumIntersectingCapsules;
|
|
|
|
Visibility *= ShadowConeTraceAgainstCulledCapsuleShapes(
|
|
WorldRayStart,
|
|
UnitRayDirection,
|
|
LightAngle,
|
|
1.0f / MaxOcclusionDistance,
|
|
CulledDataParameter,
|
|
NumIntersectingCapsules,
|
|
USE_CULLING ? true : false);
|
|
|
|
#if !APPLY_TO_BENT_NORMAL
|
|
if (all(GroupThreadId.xy == 0) && all(GroupId.xy < TileDimensions))
|
|
{
|
|
RWTileIntersectionCounts[GroupId.y * TileDimensions.x + GroupId.x] = NumTileIntersectingShapes;
|
|
}
|
|
#endif
|
|
}
|
|
|
|
#if LIGHT_SOURCE_MODE == LIGHT_SOURCE_FROM_CAPSULE && !FORWARD_SHADING
|
|
BRANCH
|
|
if (ReduceSelfShadowingIntensity > 0)
|
|
{
|
|
FGBufferData GBufferData = GetGBufferData(ScreenUV);
|
|
// Reduce self shadowing intensity
|
|
Visibility = lerp(Visibility, 1, HasDistanceFieldRepresentation(GBufferData) ? .8f : 0);
|
|
}
|
|
#endif
|
|
|
|
#if LIGHT_SOURCE_MODE != LIGHT_SOURCE_PUNCTUAL
|
|
// Apply to indirect shadows only
|
|
Visibility = lerp(MinVisibility, 1, Visibility);
|
|
#endif
|
|
//Visibility = NumPixelIntersectingShapes / 20.0f;
|
|
//Visibility = bTileShouldComputeShadowing ? 1 : 0;
|
|
|
|
#if APPLY_TO_BENT_NORMAL
|
|
#if LIGHT_SOURCE_MODE != LIGHT_SOURCE_FROM_RECEIVER
|
|
float3 ReceiverBentNormal = ReceiverBentNormalTexture.Load(DispatchThreadId.xyz).xyz;
|
|
#endif
|
|
RWBentNormalTexture[DispatchThreadId.xy] = float4(ReceiverBentNormal * Visibility, SceneDepth);
|
|
#else
|
|
RWShadowFactors[DispatchThreadId.xy] = float2(Visibility, SceneDepth);
|
|
#endif
|
|
}
|
|
|
|
Buffer<uint> TileIntersectionCounts;
|
|
|
|
// Size of a tile in NDC
|
|
float2 TileSize;
|
|
|
|
#ifndef TILES_PER_INSTANCE
|
|
#define TILES_PER_INSTANCE 1
|
|
#endif
|
|
|
|
void CapsuleShadowingUpsampleVS(
|
|
float2 TexCoord : ATTRIBUTE0,
|
|
uint VertexId : SV_VertexID,
|
|
uint InstanceId : SV_InstanceID,
|
|
out float4 OutPosition : SV_POSITION
|
|
)
|
|
{
|
|
// Compute the actual instance id for when multiple tiles are packed into the vertex buffer
|
|
uint EffectiveInstanceId = InstanceId * TILES_PER_INSTANCE + VertexId / 4;
|
|
uint NumCapsulesAffectingTile = TileIntersectionCounts[EffectiveInstanceId];
|
|
uint TileY = InstanceId / TileDimensions.x;
|
|
uint2 TileCoordinate = uint2(EffectiveInstanceId - TileY * TileDimensions.x, TileY);
|
|
float2 ScreenUV = ((TileCoordinate + TexCoord) * TileSize + ScissorRectMinAndSize.xy) * View.BufferSizeAndInvSize.zw;
|
|
float2 ScreenPosition = (ScreenUV.xy - View.ScreenPositionScaleBias.wz) / View.ScreenPositionScaleBias.xy;
|
|
OutPosition = float4(ScreenPosition, 0, 1);
|
|
|
|
// Cull the tile if no affecting capsules, shadow will not be visible
|
|
if (NumCapsulesAffectingTile == 0)
|
|
{
|
|
OutPosition.xy = 0;
|
|
}
|
|
}
|
|
|
|
Texture2D ShadowFactorsTexture;
|
|
SamplerState ShadowFactorsSampler;
|
|
|
|
float OutputtingToLightAttenuation;
|
|
|
|
void CapsuleShadowingUpsamplePS(
|
|
in float4 SVPos : SV_POSITION,
|
|
out float4 OutColor : SV_Target0
|
|
#if APPLY_TO_SSAO
|
|
,out float4 OutAmbientOcclusion : SV_Target1
|
|
#endif
|
|
)
|
|
{
|
|
// Distance field shadowing was computed at 0,0 regardless of viewrect min
|
|
float2 DistanceFieldUVs = SvPositionToBufferUV(SVPos) - ScissorRectMinAndSize.xy * View.BufferSizeAndInvSize.zw;
|
|
float SceneDepth = CalcSceneDepth(SvPositionToBufferUV(SVPos));
|
|
|
|
#define BILATERAL_UPSAMPLE 1
|
|
#if BILATERAL_UPSAMPLE && UPSAMPLE_REQUIRED
|
|
float2 LowResBufferSize = floor(View.RenderTargetSize / DOWNSAMPLE_FACTOR);
|
|
float2 LowResTexelSize = 1.0f / LowResBufferSize;
|
|
float2 Corner00UV = floor(DistanceFieldUVs * LowResBufferSize - .5f) / LowResBufferSize + .5f * LowResTexelSize;
|
|
float2 BilinearWeights = (DistanceFieldUVs - Corner00UV) * LowResBufferSize;
|
|
|
|
float2 TextureValues00 = Texture2DSampleLevel(ShadowFactorsTexture, ShadowFactorsSampler, Corner00UV, 0).xy;
|
|
float2 TextureValues10 = Texture2DSampleLevel(ShadowFactorsTexture, ShadowFactorsSampler, Corner00UV + float2(LowResTexelSize.x, 0), 0).xy;
|
|
float2 TextureValues01 = Texture2DSampleLevel(ShadowFactorsTexture, ShadowFactorsSampler, Corner00UV + float2(0, LowResTexelSize.y), 0).xy;
|
|
float2 TextureValues11 = Texture2DSampleLevel(ShadowFactorsTexture, ShadowFactorsSampler, Corner00UV + LowResTexelSize, 0).xy;
|
|
|
|
float4 CornerWeights = float4(
|
|
(1 - BilinearWeights.y) * (1 - BilinearWeights.x),
|
|
(1 - BilinearWeights.y) * BilinearWeights.x,
|
|
BilinearWeights.y * (1 - BilinearWeights.x),
|
|
BilinearWeights.y * BilinearWeights.x);
|
|
|
|
float Epsilon = .0001f;
|
|
|
|
float4 CornerDepths = abs(float4(TextureValues00.y, TextureValues10.y, TextureValues01.y, TextureValues11.y));
|
|
float4 DepthWeights = 1.0f / (abs(CornerDepths - SceneDepth.xxxx) + Epsilon);
|
|
float4 FinalWeights = CornerWeights * DepthWeights;
|
|
|
|
float InterpolatedResult =
|
|
(FinalWeights.x * TextureValues00.x
|
|
+ FinalWeights.y * TextureValues10.x
|
|
+ FinalWeights.z * TextureValues01.x
|
|
+ FinalWeights.w * TextureValues11.x)
|
|
/ dot(FinalWeights, 1);
|
|
|
|
float Output = InterpolatedResult;
|
|
|
|
#else
|
|
float Output = Texture2DSampleLevel(ShadowFactorsTexture, ShadowFactorsSampler, DistanceFieldUVs, 0).x;
|
|
#endif
|
|
|
|
if (OutputtingToLightAttenuation > 0)
|
|
{
|
|
OutColor = EncodeLightAttenuation(Output).xxxx;
|
|
}
|
|
else
|
|
{
|
|
OutColor = Output;
|
|
}
|
|
|
|
#if APPLY_TO_SSAO
|
|
OutAmbientOcclusion = Output;
|
|
#endif
|
|
} |