You've already forked UnrealEngineUWP
mirror of
https://github.com/izzy2lost/UnrealEngineUWP.git
synced 2026-03-26 18:15:20 -07:00
#lockdown Nick.Penwarden #rb none ========================== MAJOR FEATURES + CHANGES ========================== Change 3055495 on 2016/07/19 by Marc.Olano Allow Noise material node on mobile No reason to exclude mobile, except for Fast Gradient Noise, which uses 3D textures. Allow this node on ES2 for all of the other noise functions. #jira UE-33345 Change 3055602 on 2016/07/19 by Luke.Thatcher Fix crash bug in D3D11 RHI when selecting adapters. - Array of adapter descriptors will get out of sync with the adapter index if any adapter is skipped (e.g. the Microsoft Basic Render Device). #jira UE-33236 Change 3055890 on 2016/07/19 by Daniel.Wright Improved the assert in LoadModuleChecked so we won't have to check the log to see which module it was Change 3055891 on 2016/07/19 by Daniel.Wright Fixed Global Distance Field not dirtying previous object position on UpdateTransform - left behind a phantom shadow on teleports * This will effectively double partial distiance field update costs until clipping of the update regions is implemented Change 3055892 on 2016/07/19 by Daniel.Wright Higher poly light source shapes drawn into reflection captures Change 3055893 on 2016/07/19 by Daniel.Wright More info to 'Incompatible surface format' GNM assert Change 3055904 on 2016/07/19 by Daniel.Wright Reflection environment normalization improvements * Indirect specular from reflection captures is now mixed with indirect diffuse from lightmaps based on roughness, such that a mirror surface will have no mixing. Reflection captures now match other reflection methods like SSR and planar reflections much more closely. * When a stationary skylight is present, Reflection captures are now normalized as if the initial skylight will always be present, giving consistent results with static skylight reflections. The skylight and reflection captures with sky removed used to be normalized separately, compacting the relative brightness between the sky and scene. * Added r.ReflectionEnvironmentLightmapMixing for debugging lightmap mixing issues. This toggle was previously not possible due to prenormalizing the capture data. * The standard deferred reflection path (r.DoTiledReflections 0) can no longer match the results of the compute path or base pass reflections, as it would require MRT to accumulate the average brightness * Removed unused r.DiffuseFromCaptures * Cost of reflection environment on PS4 increased from 1.52ms -> 1.75ms with this change, but decreased back to 1.58ms by reducing tile size to 8x8 Change 3055905 on 2016/07/19 by Daniel.Wright Workaround for RTDF shadows not working on PS4 - manual clear of ObjectIndirectArguments instead of RHICmdList.ClearUAV Change 3059486 on 2016/07/21 by Nick.Penwarden Testing #uecritical Change 3060558 on 2016/07/21 by Daniel.Wright Fixed skylight with specified cubemap being black Change 3061999 on 2016/07/22 by Marcus.Wassmer Disable old AMD driver hacks for DX11. QA has already tested with them off and given thumbs up. Change 3062241 on 2016/07/22 by Daniel.Wright Fixed bug in RHISupportsSeparateMSAAAndResolveTextures that was preventing MSAA for any non-Vulkan platforms Change 3062244 on 2016/07/22 by Daniel.Wright Discard old prenormalized reflection environment data on load Change 3062283 on 2016/07/22 by Daniel.Wright MSAA support for the forward renderer * AntiAliasing method is chosen in Rendering project settings, DefaultSettings category * Deferred passes like shadow projection, fogging and decals are only computed per-pixel and can introduce aliasing * Added Rendering project setting VertexFoggingForOpaque, which makes height fog cheaper and work properly with MSAA * The AntiAliasing method in PostProcessSettings has been removed, this may affect existing content * Added r.MSAACount which defaults to 4 * Integrated wide custom resolve filter from Oculus renderer, controlled by r.WideCustomResolve * GBuffer targets are no longer allocated when using the forward renderer * Decal blend modes that write to the GBuffer fall back to SceneColor emissive only Change 3062666 on 2016/07/23 by Uriel.Doyon Added legend to streaming accuracy viewmodes Added a new helper class FRenderTargetTemp to be reused in different canvas rendering. Exposed the pass through pixel shader so that it can be reused. #review-3058986 @marcus.wassmer Change 3063023 on 2016/07/25 by Luke.Thatcher Fix "RecompileShaders Changed" when using Cook On The Fly. #jira UE-33573 Change 3063078 on 2016/07/25 by Ben.Woodhouse Add -emitdrawevents command line option to emit draw events by default. This is useful when capturing with Renderdoc Change 3063315 on 2016/07/25 by Ben.Woodhouse Fix div 0 in motion blur. This caused artifacts in some fairly common cases #jira UE-32331 Change 3063897 on 2016/07/25 by Uriel.Doyon Fixed missing qualifier on interpolants Change 3064559 on 2016/07/26 by Ben.Woodhouse Fix for cooker crash with BC6H textures (XB1, but may affect other platforms). Also fixes corruption issue with texture slices not being a multiple of 4 pixels (expanding as necessary), courtesy of Stu McKenna at the Coalition Tested fix on xbox, PC and PS4, using QAGame #jira UE-28592 Change 3064896 on 2016/07/26 by Ben.Woodhouse Fix compile errors on PS4 (the variable "sample" was conflicting with a keyword, causing compile errors). Also making encoding consistent on new shaders (ansi rather than UTF16) Change 3064913 on 2016/07/26 by Ben.Marsh Fix spelling of "Editor, Tools, Monolithics & DDC" node in Dev-Rendering build settings. Change 3065326 on 2016/07/26 by Uriel.Doyon Fixed UnbuiltInstanceBoundsList not being reset correctly, creating broken rendered primitives. #jira UE-32585 Change 3065541 on 2016/07/26 by Daniel.Wright Materials with a GBuffer SceneTexture lookup will fail to compile with forward shading Change 3065543 on 2016/07/26 by Daniel.Wright Restored DetailMode changes causing a FGlobalComponentRecreateRenderStateContext - accidental removal from cl 2969413 Change 3065545 on 2016/07/26 by Daniel.Wright Added material property bNormalCurvatureToRoughness, which can slightly reduce aliasing. Tweakable impact with r.NormalCurvatureToRoughnessScale. Fixed reflection capture feedback with base pass reflections Change 3066783 on 2016/07/27 by Daniel.Wright Moved PreShadowCacheDepthZ out of FSceneRenderTargets and into FScene, which fixes issues with cached preshadows and multiple scenes, including HighResScreenShot Disabled GMinScreenRadiusForShadowCaster on per-object shadows, which fixes popping when trying to increase shadow resolution from the defaults (r.Shadow.TexelsPerPixel 3) Change 3066794 on 2016/07/27 by Daniel.Wright Fixed crash rendering planar reflections due to NULL PostProcessSettings Change 3067412 on 2016/07/27 by Daniel.Wright Fix for OpenGL4 with uint interpolator Change 3068470 on 2016/07/28 by Daniel.Wright Fixed crash rendering translucency with translucent shadows which were determined to be invisible Change 3069046 on 2016/07/28 by Daniel.Wright Handle null Family in SetupAntiAliasingMethod Change 3069059 on 2016/07/28 by Daniel.Wright Added r.ReflectionEnvironmentBeginMixingRoughness (.1) and r.ReflectionEnvironmentEndMixingRoughness (.3), which can be used to tweak the lightmap mixing heuristc, or revert to previous behavior (mixing even on a mirror surface) Change 3069391 on 2016/07/28 by Daniel.Wright Fixed AverageBrightness being applied to reflections in gamma space in the mobile base pass, causing ES2 reflections to be overbright Change 3070369 on 2016/07/29 by Daniel.Wright r.ReflectionEnvironmentBeginMixingRoughness and r.ReflectionEnvironmentEndMixingRoughness set to 0 can be used to achieve old non-roughness based lightmap mixing Change 3070370 on 2016/07/29 by Daniel.Wright Bumped reflection capture DDC version to get rid of legacy prenormalized data Change 3070680 on 2016/07/29 by Marcus.Wassmer Fix slate ensure that is most likely a timing issue exposed by rendering. #ue-33902 Change 3070811 on 2016/07/29 by Marcus.Wassmer Fix ProjectLauncher errors when loading old versions #ue-33939 Change 3070971 on 2016/07/29 by Uriel.Doyon Updated ListTextures outputs to fix cooked VS non cooked differences and also to put enphasis on disk VS memory Change 3071452 on 2016/07/31 by Uriel.Doyon Updated the legend description for the (texture streaming) primitive distance accuracy view mode [CL 3072803 by Marcus Wassmer in Main branch]
743 lines
27 KiB
Plaintext
743 lines
27 KiB
Plaintext
// Copyright 1998-2016 Epic Games, Inc. All Rights Reserved.
|
|
|
|
/*=============================================================================
|
|
Random.usf: A pseudo-random number generator.
|
|
=============================================================================*/
|
|
|
|
#ifndef __Random_usf__
|
|
#define __Random_usf__
|
|
|
|
// @param xy should be a integer position (e.g. pixel position on the screen), repeats each 128x128 pixels
|
|
// similar to a texture lookup but is only ALU
|
|
// ~13 ALU operations (3 frac, 6 *, 4 mad)
|
|
float PseudoRandom(float2 xy)
|
|
{
|
|
float2 pos = frac(xy / 128.0f) * 128.0f + float2(-64.340622f, -72.465622f);
|
|
|
|
// found by experimentation
|
|
return frac(dot(pos.xyx * pos.xyy, float3(20.390625f, 60.703125f, 2.4281209f)));
|
|
}
|
|
|
|
// high frequency dither pattern appearing almost random without banding steps
|
|
//note: from "NEXT GENERATION POST PROCESSING IN CALL OF DUTY: ADVANCED WARFARE"
|
|
// http://advances.realtimerendering.com/s2014/index.html
|
|
// Epic extended by FrameId
|
|
// ~7 ALU operations (2 frac, 3 mad, 2 *)
|
|
// @return 0..1
|
|
float InterleavedGradientNoise( float2 uv, float FrameId )
|
|
{
|
|
// magic values are found by experimentation
|
|
uv += FrameId * (float2(47, 17) * 0.695f);
|
|
|
|
const float3 magic = float3( 0.06711056f, 0.00583715f, 52.9829189f );
|
|
return frac(magic.z * frac(dot(uv, magic.xy)));
|
|
}
|
|
|
|
// [0, 1[
|
|
// ~10 ALU operations (2 frac, 5 *, 3 mad)
|
|
float RandFast( uint2 PixelPos, float Magic = 3571.0 )
|
|
{
|
|
float2 Random2 = ( 1.0 / 4320.0 ) * PixelPos + float2( 0.25, 0.0 );
|
|
float Random = frac( dot( Random2 * Random2, Magic ) );
|
|
Random = frac( Random * Random * (2 * Magic) );
|
|
return Random;
|
|
}
|
|
|
|
// Blum-Blum-Shub-inspired pseudo random number generator
|
|
// http://www.umbc.edu/~olano/papers/mNoise.pdf
|
|
// real BBS uses ((s*s) mod M) with bignums and M as the product of two huge primes
|
|
// instead, we use a single prime M just small enough not to overflow
|
|
// This is the largest prime < 2^12 so s*s will fit in a 24-bit floating point mantissa
|
|
#define BBS_PRIME24 4093
|
|
// This is the largest prime < 2^16 so s*s will fit in a 32-bit unsigned integer
|
|
#define BBS_PRIME32 65521
|
|
|
|
// Blum-Blum-Shub-inspired pseudo random number generator - float version
|
|
// @param Integer valued floating point seed
|
|
// @return random number in range [0,1)
|
|
// uint version RandBBSint24(intseed) is equivalent to BBS_PRIME24 * RandBBSfloat(float(intseed))
|
|
// ~7 ALU operations (4 *, 3 frac)
|
|
float RandBBSfloat(float seed)
|
|
{
|
|
float s = frac(seed / BBS_PRIME24);
|
|
s = frac(s * s * BBS_PRIME24);
|
|
s = frac(s * s * BBS_PRIME24);
|
|
return s;
|
|
}
|
|
|
|
// BBS random number generator - 24-bit uint version
|
|
// This version exists to match the float one, and because some hardware is faster for 24-bit int ops
|
|
// @param seed - old seed, repeats every BBS_PRIME24 (so 0 and 4093 produce the same result)
|
|
// @return random value (can be used as new seed) in range [0,BBS_PRIME24) = [0,4093)
|
|
// ~5 ALU operations (3 %, 2 *)
|
|
uint RandBBSuint24(uint seed)
|
|
{
|
|
uint s = Mod(seed, BBS_PRIME24);
|
|
s = Mod(s * s, BBS_PRIME24);
|
|
s = Mod(s * s, BBS_PRIME24);
|
|
return s;
|
|
}
|
|
|
|
// BBS random number generator - 32-bit uint version
|
|
// uses 65521 as the prime modulus, since it is the largest prime < 2^16
|
|
// @param seed - old seed, repeats every BBS_PRIME32 (so 0 and 65521 produce the same result)
|
|
// @return random value (can be used as new seed) in range [0,BBS_PRIME32) = [0,65521)
|
|
// ~5 ALU operations (3 %, 2 *)
|
|
uint RandBBSuint32(uint seed)
|
|
{
|
|
uint s = Mod(seed, BBS_PRIME32);
|
|
s = Mod(s * s, BBS_PRIME32);
|
|
s = Mod(s * s, BBS_PRIME32);
|
|
return s;
|
|
}
|
|
|
|
// 3D random number generator inspired by PCGs (permuted congruential generator)
|
|
// Using a **simple** Feistel cipher in place of the usual xor shift permutation step
|
|
// @param v = 3D integer coordinate
|
|
// @return three elements w/ 16 random bits each (0-0xffff).
|
|
// ~8 ALU operations for result.x (7 mad, 1 >>)
|
|
// ~10 ALU operations for result.xy (8 mad, 2 >>)
|
|
// ~12 ALU operations for result.xyz (9 mad, 3 >>)
|
|
uint3 Rand3DPCG16(int3 p)
|
|
{
|
|
// taking a signed int then reinterpreting as unsigned gives good behavior for negatives
|
|
uint3 v = uint3(p);
|
|
|
|
// Linear congruential step. These LCG constants are from Numerical Recipies
|
|
// For additional #'s, PCG would do multiple LCG steps and scramble each on output
|
|
// So v here is the RNG state
|
|
v = v * 1664525u + 1013904223u;
|
|
|
|
// PCG uses xorshift for the final shuffle, but it is expensive (and cheap
|
|
// versions of xorshift have visible artifacts). Instead, use simple MAD Feistel steps
|
|
//
|
|
// Feistel ciphers divide the state into separate parts (usually by bits)
|
|
// then apply a series of permutation steps one part at a time. The permutations
|
|
// use a reversible operation (usually ^) to part being updated with the result of
|
|
// a permutation function on the other parts and the key.
|
|
//
|
|
// In this case, I'm using v.x, v.y and v.z as the parts, using + instead of ^ for
|
|
// the combination function, and just multiplying the other two parts (no key) for
|
|
// the permutation function.
|
|
//
|
|
// That gives a simple mad per round.
|
|
v.x += v.y*v.z;
|
|
v.y += v.z*v.x;
|
|
v.z += v.x*v.y;
|
|
v.x += v.y*v.z;
|
|
v.y += v.z*v.x;
|
|
v.z += v.x*v.y;
|
|
|
|
// only top 16 bits are well shuffled
|
|
return v >> 16u;
|
|
}
|
|
|
|
// 3D random number generator inspired by PCGs (permuted congruential generator)
|
|
// Using a **simple** Feistel cipher in place of the usual xor shift permutation step
|
|
// @param v = 3D integer coordinate
|
|
// @return three elements w/ 16 random bits each (0-0xffff).
|
|
// ~12 ALU operations for result.x (10 mad, 3 >>)
|
|
// ~14 ALU operations for result.xy (11 mad, 3 >>)
|
|
// ~15 ALU operations for result.xyz (12 mad, 3 >>)
|
|
uint3 Rand3DPCG32(int3 p)
|
|
{
|
|
// taking a signed int then reinterpreting as unsigned gives good behavior for negatives
|
|
uint3 v = uint3(p);
|
|
|
|
// Linear congruential step.
|
|
v = v * 1664525u + 1013904223u;
|
|
|
|
// swapping low and high bits makes all 32 bits pretty good
|
|
v = v * (1u << 16u) + (v >> 16u);
|
|
|
|
// final shuffle
|
|
v.x += v.y*v.z;
|
|
v.y += v.z*v.x;
|
|
v.z += v.x*v.y;
|
|
v.x += v.y*v.z;
|
|
v.y += v.z*v.x;
|
|
v.z += v.x*v.y;
|
|
|
|
return v;
|
|
}
|
|
|
|
/**
|
|
* Find good arbitrary axis vectors to represent U and V axes of a plane,
|
|
* given just the normal. Ported from UnMath.h
|
|
*/
|
|
void FindBestAxisVectors(float3 In, out float3 Axis1, out float3 Axis2 )
|
|
{
|
|
const float3 N = abs(In);
|
|
|
|
// Find best basis vectors.
|
|
if( N.z > N.x && N.z > N.y )
|
|
{
|
|
Axis1 = float3(1, 0, 0);
|
|
}
|
|
else
|
|
{
|
|
Axis1 = float3(0, 0, 1);
|
|
}
|
|
|
|
Axis1 = normalize(Axis1 - In * dot(Axis1, In));
|
|
Axis2 = cross(Axis1, In);
|
|
}
|
|
|
|
// References for noise:
|
|
//
|
|
// Improved Perlin noise
|
|
// http://mrl.nyu.edu/~perlin/noise/
|
|
// http://http.developer.nvidia.com/GPUGems/gpugems_ch05.html
|
|
// Modified Noise for Evaluation on Graphics Hardware
|
|
// http://www.csee.umbc.edu/~olano/papers/mNoise.pdf
|
|
// Perlin Noise
|
|
// http://mrl.nyu.edu/~perlin/doc/oscar.html
|
|
// Fast Gradient Noise
|
|
// http://prettyprocs.wordpress.com/2012/10/20/fast-perlin-noise
|
|
|
|
|
|
// -------- ALU based method ---------
|
|
|
|
/*
|
|
* Pseudo random number generator, based on "TEA, a tiny Encrytion Algorithm"
|
|
* http://citeseer.ist.psu.edu/viewdoc/download?doi=10.1.1.45.281&rep=rep1&type=pdf
|
|
* http://www.umbc.edu/~olano/papers/index.html#GPUTEA
|
|
* @param v - old seed (full 32bit range)
|
|
* @param IterationCount - >=1, bigger numbers cost more performance but improve quality
|
|
* @return new seed
|
|
*/
|
|
uint2 ScrambleTEA(uint2 v, uint IterationCount = 3)
|
|
{
|
|
// Start with some random data (numbers can be arbitrary but those have been used by others and seem to work well)
|
|
uint k[4] ={ 0xA341316Cu , 0xC8013EA4u , 0xAD90777Du , 0x7E95761Eu };
|
|
|
|
uint y = v[0];
|
|
uint z = v[1];
|
|
uint sum = 0;
|
|
|
|
UNROLL for(uint i = 0; i < IterationCount; ++i)
|
|
{
|
|
sum += 0x9e3779b9;
|
|
y += ((z << 4u) + k[0]) ^ (z + sum) ^ ((z >> 5u) + k[1]);
|
|
z += ((y << 4u) + k[2]) ^ (y + sum) ^ ((y >> 5u) + k[3]);
|
|
}
|
|
|
|
return uint2(y, z);
|
|
}
|
|
|
|
// Wraps noise for tiling texture creation
|
|
// @param v = unwrapped texture parameter
|
|
// @param bTiling = true to tile, false to not tile
|
|
// @param RepeatSize = number of units before repeating
|
|
// @return either original or wrapped coord
|
|
float3 NoiseTileWrap(float3 v, bool bTiling, float RepeatSize)
|
|
{
|
|
return bTiling ? (frac(v / RepeatSize) * RepeatSize) : v;
|
|
}
|
|
|
|
// Evaluate polynomial to get smooth transitions for Perlin noise
|
|
// only needed by Perlin functions in this file
|
|
// scalar(per component): 2 add, 5 mul
|
|
float4 PerlinRamp(float4 t)
|
|
{
|
|
return t * t * t * (t * (t * 6 - 15) + 10);
|
|
}
|
|
|
|
// Modified noise gradient term
|
|
// @param seed - random seed for integer lattice position
|
|
// @param offset - [-1,1] offset of evaluation point from lattice point
|
|
// @return gradient contribution from this lattice point
|
|
float MGradient(uint seed, float3 offset)
|
|
{
|
|
uint rand = RandBBSuint24(seed);
|
|
float3 direction = float3(rand & 1, rand & 2, rand & 4) * float3(2, 1, 0.5) - 1;
|
|
return dot(direction, offset);
|
|
}
|
|
|
|
// compute Perlin and related noise corner seed values
|
|
// @param v = 3D noise argument, use float3(x,y,0) for 2D or float3(x,0,0) for 1D
|
|
// @param bTiling = true to return seed values for a repeating noise pattern
|
|
// @param RepeatSize = integer units before tiling in each dimension
|
|
// @param seed000-seed111 = hash function seeds for the eight corners
|
|
// @return fractional part of v
|
|
float3 NoiseSeeds(float3 v, bool bTiling, float RepeatSize,
|
|
out float seed000, out float seed001, out float seed010, out float seed011,
|
|
out float seed100, out float seed101, out float seed110, out float seed111)
|
|
{
|
|
float3 fv = frac(v);
|
|
float3 iv = floor(v);
|
|
|
|
const float3 primes = float3(19, 47, 101);
|
|
|
|
if (bTiling)
|
|
{ // can't algebraically combine with primes
|
|
seed000 = dot(primes, NoiseTileWrap(iv, true, RepeatSize));
|
|
seed100 = dot(primes, NoiseTileWrap(iv + float3(1, 0, 0), true, RepeatSize));
|
|
seed010 = dot(primes, NoiseTileWrap(iv + float3(0, 1, 0), true, RepeatSize));
|
|
seed110 = dot(primes, NoiseTileWrap(iv + float3(1, 1, 0), true, RepeatSize));
|
|
seed001 = dot(primes, NoiseTileWrap(iv + float3(0, 0, 1), true, RepeatSize));
|
|
seed101 = dot(primes, NoiseTileWrap(iv + float3(1, 0, 1), true, RepeatSize));
|
|
seed011 = dot(primes, NoiseTileWrap(iv + float3(0, 1, 1), true, RepeatSize));
|
|
seed111 = dot(primes, NoiseTileWrap(iv + float3(1, 1, 1), true, RepeatSize));
|
|
}
|
|
else
|
|
{ // get to combine offsets with multiplication by primes in this case
|
|
seed000 = dot(iv, primes);
|
|
seed100 = seed000 + primes.x;
|
|
seed010 = seed000 + primes.y;
|
|
seed110 = seed100 + primes.y;
|
|
seed001 = seed000 + primes.z;
|
|
seed101 = seed100 + primes.z;
|
|
seed011 = seed010 + primes.z;
|
|
seed111 = seed110 + primes.z;
|
|
}
|
|
|
|
return fv;
|
|
}
|
|
|
|
// Perlin-style "Modified Noise"
|
|
// http://www.umbc.edu/~olano/papers/index.html#mNoise
|
|
// @param v = 3D noise argument, use float3(x,y,0) for 2D or float3(x,0,0) for 1D
|
|
// @param bTiling = repeat noise pattern
|
|
// @param RepeatSize = integer units before tiling in each dimension
|
|
// @return random number in the range -1 .. 1
|
|
float GradientNoise3D_ALU(float3 v, bool bTiling, float RepeatSize)
|
|
{
|
|
float seed000, seed001, seed010, seed011, seed100, seed101, seed110, seed111;
|
|
float3 fv = NoiseSeeds(v, bTiling, RepeatSize, seed000, seed001, seed010, seed011, seed100, seed101, seed110, seed111);
|
|
|
|
float rand000 = MGradient(int(seed000), fv - float3(0, 0, 0));
|
|
float rand100 = MGradient(int(seed100), fv - float3(1, 0, 0));
|
|
float rand010 = MGradient(int(seed010), fv - float3(0, 1, 0));
|
|
float rand110 = MGradient(int(seed110), fv - float3(1, 1, 0));
|
|
float rand001 = MGradient(int(seed001), fv - float3(0, 0, 1));
|
|
float rand101 = MGradient(int(seed101), fv - float3(1, 0, 1));
|
|
float rand011 = MGradient(int(seed011), fv - float3(0, 1, 1));
|
|
float rand111 = MGradient(int(seed111), fv - float3(1, 1, 1));
|
|
|
|
float3 Weights = PerlinRamp(float4(fv, 0)).xyz;
|
|
|
|
float i = lerp(lerp(rand000, rand100, Weights.x), lerp(rand010, rand110, Weights.x), Weights.y);
|
|
float j = lerp(lerp(rand001, rand101, Weights.x), lerp(rand011, rand111, Weights.x), Weights.y);
|
|
return lerp(i, j, Weights.z).x;
|
|
}
|
|
|
|
// 3D value noise - used to be incorrectly called Perlin noise
|
|
// @param v = 3D noise argument, use float3(x,y,0) for 2D or float3(x,0,0) for 1D
|
|
// @param bTiling = repeat noise pattern
|
|
// @param RepeatSize = integer units before tiling in each dimension
|
|
// @return random number in the range -1 .. 1
|
|
float ValueNoise3D_ALU(float3 v, bool bTiling, float RepeatSize)
|
|
{
|
|
float seed000, seed001, seed010, seed011, seed100, seed101, seed110, seed111;
|
|
float3 fv = NoiseSeeds(v, bTiling, RepeatSize, seed000, seed001, seed010, seed011, seed100, seed101, seed110, seed111);
|
|
|
|
float rand000 = RandBBSfloat(seed000) * 2 - 1;
|
|
float rand100 = RandBBSfloat(seed100) * 2 - 1;
|
|
float rand010 = RandBBSfloat(seed010) * 2 - 1;
|
|
float rand110 = RandBBSfloat(seed110) * 2 - 1;
|
|
float rand001 = RandBBSfloat(seed001) * 2 - 1;
|
|
float rand101 = RandBBSfloat(seed101) * 2 - 1;
|
|
float rand011 = RandBBSfloat(seed011) * 2 - 1;
|
|
float rand111 = RandBBSfloat(seed111) * 2 - 1;
|
|
|
|
float3 Weights = PerlinRamp(float4(fv, 0)).xyz;
|
|
|
|
float i = lerp(lerp(rand000, rand100, Weights.x), lerp(rand010, rand110, Weights.x), Weights.y);
|
|
float j = lerp(lerp(rand001, rand101, Weights.x), lerp(rand011, rand111, Weights.x), Weights.y);
|
|
return lerp(i, j, Weights.z).x;
|
|
}
|
|
|
|
|
|
// -------- TEX based methods ---------
|
|
|
|
// filtered 3D noise, can be optimized
|
|
// @param v = 3D noise argument, use float3(x,y,0) for 2D or float3(x,0,0) for 1D
|
|
// @param bTiling = repeat noise pattern
|
|
// @param RepeatSize = integer units before tiling in each dimension
|
|
// @return random number in the range -1 .. 1
|
|
float GradientNoise3D_TEX(float3 v, bool bTiling, float RepeatSize)
|
|
{
|
|
bTiling = true;
|
|
float3 fv = frac(v);
|
|
float3 iv0 = NoiseTileWrap(floor(v), bTiling, RepeatSize);
|
|
float3 iv1 = NoiseTileWrap(iv0 + 1, bTiling, RepeatSize);
|
|
|
|
const int2 ZShear = int2(17, 89);
|
|
|
|
float2 OffsetA = iv0.z * ZShear;
|
|
float2 OffsetB = OffsetA + ZShear; // non-tiling, use relative offset
|
|
if (bTiling) // tiling, have to compute from wrapped coordinates
|
|
{
|
|
OffsetB = iv1.z * ZShear;
|
|
}
|
|
|
|
// Texture size scale factor
|
|
float ts = 1 / 128.0f;
|
|
|
|
// texture coordinates for iv0.xy, as offset for both z slices
|
|
float2 TexA0 = (iv0.xy + OffsetA + 0.5f) * ts;
|
|
float2 TexB0 = (iv0.xy + OffsetB + 0.5f) * ts;
|
|
|
|
// texture coordinates for iv1.xy, as offset for both z slices
|
|
float2 TexA1 = TexA0 + ts; // for non-tiling, can compute relative to existing coordinates
|
|
float2 TexB1 = TexB0 + ts;
|
|
if (bTiling) // for tiling, need to compute from wrapped coordinates
|
|
{
|
|
TexA1 = (iv1.xy + OffsetA + 0.5f) * ts;
|
|
TexB1 = (iv1.xy + OffsetB + 0.5f) * ts;
|
|
}
|
|
|
|
|
|
// can be optimized to 1 or 2 texture lookups (4 or 8 channel encoded in 8, 16 or 32 bit)
|
|
float3 A = Texture2DSampleLevel(View.PerlinNoiseGradientTexture, View.PerlinNoiseGradientTextureSampler, float2(TexA0.x, TexA0.y), 0).xyz * 2 - 1;
|
|
float3 B = Texture2DSampleLevel(View.PerlinNoiseGradientTexture, View.PerlinNoiseGradientTextureSampler, float2(TexA1.x, TexA0.y), 0).xyz * 2 - 1;
|
|
float3 C = Texture2DSampleLevel(View.PerlinNoiseGradientTexture, View.PerlinNoiseGradientTextureSampler, float2(TexA0.x, TexA1.y), 0).xyz * 2 - 1;
|
|
float3 D = Texture2DSampleLevel(View.PerlinNoiseGradientTexture, View.PerlinNoiseGradientTextureSampler, float2(TexA1.x, TexA1.y), 0).xyz * 2 - 1;
|
|
float3 E = Texture2DSampleLevel(View.PerlinNoiseGradientTexture, View.PerlinNoiseGradientTextureSampler, float2(TexB0.x, TexB0.y), 0).xyz * 2 - 1;
|
|
float3 F = Texture2DSampleLevel(View.PerlinNoiseGradientTexture, View.PerlinNoiseGradientTextureSampler, float2(TexB1.x, TexB0.y), 0).xyz * 2 - 1;
|
|
float3 G = Texture2DSampleLevel(View.PerlinNoiseGradientTexture, View.PerlinNoiseGradientTextureSampler, float2(TexB0.x, TexB1.y), 0).xyz * 2 - 1;
|
|
float3 H = Texture2DSampleLevel(View.PerlinNoiseGradientTexture, View.PerlinNoiseGradientTextureSampler, float2(TexB1.x, TexB1.y), 0).xyz * 2 - 1;
|
|
|
|
float a = dot(A, fv - float3(0, 0, 0));
|
|
float b = dot(B, fv - float3(1, 0, 0));
|
|
float c = dot(C, fv - float3(0, 1, 0));
|
|
float d = dot(D, fv - float3(1, 1, 0));
|
|
float e = dot(E, fv - float3(0, 0, 1));
|
|
float f = dot(F, fv - float3(1, 0, 1));
|
|
float g = dot(G, fv - float3(0, 1, 1));
|
|
float h = dot(H, fv - float3(1, 1, 1));
|
|
|
|
float3 Weights = PerlinRamp(frac(float4(fv, 0))).xyz;
|
|
|
|
float i = lerp(lerp(a, b, Weights.x), lerp(c, d, Weights.x), Weights.y);
|
|
float j = lerp(lerp(e, f, Weights.x), lerp(g, h, Weights.x), Weights.y);
|
|
|
|
return lerp(i, j, Weights.z);
|
|
}
|
|
|
|
// @return random number in the range -1 .. 1
|
|
// scalar: 6 frac, 31 mul/mad, 15 add,
|
|
float FastGradientPerlinNoise3D_TEX(float3 xyz)
|
|
{
|
|
// needs to be the same value when creating the PerlinNoise3D texture
|
|
float Extent = 16;
|
|
|
|
// last texel replicated and needed for filtering
|
|
// scalar: 3 frac, 6 mul
|
|
xyz = frac(xyz / (Extent - 1)) * (Extent - 1);
|
|
|
|
// scalar: 3 frac
|
|
float3 uvw = frac(xyz);
|
|
// = floor(xyz);
|
|
// scalar: 3 add
|
|
float3 p0 = xyz - uvw;
|
|
// float3 f = pow(uvw, 2) * 3.0f - pow(uvw, 3) * 2.0f; // original perlin hermite (ok when used without bump mapping)
|
|
// scalar: 2*3 add 5*3 mul
|
|
float3 f = PerlinRamp(float4(uvw, 0)).xyz; // new, better with continues second derivative for bump mapping
|
|
// scalar: 3 add
|
|
float3 p = p0 + f;
|
|
// scalar: 3 mad
|
|
float4 NoiseSample = Texture3DSampleLevel(View.PerlinNoise3DTexture, View.PerlinNoise3DTextureSampler, p / Extent + 0.5f / Extent, 0); // +0.5f to get rid of bilinear offset
|
|
|
|
// reconstruct from 8bit (using mad with 2 constants and dot4 was same instruction count)
|
|
// scalar: 4 mad, 3 mul, 3 add
|
|
float3 n = NoiseSample.xyz * 255.0f / 127.0f - 1.0f;
|
|
float d = NoiseSample.w * 255.f - 127;
|
|
return dot(xyz, n) - d;
|
|
}
|
|
|
|
|
|
// 3D jitter offset within a voronoi noise cell
|
|
// @param pos - integer lattice corner
|
|
// @return random offsets vector
|
|
float3 VoronoiCornerSample(int3 pos, int Quality)
|
|
{
|
|
// random values in [-0.5, 0.5]
|
|
float3 noise = float3(Rand3DPCG16(pos)) / 0xffff - 0.5;
|
|
|
|
// quality level 1 or 2: searches a 2x2x2 neighborhood with points distributed on a sphere
|
|
// scale factor to guarantee jittered points will be found within a 2x2x2 search
|
|
if (Quality <= 2)
|
|
{
|
|
return normalize(noise) * 0.2588;
|
|
}
|
|
|
|
// quality level 3: searches a 3x3x3 neighborhood with points distributed on a sphere
|
|
// scale factor to guarantee jittered points will be found within a 3x3x3 search
|
|
if (Quality == 3)
|
|
{
|
|
return normalize(noise) * 0.3090;
|
|
}
|
|
|
|
// quality level 4: jitter to anywhere in the cell, needs 4x4x4 search
|
|
return noise;
|
|
}
|
|
|
|
// 220 instruction Worley noise
|
|
float VoronoiNoise3D_ALU(float3 v, int Quality, bool bTiling, float RepeatSize)
|
|
{
|
|
float3 fv = frac(v), fv2 = frac(v + 0.5);
|
|
float3 iv = floor(v), iv2 = floor(v + 0.5);
|
|
|
|
// with initial minimum distance = infinity (or at least bigger than 4), first min is optimized away
|
|
float mindist = 100;
|
|
float3 offset;
|
|
|
|
// quality level 3: do 3x3x3 search centered on current location
|
|
if (Quality == 3)
|
|
{
|
|
float3 mincell = floor(v - 1), maxcell = floor(v + 1);
|
|
float3 cell;
|
|
LOOP for (cell.x = mincell.x; cell.x <= maxcell.x; ++cell.x)
|
|
{
|
|
LOOP for (cell.y = mincell.y; cell.y <= maxcell.y; ++cell.y)
|
|
{
|
|
LOOP for (cell.z = mincell.z; cell.z <= maxcell.z; ++cell.z)
|
|
{
|
|
float3 p = v - cell - VoronoiCornerSample(NoiseTileWrap(cell, bTiling, RepeatSize), Quality);
|
|
mindist = min(mindist, dot(p, p));
|
|
}
|
|
}
|
|
}
|
|
}
|
|
|
|
// all others, do 2x2x2 search (unrolled)
|
|
else
|
|
{
|
|
UNROLL for (offset.x = 0; offset.x <= 1; ++offset.x)
|
|
{
|
|
UNROLL for (offset.y = 0; offset.y <= 1; ++offset.y)
|
|
{
|
|
UNROLL for (offset.z = 0; offset.z <= 1; ++offset.z)
|
|
{
|
|
float3 p = fv - offset - VoronoiCornerSample(NoiseTileWrap(iv + offset, bTiling, RepeatSize), Quality);
|
|
mindist = min(mindist, dot(p, p));
|
|
// quality level 2, do extra set of points, offset by half a cell
|
|
if (Quality == 2)
|
|
{
|
|
// 467 is just an offset to a different area in the random number field to avoid similar neighbor artifacts
|
|
p = fv2 - offset - VoronoiCornerSample(NoiseTileWrap(iv2 + offset, bTiling, RepeatSize) + 467, Quality);
|
|
mindist = min(mindist, dot(p, p));
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
|
|
// quality level 4: add extra sets of four cells in each direction
|
|
if (Quality >= 4)
|
|
{
|
|
float3 p;
|
|
UNROLL for (offset.x = -1; offset.x <= 2; offset.x += 3)
|
|
{
|
|
UNROLL for (offset.y = 0; offset.y <= 1; ++offset.y)
|
|
{
|
|
UNROLL for (offset.z = 0; offset.z <= 1; ++offset.z)
|
|
{
|
|
// along x axis
|
|
p = fv - offset.xyz - VoronoiCornerSample(NoiseTileWrap(iv + offset.xyz, bTiling, RepeatSize), Quality);
|
|
mindist = min(mindist, dot(p, p));
|
|
|
|
// along y axis
|
|
p = fv - offset.yzx - VoronoiCornerSample(NoiseTileWrap(iv + offset.yzx, bTiling, RepeatSize), Quality);
|
|
mindist = min(mindist, dot(p, p));
|
|
|
|
// along z axis
|
|
p = fv - offset.zxy - VoronoiCornerSample(NoiseTileWrap(iv + offset.zxy, bTiling, RepeatSize), Quality);
|
|
mindist = min(mindist, dot(p, p));
|
|
}
|
|
}
|
|
}
|
|
}
|
|
|
|
// transform to -1 to 1 range as expected by later OutputMin to OutputMax transform
|
|
return (sqrt(mindist) * 2 ) - 1;
|
|
}
|
|
|
|
|
|
// -------- Simplex method (faster in higher dimensions because less samples are used, uses gradient noise for quality) ---------
|
|
// <Dimensions>D:<Normal>/<Simplex> 1D:2, 2D:4/3, 3D:8/4, 4D:16/5
|
|
|
|
// Computed weights and sample positions for simplex interpolation
|
|
// @return float3(a,b,c) Barycentric coordianate defined as Filtered = Tex(PosA) * a + Tex(PosB) * b + Tex(PosC) * c
|
|
float3 ComputeSimplexWeights2D(float2 OrthogonalPos, out float2 PosA, out float2 PosB, out float2 PosC)
|
|
{
|
|
float2 OrthogonalPosFloor = floor(OrthogonalPos);
|
|
PosA = OrthogonalPosFloor;
|
|
PosB = PosA + float2(1, 1);
|
|
|
|
float2 LocalPos = OrthogonalPos - OrthogonalPosFloor;
|
|
|
|
PosC = PosA + ((LocalPos.x > LocalPos.y) ? float2(1,0) : float2(0,1));
|
|
|
|
float b = min(LocalPos.x, LocalPos.y);
|
|
float c = abs(LocalPos.y - LocalPos.x);
|
|
float a = 1.0f - b - c;
|
|
|
|
return float3(a, b, c);
|
|
}
|
|
|
|
// Computed weights and sample positions for simplex interpolation
|
|
// @return float4(a,b,c, d) Barycentric coordinate defined as Filtered = Tex(PosA) * a + Tex(PosB) * b + Tex(PosC) * c + Tex(PosD) * d
|
|
float4 ComputeSimplexWeights3D(float3 OrthogonalPos, out float3 PosA, out float3 PosB, out float3 PosC, out float3 PosD)
|
|
{
|
|
float3 OrthogonalPosFloor = floor(OrthogonalPos);
|
|
|
|
PosA = OrthogonalPosFloor;
|
|
PosB = PosA + float3(1, 1, 1);
|
|
|
|
OrthogonalPos -= OrthogonalPosFloor;
|
|
|
|
float Largest = max(OrthogonalPos.x, max(OrthogonalPos.y, OrthogonalPos.z));
|
|
float Smallest = min(OrthogonalPos.x, min(OrthogonalPos.y, OrthogonalPos.z));
|
|
|
|
PosC = PosA + float3(Largest == OrthogonalPos.x, Largest == OrthogonalPos.y, Largest == OrthogonalPos.z);
|
|
PosD = PosA + float3(Smallest != OrthogonalPos.x, Smallest != OrthogonalPos.y, Smallest != OrthogonalPos.z);
|
|
|
|
float4 ret;
|
|
|
|
float RG = OrthogonalPos.x - OrthogonalPos.y;
|
|
float RB = OrthogonalPos.x - OrthogonalPos.z;
|
|
float GB = OrthogonalPos.y - OrthogonalPos.z;
|
|
|
|
ret.b =
|
|
min(max(0, RG), max(0, RB)) // X
|
|
+ min(max(0, -RG), max(0, GB)) // Y
|
|
+ min(max(0, -RB), max(0, -GB)); // Z
|
|
|
|
ret.a =
|
|
min(max(0, -RG), max(0, -RB)) // X
|
|
+ min(max(0, RG), max(0, -GB)) // Y
|
|
+ min(max(0, RB), max(0, GB)); // Z
|
|
|
|
ret.g = Smallest;
|
|
ret.r = 1.0f - ret.g - ret.b - ret.a;
|
|
|
|
return ret;
|
|
}
|
|
|
|
float2 GetPerlinNoiseGradientTextureAt(float2 v)
|
|
{
|
|
float2 TexA = (v.xy + 0.5f) / 128.0f;
|
|
|
|
// todo: storing random 2d unit vectors would be better
|
|
float3 p = Texture2DSampleLevel(View.PerlinNoiseGradientTexture, View.PerlinNoiseGradientTextureSampler, TexA, 0).xyz * 2 - 1;
|
|
return normalize(p.xy + p.z * 0.33f);
|
|
}
|
|
|
|
float3 GetPerlinNoiseGradientTextureAt(float3 v)
|
|
{
|
|
const float2 ZShear = int2(17, 89);
|
|
|
|
float2 OffsetA = v.z * ZShear;
|
|
float2 TexA = (v.xy + OffsetA + 0.5f) / 128.0f;
|
|
|
|
return Texture2DSampleLevel(View.PerlinNoiseGradientTexture, View.PerlinNoiseGradientTextureSampler, TexA , 0).xyz * 2 - 1;
|
|
}
|
|
|
|
float2 SkewSimplex(float2 In)
|
|
{
|
|
return In + dot(In, (sqrt(3.0f) - 1.0f) * 0.5f );
|
|
}
|
|
float2 UnSkewSimplex(float2 In)
|
|
{
|
|
return In - dot(In, (3.0f - sqrt(3.0f)) / 6.0f );
|
|
}
|
|
float3 SkewSimplex(float3 In)
|
|
{
|
|
return In + dot(In, 1.0 / 3.0f );
|
|
}
|
|
float3 UnSkewSimplex(float3 In)
|
|
{
|
|
return In - dot(In, 1.0 / 6.0f );
|
|
}
|
|
|
|
// filtered 3D gradient simple noise (few texture lookups, high quality)
|
|
// @param v >0
|
|
// @return random number in the range -1 .. 1
|
|
float GradientSimplexNoise2D_TEX(float2 EvalPos)
|
|
{
|
|
float2 OrthogonalPos = SkewSimplex(EvalPos);
|
|
|
|
float2 PosA, PosB, PosC, PosD;
|
|
float3 Weights = ComputeSimplexWeights2D(OrthogonalPos, PosA, PosB, PosC);
|
|
|
|
// can be optimized to 1 or 2 texture lookups (4 or 8 channel encoded in 32 bit)
|
|
float2 A = GetPerlinNoiseGradientTextureAt(PosA);
|
|
float2 B = GetPerlinNoiseGradientTextureAt(PosB);
|
|
float2 C = GetPerlinNoiseGradientTextureAt(PosC);
|
|
|
|
PosA = UnSkewSimplex(PosA);
|
|
PosB = UnSkewSimplex(PosB);
|
|
PosC = UnSkewSimplex(PosC);
|
|
|
|
float DistanceWeight;
|
|
|
|
DistanceWeight = saturate(0.5f - length2(EvalPos - PosA)); DistanceWeight *= DistanceWeight; DistanceWeight *= DistanceWeight;
|
|
float a = dot(A, EvalPos - PosA) * DistanceWeight;
|
|
DistanceWeight = saturate(0.5f - length2(EvalPos - PosB)); DistanceWeight *= DistanceWeight; DistanceWeight *= DistanceWeight;
|
|
float b = dot(B, EvalPos - PosB) * DistanceWeight;
|
|
DistanceWeight = saturate(0.5f - length2(EvalPos - PosC)); DistanceWeight *= DistanceWeight; DistanceWeight *= DistanceWeight;
|
|
float c = dot(C, EvalPos - PosC) * DistanceWeight;
|
|
|
|
return 70 * (a + b + c);
|
|
}
|
|
|
|
|
|
|
|
// filtered 3D gradient simple noise (few texture lookups, high quality)
|
|
// @param v >0
|
|
// @return random number in the range -1 .. 1
|
|
float SimplexNoise3D_TEX(float3 EvalPos)
|
|
{
|
|
float3 OrthogonalPos = SkewSimplex(EvalPos);
|
|
|
|
float3 PosA, PosB, PosC, PosD;
|
|
float4 Weights = ComputeSimplexWeights3D(OrthogonalPos, PosA, PosB, PosC, PosD);
|
|
|
|
// can be optimized to 1 or 2 texture lookups (4 or 8 channel encoded in 32 bit)
|
|
float3 A = GetPerlinNoiseGradientTextureAt(PosA);
|
|
float3 B = GetPerlinNoiseGradientTextureAt(PosB);
|
|
float3 C = GetPerlinNoiseGradientTextureAt(PosC);
|
|
float3 D = GetPerlinNoiseGradientTextureAt(PosD);
|
|
|
|
PosA = UnSkewSimplex(PosA);
|
|
PosB = UnSkewSimplex(PosB);
|
|
PosC = UnSkewSimplex(PosC);
|
|
PosD = UnSkewSimplex(PosD);
|
|
|
|
float DistanceWeight;
|
|
|
|
DistanceWeight = saturate(0.6f - length2(EvalPos - PosA)); DistanceWeight *= DistanceWeight; DistanceWeight *= DistanceWeight;
|
|
float a = dot(A, EvalPos - PosA) * DistanceWeight;
|
|
DistanceWeight = saturate(0.6f - length2(EvalPos - PosB)); DistanceWeight *= DistanceWeight; DistanceWeight *= DistanceWeight;
|
|
float b = dot(B, EvalPos - PosB) * DistanceWeight;
|
|
DistanceWeight = saturate(0.6f - length2(EvalPos - PosC)); DistanceWeight *= DistanceWeight; DistanceWeight *= DistanceWeight;
|
|
float c = dot(C, EvalPos - PosC) * DistanceWeight;
|
|
DistanceWeight = saturate(0.6f - length2(EvalPos - PosD)); DistanceWeight *= DistanceWeight; DistanceWeight *= DistanceWeight;
|
|
float d = dot(D, EvalPos - PosD) * DistanceWeight;
|
|
|
|
return 32 * (a + b + c + d);
|
|
}
|
|
|
|
|
|
float VolumeRaymarch(float3 posPixelWS, float3 posCameraWS)
|
|
{
|
|
float ret = 0;
|
|
int cnt = 60;
|
|
|
|
LOOP for(int i=0; i < cnt; ++i)
|
|
{
|
|
ret += saturate(FastGradientPerlinNoise3D_TEX(lerp(posPixelWS, posCameraWS, i/(float)cnt) * 0.01) - 0.2f);
|
|
}
|
|
|
|
return ret / cnt * (length(posPixelWS - posCameraWS) * 0.001f );
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
#endif
|