Files
UnrealEngineUWP/Engine/Shaders/Random.usf
Marcus Wassmer 2826204161 Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3357411)
#lockdown Nick.Penwarden
#rb none

==========================
MAJOR FEATURES + CHANGES
==========================

Change 3244756 on 2017/01/03 by Marcus.Wassmer

	Copying //Tasks/UE4/Dev-Niagara@3244743 to Dev-Rendering (//UE4/Dev-Rendering)

Change 3248667 on 2017/01/05 by Olaf.Piesche

	Resaving default asset because of engine verison issue; maybe unnecessary, but resaving niagara engine content to be sure

	#jira UE-40160

Change 3249324 on 2017/01/06 by Marcus.Wassmer

	Resave with an actual version to stop cook warning

Change 3249611 on 2017/01/06 by Marcus.Wassmer

	Just remove warning-causing niagara data for now.

Change 3308052 on 2017/02/16 by Rolando.Caloca

	DR - Check for Vulkan SDK, and only use it if it's newer or the same as the headers we distribute

Change 3308109 on 2017/02/16 by Rolando.Caloca

	DR - Upgrade glslang to 1.0.39.1

Change 3308111 on 2017/02/16 by Rolando.Caloca

	DR - Update Vulkan distribution to 1.0.39.1

Change 3308153 on 2017/02/16 by Rolando.Caloca

	DR - Updated glslang libs

Change 3308842 on 2017/02/17 by Rolando.Caloca

	DR - Fixed copy/paste

Change 3310007 on 2017/02/17 by Chris.Bunner

	Back out CL 3221219 - causing MIC generation issues and superseded by CL 3273971.

	#jira UE-37792

Change 3310154 on 2017/02/17 by Chris.Bunner

	Assert when attempting to add a custom material attribute already in the base attributes list.

Change 3310155 on 2017/02/17 by Chris.Bunner

	PR #3231: Validate material index before accessing (Contributed by projectgheist)

	#jira UE-41774, UE-41788

Change 3310162 on 2017/02/17 by Chris.Bunner

	PR #3252: Added MobileMaterialInterface to UsedMaterials (Contributed by projectgheist)

	#jira UE-41823, UE-41950

Change 3310176 on 2017/02/17 by Chris.Bunner

	Merging CL 3233886: AMD HDR support (requires r.AMDSupportsHDRDisplayOutput=1 in ini).
	Update to AGS 5.0.5.
	Partial code tidy up.

Change 3310187 on 2017/02/17 by Chris.Bunner

	Preserve constant expressions rather than always casting after translating a material attribute. Losing the notion of constant means we can't correctly detect used properties and falsely enable e.g. PDO. Happened because of the incorrect component masks in BreakMaterialNodes which then had to be downcast to the correct type which is done as an inline fragment rather than swizzle expression.

	#jira UE-41594

Change 3310215 on 2017/02/17 by Chris.Bunner

	Prevent SpeedTree node compiling for skeletal meshes (not supported as uses more UV sets than available).
	More descriptive error for missing Cubemap UV input on TextureSample material node .

	#jira UE-33098

Change 3310838 on 2017/02/18 by Joe.Graf

	Moved some private functions to public for a licensee

	#CodeReview: matt.kuhlenschmidt
	#rb: n/a

Change 3311876 on 2017/02/20 by Rolando.Caloca

	DR - Expose skin cache cvar r.SkinCache.AccumulationBufferSizeInKB

	#jira UE-42014

Change 3314139 on 2017/02/21 by Rolando.Caloca

	DR - Minor cleanup pass
	- Remove FVulkanPendingState
	- Renamed some classes for clarity
	- Hoist pending UAVs for flush out to pending compute state

Change 3314642 on 2017/02/21 by Rolando.Caloca

	DR - Some more renaming

Change 3315431 on 2017/02/21 by Ben.Salem

	Properly set default values for test time out and tick. We now will default to ticking once per second, and tracking the macro stats of GPU/Render/Game thread time.

	#tests Ran showdown demo several times

Change 3316710 on 2017/02/22 by Rolando.Caloca

	DR - hlslcc - Fix refract intrinsic

Change 3316718 on 2017/02/22 by Rolando.Caloca

	DR - hlslcc - Built libs to pick up change from 3316710 - refract fix

Change 3316820 on 2017/02/22 by Benjamin.Hyder

	updating Tm-TrigNodes map

Change 3317192 on 2017/02/22 by Benjamin.Hyder

	Updating QA-Decals map

Change 3317528 on 2017/02/22 by Benjamin.Hyder

	Updating QA-Decals map

Change 3317639 on 2017/02/22 by Benjamin.Hyder

	Updating Decal on Complex Mesh example in QA-Decals

Change 3317764 on 2017/02/22 by Benjamin.Hyder

	Final updates to QA-Decals

Change 3318319 on 2017/02/22 by Rolando.Caloca

	DR - minor reorg/rename

Change 3318379 on 2017/02/22 by Rolando.Caloca

	DR - more cleanup

Change 3321181 on 2017/02/24 by Rolando.Caloca

	DR - Fix GL bug

Change 3321247 on 2017/02/24 by Rolando.Caloca

	DR - Fix misc bugs

Change 3321898 on 2017/02/24 by Chris.Bunner

	Only issue clear TLV dispatch if required.

	#jira UERNDR-193

Change 3321904 on 2017/02/24 by Chris.Bunner

	Added comment for potential future optimization.

Change 3322013 on 2017/02/24 by Uriel.Doyon

	Fixed separate translucency being affected by Gaussian DOF
	#jira UE-40489

Change 3322517 on 2017/02/24 by Uriel.Doyon

	Fixed issue with InvestigateTexture command removing budget limit.
	Fixed StreamingBounds show flag not working. It nows shows the streaming bound for the currently selected textures.
	#jira UE-40485

Change 3323470 on 2017/02/27 by Chad.Garyet

	Removing DDC job from dev-rendering

Change 3323479 on 2017/02/27 by Chad.Garyet

	Removing RDU agent type

Change 3323519 on 2017/02/27 by Chad.Garyet

	removing NCL/LHR/SEA agent types to clean up space

Change 3323639 on 2017/02/27 by Benjamin.Hyder

	More updates to QA-Decals

Change 3324207 on 2017/02/27 by Uriel.Doyon

	Fixed typo ScaleTexturesByGlobalMyBias ->  ScaleTexturesByGlobalMipBias
	Removed bad merge in FStreamingTextureLevelContext::GetBuildDataIndexRef

Change 3324396 on 2017/02/27 by Uriel.Doyon

	Fixed an issue with the Streaming Bounds show flag interferring with the static level data initialization
	#jira UE-40485

Change 3325227 on 2017/02/28 by Chris.Bunner

	Fix-up AMD AGS libs.

Change 3325566 on 2017/02/28 by Uriel.Doyon

	Fixed possible out-of-bound access in GetUsedTexture() when passing ERHIFeatureLevel::Num

Change 3326009 on 2017/02/28 by Uriel.Doyon

	Better fix for 3325566, as the previous fix would ignore the material instance overrides.

Change 3327058 on 2017/03/01 by Benjamin.Hyder

	Preparing TM_Shadermodels map for automation

Change 3328222 on 2017/03/01 by Chris.Bunner

	Prevent decals from drawing in separate translucency pass. Whilst user control and material relevance were already removed, if the flag was checked before being disabled (by swapping to decal domain) this was still being read in the render loop, now explicitly ignores decals.

	#jira UE-42449, UE-42446

Change 3329848 on 2017/03/02 by Uriel.Doyon

	Added some extra logs to help track UE-42168

Change 3329977 on 2017/03/02 by Rolando.Caloca

	DR - Fix bad clear value

Change 3330008 on 2017/03/02 by Benjamin.Hyder

	More preparations for QA-Decals automation

Change 3330754 on 2017/03/02 by Daniel.Wright

	Prominent comment explaining reflection env async compute usage and why it's not overlapped with anything

Change 3331451 on 2017/03/03 by Marc.Olano

	Manually unroll simplex noise loop to avoid PSO bug on AMD/Metal

Change 3331839 on 2017/03/03 by Rolando.Caloca

	DR - hlslcc - add missing file to project

Change 3332247 on 2017/03/03 by Rolando.Caloca

	DR - Fix for integrated intel
	PR #3305
	#jira UE-42393

Change 3332259 on 2017/03/03 by Rolando.Caloca

	DR - Fix bad index into pixel formats
	PR #3237
	#jira UE-41855

Change 3332305 on 2017/03/03 by Rolando.Caloca

	DR - OpenGL SRV for index buffers
	PR #3271
	#jira UE-32618

Change 3332313 on 2017/03/03 by Rolando.Caloca

	DR - Fix for integrated intel (properly)
	PR #3305
	#jira UE-42393

Change 3332317 on 2017/03/03 by Rolando.Caloca

	DR - OpenGL SRV for index buffers (properly)
	PR #3271
	#jira UE-32618

Change 3332368 on 2017/03/03 by Rolando.Caloca

	DR - Minor fixes so -sm4 and -sm5 can be used on windows with OpenGL/Vulkan

Change 3333690 on 2017/03/06 by Daniel.Wright

	[Copy] Changing movable skylight properties no longer affects static draw lists

Change 3333693 on 2017/03/06 by Daniel.Wright

	[Copy] Added 'r.AOListMeshDistanceFields' which dumps out mesh distance fields sorted by memory size, useful for directing content optimizations

Change 3333705 on 2017/03/06 by Daniel.Wright

	[Copy] Mesh distance fields are now 8 bit fixed point by default, but can be changed back to 16 bit floating piont with a project setting.
	* 8 bit uses half memory but introduces error for thin surfaces or large meshes.

Change 3333721 on 2017/03/06 by David.Hill

	DecalProxy:
	Copy float FadeScreenSize to FDeferredDecalProxy for use in the render thread.  This avoids  pointer chasing to the UDecalComponent (game thread component).

Change 3333772 on 2017/03/06 by Daniel.Wright

	[Copy] Scene motion blur data is only updated for the main renderer frames.  Fixes scene captures and planar reflections breaking object motion blur.

Change 3333790 on 2017/03/06 by Daniel.Wright

	[Copy] Mesh distance field generation uses Embree, for a 2.5x speedup
	* Can switch back to old kDOP generation with 'r.DistanceFieldBuild.UseEmbree 0' for debugging

Change 3333822 on 2017/03/06 by Daniel.Wright

	[Copy] Moved mesh distance field code into MeshDistanceFieldUtilities.cpp
	* Moved FMeshUtilities to its own header so the 8k line MeshUtilites.cpp file can be further split up

Change 3333827 on 2017/03/06 by Daniel.Wright

	[Copy] Range compress 8bit distance fields - gets one extra bit of precision on average

Change 3333828 on 2017/03/06 by Daniel.Wright

	[Copy] Raised High ShadowQuality to 2048 as 1024 for CSM is way too low

Change 3333831 on 2017/03/06 by Daniel.Wright

	Non-editor compile fix

Change 3333836 on 2017/03/06 by Daniel.Wright

	[Copy] Workaround for gobal distance field volume textures being bloated by 4x on PS4 due to the recommended tiling modes.  They now use a 2d tiling mode which avoids the bloat, saving 96Mb.

Change 3333843 on 2017/03/06 by Daniel.Wright

	[Copy] Added OcclusionExponent to skylight component
	* Useful for brightening up indoors without losing contact shadows as MinOcclusion does

Change 3333845 on 2017/03/06 by Daniel.Wright

	[Copy] Capsule shadow BP functions

Change 3333850 on 2017/03/06 by Daniel.Wright

	[Copy] Added OcclusionCombineMode to skylight component

Change 3333854 on 2017/03/06 by Daniel.Wright

	[Copy] Gnm properly registers clears as GPU work so those events show up in profilegpu

Change 3333857 on 2017/03/06 by Daniel.Wright

	[Copy] Clear light attenuation for local lights with a quad covering their screen extents
	* Clearing the entire light attenuation buffer costs .1ms on PS4.  This optimization lowers the minimum cost of a shadow casting light from .15ms -> .03ms.
	* Shadowed lights in Fortnite with 25 lights 3.7ms -> 1.42ms on PS4

Change 3333860 on 2017/03/06 by Daniel.Wright

	[Copy] Flush deferred deletes when reallocating distance field atlas to reduce peak memory

Change 3333861 on 2017/03/06 by Daniel.Wright

	[Copy] Disable all distance field features on Intel cards as HD 4000 hangs in the RHICreateTexture3D call to allocate the large atlas

Change 3333869 on 2017/03/06 by Daniel.Wright

	[Copy] Volumetric Fog using a volume texture mapped to the camera frustum
	* Volumetric fog can be enabled on an Exponential Height Fog component with additional controls
	* Lights have a VolumetricScatteringIntensity
	* New cvars r.VolumetricFog, r.VolumetricFog.GridPixelSize, r.VolumetricFog.GridSizeZ, r.VolumetricFog.DepthDistributionScale
	* Lighting features supported:
	   * Directional light with CSM and a light function
	   * Point / spot lights without shadows / light functions / IES profiles
	   * Skylight with occlusion from distance fields
	* Analytical height fog covers the view range past where the volumetric fog ends
	* Temporal reprojection is used on the volumetric fog scattering and extinction to achieve stability
	* Translucency integrates properly into volumetric fog
	* Height fog StartDistance is not supported by volumetric fog and should be set to 0.

Change 3333894 on 2017/03/06 by Daniel.Wright

	[Copy] Initialize GDummyVolumetricFogGlobalDataUniformBuffer outside of parallel rendering

Change 3333902 on 2017/03/06 by Daniel.Wright

	[Copy] Better handling of volumetric fog enabled with distance of 0

Change 3333903 on 2017/03/06 by Daniel.Wright

	[Copy] Fixed volumetric fog trying to render light functions for a point light

Change 3333908 on 2017/03/06 by Daniel.Wright

	[Copy] Volumetric materials
	* Added new material domain Volume, which can output Scattering, Absorption and Emissive.  All properties are in world space densities.
	* Particle systems using the Volume domain are voxelized based on their ParticlePosition and ParticleRadius
	* Volumetric fog integration is now energy conservative - scattering is integrated against transmission over the depth of each slice.
	* Added bOverrideLightColorsWithFogInscatteringColors to exponential height fog, which can be enabled to make Volumetric Fog match Height fog more closely

Change 3334134 on 2017/03/06 by Daniel.Wright

	[Copy from Michael Trepka] Added Embree 2.14.0 and changed MeshUtilities to use it as this solves issues with Embree leaking TLS keys. UnrealLightmass is still using older Embree 2.7.0 until we can find time to properly test it with the new version. Also, invalidated distance field DDC to force it to rebuild with updated Embree.

Change 3334420 on 2017/03/06 by Daniel.Wright

	Fixed RTDF shadows

Change 3335467 on 2017/03/07 by Benjamin.Hyder

	Initial submission of QA-Decals map to EngineTest

Change 3335556 on 2017/03/07 by Daniel.Wright

	Changed mesh distance field default format back to R16f

Change 3338020 on 2017/03/08 by Daniel.Wright

	Disable volumetric fog in vertex shaders for feature levels which don't support it

Change 3339394 on 2017/03/09 by Chris.Bunner

	Correctly handle material texture translation error edge case.

	#jira UE-42579, UE-42670

Change 3339992 on 2017/03/09 by Daniel.Wright

	Only compile volumetric fog shaders on supporting platforms

Change 3341858 on 2017/03/10 by Arne.Schober

	Copying //UE4/Dev-Rendering-PSO to Dev-Rendering (//UE4/Dev-Rendering)

	#RB Rolando.Caloca, Marcus.Wassmer, Daniel.Wright, Nick.Penwarden, Mark.Satterthwaite

Change 3342004 on 2017/03/10 by Arne.Schober

	Copying //UE4/Dev-Rendering-PSO to Dev-Rendering (//UE4/Dev-Rendering)
	Fix unity build

	#RB Marcus.Wassmer

Change 3343307 on 2017/03/13 by Marcus.Wassmer

	Update showflags when we are guaranteed it will happen in all possible ways to spawn the scenecapture. (drag into editor, PIE, -game, etc)

Change 3343732 on 2017/03/13 by Rolando.Caloca

	DR - Vulkan compute pipeline & refactor

Change 3344846 on 2017/03/14 by Rolando.Caloca

	DR - Android compile fixes

Change 3344883 on 2017/03/14 by Rolando.Caloca

	DR - Add missing stencil load/store to PSO initializer

Change 3344985 on 2017/03/14 by Rolando.Caloca

	DR - Made load/store actions uint8

Change 3345141 on 2017/03/14 by Rolando.Caloca

	DR - vk - Rework render pass hash

Change 3345304 on 2017/03/14 by Benjamin.Hyder

	Updating TM-Distancefields map to include TemplateFloor mesh

Change 3345387 on 2017/03/14 by Rolando.Caloca

	DR - Add _RenderThread calls for Create*Shader so RHIs can choose not to stall when creating

Change 3345388 on 2017/03/14 by Rolando.Caloca

	DR - Do not stall when creating shaders on Vulkan

Change 3345722 on 2017/03/14 by Chris.Bunner

	PR #3357: MinimalAPI add to many material expressions (Contributed by DeanoC)

	#jira UE-42752

Change 3345723 on 2017/03/14 by Chris.Bunner

	Reduce log verbosity causing spamming during  landscape editing.

	#jira UE-42714

Change 3345725 on 2017/03/14 by Chris.Bunner

	[Duplicate 3341860] Fixed material translation error with multiple connections from custom interpolator nodes.

Change 3345726 on 2017/03/14 by Chris.Bunner

	Typo fixes.

Change 3345732 on 2017/03/14 by Rolando.Caloca

	DR - Decouple vertex declaration off BSS

Change 3345746 on 2017/03/14 by Chris.Bunner

	Added sign() intrinsic material graph node and delisted material function workaround.

Change 3346042 on 2017/03/14 by Chris.Bunner

	Implement missing size query interface for FRenderTargetResources.

	#jira UE-41672

Change 3346387 on 2017/03/14 by Daniel.Wright

	[Copy] Added VolumetricScatteringIntensity to particle lights

Change 3346389 on 2017/03/14 by Daniel.Wright

	[Copy] Clamp Volumetric material attributes to fp16 range to avoid INFs
	Disable volumetric fog when the fog show flag is disabled

Change 3346392 on 2017/03/14 by Daniel.Wright

	[Copy] Fixed skylight being much too bright on volumetric fog

Change 3346406 on 2017/03/14 by Daniel.Wright

	[Copy] CSM resolution is now controlled by r.Shadow.MaxCSMResolution.
	* Changed HighPC to use 1024 MaxShadowResolution (max for all non-CSM shadows), saves 60Mb in Fortnite

Change 3346412 on 2017/03/14 by Daniel.Wright

	[Copy] TexCreate_ReduceMemoryWithTilingMode for translucency lighting 3d textures, saves 13Mb

Change 3346414 on 2017/03/14 by Daniel.Wright

	[Copy] TexCreate_ReduceMemoryWithTilingMode for volumetric fog 3d textures, saves 13Mb

Change 3346415 on 2017/03/14 by Daniel.Wright

	[Copy] Missing file from cl 3338451

Change 3346421 on 2017/03/14 by Daniel.Wright

	[Copy] Fixed NaNs in volumetric fog due to rendering when height fog is disabled
	* Volumetric fog converts NaNs to black now so they don't spread

Change 3346422 on 2017/03/14 by Daniel.Wright

	[Copy] Fixed NaN in volumetric fog with low density values

Change 3346423 on 2017/03/14 by Daniel.Wright

	[Copy] Changed default VolumetricFogScatteringDistribution to .2

Change 3346430 on 2017/03/14 by Daniel.Wright

	[Copy] New translucent material option to compute fog per pixel instead of the default per vertex

Change 3346432 on 2017/03/14 by Daniel.Wright

	[Copy] Moved Volumetric Fog parameters to view uniform buffer for translucency pass
	Fixed lifetimes of temporary Volumetric Fog render targets

Change 3346526 on 2017/03/14 by Daniel.Wright

	[Copy] Volumetric Fog supports point and spot light shadows
	* These lights are injected separately so that per-light resources can be bound (shadow depth map, static shadow depth map)
	* Forward lighting of local lights can be forced with 'r.VolumetricFog.InjectShadowedLightsSeparately 0'
	* Shadowed lights come at a cost: 2.9ms for volumetric fog on 970 -> 4.2ms with shadowing

Change 3347053 on 2017/03/15 by Rolando.Caloca

	DR - android compile fix

Change 3347384 on 2017/03/15 by Rolando.Caloca

	DR - Fix merge issue

Change 3347643 on 2017/03/15 by Marcus.Wassmer

	Fix some bugs with the 'disable stationary skylight ffor the project' feature.
	Fixes lighting in Persona on Paragon.

Change 3347979 on 2017/03/15 by Rolando.Caloca

	DR - Allow to automatically apply cached rendertargets to PSO initializer

Change 3348024 on 2017/03/15 by Rolando.Caloca

	DR - Remove NullPS on Vulkan to avoid deadlock

Change 3348303 on 2017/03/15 by Rolando.Caloca

	DR - Fix for debugging SCW with material SRT

Change 3348357 on 2017/03/15 by Marcus.Wassmer

	Fix stencildither and a stencilref bug that was probably breaking decals sometimes.

Change 3348549 on 2017/03/15 by Marcus.Wassmer

	Hopefully fix static analysis for potential nullptr access.

Change 3348614 on 2017/03/15 by Marcus.Wassmer

	Duplicate some switch changes to fix crash on launch.

Change 3349369 on 2017/03/16 by Gil.Gribb

	Fixed botched merge

Change 3349947 on 2017/03/16 by Rolando.Caloca

	DR - Fix for mismatched primitive type

Change 3349956 on 2017/03/16 by Benjamin.Hyder

	initial updates to TM-DistanceFields map

Change 3350151 on 2017/03/16 by Rolando.Caloca

	DR - Fix UT compile issue

Change 3350155 on 2017/03/16 by Rolando.Caloca

	DR - Catch mismatched primitive type on PSOs on D3D11

Change 3350192 on 2017/03/16 by Daniel.Wright

	Fix for point light shadow depths rendering with wrong cull mode due to PSO refactor

Change 3350736 on 2017/03/16 by Daniel.Wright

	Fixed formatting from merge

Change 3350881 on 2017/03/16 by Rolando.Caloca

	DR - Fix texture arrays as UAVs on Metal

Change 3350927 on 2017/03/16 by Rolando.Caloca

	DR - Fix warning

Change 3350935 on 2017/03/16 by Daniel.Wright

	Fix for materials with non-Surface domains being skipped in mesh passes

Change 3351583 on 2017/03/17 by Marcus.Wassmer

	Fix clang platforms

Change 3351917 on 2017/03/17 by Marcus.Wassmer

	Fix linux compile

Change 3351973 on 2017/03/17 by Marcus.Wassmer

	Fix mismatched rendertargetformat

Change 3352038 on 2017/03/17 by Daniel.Wright

	Enabled GetAndOrCreateGraphicsPipelineState ensures in Development for testing

Change 3352110 on 2017/03/17 by Marcus.Wassmer

	Fix missing RT PSO apply

Change 3352695 on 2017/03/17 by Arne.Schober

	DR - Remove PSO Rendertarget check in DX12 Resolve with Shader.
	#RB Rolando.Caloca

Change 3352960 on 2017/03/17 by Arne.Schober

	DR - Fix some things that slipped trough the PSO merge
	#RB none

Change 3353150 on 2017/03/18 by Rolando.Caloca

	DR - compile fix

Change 3353205 on 2017/03/18 by Arne.Schober

	DR - Fix Incremental Compile and PS4 runtime error where CMASK is not allowed for ThickTile Mode

	#RB none

Change 3353207 on 2017/03/18 by Arne.Schober

	DR - Fix Confusion

	#RB none

Change 3355183 on 2017/03/20 by Nick.Bullard

	Fixed up Content orginzation for Decals automation tests in EngineTest

Change 3355627 on 2017/03/20 by Arne.Schober

	DR - [UE-43094] - removed ensure in comporiton graph as control of the clear color cannot be gurantueed.

Change 3356342 on 2017/03/21 by Marcus.Wassmer

	Fix clang errors

Change 3356591 on 2017/03/21 by Arne.Schober

	DR - Fix ensure message
	#RB none

Change 3356873 on 2017/03/21 by Arne.Schober

	DR - Fix comparission of undefined values in RendertargetApply Check

Change 3357261 on 2017/03/21 by Marcus.Wassmer

	Fix LinuxEditor compile

Change 3357294 on 2017/03/21 by Marcus.Wassmer

	Add missing SSE functions

Change 3357351 on 2017/03/21 by Frank.Fella

	Fix win32 and linux compiler errors

Change 3357370 on 2017/03/21 by Arne.Schober

	DR - disable ensure in test builds

	#RB Marcus.Wassmer

[CL 3357449 by Marcus Wassmer in Main branch]
2017-03-21 17:46:52 -04:00

862 lines
32 KiB
Plaintext

// Copyright 1998-2017 Epic Games, Inc. All Rights Reserved.
/*=============================================================================
Random.usf: A pseudo-random number generator.
=============================================================================*/
#ifndef __Random_usf__
#define __Random_usf__
// @param xy should be a integer position (e.g. pixel position on the screen), repeats each 128x128 pixels
// similar to a texture lookup but is only ALU
// ~13 ALU operations (3 frac, 6 *, 4 mad)
float PseudoRandom(float2 xy)
{
float2 pos = frac(xy / 128.0f) * 128.0f + float2(-64.340622f, -72.465622f);
// found by experimentation
return frac(dot(pos.xyx * pos.xyy, float3(20.390625f, 60.703125f, 2.4281209f)));
}
// high frequency dither pattern appearing almost random without banding steps
//note: from "NEXT GENERATION POST PROCESSING IN CALL OF DUTY: ADVANCED WARFARE"
// http://advances.realtimerendering.com/s2014/index.html
// Epic extended by FrameId
// ~7 ALU operations (2 frac, 3 mad, 2 *)
// @return 0..1
float InterleavedGradientNoise( float2 uv, float FrameId )
{
// magic values are found by experimentation
uv += FrameId * (float2(47, 17) * 0.695f);
const float3 magic = float3( 0.06711056f, 0.00583715f, 52.9829189f );
return frac(magic.z * frac(dot(uv, magic.xy)));
}
// [0, 1[
// ~10 ALU operations (2 frac, 5 *, 3 mad)
float RandFast( uint2 PixelPos, float Magic = 3571.0 )
{
float2 Random2 = ( 1.0 / 4320.0 ) * PixelPos + float2( 0.25, 0.0 );
float Random = frac( dot( Random2 * Random2, Magic ) );
Random = frac( Random * Random * (2 * Magic) );
return Random;
}
// This is the largest prime < 2^12 so s*s will fit in a 24-bit floating point mantissa
#define BBS_PRIME24 4093
// Blum-Blum-Shub-inspired pseudo random number generator
// http://www.umbc.edu/~olano/papers/mNoise.pdf
// real BBS uses ((s*s) mod M) with bignums and M as the product of two huge Blum primes
// instead, we use a single prime M just small enough not to overflow
// note that the above paper used 61, which fits in a half, but is unusably bad
// @param Integer valued floating point seed
// @return random number in range [0,1)
// ~8 ALU operations (5 *, 3 frac)
float RandBBSfloat(float seed)
{
float s = frac(seed / BBS_PRIME24);
s = frac(s * s * BBS_PRIME24);
s = frac(s * s * BBS_PRIME24);
return s;
}
// 3D random number generator inspired by PCGs (permuted congruential generator)
// Using a **simple** Feistel cipher in place of the usual xor shift permutation step
// @param v = 3D integer coordinate
// @return three elements w/ 16 random bits each (0-0xffff).
// ~8 ALU operations for result.x (7 mad, 1 >>)
// ~10 ALU operations for result.xy (8 mad, 2 >>)
// ~12 ALU operations for result.xyz (9 mad, 3 >>)
uint3 Rand3DPCG16(int3 p)
{
// taking a signed int then reinterpreting as unsigned gives good behavior for negatives
uint3 v = uint3(p);
// Linear congruential step. These LCG constants are from Numerical Recipies
// For additional #'s, PCG would do multiple LCG steps and scramble each on output
// So v here is the RNG state
v = v * 1664525u + 1013904223u;
// PCG uses xorshift for the final shuffle, but it is expensive (and cheap
// versions of xorshift have visible artifacts). Instead, use simple MAD Feistel steps
//
// Feistel ciphers divide the state into separate parts (usually by bits)
// then apply a series of permutation steps one part at a time. The permutations
// use a reversible operation (usually ^) to part being updated with the result of
// a permutation function on the other parts and the key.
//
// In this case, I'm using v.x, v.y and v.z as the parts, using + instead of ^ for
// the combination function, and just multiplying the other two parts (no key) for
// the permutation function.
//
// That gives a simple mad per round.
v.x += v.y*v.z;
v.y += v.z*v.x;
v.z += v.x*v.y;
v.x += v.y*v.z;
v.y += v.z*v.x;
v.z += v.x*v.y;
// only top 16 bits are well shuffled
return v >> 16u;
}
// 3D random number generator inspired by PCGs (permuted congruential generator)
// Using a **simple** Feistel cipher in place of the usual xor shift permutation step
// @param v = 3D integer coordinate
// @return three elements w/ 32 random bits each (0-0xffffffff).
// ~12 ALU operations for result.x (10 mad, 3 >>)
// ~14 ALU operations for result.xy (11 mad, 3 >>)
// ~15 ALU operations for result.xyz (12 mad, 3 >>)
uint3 Rand3DPCG32(int3 p)
{
// taking a signed int then reinterpreting as unsigned gives good behavior for negatives
uint3 v = uint3(p);
// Linear congruential step.
v = v * 1664525u + 1013904223u;
// swapping low and high bits makes all 32 bits pretty good
v = v * (1u << 16u) + (v >> 16u);
// final shuffle
v.x += v.y*v.z;
v.y += v.z*v.x;
v.z += v.x*v.y;
v.x += v.y*v.z;
v.y += v.z*v.x;
v.z += v.x*v.y;
return v;
}
/**
* Find good arbitrary axis vectors to represent U and V axes of a plane,
* given just the normal. Ported from UnMath.h
*/
void FindBestAxisVectors(float3 In, out float3 Axis1, out float3 Axis2 )
{
const float3 N = abs(In);
// Find best basis vectors.
if( N.z > N.x && N.z > N.y )
{
Axis1 = float3(1, 0, 0);
}
else
{
Axis1 = float3(0, 0, 1);
}
Axis1 = normalize(Axis1 - In * dot(Axis1, In));
Axis2 = cross(Axis1, In);
}
// References for noise:
//
// Improved Perlin noise
// http://mrl.nyu.edu/~perlin/noise/
// http://http.developer.nvidia.com/GPUGems/gpugems_ch05.html
// Modified Noise for Evaluation on Graphics Hardware
// http://www.csee.umbc.edu/~olano/papers/mNoise.pdf
// Perlin Noise
// http://mrl.nyu.edu/~perlin/doc/oscar.html
// Fast Gradient Noise
// http://prettyprocs.wordpress.com/2012/10/20/fast-perlin-noise
// -------- ALU based method ---------
/*
* Pseudo random number generator, based on "TEA, a tiny Encrytion Algorithm"
* http://citeseer.ist.psu.edu/viewdoc/download?doi=10.1.1.45.281&rep=rep1&type=pdf
* http://www.umbc.edu/~olano/papers/index.html#GPUTEA
* @param v - old seed (full 32bit range)
* @param IterationCount - >=1, bigger numbers cost more performance but improve quality
* @return new seed
*/
uint2 ScrambleTEA(uint2 v, uint IterationCount = 3)
{
// Start with some random data (numbers can be arbitrary but those have been used by others and seem to work well)
uint k[4] ={ 0xA341316Cu , 0xC8013EA4u , 0xAD90777Du , 0x7E95761Eu };
uint y = v[0];
uint z = v[1];
uint sum = 0;
UNROLL for(uint i = 0; i < IterationCount; ++i)
{
sum += 0x9e3779b9;
y += ((z << 4u) + k[0]) ^ (z + sum) ^ ((z >> 5u) + k[1]);
z += ((y << 4u) + k[2]) ^ (y + sum) ^ ((y >> 5u) + k[3]);
}
return uint2(y, z);
}
// Wraps noise for tiling texture creation
// @param v = unwrapped texture parameter
// @param bTiling = true to tile, false to not tile
// @param RepeatSize = number of units before repeating
// @return either original or wrapped coord
float3 NoiseTileWrap(float3 v, bool bTiling, float RepeatSize)
{
return bTiling ? (frac(v / RepeatSize) * RepeatSize) : v;
}
// Evaluate polynomial to get smooth transitions for Perlin noise
// only needed by Perlin functions in this file
// scalar(per component): 2 add, 5 mul
float4 PerlinRamp(float4 t)
{
return t * t * t * (t * (t * 6 - 15) + 10);
}
// Analytical derivative of the PerlinRamp polynomial
// only needed by Perlin functions in this file
// scalar(per component): 2 add, 5 mul
float4 PerlinRampDerivative(float4 t)
{
return t * t * (t * (t * 30 - 60) + 30);
}
#define MGradientMask int3(0x8000, 0x4000, 0x2000)
#define MGradientScale float3(1. / 0x4000, 1. / 0x2000, 1. / 0x1000)
// Modified noise gradient term
// @param seed - random seed for integer lattice position
// @param offset - [-1,1] offset of evaluation point from lattice point
// @return gradient direction (xyz) and contribution (w) from this lattice point
float4 MGradient(int seed, float3 offset)
{
uint rand = Rand3DPCG16(int3(seed,0,0)).x;
float3 direction = float3(rand.xxx & MGradientMask) * MGradientScale - 1;
return float4(direction, dot(direction, offset));
}
// compute Perlin and related noise corner seed values
// @param v = 3D noise argument, use float3(x,y,0) for 2D or float3(x,0,0) for 1D
// @param bTiling = true to return seed values for a repeating noise pattern
// @param RepeatSize = integer units before tiling in each dimension
// @param seed000-seed111 = hash function seeds for the eight corners
// @return fractional part of v
float3 NoiseSeeds(float3 v, bool bTiling, float RepeatSize,
out float seed000, out float seed001, out float seed010, out float seed011,
out float seed100, out float seed101, out float seed110, out float seed111)
{
float3 fv = frac(v);
float3 iv = floor(v);
const float3 primes = float3(19, 47, 101);
if (bTiling)
{ // can't algebraically combine with primes
seed000 = dot(primes, NoiseTileWrap(iv, true, RepeatSize));
seed100 = dot(primes, NoiseTileWrap(iv + float3(1, 0, 0), true, RepeatSize));
seed010 = dot(primes, NoiseTileWrap(iv + float3(0, 1, 0), true, RepeatSize));
seed110 = dot(primes, NoiseTileWrap(iv + float3(1, 1, 0), true, RepeatSize));
seed001 = dot(primes, NoiseTileWrap(iv + float3(0, 0, 1), true, RepeatSize));
seed101 = dot(primes, NoiseTileWrap(iv + float3(1, 0, 1), true, RepeatSize));
seed011 = dot(primes, NoiseTileWrap(iv + float3(0, 1, 1), true, RepeatSize));
seed111 = dot(primes, NoiseTileWrap(iv + float3(1, 1, 1), true, RepeatSize));
}
else
{ // get to combine offsets with multiplication by primes in this case
seed000 = dot(iv, primes);
seed100 = seed000 + primes.x;
seed010 = seed000 + primes.y;
seed110 = seed100 + primes.y;
seed001 = seed000 + primes.z;
seed101 = seed100 + primes.z;
seed011 = seed010 + primes.z;
seed111 = seed110 + primes.z;
}
return fv;
}
// Perlin-style "Modified Noise"
// http://www.umbc.edu/~olano/papers/index.html#mNoise
// @param v = 3D noise argument, use float3(x,y,0) for 2D or float3(x,0,0) for 1D
// @param bTiling = repeat noise pattern
// @param RepeatSize = integer units before tiling in each dimension
// @return random number in the range -1 .. 1
float GradientNoise3D_ALU(float3 v, bool bTiling, float RepeatSize)
{
float seed000, seed001, seed010, seed011, seed100, seed101, seed110, seed111;
float3 fv = NoiseSeeds(v, bTiling, RepeatSize, seed000, seed001, seed010, seed011, seed100, seed101, seed110, seed111);
float rand000 = MGradient(int(seed000), fv - float3(0, 0, 0)).w;
float rand100 = MGradient(int(seed100), fv - float3(1, 0, 0)).w;
float rand010 = MGradient(int(seed010), fv - float3(0, 1, 0)).w;
float rand110 = MGradient(int(seed110), fv - float3(1, 1, 0)).w;
float rand001 = MGradient(int(seed001), fv - float3(0, 0, 1)).w;
float rand101 = MGradient(int(seed101), fv - float3(1, 0, 1)).w;
float rand011 = MGradient(int(seed011), fv - float3(0, 1, 1)).w;
float rand111 = MGradient(int(seed111), fv - float3(1, 1, 1)).w;
float3 Weights = PerlinRamp(float4(fv, 0)).xyz;
float i = lerp(lerp(rand000, rand100, Weights.x), lerp(rand010, rand110, Weights.x), Weights.y);
float j = lerp(lerp(rand001, rand101, Weights.x), lerp(rand011, rand111, Weights.x), Weights.y);
return lerp(i, j, Weights.z).x;
}
// Coordinates for corners of a Simplex tetrahedron
// Based on McEwan et al., Efficient computation of noise in GLSL, JGT 2011
// @param v = 3D noise argument
// @return 4 corner locations
float4x3 SimplexCorners(float3 v)
{
// find base corner by skewing to tetrahedral space and back
float3 tet = floor(v + v.x/3 + v.y/3 + v.z/3);
float3 base = tet - tet.x/6 - tet.y/6 - tet.z/6;
float3 f = v - base;
// Find offsets to other corners (McEwan did this in tetrahedral space,
// but since skew is along x=y=z axis, this works in Euclidean space too.)
float3 g = step(f.yzx, f.xyz), h = 1 - g.zxy;
float3 a1 = min(g, h) - 1. / 6., a2 = max(g, h) - 1. / 3.;
// four corners
return float4x3(base, base + a1, base + a2, base + 0.5);
}
// Improved smoothing function for simplex noise
// @param f = fractional distance to four tetrahedral corners
// @return weight for each corner
float4 SimplexSmooth(float4x3 f)
{
const float scale = 1024. / 375.; // scale factor to make noise -1..1
float4 d = float4(dot(f[0], f[0]), dot(f[1], f[1]), dot(f[2], f[2]), dot(f[3], f[3]));
float4 s = saturate(2 * d);
return (1 * scale + s*(-3 * scale + s*(3 * scale - s*scale)));
}
// Derivative of simplex noise smoothing function
// @param f = fractional distanc eto four tetrahedral corners
// @return derivative of smoothing function for each corner by x, y and z
float3x4 SimplexDSmooth(float4x3 f)
{
const float scale = 1024. / 375.; // scale factor to make noise -1..1
float4 d = float4(dot(f[0], f[0]), dot(f[1], f[1]), dot(f[2], f[2]), dot(f[3], f[3]));
float4 s = saturate(2 * d);
s = -12 * scale + s*(24 * scale - s * 12 * scale);
return float3x4(
s * float4(f[0][0], f[1][0], f[2][0], f[3][0]),
s * float4(f[0][1], f[1][1], f[2][1], f[3][1]),
s * float4(f[0][2], f[1][2], f[2][2], f[3][2]));
}
// Simplex noise and its Jacobian derivative
// @param v = 3D noise argument
// @param bTiling = whether to repeat noise pattern
// @param RepeatSize = integer units before tiling in each dimension, must be a multiple of 3
// @return float3x3 Jacobian in J[*].xyz, vector noise in J[*].w
// J[0].w, J[1].w, J[2].w is a Perlin-style simplex noise with vector output, e.g. (Nx, Ny, Nz)
// J[i].x is X derivative of the i'th component of the noise so J[2].x is dNz/dx
// You can use this to compute the noise, gradient, curl, or divergence:
// float3x4 J = JacobianSimplex_ALU(...);
// float3 VNoise = float3(J[0].w, J[1].w, J[2].w); // 3D noise
// float3 Grad = J[0].xyz; // gradient of J[0].w
// float3 Curl = float3(J[1][2]-J[2][1], J[2][0]-J[0][2], J[0][1]-J[1][2]);
// float Div = J[0][0]+J[1][1]+J[2][2];
// All of these are confirmed to compile out all unneeded terms.
// So Grad of X doesn't compute Y or Z components, and VNoise doesn't do any of the derivative computation.
float3x4 JacobianSimplex_ALU(float3 v, bool bTiling, float RepeatSize)
{
// corners of tetrahedron
float4x3 T = SimplexCorners(v);
uint3 rand;
float4x3 gvec[3], fv;
float3x4 grad;
// processing of tetrahedral vertices, unrolled
// to compute gradient at each corner
fv[0] = v - T[0];
rand = Rand3DPCG16(int3(floor(NoiseTileWrap(6 * T[0] + 0.5, bTiling, RepeatSize))));
gvec[0][0] = float3(rand.xxx & MGradientMask) * MGradientScale - 1;
gvec[1][0] = float3(rand.yyy & MGradientMask) * MGradientScale - 1;
gvec[2][0] = float3(rand.zzz & MGradientMask) * MGradientScale - 1;
grad[0][0] = dot(gvec[0][0], fv[0]);
grad[1][0] = dot(gvec[1][0], fv[0]);
grad[2][0] = dot(gvec[2][0], fv[0]);
fv[1] = v - T[1];
rand = Rand3DPCG16(int3(floor(NoiseTileWrap(6 * T[1] + 0.5, bTiling, RepeatSize))));
gvec[0][1] = float3(rand.xxx & MGradientMask) * MGradientScale - 1;
gvec[1][1] = float3(rand.yyy & MGradientMask) * MGradientScale - 1;
gvec[2][1] = float3(rand.zzz & MGradientMask) * MGradientScale - 1;
grad[0][1] = dot(gvec[0][1], fv[1]);
grad[1][1] = dot(gvec[1][1], fv[1]);
grad[2][1] = dot(gvec[2][1], fv[1]);
fv[2] = v - T[2];
rand = Rand3DPCG16(int3(floor(NoiseTileWrap(6 * T[2] + 0.5, bTiling, RepeatSize))));
gvec[0][2] = float3(rand.xxx & MGradientMask) * MGradientScale - 1;
gvec[1][2] = float3(rand.yyy & MGradientMask) * MGradientScale - 1;
gvec[2][2] = float3(rand.zzz & MGradientMask) * MGradientScale - 1;
grad[0][2] = dot(gvec[0][2], fv[2]);
grad[1][2] = dot(gvec[1][2], fv[2]);
grad[2][2] = dot(gvec[2][2], fv[2]);
fv[3] = v - T[3];
rand = Rand3DPCG16(int3(floor(NoiseTileWrap(6 * T[3] + 0.5, bTiling, RepeatSize))));
gvec[0][3] = float3(rand.xxx & MGradientMask) * MGradientScale - 1;
gvec[1][3] = float3(rand.yyy & MGradientMask) * MGradientScale - 1;
gvec[2][3] = float3(rand.zzz & MGradientMask) * MGradientScale - 1;
grad[0][3] = dot(gvec[0][3], fv[3]);
grad[1][3] = dot(gvec[1][3], fv[3]);
grad[2][3] = dot(gvec[2][3], fv[3]);
// blend gradients
float4 sv = SimplexSmooth(fv);
float3x4 ds = SimplexDSmooth(fv);
float3x4 jacobian;
jacobian[0] = float4(mul(sv, gvec[0]) + mul(ds, grad[0]), dot(sv, grad[0]));
jacobian[1] = float4(mul(sv, gvec[1]) + mul(ds, grad[1]), dot(sv, grad[1]));
jacobian[2] = float4(mul(sv, gvec[2]) + mul(ds, grad[2]), dot(sv, grad[2]));
return jacobian;
}
// 3D value noise - used to be incorrectly called Perlin noise
// @param v = 3D noise argument, use float3(x,y,0) for 2D or float3(x,0,0) for 1D
// @param bTiling = repeat noise pattern
// @param RepeatSize = integer units before tiling in each dimension
// @return random number in the range -1 .. 1
float ValueNoise3D_ALU(float3 v, bool bTiling, float RepeatSize)
{
float seed000, seed001, seed010, seed011, seed100, seed101, seed110, seed111;
float3 fv = NoiseSeeds(v, bTiling, RepeatSize, seed000, seed001, seed010, seed011, seed100, seed101, seed110, seed111);
float rand000 = RandBBSfloat(seed000) * 2 - 1;
float rand100 = RandBBSfloat(seed100) * 2 - 1;
float rand010 = RandBBSfloat(seed010) * 2 - 1;
float rand110 = RandBBSfloat(seed110) * 2 - 1;
float rand001 = RandBBSfloat(seed001) * 2 - 1;
float rand101 = RandBBSfloat(seed101) * 2 - 1;
float rand011 = RandBBSfloat(seed011) * 2 - 1;
float rand111 = RandBBSfloat(seed111) * 2 - 1;
float3 Weights = PerlinRamp(float4(fv, 0)).xyz;
float i = lerp(lerp(rand000, rand100, Weights.x), lerp(rand010, rand110, Weights.x), Weights.y);
float j = lerp(lerp(rand001, rand101, Weights.x), lerp(rand011, rand111, Weights.x), Weights.y);
return lerp(i, j, Weights.z).x;
}
// -------- TEX based methods ---------
// filtered 3D noise, can be optimized
// @param v = 3D noise argument, use float3(x,y,0) for 2D or float3(x,0,0) for 1D
// @param bTiling = repeat noise pattern
// @param RepeatSize = integer units before tiling in each dimension
// @return random number in the range -1 .. 1
float GradientNoise3D_TEX(float3 v, bool bTiling, float RepeatSize)
{
bTiling = true;
float3 fv = frac(v);
float3 iv0 = NoiseTileWrap(floor(v), bTiling, RepeatSize);
float3 iv1 = NoiseTileWrap(iv0 + 1, bTiling, RepeatSize);
const int2 ZShear = int2(17, 89);
float2 OffsetA = iv0.z * ZShear;
float2 OffsetB = OffsetA + ZShear; // non-tiling, use relative offset
if (bTiling) // tiling, have to compute from wrapped coordinates
{
OffsetB = iv1.z * ZShear;
}
// Texture size scale factor
float ts = 1 / 128.0f;
// texture coordinates for iv0.xy, as offset for both z slices
float2 TexA0 = (iv0.xy + OffsetA + 0.5f) * ts;
float2 TexB0 = (iv0.xy + OffsetB + 0.5f) * ts;
// texture coordinates for iv1.xy, as offset for both z slices
float2 TexA1 = TexA0 + ts; // for non-tiling, can compute relative to existing coordinates
float2 TexB1 = TexB0 + ts;
if (bTiling) // for tiling, need to compute from wrapped coordinates
{
TexA1 = (iv1.xy + OffsetA + 0.5f) * ts;
TexB1 = (iv1.xy + OffsetB + 0.5f) * ts;
}
// can be optimized to 1 or 2 texture lookups (4 or 8 channel encoded in 8, 16 or 32 bit)
float3 A = Texture2DSampleLevel(View.PerlinNoiseGradientTexture, View.PerlinNoiseGradientTextureSampler, float2(TexA0.x, TexA0.y), 0).xyz * 2 - 1;
float3 B = Texture2DSampleLevel(View.PerlinNoiseGradientTexture, View.PerlinNoiseGradientTextureSampler, float2(TexA1.x, TexA0.y), 0).xyz * 2 - 1;
float3 C = Texture2DSampleLevel(View.PerlinNoiseGradientTexture, View.PerlinNoiseGradientTextureSampler, float2(TexA0.x, TexA1.y), 0).xyz * 2 - 1;
float3 D = Texture2DSampleLevel(View.PerlinNoiseGradientTexture, View.PerlinNoiseGradientTextureSampler, float2(TexA1.x, TexA1.y), 0).xyz * 2 - 1;
float3 E = Texture2DSampleLevel(View.PerlinNoiseGradientTexture, View.PerlinNoiseGradientTextureSampler, float2(TexB0.x, TexB0.y), 0).xyz * 2 - 1;
float3 F = Texture2DSampleLevel(View.PerlinNoiseGradientTexture, View.PerlinNoiseGradientTextureSampler, float2(TexB1.x, TexB0.y), 0).xyz * 2 - 1;
float3 G = Texture2DSampleLevel(View.PerlinNoiseGradientTexture, View.PerlinNoiseGradientTextureSampler, float2(TexB0.x, TexB1.y), 0).xyz * 2 - 1;
float3 H = Texture2DSampleLevel(View.PerlinNoiseGradientTexture, View.PerlinNoiseGradientTextureSampler, float2(TexB1.x, TexB1.y), 0).xyz * 2 - 1;
float a = dot(A, fv - float3(0, 0, 0));
float b = dot(B, fv - float3(1, 0, 0));
float c = dot(C, fv - float3(0, 1, 0));
float d = dot(D, fv - float3(1, 1, 0));
float e = dot(E, fv - float3(0, 0, 1));
float f = dot(F, fv - float3(1, 0, 1));
float g = dot(G, fv - float3(0, 1, 1));
float h = dot(H, fv - float3(1, 1, 1));
float3 Weights = PerlinRamp(frac(float4(fv, 0))).xyz;
float i = lerp(lerp(a, b, Weights.x), lerp(c, d, Weights.x), Weights.y);
float j = lerp(lerp(e, f, Weights.x), lerp(g, h, Weights.x), Weights.y);
return lerp(i, j, Weights.z);
}
// @return random number in the range -1 .. 1
// scalar: 6 frac, 31 mul/mad, 15 add,
float FastGradientPerlinNoise3D_TEX(float3 xyz)
{
// needs to be the same value when creating the PerlinNoise3D texture
float Extent = 16;
// last texel replicated and needed for filtering
// scalar: 3 frac, 6 mul
xyz = frac(xyz / (Extent - 1)) * (Extent - 1);
// scalar: 3 frac
float3 uvw = frac(xyz);
// = floor(xyz);
// scalar: 3 add
float3 p0 = xyz - uvw;
// float3 f = pow(uvw, 2) * 3.0f - pow(uvw, 3) * 2.0f; // original perlin hermite (ok when used without bump mapping)
// scalar: 2*3 add 5*3 mul
float3 f = PerlinRamp(float4(uvw, 0)).xyz; // new, better with continues second derivative for bump mapping
// scalar: 3 add
float3 p = p0 + f;
// scalar: 3 mad
float4 NoiseSample = Texture3DSampleLevel(View.PerlinNoise3DTexture, View.PerlinNoise3DTextureSampler, p / Extent + 0.5f / Extent, 0); // +0.5f to get rid of bilinear offset
// reconstruct from 8bit (using mad with 2 constants and dot4 was same instruction count)
// scalar: 4 mad, 3 mul, 3 add
float3 n = NoiseSample.xyz * 255.0f / 127.0f - 1.0f;
float d = NoiseSample.w * 255.f - 127;
return dot(xyz, n) - d;
}
// 3D jitter offset within a voronoi noise cell
// @param pos - integer lattice corner
// @return random offsets vector
float3 VoronoiCornerSample(float3 pos, int Quality)
{
// random values in [-0.5, 0.5]
float3 noise = float3(Rand3DPCG16(int3(pos))) / 0xffff - 0.5;
// quality level 1 or 2: searches a 2x2x2 neighborhood with points distributed on a sphere
// scale factor to guarantee jittered points will be found within a 2x2x2 search
if (Quality <= 2)
{
return normalize(noise) * 0.2588;
}
// quality level 3: searches a 3x3x3 neighborhood with points distributed on a sphere
// scale factor to guarantee jittered points will be found within a 3x3x3 search
if (Quality == 3)
{
return normalize(noise) * 0.3090;
}
// quality level 4: jitter to anywhere in the cell, needs 4x4x4 search
return noise;
}
// compare previous best with a new candidate
// not producing point locations makes it easier for compiler to eliminate calculations when they're not needed
// @param minval = location and distance of best candidate seed point before the new one
// @param candidate = candidate seed point
// @param offset = 3D offset to new candidate seed point
// @param bDistanceOnly = if true, only set maxval.w with distance, otherwise maxval.w is distance and maxval.xyz is position
// @return position (if bDistanceOnly is false) and distance to closest seed point so far
float4 VoronoiCompare(float4 minval, float3 candidate, float3 offset, bool bDistanceOnly)
{
if (bDistanceOnly)
{
return float4(0, 0, 0, min(minval.w, dot(offset, offset)));
}
else
{
float newdist = dot(offset, offset);
return newdist > minval.w ? minval : float4(candidate, newdist);
}
}
// 220 instruction Worley noise
float4 VoronoiNoise3D_ALU(float3 v, int Quality, bool bTiling, float RepeatSize, bool bDistanceOnly)
{
float3 fv = frac(v), fv2 = frac(v + 0.5);
float3 iv = floor(v), iv2 = floor(v + 0.5);
// with initial minimum distance = infinity (or at least bigger than 4), first min is optimized away
float4 mindist = float4(0,0,0,100);
float3 p, offset;
// quality level 3: do a 3x3x3 search
if (Quality == 3)
{
UNROLL for (offset.x = -1; offset.x <= 1; ++offset.x)
{
UNROLL for (offset.y = -1; offset.y <= 1; ++offset.y)
{
UNROLL for (offset.z = -1; offset.z <= 1; ++offset.z)
{
p = offset + VoronoiCornerSample(NoiseTileWrap(iv2 + offset, bTiling, RepeatSize), Quality);
mindist = VoronoiCompare(mindist, iv2 + p, fv2 - p, bDistanceOnly);
}
}
}
}
// everybody else searches a base 2x2x2 neighborhood
else
{
UNROLL for (offset.x = 0; offset.x <= 1; ++offset.x)
{
UNROLL for (offset.y = 0; offset.y <= 1; ++offset.y)
{
UNROLL for (offset.z = 0; offset.z <= 1; ++offset.z)
{
p = offset + VoronoiCornerSample(NoiseTileWrap(iv + offset, bTiling, RepeatSize), Quality);
mindist = VoronoiCompare(mindist, iv + p, fv - p, bDistanceOnly);
// quality level 2, do extra set of points, offset by half a cell
if (Quality == 2)
{
// 467 is just an offset to a different area in the random number field to avoid similar neighbor artifacts
p = offset + VoronoiCornerSample(NoiseTileWrap(iv2 + offset, bTiling, RepeatSize) + 467, Quality);
mindist = VoronoiCompare(mindist, iv2 + p, fv2 - p, bDistanceOnly);
}
}
}
}
}
// quality level 4: add extra sets of four cells in each direction
if (Quality >= 4)
{
UNROLL for (offset.x = -1; offset.x <= 2; offset.x += 3)
{
UNROLL for (offset.y = 0; offset.y <= 1; ++offset.y)
{
UNROLL for (offset.z = 0; offset.z <= 1; ++offset.z)
{
// along x axis
p = offset.xyz + VoronoiCornerSample(NoiseTileWrap(iv + offset.xyz, bTiling, RepeatSize), Quality);
mindist = VoronoiCompare(mindist, iv + p, fv - p, bDistanceOnly);
// along y axis
p = offset.yzx + VoronoiCornerSample(NoiseTileWrap(iv + offset.yzx, bTiling, RepeatSize), Quality);
mindist = VoronoiCompare(mindist, iv + p, fv - p, bDistanceOnly);
// along z axis
p = offset.zxy + VoronoiCornerSample(NoiseTileWrap(iv + offset.zxy, bTiling, RepeatSize), Quality);
mindist = VoronoiCompare(mindist, iv + p, fv - p, bDistanceOnly);
}
}
}
}
// transform squared distance to real distance
return float4(mindist.xyz, sqrt(mindist.w));
}
// -------- Simplex method (faster in higher dimensions because less samples are used, uses gradient noise for quality) ---------
// <Dimensions>D:<Normal>/<Simplex> 1D:2, 2D:4/3, 3D:8/4, 4D:16/5
// Computed weights and sample positions for simplex interpolation
// @return float3(a,b,c) Barycentric coordianate defined as Filtered = Tex(PosA) * a + Tex(PosB) * b + Tex(PosC) * c
float3 ComputeSimplexWeights2D(float2 OrthogonalPos, out float2 PosA, out float2 PosB, out float2 PosC)
{
float2 OrthogonalPosFloor = floor(OrthogonalPos);
PosA = OrthogonalPosFloor;
PosB = PosA + float2(1, 1);
float2 LocalPos = OrthogonalPos - OrthogonalPosFloor;
PosC = PosA + ((LocalPos.x > LocalPos.y) ? float2(1,0) : float2(0,1));
float b = min(LocalPos.x, LocalPos.y);
float c = abs(LocalPos.y - LocalPos.x);
float a = 1.0f - b - c;
return float3(a, b, c);
}
// Computed weights and sample positions for simplex interpolation
// @return float4(a,b,c, d) Barycentric coordinate defined as Filtered = Tex(PosA) * a + Tex(PosB) * b + Tex(PosC) * c + Tex(PosD) * d
float4 ComputeSimplexWeights3D(float3 OrthogonalPos, out float3 PosA, out float3 PosB, out float3 PosC, out float3 PosD)
{
float3 OrthogonalPosFloor = floor(OrthogonalPos);
PosA = OrthogonalPosFloor;
PosB = PosA + float3(1, 1, 1);
OrthogonalPos -= OrthogonalPosFloor;
float Largest = max(OrthogonalPos.x, max(OrthogonalPos.y, OrthogonalPos.z));
float Smallest = min(OrthogonalPos.x, min(OrthogonalPos.y, OrthogonalPos.z));
PosC = PosA + float3(Largest == OrthogonalPos.x, Largest == OrthogonalPos.y, Largest == OrthogonalPos.z);
PosD = PosA + float3(Smallest != OrthogonalPos.x, Smallest != OrthogonalPos.y, Smallest != OrthogonalPos.z);
float4 ret;
float RG = OrthogonalPos.x - OrthogonalPos.y;
float RB = OrthogonalPos.x - OrthogonalPos.z;
float GB = OrthogonalPos.y - OrthogonalPos.z;
ret.b =
min(max(0, RG), max(0, RB)) // X
+ min(max(0, -RG), max(0, GB)) // Y
+ min(max(0, -RB), max(0, -GB)); // Z
ret.a =
min(max(0, -RG), max(0, -RB)) // X
+ min(max(0, RG), max(0, -GB)) // Y
+ min(max(0, RB), max(0, GB)); // Z
ret.g = Smallest;
ret.r = 1.0f - ret.g - ret.b - ret.a;
return ret;
}
float2 GetPerlinNoiseGradientTextureAt(float2 v)
{
float2 TexA = (v.xy + 0.5f) / 128.0f;
// todo: storing random 2d unit vectors would be better
float3 p = Texture2DSampleLevel(View.PerlinNoiseGradientTexture, View.PerlinNoiseGradientTextureSampler, TexA, 0).xyz * 2 - 1;
return normalize(p.xy + p.z * 0.33f);
}
float3 GetPerlinNoiseGradientTextureAt(float3 v)
{
const float2 ZShear = int2(17, 89);
float2 OffsetA = v.z * ZShear;
float2 TexA = (v.xy + OffsetA + 0.5f) / 128.0f;
return Texture2DSampleLevel(View.PerlinNoiseGradientTexture, View.PerlinNoiseGradientTextureSampler, TexA , 0).xyz * 2 - 1;
}
float2 SkewSimplex(float2 In)
{
return In + dot(In, (sqrt(3.0f) - 1.0f) * 0.5f );
}
float2 UnSkewSimplex(float2 In)
{
return In - dot(In, (3.0f - sqrt(3.0f)) / 6.0f );
}
float3 SkewSimplex(float3 In)
{
return In + dot(In, 1.0 / 3.0f );
}
float3 UnSkewSimplex(float3 In)
{
return In - dot(In, 1.0 / 6.0f );
}
// filtered 3D gradient simple noise (few texture lookups, high quality)
// @param v >0
// @return random number in the range -1 .. 1
float GradientSimplexNoise2D_TEX(float2 EvalPos)
{
float2 OrthogonalPos = SkewSimplex(EvalPos);
float2 PosA, PosB, PosC, PosD;
float3 Weights = ComputeSimplexWeights2D(OrthogonalPos, PosA, PosB, PosC);
// can be optimized to 1 or 2 texture lookups (4 or 8 channel encoded in 32 bit)
float2 A = GetPerlinNoiseGradientTextureAt(PosA);
float2 B = GetPerlinNoiseGradientTextureAt(PosB);
float2 C = GetPerlinNoiseGradientTextureAt(PosC);
PosA = UnSkewSimplex(PosA);
PosB = UnSkewSimplex(PosB);
PosC = UnSkewSimplex(PosC);
float DistanceWeight;
DistanceWeight = saturate(0.5f - length2(EvalPos - PosA)); DistanceWeight *= DistanceWeight; DistanceWeight *= DistanceWeight;
float a = dot(A, EvalPos - PosA) * DistanceWeight;
DistanceWeight = saturate(0.5f - length2(EvalPos - PosB)); DistanceWeight *= DistanceWeight; DistanceWeight *= DistanceWeight;
float b = dot(B, EvalPos - PosB) * DistanceWeight;
DistanceWeight = saturate(0.5f - length2(EvalPos - PosC)); DistanceWeight *= DistanceWeight; DistanceWeight *= DistanceWeight;
float c = dot(C, EvalPos - PosC) * DistanceWeight;
return 70 * (a + b + c);
}
// filtered 3D gradient simple noise (few texture lookups, high quality)
// @param v >0
// @return random number in the range -1 .. 1
float SimplexNoise3D_TEX(float3 EvalPos)
{
float3 OrthogonalPos = SkewSimplex(EvalPos);
float3 PosA, PosB, PosC, PosD;
float4 Weights = ComputeSimplexWeights3D(OrthogonalPos, PosA, PosB, PosC, PosD);
// can be optimized to 1 or 2 texture lookups (4 or 8 channel encoded in 32 bit)
float3 A = GetPerlinNoiseGradientTextureAt(PosA);
float3 B = GetPerlinNoiseGradientTextureAt(PosB);
float3 C = GetPerlinNoiseGradientTextureAt(PosC);
float3 D = GetPerlinNoiseGradientTextureAt(PosD);
PosA = UnSkewSimplex(PosA);
PosB = UnSkewSimplex(PosB);
PosC = UnSkewSimplex(PosC);
PosD = UnSkewSimplex(PosD);
float DistanceWeight;
DistanceWeight = saturate(0.6f - length2(EvalPos - PosA)); DistanceWeight *= DistanceWeight; DistanceWeight *= DistanceWeight;
float a = dot(A, EvalPos - PosA) * DistanceWeight;
DistanceWeight = saturate(0.6f - length2(EvalPos - PosB)); DistanceWeight *= DistanceWeight; DistanceWeight *= DistanceWeight;
float b = dot(B, EvalPos - PosB) * DistanceWeight;
DistanceWeight = saturate(0.6f - length2(EvalPos - PosC)); DistanceWeight *= DistanceWeight; DistanceWeight *= DistanceWeight;
float c = dot(C, EvalPos - PosC) * DistanceWeight;
DistanceWeight = saturate(0.6f - length2(EvalPos - PosD)); DistanceWeight *= DistanceWeight; DistanceWeight *= DistanceWeight;
float d = dot(D, EvalPos - PosD) * DistanceWeight;
return 32 * (a + b + c + d);
}
float VolumeRaymarch(float3 posPixelWS, float3 posCameraWS)
{
float ret = 0;
int cnt = 60;
LOOP for(int i=0; i < cnt; ++i)
{
ret += saturate(FastGradientPerlinNoise3D_TEX(lerp(posPixelWS, posCameraWS, i/(float)cnt) * 0.01) - 0.2f);
}
return ret / cnt * (length(posPixelWS - posCameraWS) * 0.001f );
}
#endif